Linking Tables and Worksheets

Linking provides a simple yet powerful way of combining the data from two tables.

When you link tables (or worksheets) together, you essentially combine them into a single, bigger table with all the columns from both. (The change applies only to your session and does not affect the original tables or other people in any way.) For example, if one table contains addresses and telephone numbers for a group of people and another table contains their ages, heights, and weights, linking the two tables results in a table that has all the information about each person.

Linking is quite important for analysis. For example, using the first table described above, you could easily find all the people who live in a given city. Similarly, with the second table you could find all the people older than a given age. Both are simple row selections. But suppose you want to find all people older than a given age who also live in a given city? Link the tables and it's just another row selection.

Note: When you link to a worksheet (a table with operations applied to it), the full worksheet is pulled into memory, so there is a limit on how large the worksheet can be (roughly about 250,000,000 cells). When you link to a table, no such restriction applies. See the section on Linking under Writing Efficient Queries for tips on how to help limit the size of the worksheet you're linking in.

If, for a particular row in the table that you are viewing, there is no information in the foreign table, N/A values are filled in for that row. In many cases, you may not be interested in such rows; after all, they have a significant amount of missing information. One way to eliminate such rows from the analysis is to do the link and then select only those rows that have non-N/A values in the columns you have linked in. Another, more efficient, method is to link-and-select, which combines the link and the selection in one operation. The result is the same as a simple link, except that only those rows for which there are matches in the foreign table are linked in.

Note that this is another way of doing row selections and may be useful in certain circumstances. Suppose, for example, that you are looking at a Sales table that has millions of rows involving thousands of customer IDs. Now suppose that you have a list of a few hundred customer IDs that you would like to select for analysis and that this list is in a text file or Excel spreadsheet. An easy way of proceeding in 1010data would be:

  1. Upload the list of customer IDs as a new table.
  2. Go to the Sales table and perform a link-and-select to your uploaded table, linking on the customer ID.

Your view of the Sales table will now reflect only those customers in your list.