Linking tables and worksheets

Use linking to combine tables or worksheets into a single, bigger table with columns from both.

When you work a lot with data, you will often find yourself in a position where you want to use two separate tables, which contain different but related data, to answer a question. The 1010data Insights Platform has a highly effective feature that allows you join the information from related data tables together in a straightforward manner. It is called linking, and it is one of the things the Insights Platform does best.

When you link tables (or worksheets) together, you essentially combine them into a single, bigger table with all the columns from both. The change applies only to your current workspace and does not affect the original tables or other people in any way. For example, if one table contains addresses and telephone numbers for a group of people and another table contains their ages, heights, and weights, linking the two tables results in a table that has all the information about each person.

Linking is quite important for analysis. For example, using the first table described above, you could easily find all the people who live in a given city. Similarly, with the second table you could find all the people older than a given age. Both are simple row selections. But suppose you want to find all people older than a given age who also live in a given city? Link the tables and it's just another row selection.

Note: When you link to a worksheet (a table with operations applied to it), the full worksheet is pulled into memory, so there is a limit on how large the worksheet can be (roughly about 250,000,000 cells). When you link to a table, no such restriction applies.

If, for a particular row in the table that you are viewing, there is no information in the foreign table, N/A values are filled in for that row. In many cases, you may not be interested in such rows; after all, they have a significant amount of missing information. One way to eliminate such rows from the analysis is to do the link and then select only those rows that have non-N/A values in the columns you have linked in. Another, more efficient, method is to link-and-select, which combines the link and the selection in one operation. The result is the same as a simple link, except that only those rows for which there are matches in the foreign table are linked in.

Note that this is another way of doing row selections and may be useful in certain circumstances. Suppose, for example, that you are looking at a Sales table that has millions of rows involving thousands of customer IDs. Now suppose that you have a list of a few hundred customer IDs that you would like to select for analysis and that this list is in a text file or Excel spreadsheet. An easy way of proceeding in the Insights Platform would be:

  1. Upload the list of customer IDs as a new table.
  2. Go to the Sales table and perform a link-and-select to your uploaded table, linking on the customer ID.

Your view of the Sales table will now reflect only those customers in your list.