Performing a left or right join

With some manipulation, a simple link in 1010data can produce the same results as a LEFT JOIN or a RIGHT JOIN in SQL.

Difficulty

Objective

You want to perform a "LEFT JOIN" or a "RIGHT JOIN" in 1010data, that is you want to match all of the rows in your base table with the appropriate row in the foreign table, and if their is not a match, you want the result from the foreign table to be NULL.

SQL Solution

SELECT store_master.store,sum_sales,division,
subdivision,manager,city,state
FROM 
  (SELECT store,sum(xsales) as sum_sales 
  FROM sales_item_detail 
  WHERE date=20150621
  GROUP BY store) 
  AS sales 
LEFT JOIN 
  store_master 
  ON sales.store=store_master.store;

Performing a LEFT JOIN in SQL combines all of the columns from the left table to those of the right table. If there is an exact match between a row in both tables, then the corresponding data is joined in, if there is not a match to the right table, the columns from the right table for that row are populated with NULL values.

SELECT store_master.store,sum_sales,division,
subdivision,manager,city,state
FROM 
  (SELECT store,sum(xsales) as sum_sales 
  FROM sales_item_detail 
  WHERE date=20150621
  GROUP BY store) 
  AS sales 
RIGHT JOIN 
  store_master 
  ON sales.store=store_master.store;

A RIGHT JOIN is the opposite of a LEFT JOIN, where all information from the right table is kept and if there is not a match in the left table, the row is populated with NULL values in each of the columns.

1010data Macro Language solution

<base table="pub.doc.retail.altseg.sales_detail_transid"/>
<sel value="trans_date=20150621"/>
<tabu breaks="store" label="Tabulation">
  <break col="store" sort="up"/>
  <tcol fun="sum" source="xsales" name="sum_sales" format="dec:2"/>
</tabu>
<link table2="pub.doc.retail.altseg.stores" col="store" col2="store" 
cols="division,subdivision,manager,city,state"/>
<<base table="pub.doc.retail.altseg.stores"/>
<colord cols="store,division,subdivision,manager,city,state"/>
<sort col="store" dir="up"/>
<link table2="pub.doc.retail.altseg.sales_detail_transid" col="store" col2="store">
  <sel value="trans_date=20150621"/>
  <tabu breaks="store" label="Tabulation">
    <tcol fun="sum" source="xsales" name="sum_sales" format="dec:2"/>
  </tabu>
</link>

The difference of simulating a LEFT JOIN or a RIGHT JOIN in 1010data is the table that is designated as your base table. If you say that pub.doc.retail.altseg.sales_detail_transid is your "left table" and pub.doc.retail.altseg.stores is your "right table," then simulating a LEFT JOIN would require you to use the Sales Item Detail table as your base and then perform a simple link on store to the Store Master table.

Similarly, simulating a "RIGHT JOIN" requires you to use Store Master as your base table, then link to Sales Item Detail on store. As you can imagine, the results of these two "joins" differ from each other. Due to the selection of a specific date, some stores do not have entries in the Sales Item Detail table, because they did not have transactions on that day. Therefore, similar to the LEFT JOIN in SQL, only the store information for the stores that had sales on that day is brought in when using the Sales Item Detail table as a base. With the Store Master as the base, you obtain the same result as a RIGHT JOIN- the sum of sales is brought in for every store. Therefore, if a store had no sales on that day, the result is a NULL value.

In this example, there is only one row for each store, therefore the only difference between a "left join" or a "right join" is whether or not there is an entry for store 186, which had no sales on the date selected, June 21, 2016. However, if you did not tabulate to find the sum of sales and instead just wanted to bring in the store information for each individual transaction, the table that is used as your base is very important.

Joins in SQL will bring in all possible matches from the foreign table to the base table. Therefore, if you perform the RIGHT JOIN in the SQL example above, and use the whole Sales Item Detail instead of calculating the sum of sales per store, the information in the Store Master table will be duplicated to accommodate every entry in the sales table.

Regular links in 1010data, only bring in the first entry from the foreign table if there are multiple matches to the base table. Again starting with the Store Master and linking in the whole Sales Item Detail table, the result will be the same size as the store table and only the first transaction for each store will be linked in.

Thus, you should think carefully about which table you should use as your base. In general, your foreign table should contain the higher number of unique values. In this example, the transaction table does not contain any unique values in the store column, meaning that there is at least two entries for every store. However the store table only contains one entry for every store, therefore it has the higher number of unique values. So when performing a link between these two tables in their entirety, you should use the transaction table as your base.

Further reading

If you would like to learn more about the functions and operations discussed in this recipe, click on the links below:

<link>