Perform a tabulation

A tabulation groups rows of data together based on the values in one or more columns and performs calculations on data in each of those groups. The summarization results are provided in a small, easy-to-read table.

A tabulation is a great place to start when you want to get a feel for what all those rows of data in your table really mean. Tabulations allow you to calculate various metrics, such as the sum, average, or highest value of a particular column, grouping those calculations based on common values in one or more other columns. For instance, in a table containing weather data, you could calculate the highest temperature during the year for every unique combination of ZIP Code and month in the table.

In this tutorial, you will perform a tabulation on the Sales Item Detail table in the Trillion-Row Spreadsheet (TRS) to find the total amount of sales for each of three stores in a chain. You will then perform a separate tabulation to calculate the total sales amount and number of items sold in each transaction. Finally, you will export the tabulation to a Microsoft Excel workbook file.

To perform a tabulation:

  1. From the menu bar, select Tools > Trillion-Row Spreadsheet.
    The Insights Platform displays the TRS window.

  2. Open the Sales Item Detail table (pub.demo.retail.item).
    TRS displays the New operation panel in the analysis pane on the left and the Sales Item Detail table in the results pane on the right.

  3. In the New operation panel, click Tabulate.
    TRS displays the Tabulation panel.

    The Tabulation panel allows you to create a basic tabulation.

  4. Remove the default cnt (count) tabulation function by clicking the Remove () icon.

Next, choose the column by which you want to group the information. In other words, specify how you want to subtotal the data. When performing a tabulation, a helpful question to ask yourself is "What values do I want to use to group the records?" Since you want to find the total amount of sales for each store, you should group the information by the Store column. Grouping is a way of pooling all the records for a single entity or value (in this case, a store) into a single entry in the table.

  1. Drag the Store column from the Worksheet Columns section to the Row Breaks section.
    This groups your tabulation by store.

Now that the data grouping is set, choose the data you want to summarize. Since you want to find the total amount of sales, you should summarize the Sales column.

  1. Drag the Sales column from the Worksheet Columns section to the lower area of the Column Breaks section.
    This selection chooses the column of data you want to summarize.

    TRS adds the sum of Sales summarization and returns the Sales column to the Worksheet Columns section.

The summarization is a combination of the selected column and the default tabulation function for that column's data type. In this case, sum is the default tabulation function for Sales. This means the sum of Sales summarization will calculate the total amount in the Sales column.

The next step of a tabulation is to select the tabulation function you want to perform on the column. This is done by clicking the summarization (in this case, sum of Sales) and choosing a tabulation function from the menu.

However, since the default tabulation function happens to be exactly what you are looking for, you do not need to make this selection.

At this point, your tabulation should look like the following image.

Starting from the right and working left, the tabulation panel indicates that the resultant worksheet will contain a single column (Sales) and each row in that column will list the total sales amount (sum of Sales) grouped by Store.

By default, columns in the resultant worksheet are named t0, t1, t2, and so on. You can give the tabulated columns more meaningful names within the More Options view of the Tabulation panel. In addition, you can define the format of the tabulation.

  1. Click the More Options switch.
    TRS displays the formatting options.

    Select sum of Sales in the Tabulation Column drop-down. This indicates that the name and formatting options you define apply to the sum of Sales tabulated column.

  2. Complete the following fields:
    Tabulation Label Enter Sum of Sales Tabulation
    Result Name Enter sumsales.

    This is the column name. Remember, column names must begin with an alphabetic character, may only contain alphanumeric characters or underscores, and may not contain any spaces or other special characters.

    Result Heading Enter Sum of`Sales.

    This is the column label. For easier readability, column labels may contain any combination of uppercase and lowercase letters, numbers, spaces, and special characters. The backtick ( ` ) special character is used to break the column label text line into multiple lines. In this case, "Sum of" will be on the first line of the column label and "Sales" will be on the second line.

    Display Format Select Currency: $1,234.56.

    Since your totals will be dollar amounts, this option is selected. When currency is selected, TRS adds the dollar sign and uses two decimal places for each tabulated sales value.

    Column Width Select Default.
    Decimal Places Select Default.

    Because you selected currency above, TRS will automatically use two decimal places.

Now that you have selected the data you want to group by, the data you want to summarize, and the formatting options, generate the tabulation.

  1. Click the Submit operation () icon.
    TRS displays the results of your tabulation.

    Note: In the image above, the column heading (Sum of Sales) is at the top of the column. If you prefer, you can change this to show the column name instead. For details, see Edit your data preferences in the 1010data Insights Platform User's Guide.

    Notice that there is a row of data, with values in orange, at the top of the worksheet. This row shows the totals for each column of summarized data. The three fictional stores generated $59.85 in sales with store 1 generating $23.19, store 2 generating $16.31, and store 3 generating $20.35. While this tabulation might not seem very useful for a chain of three stores, it becomes much more useful when summarizing chains with hundreds, or even thousands, of stores.

Now, you will take a look at a slightly more complex example of a tabulation.

Instead of looking at the total sales per store, you will determine the sales totals for each transaction in the Sales Item Detail table. However, you also want to know at which store any given transaction took place, and also the total number of items in each transaction. The objective is to have a summary table that shows you the transaction ID, store, sum of sales, and total units for each transaction. To accomplish this, you will edit the existing tabulation operation in the timeline.

  1. In the Analyze tab, click the Tabulation operation in step 2 of the timeline.
    TRS displays the Tabulation panel.

Just as before, start by selecting a column to group your data. In this case, you want to see the total sales figures for each transaction. You also want to know which store recorded each of the transactions in the table, so you should also group by store.

  1. Drag the Transaction column from the Worksheet Columns section to the Row Breaks section placing it above the Store column.
    This selection first groups the tabulation by transaction and then divides the tabulated results into subgroups by store. The result of the calculation for each transaction will be displayed on a separate row in the new worksheet.
    Note: The order by which you choose to group the data is important because it can affect the results of the tabulation.

Now that you have grouped your data, first by transaction, then by store, you can choose the data you want to summarize. In this example, you are still summarizing sales, so you can leave the sum of Sales summarization and simply add the summarization of units.

  1. Drag the Units column from the Worksheet Columns section to the lower area of the Column Breaks section placing it below the sum of Sales summarization.
    This selection is chosen because you want to determine the number of units.
    TRS adds the sum of Units summarization and returns the Units column to the Worksheet Columns section.

Since the default tabulation function (sum) is what you want, once again you do not need to select a tabulation function from the menu.

The tabulation panel indicates that the resultant worksheet will contain two columns, Sales and Units. The rows in each column will list the total sales amount (sum of Sales) and total number of items sold (sum of Units) grouped first by Transaction and then subgrouped by Store.

Next, as before, give the new tabulated column a more meaningful name and define the formatting.

  1. Click the More Options switch.
    TRS displays the formatting options.

  2. In the Summary section, click sum of Units.
    This selection is chosen because this time you want to define the name and formatting options for the sum of Units tabulated column.
    TRS displays the formatting options for the sum of Units tabulated column.

  3. Complete the following fields:
    Tabulation Label Sum of Sales Tabulation
    Result Name Enter totunits.
    Result Heading Enter Total Units.
    Display Format Select Default.
    Column Width Select Default.
    Decimal Places Select Default.

Now that you have selected the data you want to group by and the data you want to summarize, generate the tabulation.

  1. Click the Submit operation () icon.
    TRS displays the results of your tabulation.

    Notice that in the new summary, the total sales figure in the row at the top of the Sum of Sales column remains the same. However, instead of the total sales by store, TRS displays the total sales figure for each transaction and the store at which the transaction took place. In addition, the tabulation includes the total number of units sold in every transaction to provide a little more insight into volume. This provides you with an extra layer of detail or granularity in your results. Yet, the new table is still much easier to read and understand than the original Sales Item Detail table.

The final step in this tutorial is to download your results in a Microsoft Excel workbook file.

  1. In the analysis pane, click the Export tab.
    TRS displays the export options.

  2. Click Microsoft Excel.
  3. Click the Export button.
    TRS saves the tabulation to your computer as download.xlsx.
    Note: Depending on the browser you are using, you may be prompted to open or save the file. If you are prompted, save the file.
  4. When you are finished with this tutorial, close the TRS window.
After you are comfortable with them, tabulations are a very fast and powerful way to get a sense of what information your data contains and how it can be leveraged to make decisions. So with that in mind, good luck and keep practicing.