Select rows

Selecting rows is one of the most basic tasks in 1010data. Use row selection to isolate data that is important to your analysis and create more efficient queries.

Whether you are working with a table that is very small or very large, the first step to answering an analytical question is to narrow down the data you are working with so you see only the pieces of information that apply. While there are many ways to do this, the most basic is a row selection.

The purpose of row selection is to reduce a large collection of general data to a smaller collection of data specific to the question being asked. For example, a question might be "Which store in our chain had the highest sales in December?" Another could be "What were the highest and lowest sales figures for a single transaction?" In both cases, you start with all of your data to answer a question that only requires a subset of that data.

When you work with very large data sets like those typically found in 1010data, the order in which you perform your operations on the data makes a difference in how fast the system can complete those operations. If you make the largest selection that eliminates the most data first, then the subsequent operations you perform finish faster because 1010data does not need to work with as much information. As a best practice, if your data contains columns with date- or time-related information, that is generally the best place to start. However, the largest selection may be something other than date or time information and is based on your data and the analysis you want to perform.

This tutorial explains how to perform a series of row selections to reduce the amount of information in the Sales Item Detail data set.

To perform a row selection:

  1. Open the Sales Item Detail table (pub.demo.retail.item).
    The name of the Sales Item Detail table is pub.demo.retail.item. The location of the table is provided within the file name. In this case, the table is located in the following folder: Published Data > Demo > Retail.
    1010data displays the Sales Item Detail table. It contains 35 rows of data, of which only a limited number display in the image below.

    For the purposes of this tutorial, we developed a smaller data set so you can see the changes being made to the data. Even though performing real analyses require much larger data sets, all the principles in this tutorial apply.

  2. Click Rows > Select Rows.

    1010data displays the Select Rows dialog.

When selecting rows in 1010data, you should always make the selection that eliminates the largest amount of data first. This almost always means your first selection should be a date range. For the purposes of this tutorial, limit the data to May 15, 2012 through May 17, 2012.

  1. From the drop-down list preceding the first is between label, select Date. In the next field, enter 05/15/12, and in the last field, enter 5/17/2012.
    Dates entered must exist in the data set.

    Note: A row selection specified using the in between criteria is inclusive, meaning rows with the values entered in the corresponding fields are included in the results.

    You may have noticed that different date formats were used in each field. Different formats are valid in 1010data, so you can use the date format that you prefer.

  2. Click Select.
    1010data displays the results of your selection.

    Congratulations, you just performed your first row selection in 1010data! As you can see in the image above, the amount of data in the table is reduced by more than half.

What if you want to see information about a particular store? To do this, again use the Select Rows dialog and enter additional comparison criteria. In this case, you can enter a store number to limit the data to the rows that identify a specific store.

  1. In the first row of the Select Rows dialog, do the following:
    1. From the first drop-down list, select Store.
    2. From the second drop-down list, select has the value(s).
    3. In the last field, enter 2.

    These selections indicate that you want to see only rows for store 2. If the second comparison option, does not have the value(s), is selected, all of the stores in the table except store 2 display.

    If you want to include more than one value (more than a single store in the example above), separate the values with a space.

    The example above displays results for both store 1 and store 3. Searching for multiple values is useful when working with information like product numbers.

    Now that all of the information you need for your results is entered, you can perform the selection.

  2. Click Select.
    1010data displays the results of your selection.

    In just a few minutes, you narrowed the data down from 35 rows to 5. That is less than one-seventh of the original data. Also, notice that the worksheet indicates the selections you already performed (outlined in red above). This can help prevent duplicating effort when making additional selections.

    You can also quickly perform a basic row selection directly from the worksheet by right-clicking any cell and choosing an option that appears in a pop-up menu.

  3. Right-click the cell in the first row of the Date column.
    1010data displays a pop-up menu.

By right-clicking a cell in the worksheet, you can choose from several row selection options that pertain to that cell. Take a look at the available options when you right-click an Account cell.

  1. Right-click the cell in the first row of the Account column.
    1010data displays a pop-up menu.

    As you can see, when you right-click a cell in the Account column, the options are related to Account and not Date. If you need to perform a simple selection, this is a great way to quickly select rows.

  2. When you are finished with this tutorial, close the Sales Item Detail worksheet.
Row selection is an incredibly powerful tool you can use when working with large data sets. Using this simple operation, you can eliminate unnecessary data and focus on what's pertinent. Of course, you can use the advanced selection features to narrow down your results even more, but that is a topic for another tutorial. Feel free to make up a few row selections of your own and try them out on any of the tables in the Published > Demo folder. It will make a big difference as you learn how to work with the 1010data Trillion-Row Spreadsheet, and ultimately, make your analyses faster and more accurate. Good luck!

For more information about selecting rows, see Rows in the 1010data User's Guide.