Column Information

Descriptions for fields, options, and icons in the Column Information section of the Advanced Uploader tab.

For every column in your table, you need to provide the information required by 1010data. If your data file was formatted and delineated properly, the Auto Detect File Specs button should have identified the number of columns in your table, created a section for you to provide the necessary information, and populated as much information as possible from the data in the file. The following tables provide descriptions of all fields, options, and icons available in the Column Information sections of the Advanced Uploader tab.

Table 1. Column Information fields and options
Field Description
Name Also known as the column name, the Name provides a unique identifier for each column of a table in the 1010data interface.

The Name may only contain alphanumeric characters or underscores and must begin with an alphabetic character (e.g., percent_total_sales). It may not contain any spaces or other special characters.

Header Also known as the column heading, the Header is the label of the column that displays by default at the top of a column in the user interface.

The Header may contain any combination of uppercase and lowercase letters, numbers, spaces, and special characters (e.g., "Percentage of Total Sales (%)").

Type The kind of data contained in the column. The most common options include the following:
Text
The column contains text, as opposed to numbers. When this option is selected, the Format Type drop-down changes to the Force Case drop-down.
Integer
The column contains only whole numbers. For example, 1, 53, and 1,234,597 are all integers.
Float
The column contains numbers with place values to the right of a decimal point. For example, 1.1, 23.845 and 0.37383636833930 are all floating point numbers.
Expression
Allows you to derive the value of a column with a 1010data Macro Language expression.
Big Integer
The column contains 64-bit integer data representing very large and very small whole numbers.
Note: In addition to the types explained above, there are also numerous options for date and time data. These data formats apply 1010data date formatting measures to the date data in the source file. Dates in the source file can be formatted with dashes ('-'), forward-slashes ('/'), and spaces (' '). Data uploaded with just date information are created as integers in 1010data, with time-handling formats applied as specified in the Type drop-down menu. Data with date and time data are created as floats in 1010data, with the applicable time-handling formats applied as specified in the Type drop-down menu.
Force Case Allows you to convert the text data in the column to all upper or all lower case letters. Select N to keep the text data as is.

This option is displayed only when Text or Expression is selected from the Type drop-down list.

Format Type Defines how the data is displayed on the screen. For example, you can choose to exclude commas in numbers or change the way a date is formatted. Choices made in the Format Type do not change the type of data in the column, only the way the data is displayed.

This option is displayed only when Integer, Float, Big Integer, or Expression is selected from the Type drop-down list.

Display Width The width of the column in the table. This field is used for fixed-width files.
Dec Places The number of digits to display after a decimal point.

This option is displayed only when Float or Expression is selected from the Type drop-down list.

Time Series Break Order Allows you to group together, or segment, like information in the uploaded table. A table must be segmented before g_functions or Time Series functions can be used to analyze the data in the table.

The Time Series Break Order field accepts whole number sequential values starting with 1.

For g_functions, a minimum of two columns must be identified. The first column, indicated by entering a 1 in the Time Series Break Order field, is the data that will be grouped together in the same segment. The second column, indicated by entering a 2 in the Time Series Break Order field, is used to order the data.

For Time Series functions, a minimum of three columns must be identified. The first column, indicated by entering a 1 in the Time Series Break Order field, is the data that will be grouped together in the same segment. The second, up to the last column, indicated by entering a 2 (and then sequential whole numbers for each additional column) in Time Series Break Order field, identifies the grouping arguments. The last column, indicated by entering the next highest sequential number in the Time Series Break Order field, is used to order the data.

For example, if you enter a 1 for the AccountID column and a 2 for the Store column and a 3 for the Date column, 1010data will break the data up into groups ensuring that all records with a unique combination of AccountID and Store are in the same group and will sort the table by Date.

1010data will always sort the table being loaded by the column with the highest value in the Time Series Break Order field. If in the previous example a 3 was not entered for the Date column, 1010data would group all unique AccountID values in the same segment and sort the table by the Store column.

This field is displayed only when YES is selected from the Time Series drop-down list in the Table Information section.

Column Information icons

Above the fields in each Column Information section are the following icons:

The table below contains a description of the function of each of these icons.

Table 2. Column Information icons
Icon Function
Add a new column to the table. All the values in the new column will be blank.
Clone a column so that an exact copy, with all associated values, is created.
Move this column up in the order. This will move the column to the left in the final table.
Move this column down in the order. This will move the column to the right in the final table.
Delete this column.

Advanced Options

The advanced options provide additional settings for columns.

To view these additional fields, click Advanced Options.

Table 3. Advanced Options fields
Field Description
Description Explains the values and their meaning in a given column.

When a user clicks the Show Information () icon at the top of a column, the text entered in this field is displayed in the Description field under Meta Information.

Expression Macro Language expression for computed columns. Also helps control formatting for date and time data.
Note: Expressions are built from a subset of the 1010data Macro Language. All functions from the Macro Language are available for use in expressions except for g_ (group), r_ (row), and ts_ (time-series) functions.
Fixed Column When you select this option, the column will remain in its current position when you scroll horizontally through the table columns in the 1010data Trillion-Row Spreadsheet.
Hide Column Creates the column in 1010data, but hides it. The column will be available in the Rearrange columns dialog in 1010data. Hidden columns are also available for use in expressions.
Destroy Column Excludes the column from the new table. The column will not be available in the 1010data base table after it is created.
Custom Compression Allows for custom compression settings.

When selected, the Type, Method, and Enumerate drop-down lists are enabled.

In general, the default compression setting should be used. However, if the table is extremely large and the column contains unique text values, the compression settings may need to be changed. Contact 1010data Support to discuss your needs and we will help you determine the appropriate settings.

Type Determines the Default values for Method and Enumerate.
This drop-down list is enabled when Custom is selected from the Compression Settings drop-down list.
Default
Dynamic is the default setting.
Static
Allows you to manually select the options for Method and Enumerate.
Dynamic
1010data will determine the best Method and Enumerate settings to use based on the data in the column.
Method The type of compression to use.
Default
The result of this option changes based on whether Static or Dynamic is selected from the Type drop-down list. If Dynamic is selected, leave Method set to Default. This allows 1010data to determine the best type of compression to use based on the data in the column. If Static is selected, and Method is set to Default, lzo compression is used.
bitpack
Very fast compression and decompression speeds. Only for compact integer vectors with few unique row values. Integer vectors cannot have 0I, -0I, or 0N infinity values.
lzo
Moderate compression ratios. Fast compression and very fast decompression speeds. This is the recommended compression method.
bzip2
Greater compression ratio than lzo compression. Slower compression and decompression speeds than lzo compression.
lzma
Excellent compression ratio. Very slow compression and moderately fast decompression speeds.
Enumerate Enumeration is faster for string sets that have several repeated values because instead of storing the same string over and over, the repeated string is stored once and indexed. In general, leave Enumerate set to Default.
Default
The result of this option changes based on the type of data selected in the column data Type drop-down list (not the compression Type drop-down list). If Text is selected and Enumerate is set to Default, enumeration is used. If Integer or Float is selected and Enumerate is set to Default, enumeration is not used.
yes
Use enumeration. This setting should be selected when there are a lot of repeated strings.
Note: Static strings are always enumerated.
no
Enumeration is not used. This setting should be selected when there are no repeated row values in a very large string column.