analyze_="table"

<do action_="sql" analyze_="table" path_="[PATH]" value_="@r"/> analyzes the 1010data table at [PATH] and populates @r with complete SQL metadata for the table.

Description

analyze_="table" creates a SQL column for each column in the 1010data table and uses the following default values for creating the equivalent SQL table.

  • The 1010data column label is used as the SQL name.

    If you want to use the 1010data column name as the SQL name, set colnames_="names".

  • segby and sortseg attributes in the 1010data table are recorded in the SQL metadata.
  • Each column's SQL type is determined by the underlying column's datatype and format.

    For example, an integer column with a date format is given a date SQL type. Since the SQL to 1010data type mapping is not a 1 to 1 correlation, you may need to edit some of the SQL types manually after analysis.

  • An index in a 1010data column is recorded in the SQL metadata indexed attribute.
  • Each column's data is fully scanned for the following:
    • Whether it contains any null (NA) values (null attribute in the metadata)
    • Whether it is sorted (sorted attribute in the metadata)
    • What the cardinality, or the number of unique values is.

      If the cardinality of the column is equal to the depth of the table, it is given a unique attribute in the SQL metadata. These unique columns are also added to the keys of the table.

Note that since analyze_="table" scans all the data in the table, it can take a very long time, particularly when computing cardinality and sortedness. By default, analyze_="table" omits sortedness in tables over 1 billion rows, and approximates cardinality (using aucnt rather than ucnt). Use limit_="[DEPTH]" to change this default value.

Syntax

<dynamic>
    <do action_="sql" analyze_="table"
     path_="[PATH]"
     colnames_="labels"|"names"|[LIST_OF_COLUMN_NAMES]
     clearcache_="0|1"
     limit_="[DEPTH]"
     value_="@r"/>
</dynamic>

Attributes for analyze_="table"

path_
The path to the table to analyze.
colnames_
How SQL should create a column from the 1010data table. The default value is "label" (use 1010data column labels). You can also use "names" (use 1010data column names) or define a non-comma-separated list of unique column names. The user-defined column names must consist of valid SQL identifiers and be the same length as the number of columns in the table being analyzed.
clearcache_
clearcache_="1" clears the MDB cache after each column in the table is scanned. The default value is 0.
limit_
Specifies the row threshold at which a table is considered too long to perform exact cardinality and sortedness operations. The default value is 1,000,000,000.
value_
The information returned by analyze_="table" is saved as a package to the dynamic variable @r. {@r} may then be used as the dict_= attribute of a put_="table" transaction.