`action_="importtable"`

<do action_="importtable"> imports an iceberg-formatted table or a set of parquet files, and returns basic metadata statistics. (Available as of version 20.05)

Description

To use tables defined in an external data-lake location, like Azure Blob Storage, users may run <do action_="importtable"> to create tables. Queries run on these tables will fetch external data only once per segment/parquet file. The fetch will occur at query time and not when the import is initially run. The externally-defined tables may be in iceberg format or a list of parquet files may be supplied in the content of the do action (see Parquet File List Syntax). Special care is taken to organize the imported data. There are several options for mapping external data. Users may choose to map each external file (either defined in the list of parquet files or the iceberg metadata) to one segment. Users may also map several files per 1010data segment (see method_ and segmentation_). If partition information exists in the iceberg metadata it may automatically set segby and if both partition information and sort information exists it may result in sortseg.

For example, if there is a bucket partition on column store, then the 1010data target table will have segby set with value store. If there is a truncation partition on column date as well as sort metadata, then sortseg is set with value date, and if the sort data does not exist, segby will be set.

Syntax

<!-- Iceberg Syntax -->
<dynamic res="">
  <do action_="importtable"
      on_="render"
      path_="[TARGET PATH]"
      replace_="[1|0 DEFAULT 0]"
      iceberg_="{pkg('warehouse' 'database' 'table' 'version';'[WAREHOUSE]' '[DB]' '[TABLE]' '[OPTIONAL - ICEBERG VERSION JSON FILE NAME]')}"
      azparameters_="{pkg('sasToken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}"
      method_="[file | bin DEFAULT bin]"
      value_="res"/>
</dynamic>

<!-- Parquet File List Syntax -->
<dynamic>
  <do action_="getdata" tablevalue_="@filepaths">
  <table cols="files">
demo/path/to/sales_detail_dir0
demo/path/to/sales_detail_dir1
    </table>
  </do>
  <do action_="importtable"
...
      importpaths_="{@filepaths['files']}"/>
  <widget class_="scope"/>
</dynamic>

Attributes for `action_="importtable"`

path_

Specifies the location where the table is to be saved.

replace_

Specifies whether to replace an existing table, if it exists. Accepts a 1 or 0.

iceberg_

If specified it must be a pkg that includes keys 'warehouse' and 'table'. Optionally, the pkg may include a database and version. The version must be the name of a valid iceberg json metadata file. If version is not supplied the latest json file in the table's metadata is used. An examples iceberg_ might be:

iceberg_="{pkg('warehouse' 'database' 'table' 'version';'retail' 'discountsupplyinc' 'sales_detail' 'w34g-wrgv-vbsr-fwrg.metadata.json')}"

where the json file would exist at retail/discountsupplyinc/sales_detail/metadata/w34g-wrgv-vbsr-fwrg.metadata.json.

Another example but this time with a minimum amount of information:

iceberg_="{pkg('warehouse' 'table';'tmptables' 'combinedsales')}"

icebergpath_ and icebergversion_

An alternative way to provide the iceberg metadata location and iceberg table version. Instead of providing a dictionary as shown above, icebergpath_ can be used to provide a path, as

icebergpath_="retail/discountsupplyinc/sales_detail"

Adding an iceberg version with icebergversion_ is optional.

importpath_

For non-iceberg use. Must be passed a list containing paths.

azparameter_

Credentials or authentication information for data/metadata stored in Azure. Valid pkg keys are sastoken, storageaccount, container or connectionString. Example:

azparameter_="{pkg('sastoken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}"

method_

Determines how external files are mapped to 1010data segments.

method_="files": Each file maps to a single segment.
method_="bin": Packs or maps multiple files to each segment. If both segmentation_="N" and method_="bin" are specified, then a segment have at most N rows. The default for segmentation_ is 5 million rows.

segby_advise_

Appends segby column to the table's metadata.

sortseg_advise_

Appends sortseg column to the table's metadata.

status_

If set to 1, traps errors and returns status and msg keys in the value returned by importtable.

value_

Variable name in which to place the table information inferred. The package contains:

columns: names of columns found in iceberg schema or parquet files.
segby: segby columns both inferred and/or specified in segby_advise_.
sortseg: sortseg columns both inferred and/or specified in sortseg_advise_.
files: path to parquet files used to create native tables.
rows: number of rows, determine by row counts of each file.
number_segments: the number of segments generated with method_.
replace: if the replace was specified in the import.

<dynamic res=""
         azpars="{pkg('sasToken' 'storageaccount' 'container'; '24389gh34ovhr4h38owh35r' 'bigdatainc' 'user001')}"
         ice="{pkg('warehouse' 'database' 'table';'retail' 'pos' 'sales_detail')}">
  <do action_="importtable"
      on_="render"
      path_="retail.sales_detail"
      replace_="1"
      iceberg_="{@ice}"
      azparameters_="{@azpars}"
      method_="bin"
      value_="res"/>
  <widget class_="scope"/>
</dynamic>

action_="importtable"

Description

Syntax

Attributes for action_="importtable"

`action_="importtable"`

Attributes for `action_="importtable"`