action_="importtable"

<do action_="importtable"> imports an iceberg-formatted table or a set of parquet files, and returns basic metadata statistics. (Available as of version 20.05)

Description

To use tables defined in an external data-lake location, like Azure Blob Storage, users may run <do action_="importtable"> to create tables. Queries run on these tables will fetch external data only once per segment/parquet file. The fetch will occur at query time and not when the import is initially run. The externally defined tables maybe in iceberg format or a list of parquet files maybe supplied in the content of the do action (see Parquet File List Syntax). Special care is taken to organize the imported data. There are several options for mapping external data. Users may choose to map each external file (either defined in the list of parquet files or the iceberg metadata) to one segment. Users may also map several files per 1010data segment (see method_ and segmentation_). If partition information exists in the iceberg metadata it may automatically set segby and if both partition information and sort information exists it may result in sortseg.

For example, if there is a bucket partition on column store, then the 1010data target table will have segby set with value store. If there is a truncation partition on column date as well as sort metadata, then sortseg is set with value date, and if the sort data does not exist, segby will be set.

Syntax

<!-- Iceberg Syntax -->
<dynamic res="">
  <do action_="importtable"
      on_="render"
      path_="[TARGET PATH]"
      replace_="[1|0 DEFAULT 0]"
      iceberg_="{pkg('warehouse' 'database' 'table' 'version';'[WAREHOUSE]' '[DB]' '[TABLE]' '[OPTIONAL - ICEBERG VERSION JSON FILE NAME]')}"
      azparameters_="{pkg('sasToken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}"
      method_="[file | bin DEFAULT bin]"
      value_="res"/>
</dynamic>

<!-- Parquet File List Syntax -->
<dynamic>
  <do action_="getdata" tablevalue_="@filepaths">
  <table cols="files">
demo/path/to/sales_detail_dir0
demo/path/to/sales_detail_dir1
    </table>
  </do>
  <do action_="importtable"
...
      importpaths_="{@filepaths['files']}"/>
  <widget class_="scope"/>
</dynamic>

Attributes for action_="importtable"

path_

Specifies the location where the table is to be saved.

replace_

Specifies whether to replace an existing table, if it exists. Accepts a 1 or 0.

iceberg_

If specified it must be a pkg that includes keys 'warehouse' and 'table'. Optionally, the pkg may include a database and version. The version must be the name of a valid iceberg json metadata file. If version is not supplied the latest json file in the table's metadata is used. An examples iceberg_ might be:

iceberg_="{pkg('warehouse' 'database' 'table' 'version';'retail' 'discountsupplyinc' 'sales_detail' 'w34g-wrgv-vbsr-fwrg.metadata.json')}"

where the json file would exist at retail/discountsupplyinc/sales_detail/metadata/w34g-wrgv-vbsr-fwrg.metadata.json.

Another example but this time with a minimum amount of information:
iceberg_="{pkg('warehouse' 'table';'tmptables' 'combinedsales')}"
icebergpath_ and icebergversion_
An alternative way to provide the iceberg metadata location and iceberg table version. Instead of providing a dictionary as shown above, icebergpath_ can be used to provide a path, as
icebergpath_="retail/discountsupplyinc/sales_detail"

Adding an iceberg version with icebergversion_ is optional.

importpath_
For non-iceberg use. Must be passed a list containing filenames.
azparameter_
Credentials or authentication information for data/metadata stored in Azure. Valid pkg keys are sastoken, storageaccount, container or connectionString. Example:
azparameter_="{pkg('sastoken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}"
method_
Determines how external files are mapped to 1010data segments.
method_="files"
Each file maps to a single segment.
method_="bin"
Packs or maps multiple files to each segment. If both segmentation_="N" and method_="bin" are specified, then a segment have at most N rows. The default for segmentation_ is 5 million rows.
segby_advise_

Appends segby column to the table's metadata.

sortseg_advise_

Appends sortseg column to the table's metadata.

status_
If set to 1, traps errors and returns status and msg keys in the value returned by importtable.
value_
Variable name in which to place the table information inferred. The package contains:
columns
names of columns found in iceberg schema or parquet files.
segby
segby columns both inferred and/or specified in segby_advise_.
sortseg
sortseg columns both inferred and/or specified in sortseg_advise_.
files
path to parquet files used to create native tables.
rows
number of rows, determine by row counts of each file.
number_segments
the number of segments generated with method_.
replace
if the replace was specified in the import.
<dynamic res=""
         azpars="{pkg('sasToken' 'storageaccount' 'container'; '24389gh34ovhr4h38owh35r' 'bigdatainc' 'user001')}"
         ice="{pkg('warehouse' 'database' 'table';'retail' 'pos' 'sales_detail')}">
  <do action_="importtable"
      on_="render"
      path_="retail.sales_detail"
      replace_="1"
      iceberg_="{@ice}"
      azparameters_="{@azpars}"
      method_="bin"
      value_="res"/>
  <widget class_="scope"/>
</dynamic>