action_="importtable"
<do action_="importtable">
imports an iceberg-formatted table or a set of parquet files, and returns basic metadata
statistics. (Available as of version 20.05)
Description
To use tables defined in an external data-lake location, like Azure Blob Storage,
users may run <do action_="importtable">
to create tables.
Queries run on these tables will fetch external data only once per segment/parquet
file. The fetch will occur at query time and not when the import is initially run.
The externally-defined tables may be in iceberg format or a list of parquet files
may be supplied in the content of the do action (see Parquet File List Syntax).
Special care is taken to organize the imported data. There are several options for
mapping external data. Users may choose to map each external file (either defined in
the list of parquet files or the iceberg metadata) to one segment. Users may also
map several files per 1010data segment (see method_
and
segmentation_
). If partition information exists in the iceberg
metadata it may automatically set segby
and if both partition
information and sort information exists it may result in
sortseg
.
For example, if there is a bucket partition on column store
, then
the 1010data target table will have segby
set with value
store
. If there is a truncation partition on column
date
as well as sort metadata, then sortseg
is
set with value date
, and if the sort data does not exist,
segby
will be set.
Syntax
<!-- Iceberg Syntax --> <dynamic res=""> <do action_="importtable" on_="render" path_="[TARGET PATH]" replace_="[1|0 DEFAULT 0]" iceberg_="{pkg('warehouse' 'database' 'table' 'version';'[WAREHOUSE]' '[DB]' '[TABLE]' '[OPTIONAL - ICEBERG VERSION JSON FILE NAME]')}" azparameters_="{pkg('sasToken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}" method_="[file | bin DEFAULT bin]" value_="res"/> </dynamic> <!-- Parquet File List Syntax --> <dynamic> <do action_="getdata" tablevalue_="@filepaths"> <table cols="files"> demo/path/to/sales_detail_dir0 demo/path/to/sales_detail_dir1 </table> </do> <do action_="importtable" ... importpaths_="{@filepaths['files']}"/> <widget class_="scope"/> </dynamic>
Attributes for action_="importtable"
path_
-
Specifies the location where the table is to be saved.
replace_
-
Specifies whether to replace an existing table, if it exists. Accepts a 1 or 0.
iceberg_
-
If specified it must be a
pkg
that includes keys'warehouse'
and'table'
. Optionally, thepkg
may include a database and version. The version must be the name of a valid iceberg json metadata file. If version is not supplied the latest json file in the table's metadata is used. An examplesiceberg_
might be:iceberg_="{pkg('warehouse' 'database' 'table' 'version';'retail' 'discountsupplyinc' 'sales_detail' 'w34g-wrgv-vbsr-fwrg.metadata.json')}"
where the json file would exist at
retail/discountsupplyinc/sales_detail/metadata/w34g-wrgv-vbsr-fwrg.metadata.json
.Another example but this time with a minimum amount of information:iceberg_="{pkg('warehouse' 'table';'tmptables' 'combinedsales')}"
icebergpath_
andicebergversion_
- An alternative way to provide the iceberg metadata location and iceberg
table version. Instead of providing a dictionary as shown above,
icebergpath_ can be used to provide a path, as
icebergpath_="retail/discountsupplyinc/sales_detail"
Adding an iceberg version with
icebergversion_
is optional. importpath_
- For non-iceberg use. Must be passed a list containing paths.
azparameter_
-
Credentials or authentication information for data/metadata stored in Azure. Valid
pkg
keys aresastoken
,storageaccount
,container
orconnectionString
. Example:azparameter_="{pkg('sastoken' 'storageaccount' 'container'; '[SAS_TOKEN]' '[STORAGE_ACCOUNT]' '[CONTAINER]')}"
method_
-
Determines how external files are mapped to 1010data segments.
method_="files"
- Each file maps to a single segment.
method_="bin"
- Packs or maps multiple files to each segment. If both
segmentation_="N"
andmethod_="bin"
are specified, then a segment have at mostN
rows. The default forsegmentation_
is 5 million rows.
segby_advise_
-
Appends
segby
column to the table's metadata. sortseg_advise_
-
Appends
sortseg
column to the table's metadata. status_
- If set to 1, traps errors and returns
status
andmsg
keys in the value returned byimporttable
. value_
-
Variable name in which to place the table information inferred. The package contains:
columns
- names of columns found in iceberg schema or parquet files.
segby
- segby columns both inferred and/or specified in
segby_advise_
. sortseg
- sortseg columns both inferred and/or specified in
sortseg_advise_
. files
- path to parquet files used to create native tables.
rows
- number of rows, determine by row counts of each file.
number_segments
- the number of segments generated with
method_
. replace
- if the replace was specified in the import.
<dynamic res="" azpars="{pkg('sasToken' 'storageaccount' 'container'; '24389gh34ovhr4h38owh35r' 'bigdatainc' 'user001')}" ice="{pkg('warehouse' 'database' 'table';'retail' 'pos' 'sales_detail')}"> <do action_="importtable" on_="render" path_="retail.sales_detail" replace_="1" iceberg_="{@ice}" azparameters_="{@azpars}" method_="bin" value_="res"/> <widget class_="scope"/> </dynamic>