<merge>
Merge a table to another table or worksheet.
Description
<merge>
combines the contents of the current table and another one (the
foreign table) such that all rows of the foreign table are appended to the current
table. Optionally, the operation can remove rows after the merge to perform a
set-theoretical operation. The resulting table may contain all columns in either table, all
common columns, or only a subset of the common columns. <merge>
may also
be used without specifying a foreign table, in which case the resulting table is a copy of
the current table, but with any segmentation constraints lifted (so that, for example,
G_Functions may be used without restriction).
The tags contained within a <merge>
tag (<link>
,
<willbe>
, etc.) are applied to the foreign table prior to
effecting the merge, so that the current table is merged to the foreign table as transformed
by those operations. The operations are applied to the foreign table in the order in
which they appear. Thus, to select on a computed column, the
<willbe>
tag must precede the <sel>
tag.
<tabu>
) are applied to the
results of the tabulation, not the original table.Size limitations for <merge>
When using <merge>
on one or more physical tables, there are no
limitations on the size of the tables (or the final results of the
<merge>
) as long as the tables share at least one column in common.
However, if either side of a <merge>
operation is a worksheet (i.e., one
or both tables have operations applied to them), neither may exceed the limit of 2GB
each.
A simple formula for determining the approximate memory footprint of a worksheet is:
(R*C*8)/1B
R
is the number of rows in the worksheetC
is the number of columns in the worksheet8
is the number of bytes for each value in the worksheet1B
is one billion (the number of bytes in a gigabyte)
In general, if either side of a <merge>
is a worksheet, neither should
exceed ~250MM cells (i.e., R*C
<
250MM).
tab
, is temporarily materialized for the session
before merging with
tab2
:<base table="tab"/>
<willbe name="c1" value="1"/>
<willbe name="c2" value="2"/>
<materialize path="tab3" temp="1"/>
<merge table2="tab2"/>
Syntax
<merge table2="[FOREIGN TABLE NAME]" match="names|order|pad" cols="[COLUMN_NAME_1,COLUMN_NAME_2,...COLUMN_NAME_N]" type="all|union|intersect|complement" materialize="[0|1]" path="[PATH_TO_FOLDER]" replace="[0|1]" users="[USER_NAME_1,USER_NAME_2,...USER_NAME_N]" title="[TABLE_TITLE]" linkheader="[MERGED_TABLE_HEADER]" sdesc="[SHORT_DESCRIPTION]" ldesc="[LONG_DESCRIPTION]" segby="[SEGBY_COL_1,SEGBY_COL_2,...SEGBY_COL_N]" sortseg="[SORTSEG_COL_1,SORTSEG_COL_2,...SORTSEG_COL_N]" maxdown="[CELLS]" nomaxdown="[0|1]" virtual="[0|1]" tolerant="[0|1]" job_responsibility="[0|1]"/>
Attributes
table2
- The name or ID of the foreign table.
If
table2
is omitted, then the current table is consolidated into a single segment. match
- The method to be used in matching columns from the current table to those in the
foreign table.Note: The
match
attribute is ignored iftable2
is not specified.Valid values are:
names
- Columns in the current table will be matched with columns having the same name in the foreign table. Matching columns must have the same data type. The merged table will contain those columns that are common to both tables.
order
- The columns in the current table will be matched with the corresponding columns (by order, regardless of name) in the foreign table. The columns must have the same data type and must be in the same order; in addition, there must be the same number of columns in both tables.
pad
- The merged table will contain all columns in the current table and the foreign
table. Columns that match by name will appear once with data from both tables,
whereas those that are unmatched will be "padded" in one or the other table with
the appropriate N/A value.Note:
match="pad"
always turns a table merge into a worksheet merge, and is therefore subject to the size limitations discussed above.
The default is
names
. cols
- A comma-separated list of the names of columns to be merged. Only the specified
columns will appear in the merged table.
cols
may not be used withmatch="order"
.Note: Thecols
attribute may be used to specify the columns of the result table even iftable2
is not specified.The default (if
cols
is omitted) is for the merged table to contain all matching columns. type
- The type of merge.Note: The
type
attribute may be used even iftable2
is not specified.The effect of specifying any type other than
all
is to eliminate duplicate rows from the resulting table.Valid values are:
all
- All rows from the foreign table are appended to all rows in the current table.
union
- Duplicate rows are removed from the merged table (i.e., the result is the set-theoretical union of the rows of the two tables).
intersect
- Duplicates as well as all rows that do not appear in both tables are removed from the merged table (i.e., the result is the set-theoretical intersection of the rows of the two tables, which means that only rows that are common to both tables appear in the result).
complement
- Duplicates and all rows that appear in the second table are removed from the merged table (i.e., the result is the set-theoretical difference of the current and the foreign table, which means that only rows that are in the current table but not the foreign table appear in the result).
The default is
all
. materialize
- Indicates whether the merged table should be materialized as a new merged table.
When using
materialize="1"
:- The
<merge>
operation (or series of such operations) must appear before all other operations in a macro (except<note>
). - The default attribute
type="all"
must be used. - Either
match="names"
ormatch="pad"
may be used. (match="pad"
available after version 14.01) - No other operations may be contained within the
<merge>
operation.
Note: When merging multiple tables, thematerialize
attribute should only be applied to the final merge.The default is 0 (i.e., do not materialize).
- The
Attributes when materialize="1"
The following attributes are only valid when materialize="1"
:
path
- Specifies the folder path or full table pathname under which the merged table will be created.Note: You must have upload permissions to the target folder.
The default location is a new unambiguous name in the user's My Data folder.
replace
- Specifies whether the merged table should be allowed to replace an existing table.Note: You must own an existing table that is to be replaced.
The default is 0 (i.e., do not replace).
users
- A comma-separated list of the users who will have access to the merged table.
If absent, privileges are inherited from the parent folder. If
users=""
is specified, the merged table is private (i.e., accessible only to the user). title
- Specifies the title of the merged table.
linkheader
- Specifies the link header of the merged table.
sdesc
- Specifies the short description of the merged table.
ldesc
- Specifies the long description of the merged table.
segby
- A comma-separated list of columns by which the new merged table is marked as being segmented.Note: This is advisory only; that is, you are responsible for ensuring that the table is, in fact, segby.
sortseg
- A comma-separated list of columns by which the new merged table is marked as being
sorted and segmented.Note: This is advisory only; that is, you are responsible for ensuring that the table is, in fact, sortseg.
maxdown
- Sets target table's maximum cell download limit to the value specified.
The default is no limit.
nomaxdown
- Specifies whether to remove any limit on the maximum cell download for the table when
using
replace=1
.The default is 0 (i.e., the
maxdown
attribute will be observed even when usingreplace=1
). virtual
- Specifies whether the merged table is built from references to the constituent
tables.
When
virtual="1"
, changes to the underlying tables will change the virtual merged result.Setting
virtual="0"
will result in a merged table made from the state of the tables at the time of the merge. Any changes to the constituent tables will not affect the merged result.The default is 1 (i.e., the merged table is built from references to the constituent tables).
tolerant
- When
virtual="1"
, this specifies whether the virtual merge will be tolerant of future missing constituent tables.Note: The query is not tolerant of missing tables, just the resulting table.The default is 0 (i.e., the merged table will not be tolerant of missing constituent tables).
job_responsibility
- Setting
job_responsibility="1"
prevents synchronization of the table being materialized on an environment configured for High Availability (HA). This is assuming a job on the other HA sites will also be performing the same materialize, so synchronization will be unnecessary. Does nothing in an environment not configured for HA.The default is 0.
(Available as of version 9.36)
Example: Simple merge
<base table="pub.demo.weather.hourly90"/> <merge table2="pub.demo.weather.hourly91" match="names" type="all"/>
cols
attribute is
omitted in this example.Example: Merge with operations
In the following example, the<willbe>
operation is performed on the foreign table prior to the merge
with the current table. The attribute match="pad"
pads the merged table with
NA values in cases where the columns do not
match.<base table="default.lonely"/>
<willbe name="a" value="1"/>
<merge table2="default.lonely" match="pad">
<willbe name="b" value="2"/>
</merge>
The
merge produces the following result: