<merge>

Merge a table to another table or worksheet.

Description

<merge> combines the contents of the current table and another one (the foreign table) such that all rows of the foreign table are appended to the current table. Optionally, the operation can remove rows after the merge to perform a set-theoretical operation. The resulting table may contain all columns in either table, all common columns, or only a subset of the common columns. <merge> may also be used without specifying a foreign table, in which case the resulting table is a copy of the current table, but with any segmentation constraints lifted (so that, for example, G_Functions may be used without restriction).

The tags contained within a <merge> tag (<link>, <willbe>, etc.) are applied to the foreign table prior to effecting the merge, so that the current table is merged to the foreign table as transformed by those operations. The operations are applied to the foreign table in the order in which they appear. Thus, to select on a computed column, the <willbe> tag must precede the <sel> tag.

Note: All tags following a tabulation (<tabu>) are applied to the results of the tabulation, not the original table.

Size limitations for <merge>

When using <merge> on one or more physical tables, there are no limitations on the size of the tables (or the final results of the <merge>) as long as the tables share at least one column in common. However, if either side of a <merge> operation is a worksheet (i.e., one or both tables have operations applied to them), neither may exceed the limit of 2GB each.

A simple formula for determining the approximate memory footprint of a worksheet is:

(R*C*8)/1B

where:
  • R is the number of rows in the worksheet
  • C is the number of columns in the worksheet
  • 8 is the number of bytes for each value in the worksheet
  • 1B is one billion (the number of bytes in a gigabyte)

In general, if either side of a <merge> is a worksheet, neither should exceed ~250MM cells (i.e., R*C < 250MM).

If your merge fails because of size, you can try to eliminate unnecessary columns. If that is not possible, you can re-materialize the worksheet(s) that exceeds the limit, and then perform the merge. Another possible workaround is to use a temporary materialize, which creates a new DBM table entry for that user's session only, and then merge. In the following code, the base table, tab, is temporarily materialized for the session before merging with tab2:
<base table="tab"/>
<willbe name="c1" value="1"/>
<willbe name="c2" value="2"/>
<materialize path="tab3" temp="1"/>
<merge table2="tab2"/>

Syntax

<merge table2="[FOREIGN TABLE NAME]" 
       match="names|order|pad"
       cols="[COLUMN_NAME_1,COLUMN_NAME_2,...COLUMN_NAME_N]"
       type="all|union|intersect|complement"
       materialize="[0|1]" 
       path="[PATH_TO_FOLDER]" 
       replace="[0|1]"
       users="[USER_NAME_1,USER_NAME_2,...USER_NAME_N]"
       title="[TABLE_TITLE]" 
       linkheader="[MERGED_TABLE_HEADER]"
       sdesc="[SHORT_DESCRIPTION]" 
       ldesc="[LONG_DESCRIPTION]"
       segby="[SEGBY_COL_1,SEGBY_COL_2,...SEGBY_COL_N]"
       sortseg="[SORTSEG_COL_1,SORTSEG_COL_2,...SORTSEG_COL_N]"
       maxdown="[CELLS]" 
       nomaxdown="[0|1]"
       virtual="[0|1]" 
       tolerant="[0|1]"
       job_responsibility="[0|1]"/>

Attributes

table2
The name or ID of the foreign table.

If table2 is omitted, then the current table is consolidated into a single segment.

match
The method to be used in matching columns from the current table to those in the foreign table.
Note: The match attribute is ignored if table2 is not specified.

Valid values are:

names
Columns in the current table will be matched with columns having the same name in the foreign table. Matching columns must have the same data type. The merged table will contain those columns that are common to both tables.
order
The columns in the current table will be matched with the corresponding columns (by order, regardless of name) in the foreign table. The columns must have the same data type and must be in the same order; in addition, there must be the same number of columns in both tables.
pad
The merged table will contain all columns in the current table and the foreign table. Columns that match by name will appear once with data from both tables, whereas those that are unmatched will be "padded" in one or the other table with the appropriate N/A value.
Note: match="pad" always turns a table merge into a worksheet merge, and is therefore subject to the size limitations discussed above.

The default is names.

cols
A comma-separated list of the names of columns to be merged. Only the specified columns will appear in the merged table. 

cols may not be used with match="order".

Note: The cols attribute may be used to specify the columns of the result table even if table2 is not specified.

The default (if cols is omitted) is for the merged table to contain all matching columns.

type
The type of merge.
Note: The type attribute may be used even if table2 is not specified.

The effect of specifying any type other than all is to eliminate duplicate rows from the resulting table.

Valid values are:

all
All rows from the foreign table are appended to all rows in the current table.
union
Duplicate rows are removed from the merged table (i.e., the result is the set-theoretical union of the rows of the two tables).
intersect
Duplicates as well as all rows that do not appear in both tables are removed from the merged table (i.e., the result is the set-theoretical intersection of the rows of the two tables, which means that only rows that are common to both tables appear in the result).
complement
Duplicates and all rows that appear in the second table are removed from the merged table (i.e., the result is the set-theoretical difference of the current and the foreign table, which means that only rows that are in the current table but not the foreign table appear in the result).

The default is all.

materialize
Indicates whether the merged table should be materialized as a new merged table.
When using materialize="1":
  • The <merge> operation (or series of such operations) must appear before all other operations in a macro (except <note>).
  • The default attribute type="all" must be used.
  • Either match="names" or match="pad" may be used. (match="pad" available after version 14.01)
  • No other operations may be contained within the <merge> operation.
Note: When merging multiple tables, the materialize attribute should only be applied to the final merge.

The default is 0 (i.e., do not materialize).

Attributes when materialize="1"

The following attributes are only valid when materialize="1":

path
Specifies the folder path or full table pathname under which the merged table will be created.
Note: You must have upload permissions to the target folder.

The default location is a new unambiguous name in the user's My Data folder.

replace
Specifies whether the merged table should be allowed to replace an existing table.
Note: You must own an existing table that is to be replaced.

The default is 0 (i.e., do not replace).

users
A comma-separated list of the users who will have access to the merged table.

If absent, privileges are inherited from the parent folder. If users="" is specified, the merged table is private (i.e., accessible only to the user).

title
Specifies the title of the merged table.
linkheader
Specifies the link header of the merged table.
sdesc
Specifies the short description of the merged table.
ldesc
Specifies the long description of the merged table.
segby
A comma-separated list of columns by which the new merged table is marked as being segmented.
Note: This is advisory only; that is, you are responsible for ensuring that the table is, in fact, segby.
sortseg
A comma-separated list of columns by which the new merged table is marked as being sorted and segmented.
Note: This is advisory only; that is, you are responsible for ensuring that the table is, in fact, sortseg.
maxdown
Sets target table's maximum cell download limit to the value specified.

The default is no limit.

nomaxdown
Specifies whether to remove any limit on the maximum cell download for the table when using replace=1.

The default is 0 (i.e., the maxdown attribute will be observed even when using replace=1).

virtual
Specifies whether the merged table is built from references to the constituent tables.

When virtual="1", changes to the underlying tables will change the virtual merged result.

Setting virtual="0" will result in a merged table made from the state of the tables at the time of the merge. Any changes to the constituent tables will not affect the merged result.

The default is 1 (i.e., the merged table is built from references to the constituent tables).

tolerant
When virtual="1", this specifies whether the virtual merge will be tolerant of future missing constituent tables.
Note: The query is not tolerant of missing tables, just the resulting table.

The default is 0 (i.e., the merged table will not be tolerant of missing constituent tables).

job_responsibility
Setting job_responsibility="1" prevents synchronization of the table being materialized on an environment configured for High Availability (HA). This is assuming a job on the other HA sites will also be performing the same materialize, so synchronization will be unnecessary. Does nothing in an environment not configured for HA.

The default is 0.

(Available as of version 9.36)

Example: Simple merge

In this example, assume the "current" table is pub.demo.weather.hourly90. This table contains hourly weather data for calendar year 1990. It will be merged with a table with identical data for the year 1991.
Note: These tables are accessible to all 1010data users in the pub.demo folder.
<base table="pub.demo.weather.hourly90"/>
<merge table2="pub.demo.weather.hourly91" match="names" type="all"/>
Note: Since both tables have identical column names, the cols attribute is omitted in this example.

Example: Merge with operations

In the following example, the <willbe> operation is performed on the foreign table prior to the merge with the current table. The attribute match="pad" pads the merged table with NA values in cases where the columns do not match.
<base table="default.lonely"/>
<willbe name="a" value="1"/>
<merge table2="default.lonely" match="pad">
  <willbe name="b" value="2"/>
</merge>
The merge produces the following result: