segmentation

Segmentation is the process of partitioning/splitting a table horizontally in the underlying file structure so that not all rows live in the same file.

For example, a 45-row table can be split so that each "segment" (i.e., file) contains ten rows; this would yield five segments, where the first four segments contain ten rows and the last segment contains five.

The 1010data Insights Platform provides two specialized forms of segmentation: segby and sortseg. If a table is segby a given column, no unique value of the column can be found in more than one segment. If a table is sortseg on a particular column, not only are unique column values not allowed to be found in different segments, the segments themselves are internally sorted on the sortseg column. These specialized forms of segmentation allow for optimized performance when aggregating (or using g_functions) on the segmented column.

See also: