Functions in <tcol>

This table lists the valid values for the fun attribute in a <tcol> element and indicates which of those functions require a second column (via the weight attribute).

Value Description Requires 2nd column
cnt Number of records
Note: As of version 10.19, it is no longer necessary to specify the source attribute when fun="cnt".
ucnt Number of unique values

Use ucnt when the table is not segmented by the column you are counting. If the table is not segmented by the column you are counting, and speed is more important than exact precision, use aucnt.

ucnte Number of unique values (segmented column only)

If the table is segmented by the column to be counted, use ucnte instead of ucnt for a faster approach to unique counting. If the table is not segmented by the counted column, the ucnte results will be incorrect. Use ucnt instead.

aucnt Approximate number of unique values

Uses the hyper-log-log algorithm and results in a floating point value.

Note: Use aucnt instead of ucnt when the table is not segmented by the column you are counting and speed is more important than exact precision. While this function can be used for small tables, it is generally used to compute the approximate number of distinct values in a column with an extremely large number of uniques (cardinalities in the several hundred million or more). aucnt is slower than ucnt for cardinalities fewer than 100 million.

(Available as of version 10.38)

wucnt The weighted number of unique values
Note: When computing the weighted number of unique values in tabular data, it is possible that some unique values have more than one entry and therefore have more than one weight. wucnt is computed by taking the average weight for each unique value and then adding the averages together.

(Available as of version 13.33)

sum Sum
avg Average
Note: avg yields the same result as mean.
wavg Weighted average
hi Highest value
lo Lowest value
first First value
last Last value
std Population standard deviation
std_s Sample standard deviation (divide by n-1 instead of n, where n is the number of rows)
var Population variance
var_s Sample variance (divide by n-1 instead of n, where n is the number of rows)
wstd Weighted standard deviation
wvar Weighted variance
ssqr Sum of squares
rho Correlation
cov Covariance
sumwd Dot product
valcnt Number of valid values
Note: valcnt yields the same result as ssize.
nacnt Number of NA values
wcnt Number of valid pairs
wnacnt Number of NA pairs
mean Sample mean value
Note: mean yields the same result as avg.

(Available as of version 13.03)

median Median value
mode Mode
modef Frequency of mode
moden Number of modes
lquart Lower quartile

(Available as of version 13.03)

uquart Upper quartile

(Available as of version 13.03)

aquant Approximate quantiles. aquant uses the tdigest algorithm. It is significantly faster for high cardinality quantile calculations, such as when there are at least 100 million unique values for each break. However, aquant will be slower when the cardinality of the source for each break is small.

The argument p is the probability to retrieve. It must be between 0-1.

The argument nbin is optional and is the number of storage nodes the internal C implementation uses (default: 100, maximum: 1000). The larger nbin is, the greater the accuracy.
<base table="retaildemo.retail.sales_detail"/>
<willbe name="g" value="draw_(321;100)"/>
<willbe name="r" value="draw_(123;0)"/>
<tabu breaks="g">
  <tcol name="aquant" source="r" fun="aquant" args="p:0.5,nbin:1000"/>
</tabu>

(Available as of version 19.15)

In addition, one of the following optional prefixes may be prepended to cnt, sum, or avg. (As of version 10.23, the optional prefixes may be used with any of the <tcol> functions that yield numeric results.)

Prefix Description
PG Express the result as a percentage of the grand total
FG Express the result as a fraction of the grand total
RG Express the result as a rank
PR Express the result as a percentage of the row total in a cross-tabulation
FR Express the result as a fraction of the row total in a cross-tabulation
RR Express the result as a rank within the row in a cross-tabulation
PC Express the result as a percentage of the column total in a cross-tabulation
FC Express the result as a fraction of the column total in a cross-tabulation
RC Express the result as a rank within the column in a cross-tabulation