Using g_functions

Provides an overview of basic functionality of g_functions in 1010data.

Group functions, or g_functions as they are known in 1010data, are a way to perform various kinds of operations, including data summaries and aggregations, across an entire column or columns in a table. Often, especially with basic g_functions, the operation will do something to one kind of data while grouping by another kind. For instance, you can use a g_function to calculate total sales by store or average temperature by city. In both these instances the data that comes after the word by is a group.
Note: All group functions work in the vector context only.

Basic parameters

While many g_functions are available in 1010data, most share at least a few basic pieces of information they must receive before they can work:
  • G - The name of one or more columns that will be used for grouping (e.g., store or city).
  • S - The name of a single column whose values tell the function to either include or omit a row in the table (i.e., 1 for include and 0 for omit). This is the selection column. See Creating a selection column for more information.
  • O - The order of the rows in the group.
  • X - The name of a column whose values will be operated on by the function (such as sales or temperature).

Basic function syntax

Group functions, or g_functions, expect each argument to be separated by a semicolon. Arguments must be provided in the order specified in the function documentation. As an example, the function g_sum(G;S;X) is called as follows (where store, storeflag, and sales are names of columns in a table):

g_sum(store;storeflag;sales)

The example above returns the sum of sales for each store where the storeflag column contains a 1.

Not all g_function parameters must be specified, depending on the function. However, when omitting a parameter, you must still include the requisite semicolon to preserve the order of arguments passed to the function. For example, if we choose to omit the S parameter, we would call the function as shown below:

g_sum(store;;sales)

The above example returns total sales for every store in the table, as no selection column is provided.

For specific information on required and optional parameters, see the topic for the specific g_function in the Function Reference.