g_rankavg(G;S;X)

Returns the rank of unique values within a given group, averaging the rank for repeated values.

Function type

Vector only

Syntax

g_rankavg(G;S;X)
t_rankavg(X)

Input

Argument Type Description
G any A space- or comma-separated list of column names

Rows are in the same group if their values for all of the columns listed in G are the same.

If G is omitted, all rows are considered to be in the same group.

If any of the columns listed in G contain N/A, the N/A value is considered a valid grouping value.

S integer The name of a column in which every row evaluates to a 1 or 0, which determines whether or not that row is selected to be included in the calculation

If S is omitted, all rows will be considered by the function (subject to any prior row selections).

If any of the values in S are neither 1 nor 0, an error is returned.

X any numeric type A column name

Return Value

For every row in each group defined by G (and for those rows where S=1, if specified), g_rankavg returns a decimal value corresponding to the rank of X in the list of all valid (non-N/A) values of X within the group.

The largest value has rank 1, and subsequent (decreasing) values are ranked in increasing increments of 1 (e.g., 2, 3, 4).

Repeated values are assigned the average of the ranks they would receive according to g_rank(G;S;;X), and the value following a set of repeated values is ranked incrementally by the number of repeated values in the set.

If X is N/A, the result is N/A.

Sample Usage

<base table="pub.doc.samples.ref.func.g_func_sample_usage_two_values"/>
<willbe name="g_rankavg_1" value="g_rankavg(state;include;value1)"/>
<willbe name="g_rankavg_2" value="g_rankavg(state city;include;value2)"/>

Example

Let's say we want to rank the items for all the daily transactions from highest to lowest in terms of sales price for each store in our table, but let's say we want those items with the same sales price to have the same (average) rank. Furthermore, we want subsequent items to be ranked as if the items with the sames sales price were given incremental rankings. We can easily do this using g_rankavg(G;S;X).

Let's use the Sales Item Detail table (pub.demo.retail.item) to demonstrate the use of this function. When we open the table, it looks something like this:

Since we want to rank the transactions for every date, and we want to do that for each store, we'll specify the store and date columns for the G parameter. We'll omit the S parameter so that all of the rows will be considered, and we'll specify the sales column for X, since we want the ranking to be based on the sales prices of the items.

Let's create a computed column for the ranking:
<willbe name="rank" value="g_rankavg(store date;;sales)"/>

This will give us a column that looks like the following:

To see the transactions in ranking order for every date for each store, you would sort first on the rank column, then on the date column, and then on the store column. You could do this by clicking on the up arrow () at the top of each of those columns in that order. (See Sorting rows in the 1010data User's Guide for more information about sorting multiple columns.)

Alternately, you could sort those columns using the following Macro Language code:
<sort col="rank" dir="up"/>
<sort col="date" dir="up"/>
<sort col="store" dir="up"/>

The table would then show all the transactions for each store ordered by date, and all the transactions for each date ordered by rank:

You can see that on 06/03/12, there are seven transactions that now appear in the table ranked from the one with the highest sales to the lowest. You'll also notice that there are two transactions with rank 4.5, which is the average of the ranks they would receive from g_rank(G;S;;X), since they have the same value in the sales column (1.1). The next value is given the rank 6 (taking into account the rank positions of the previous two repeating values in the group and skipping over them).

Additional Information

  • The t_ version of this function defaults the G argument and omits the S argument. The default for G is set at table load time based on the organization of the table.