g_tshift_nearest(G;S;T;TS;X;N)
Returns the value of the nearest row that is at least a specified time period before (or after) the current row within a given group.
Function type
Vector only
Syntax
g_tshift_nearest(G;S;T;TS;X;N)
t_tshift_nearest(X;N)
Input
Argument | Type | Description |
---|---|---|
G |
any | A space- or comma-separated list of column names Rows are in the same group
if their values for all of the columns listed in If If any of the columns listed in |
S |
integer | The name of a column in which every row evaluates to a 1 or 0, which determines
whether or not that row is selected to be included in the calculation If
If any of the values in
|
T |
integer or decimal | The name of a column whose values are used as the basis of the time
series Row order is determined by Note:
T may not be omitted. |
TS |
text | A string representing the type of the time T The choice of
Valid types are:
If |
X |
any | A column name |
N |
integer | The number of rows to shift within the group
|
Return Value
G
and ordered by T
(and for those rows where S
=1, if specified):- If
N
>= 0:-
g_tshift_nearest
returns the value ofX
for the row that is exactlyN
time periods after this row in the same group as this row.If there are multiple rows that are exactlyN
time periods after this row,g_tshift_nearest
returns the value from the first row within that set of rows.Note: This behavior is identical tog_tshift
. -
If there are no rows that are exactly
N
time periods after this row,g_tshift_nearest
returns the value ofX
for the nearest row that is at leastN
time periods after this row.
-
- If
N
< 0:-
g_tshift_nearest
returns the value ofX
for the row that is exactly |N
| time periods before this row in the same group as this row.If there are multiple rows that are exactly |
N
| time periods before this row,g_tshift_nearest
returns the value from the first row within that set of rows.Note: This behavior is identical tog_tshift
. -
If there are no rows that are exactly |
N
| time periods before this row,g_tshift_nearest
returns the value ofX
for the nearest row that is at least |N
| time periods before this row.
-
Time periods are determined by TS
with respect to the
values in T
.
The result is the same data type as X
.
X
are treated like any other value; i.e. they are
shifted.If no rows in a group have valid (non-N/A) values for X
, the result for
every row of the group is N/A.
Sample Usage
<base table="pub.doc.samples.ref.func.g_func_time_series_sample_usage"/> <willbe name="g_tshift_nearest_1" value="g_tshift_nearest(state;include;order;;value;-1)"/> <willbe name="g_tshift_nearest_2" value="g_tshift_nearest(state city;include;order;;value;-1)"/>
Example
For this example, we'll use the sales data available in pub.demo.retail.item, which contains sales transactions from three stores.
Let's say we want to find out the total sales from the previous day. However, for the sake
of example, let's suppose our store isn't open every day, so there will be gaps in our
transactional data. We can't just use a function like g_tshift
, because there will not always be an exact match for
the previous day. What we really want is to find the total sales from the day that is
nearest to the previous day. To do this, we can use
g_tshift_nearest
.
For simplicity, let's say we just want to look at the transactions from store 1. We'll do a
simple <sel>
operation to pare down the data.
<base table="pub.demo.retail.item"/> <sel value="(store=1)"/>
(Obviously, this is not too critical for our sample table, which only contains 35 rows, but it will make our example a little easier to understand).
Now our table just shows the transactions from store 1:
Since we're interested in total sales for each day, let's use the g_sum
function to calculate that. We'll add a computed column
(using the <willbe>
operator) grouping by date
and
totaling the items in the sales
column.
<willbe name="total_sales" value="g_sum(date;;sales)" label="Total Sales" format="dec:2"/>
This adds a new column to our table, which shows the total sales for each day:
In order to contrast and compare, let's first see what would happen if we used
g_tshift
here to find the total sales from the day before.
The g_tshift
function has the form:
g_tshift(G;S;T;TS;X;N)
Since we already selected the rows for store 1, which is
the group we're interested in, we can omit the G
parameter. (If we hadn't
done the <sel>
operation earlier, we could have set G
to store
here in order to group the results by store.)
We can also omit the S
parameter, since we want to consider all rows when
applying the function.
We'll use the date
column for our
T
parameter and set the TS
parameter to
'D'
(since the data in the date
column is of the
date type), and we'll set the N
parameter to
-1, which will allow us to shift back to the previous day in our
table.
And, finally, since we're interested in the total sales for the previous day, we'll set the
X
parameter to the value in our computed column,
total_sales.
So, our g_tshift
function should look like:
g_tshift(;;date;'D';total_sales;-1)
G
and S
parameters,
we still need to account for them in the function call; hence, the ";;
"
before date
.Let's add a computed column to our table with this g_tshift
function using
<willbe>
:
<willbe name="previous_day_sales" label="Previous Day`Total Sales" value="g_tshift(;;date;'D';total_sales;-1)" format="type:num;dec:2"/>
Now our table should look something like:
Notice all the N/A values in our new column. That's because
the only data we have for consecutive days is on 05/15/12 and 05/16/12. So we can see that
g_tshift
applied to the rows with the date 05/16/12 result in the total
sales for the rows from the previous day, 05/15/12, which is
-3.40.
g_tshift
returns the value of X
(total_sales
) from the first row
with that date.All the other transactions are more than one day apart;
therefore, g_tshift
returns N/A because there is no exact match with
respect to the time period specified.
To find the total sales for the day nearest to the previous day, we can use
g_tshift_nearest
, which has the form:
g_tshift_nearest(G;S;T;TS;X;N)
All of the parameters will be the same as what we used for g_tshift
. Let's
add another computed column for the g_tshift_nearest
function:
<willbe name="nearest_day_sales" label="Nearest Previous Day`Total Sales" value="g_tshift_nearest(;;date;'D';total_sales;-1)" format="type:num;dec:2"/>
Our new computed column looks like:
Because there is an exact match with respect to the time period specified, the
results from g_tshift_nearest
for the rows with the date 05/16/12 are the
same as the results from g_tshift
(the first row with the date
05/15/12).
However, for those rows where there is no exact match with respect to the time
period specified, g_tshift_nearest
will return the value of X for the
nearest row that is at least the specified time period away.
For instance, since we have no transactional data for 05/17/12,
g_tshift_nearest
for the rows with the date 05/18/12 will return the
total sales from the nearest row with the date 05/16/12 (the closest previous day).
Also, since there is no transactional data for 06/02/12, the results from
g_tshift_nearest
for the rows with the date 06/03/12 show the total sales
from the nearest row with the date 05/18/12.
g_tshift
and
g_tshift_nearest
for those transactions dated 05/15/12 because no
transactions prior to that date exist in this table.Additional Information
- The
t_
version of this function defaults theG
andT
/TS
arguments, and omits theS
argument. The defaults forG
andT
/TS
are set at table load time based on the organization of the table.