<sel>
Select the rows relevant to your analysis.
Description
<sel>
is applied to a table or worksheet of some arbitrary number of
rows and produces a worksheet containing only those rows that were not eliminated by the
value expression. The expression passed to the value
attribute must
evaluate to either 0
or 1
. For instance, the
expression:
sales_price>5.00
where sales_price
is a column in a table or worksheet, will be evaluated
for each row in the table. If the value of sales_price
for that row is
greater than 5.00
, the expression evaluates to 1
and the
row is selected. If the value of sales_price
is less than or equal to
5.00
, the expression evaluates to 0
and the row is
eliminated from the worksheet (but not the physical table).
Expressions without comparisons can also be used for selecting rows. For example, column
foo
is column containing a 0
or 1
, and
column bar
is a column containing 0
or 1
.
An expression can be written that will only select the rows where both columns contain a
1
, as follows:
<sel value="foo*bar"/>
The selection expression above will only select rows that have the value 1
for both foo
and bar
.
Syntax
<sel value="[EXPR]"/>
This construction selects the rows for which [EXPR]
is
true.
Alternate syntax
(Available as of 11.06)
The alternate <sel>
syntax allows for selecting a disjunction of
multiple boolean expressions by specifying multiple <and>
and
<or>
child elements, as follows:
<sel> <or value= [EXPR_1]/> <or> <and value="[EXPR_2]"/> <and value="[EXPR_3]"/> </or> ... </sel>
The construction above selects the rows for which either
[EXPR_1]
or
[EXPR_2]
and
[EXPR_3]
are true.
Each <sel>
, <and>
, or <or>
tag
may contain zero or more <or>
tags or zero or more
<and>
tags. While <and>
and
<or>
may not be mixed at a given nesting level within a single tag,
you may have a complex selection that includes both <and>
s and
<or>
s. In this way, complex selections may be built up from multiple
smaller expressions. See the example section.
This alternate syntax may be more convenient for disjunctive selections, especially those constructed by block code. In addition, using this syntax may provide a memory or efficiency improvement compared to evaluating one long expression.
Including a value="[EXPR_0]"
as an attribute of the
<sel>
tag itself is permitted; it will be treated as an additional
<or>
or <and>
clause.
An <and/>
or <or/>
tag with neither a value
attribute nor contents has no effect.
Attributes
value
- Accepts a Macro Language expression which is used to filter the current dataset.
spflag
- Indicates whether special values should be treated as N/As when evaluating the
value
expression.The default is 0 (i.e., do not treat special values as N/As).
expand
- Expand the number of rows in a table based on the selection expression.
For
<sel value="N" expand="1"/>
, whereN
is a column containing integer values, or an expression (i.e., an expression involving two or more columns that results in a vector of integer values) ,expand="1"
will create a copy of each row according to the value ofN
.For rows where
N
is0
, the row will still be eliminated.If
N
is negative, an error will be returned.The system variable
ii_(B)
can be used to provide an enumeration of duplicate rows for each original row after the expansion, where B specifies a starting base number for the enumeration. For more information and a list of other system variables, see System variables.See Example:
expand="1"
below for additional detail. sample
- Accepts an integer value of
1
or0
that specifies whether or not to generalize the selection based on the value of thevalue
attribute.When
sample="1"
, a row will appear in the post-selection result with a probability equal to the value of thevalue
attribute. Thevalue
attribute can be an expression (or number) that evaluates at each row to a float between0
and1
or may also be N/A, which is equivalent to 0.For instance,
<sel sample="1" value="0.1"/>
should select approximately 10% of the rows uniformly across the entire table.The default is
0
.Note: The optional attributeseed
can be specified to change the random seed for the sample selection.(Available as of version 10.17)
seed
- Accepts a decimal number that specifies the random seed for the sample selection.
By default, the seed is based on the query hash so that the identical query will deterministically create the same sample.
If
seed="0"
, a seed is non-deterministically generated when the query is run. Because of caching, the sample won't change for the current session unless the cache is cleared, but runs of the same query withseed="0"
in different sessions will in general take a different sample.(Available as of version 10.17)
Example
In the following example, a selection is performed on
pub.demo.retail.item to select those rows where values in the
store
column are equal to 1.
<base table="pub.demo.retail.item"/> <sel value="store=1"/>
Example: expand="1"
In this example, the value in the units
column is used to determine the
factor of row expansion for each row in the table pub.demo.retail.item.
Therefore, a row where units=3
will be expanded to three rows, a row where
units=2
will be expanded to two rows, etc. Before the expansion, a
selection is performed to make sure that the units
column does not contain
negative values. A computed column that references the system variable
ii_()
displays the enumeration of duplicate rows for each original row
after the expansion.
<base table="pub.demo.retail.item"/> <sel value="units>=0"/> <sel value="units" expand="1"/> <willbe name="enum_expand" value="ii_()"/>
Example: Escape a string used in a selection expression
In the following example, the qv(X)
function is used to properly escape a
string used in a selection expression. To search for items in the
description
column of pub.doc.retail.product that
contain the string 9"
, the qv(X)
function must be used to
properly escape it.
<base table="pub.doc.retail.product"/> <let search_string="9""> <sel value="(contains(description;{qv(@search_string)}))"/> </let>
If the qv(X)
function was not specified, the query would have resulted in
an error stating that the "
was a bad character.
Example: Create a string of quoted values from a comma-separated list
The following example illustrates how to use the str_to_lst(S;D)
and
qv(X)
functions to turn a comma-separated list into a string of quoted
values that can be used in a selection expression.
<base table="pub.doc.retail.store"/> <set divs="West,North"/> <let divs_quoted="{qv(str_to_lst(@divs;','))}"> <sel value="divisiondesc={@divs_quoted}"/> </let>
The variable divs
is a comma-separated string consisting of two elements:
West
and North
. The string is first turned into a
list-value using the function str_to_lst(S;D)
, which is then passed to the
qv(X)
function. The qv(X)
function converts a list-value
into a string of quoted values, which can be used in a <sel>
operation.
The string of quoted values is assigned to the variable divs_quoted
, which
is declared in a <let>
statement. Inside the <let>
,
the <sel>
operation then references the value of the
divs_quoted
variable.
The code, after expansion, is:
<base table="pub.doc.retail.store"/> <sel value="divisiondesc='West''North'"/>
The result is shown below:
Example: Selecting a sample of rows
The following example demonstrates how to select half of an ordered set of rows with a
recency bias. Because the <sel>
operation specifies
seed="0"
, the query may result in a different sample each time it is run
either after clearing the cache or in a new session.
<base table="pub.demo.retail.item"/> <willbe name="rownum" value="rn_"/> <willbe name="prob" value="i_/n_"/> <sel value="i_/n_" sample="1" seed="0"/>
Example: Select a disjunction of multiple boolean expressions using
<or>
The following example uses the alternate syntax for <sel>
to select a
disjunction of multiple boolean expressions using <or>
elements. (Available as of 11.06)
The query selects rows for which the value expression of any <or>
tag
is true.
<base table="pub.demo.retail.item"/> <sel> <or value="store=1"/> <or value="date=20120515"/> </sel>
Example: Select a conjunction of multiple boolean expressions using
<and>
The following example uses the alternate syntax for <sel>
to select a
conjunction of multiple boolean expressions using <and>
elements. (Available as of 11.06)
The query selects rows for which the value expressions of all <and>
tags are true.
<base table="pub.demo.retail.item"/> <sel> <and value="store=1"/> <and value="date=20120515"/> </sel>
Example: Mix <and>
and <or>
tags within
<sel>
The following code shows how to nest
<and>
and <or>
selections. In this example, the
expression is included within the <sel>
tag.
<base table="pub.demo.weather.stations"/>
<sel value="tz=-7">
<or value="state='NY'">
<and value="elev>100"/>
</or>
<or value="state='CA'">
<and value="elev<100"/>
</or>
<or value="state='TX'"/>
</sel>
The
rows returned include tz=-7
, state='NY'
and
elev>100
, state='CA'
and
elev<100
, or state='TX'
.