<find>
Efficiently finds (without selecting) the first n rows, between the specified row numbers, where the specified Boolean value expression is true.
Description
This operation identifies the first n rows in a table, based upon a Boolean
expression, constrained by an upper and lower bound. The upper bound and lower bound default
to 0
if omitted and are not themselves included. Because the search is
carried out on only as many segments required to satisfy the request,
<find>
has the potential to be much faster than
<sel>
. This may also result in less memory needed to cache a
<find>
operation than a <sel>
operation.
If the search condition used is a commonly occurring word or phrase, and the table is large
and widely distributed, this can result in high parallelism which incurs the overhead of
many subprocesses needing to coordinate and report back. The attribute
maxpar="n"
specifies the maximum number of subprocesses to be enlisted to
search for the condition.
Syntax
<find value="[EXPRESSION]" after="[AFTER_ROW_NUM]" before="[BEFORE_ROW_NUM]" count="[N]">
Alternate syntax
(Available as of 12.24)
An alternate syntax of <find>
(containing a tree of
<or>
tags) is also supported. In this form, the optional attribute
spflag="1"
is available which indicates whether special values should be
treated as N/As when evaluating the value
expression or executing multiple
successive <find>
statements, each of which caches its results.
Shown here, this syntax allows for finding a disjunction of multiple Boolean expressions by
specifying multiple <or>
child elements.
<find> <or value="[EXPR_1]"/> <or value="[EXPR_2]"/> ... </find>
This construction finds the rows for which any of
[EXPR_1]
,
[EXPR_2]
, etc. are true.
This is logically equivalent to <find
value="([EXPR_1])|([EXPR_2])|..."/>
, but
it may be more convenient for disjunctive criteria, especially those constructed by block
code. In addition, using this syntax may provide a memory or efficiency improvement compared
to evaluating one long expression or executing multiple successive
<find>
statements, each of which caches its results.
Including a value="[EXPR_0]"
as an attribute of the
<find>
tag itself is permitted; it will be treated as an additional
<or>
clause.
<find value="[EXPR_0]"> <or value="[EXPR_1]"/> <or value="[EXPR_2]"/> ... </find>
Each <or>
tag may contain one or more <or>
tags. In
this way, complex searches may be built up from multiple smaller expressions.
<find>
must either be <and>
or
<or>
; they may not be mixed.Alternate syntax
(Available as of 12.24)
An alternate syntax of <find>
(containing a tree of
<and>
tags) is also supported. In this form, the optional attribute
spflag="1"
is available which indicates whether special values should be
treated as N/As when evaluating the value
expression or executing multiple
successive <find>
statements, each of which caches its results.
Shown here, this syntax allows for finding a disjunction of multiple Boolean expressions by
specifying multiple <and>
child elements.
<find> <and value="[EXPR_1]"/> <and value="[EXPR_2]"/> ... </find>
This construction selects the rows for which all of
[EXPR_1]
,
[EXPR_2]
, etc. are true.
This is logically equivalent to <find
value="([EXPR_1])&([EXPR_2])&..."/>
,
but it may be more convenient for conjunctive criteria, especially those constructed by
block code. In addition, using this syntax may provide a memory or efficiency improvement
compared to evaluating one long expression or executing multiple successive
<find>
statements, each of which caches its results.
Including a value="[EXPR_0]"
as an attribute of the
<find>
tag itself is permitted; it will be treated as an additional
<and>
clause.
<find value="[EXPR_0]"> <and value="[EXPR_1]"/> <and value="[EXPR_2]"/> ... </find>
Each <and>
tag may contain one or more <and>
tags.
In this way, complex selections may be built up from multiple smaller expressions.
<find>
must either be <and>
or
<or>
; they may not be mixed.Attributes
value
- Specifies an expression that determines which rows are searched for. The expression
specified by the
value
attribute must evaluate to either0
or1
.The row numbers in which the Boolean expression evaluates to
1
are returned; rows in which the Boolean expression evaluates to0
are ignored. after
- Indicates the starting point from which
<find>
searches for matches. The row number specified here is excluded from the search set.Defaults to
0
(i.e. just before the first row, row one) if omitted, so that the first row will be found if the expression is satisfied there. before
- Indicates the stopping point before which
<find>
searches for matches. The row number specified here is excluded from the search set.Defaults to a number 1-greater than the number of rows in the table or worksheet on which the operation is applied. This ensures that all rows are included in the search space.
count
- Specifies the number of rows to find within the given range of rows
<find>
is operating on.[N] defaults to
1
, and a negative [N] may be specified, in which case the search proceeds backwards from [BEFORE_ROW_NUM] (or the end of the table) and the results are returned in descending order. maxpar
-
Accepts a positive integer indicating the maximum number of subprocesses that will be put to work once searching for the condition.
The default is no maximum, whereas
maxpar="1"
ensures that segments are searched sequentially, one at a time. This is theoretically fastest if the requested number of results is guaranteed to be found in the first segment.maxpar="auto"
can be used to mitigate this; it starts maximum parallelism at1
and then doubles it at each step when the requested row count is not yet satisfied. This is very effective in minimizing overhead for frequent conditions while maximizing parallelism for rare conditions. softcount
- Specifies the number of rows [M] (larger than
[N]) to return if they are found while performing the search.
Used in conjunction with
count
, or ifcount
is omitted with the default [N]=1, [M] should be a positive number greater than the absolute value of [N]. Searches only as far as necessary to find [N] rows, but return a larger number [M] of rows if they are found in the course of that search.
Example
In
pub.demo.weather.stations
, find the first ten occurrences where state =
'NY'.
The result is a tabulation-like table of a single column row
containing row numbers, with as many rows as there were search results (i.e. between 0 and
10 results,
inclusive).
<base table="pub.demo.weather.stations"/> <find value="state='NY'" count="10"/>
Example: Find the last three rows
In pub.demo.weather.stations
, find the last three rows before row
100 where state='GA'
.
<base table="pub.demo.weather.stations"/> <find count="-3" before="100" value="state='GA'"/>
Example: Alternate syntax
In
pub.demo.retail.item
, find the first 15 rows matching any of three
different values from the account
column.
<base table="pub.demo.retail.item"/> <find count="15"> <or value="account=957"/> <or value="account=478"/> <or value="account=709"/> </find>