Scalar expressions in 1010data
Scalar expressions allow for the storage and access of individual variables and values in 1010data.
Traditionally, 1010data is a vector processing system. In other words, it has always been good at dealing with columns of values of the same type. With the development of higher-level tools such as blocks and QuickApps, scalar expressions have become much more important for those both developing queries and designing front-end interfaces. This guide is meant to serve as an introduction to the topic of scalar values and expressions.
Traditional Syntax
Before blocks came along, the typical way to write an expression in 1010data was to perform an operation on an entire column. For instance, to select all the rows in a table or worksheet that are associated with a specific column's value, the expression would be written as:
<sel value="store=1"/>
In the example above, store
is the name of a column in a table, and
1
is the value to which the operation will compare every value in the
column. The operation above selects only rows in the table where the
store
column has the value of 1
. This is probably the
most basic expression in 1010data. Of course, other comparison operators could also be
used:
<sel value="store<>1"/>
will select all the stores in the table that are not equal to 1
.
Computed columns operate in a very similar fashion; however, the result would be a new column, which has to have a name:
<willbe name="new_column" value="column1+column2"/>
where the new column will be the sum of columns column1
and
column2
. This made querying data in the system very straightforward, but also
had several drawbacks. For long and/or complex queries, scalability and maintainability were
lacking.
Variable Assignment and Retrieval
Blocks made it possible for 1010data queries to be parameterized. A value could be assigned to a variable once and called as many times as necessary. This meant if a value was used multiple times in a query, it would only need to be changed in a single place, which made Macro Language code much easier to maintain. Here's the first example as a block:
Works!
<block name="my_block" col="store" val="1"> <sel value="{@col}={@val}"/> </block>
The code above translates identically to <sel value="store=1"/>
. To 1010data,
the two are identical. But to the person writing the query, the comparison values can be
changed in the opening tag of the block, and the change will propagate to every instance
of the variable.
The syntax here is straightforward. A variable is created, and its value is assigned in the
opening tag of the block. In this example, two variables are created and given values:
col="store"
and value="1"
. There are now two variables whose
values can be retrieved by variable name.
The syntax for retrieving a value is also easy to spot. Scalar values are always retrieved by
enclosing the variable name in curly braces ({}
). Inside the curly braces, the
variable name is preceded with an @
symbol.
Notice in the example above that each variable is enclosed in its own set of curly braces. This is because each variable resolves to its own unique value. This is an important rule. Anything enclosed in curly braces must resolve to a single value. This will become much more interesting when list-values come into play. For now, keep in mind that the following will not work:
Does not work!
<block name="my_block" col="store" val="1"> <sel value="{@col=@val}"/> </block>
The above does not result in a single scalar value being returned. It results in a comparison operation that returns some unknown number of rows when evaluated by the system. Remember, scalar values must resolve to a single value.
Data Types in Scalar Expressions
As with all 1010data Macro Language queries, the data type for values must be considered when working with scalar expressions. For instance, when comparing a value or column to a string value, the string (or variable that resolves to a string) must be enclosed in single quotes. This is not true of integer or decimal values. Take our original example:
<sel value="store=1"/>
Notice above that the value to which store
is being compared,
1
, need not be enclosed in single quotes. However, if a comparison is
made between a column's value and a text value (for instance, a state such as Alaska), the
following syntax applies:
<sel value="state='AK'"/>
Notice that the text value needs to be enclosed in single quotes for the query to work, whereas the integer value in the previous query does not. The same rule applies to referencing variables in scalar expressions. If the underlying value of the variable is text, then the expression must be enclosed in single quotes:
Works!
<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note> <block column_name="state" state_code="AK"> <sel value="{@column_name}='{@state_code}'"/> </block>
Works!
<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note> <block column_name="state" state_code="'AK'"> <sel value="{@column_name}={@state_code}"/> </block>
The code above illustrates the use of single quotes outside the curly braces to ensure that strings are interpreted by the system properly.
Does not work!
<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note> <block column_name="state" state_code="AK"> <sel value="{@column_name}={@state_code}"/> </block>
Does not work!
<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note> <block column_name="state" state_code="AK"> <sel value="{@column_name}={'@state_code'}"/> </block>
When working with variables that could potentially hold values of multiple types, it is best practice to enclose the entire expression in single quotes. This will "cast" any non-text types to text and prevent type errors.
Functions that Accept and Return Scalar Values
Traditional 1010data functions, such as g_functions, typically accept and return vectors. In
other words, the function typically accepts some column as an input and does something to
every value of the column. For example, g_avg(G;S;X)
accepts one or more columns for
grouping, an optional selection column, and a column on which to perform the operation,
which in this case is to find the average:
<note type="base">Applied to table: pub.demo.weather.wunderground.observed_hourly</note> <willbe name="avg_tmp" value="g_avg(zipcode date;;tempi)"/> <sel value="g_first1(zipcode date;;)"/>
The resulting column from the code above will find the average mean temperature for all combinations of zipcode/date in the table. Both g_functions operate across all the values in each column provided as an argument.
With blocks and QuickApps, many new functions have been added to the system that accept and
return scalar values instead of vector values. These functions are ideal for assigning
values to variables declared in <block>
and <dynamic>
code. They follow the same
syntactical rules as creating and referencing variables in the scalar context.
As an example, take a function that works in either the scalar or vector context: csl_len(X)
. This function returns the length of a
comma-separated list. If it is passed a column containing comma-separated lists, it will
return the value for each in the vector context:
<let> <table cols="list">"list,of,separated,values";"another,list,of,values,that,is,of,different,length" </table> <col name="list" format="width:50"/> <willbe name="list_len" value="csl_len(list)"/> </let>
The code above will produce a new column that gives the number of elements in each of the two comma-separated lists. However, this function will also work in the scalar context:
Works!
<letseq delimited_list="yet,another,list,of,values" list_len="{csl_len(@delimited_list)}"> <willbe name="scalar_list_length" value="{@list_len}"/> </letseq>
The code above can be written with identical results as follows:
Works!
<let delimited_list="yet,another,list,of,values"> <willbe name="scalar_list_length" value="{csl_len(@delimited_list)}"/> </let>
Notice that in both examples above, the function call is nested immediately inside the curly
braces. The variable is referenced with the @
symbol, but does not need to be
encased directly within curly braces. The curly braces represent the entire scalar expression,
not simply the variable reference. The following will not work:
Does not work!
<let delimited_list="yet,another,list,of,values"> <willbe name="scalar_list_length" value="csl_len({@delimited_list})"/> </let>
Many functions in 1010data will work in either the scalar or vector context. This is annotated in the 1010data function reference topics with language to the following effect for a given parameter: "Accepts a value or column name containing values..."
1010data now provides functions that are specific to scalar expressions. Most notable are:
lst(X)
and pkg(K;V)
. Both of these
functions, as well as the associated functions for manipulating their values, can only be used
in the scalar context. These functions must accept and return scalar values called list-values.
List-values contain indexed collections of either values or key/value pairs.
For example, here is a basic list:
<note type="base">Applied to table: pub.demo.retail.item</note> <let my_list="{lst(1 2 3)}"> <sel value="store={@my_list.1}"/> </let>
Lists allow for cleaner, more readable code because they allow for the storage of many values in a single variable. Without lists, each value would require its own variable to hold it.
Since lists are indexed, it's also easy to iterate over them using the <foreach>
construct without the need to
reference index values:
<let ideas="{3.14159, {'a', 'e', 'i' 'o' 'u'}, 'The course of true love never did run smooth'}"> <foreach important_list="{@ideas}" tally_="@i"> <willbe name="important_stuff{@i}" value="'{@important_list}'"/> </foreach> </let>
The code above uses the implicit list syntax to build a list of important ideas. Notice that the
vowels listed above are placed in a list of their own, which is nested inside the larger
list. The <foreach>
construct is able to iterate over each value in
@important_list
without the need to reference an index position, as
would be necessary in a <willbe>
operation or <for>
loop. The list-value (that's the
vowels) contained within the list will be automatically converted to a comma-separated
string.
Finally, take note that the list-variable provided to the value
attribute is
enclosed in single quotes outside the curly braces. Since the list contains mixed value types,
this will ensure all types are cast as strings and will not cause errors.
Packages are similar to lists. Packages also provide a way to build indexed variables containing multiple values. However, packages allow for "keys" to be provided, making them similar to a dictionary or associative array in other languages. Referencing package values in scalar expressions is very similar to accessing values in lists:
<let ideas="{pkg('pi' 'vowels' 'true_love';3.14159 {'a', 'e', 'i', 'o', 'u'} 'The course of true love never did run smooth')}"> <table depth="1"/> <willbe name="lysander" value="'{@ideas.true_love}'"/> </let>
where lysander
takes the place of the index position in the
aforementioned list syntax.
Notice again that packages can contain lists. To access the values of the list of vowels, use the index position of the item after identifying the list by its package key:
<let ideas="{pkg('pi' 'vowels' 'true_love';3.14159, {'a', 'e', 'i', 'o', 'u'} 'The course of true love never did run smooth')}"> <table depth="1"/> <willbe name="vowel" value="'{@ideas.vowels.3}'"/> </let>
will locate the value: 'i'
.
Parameterizing Keys and Indexes
It is often the case that you will need to programmatically interact with data and use return values from queries to identify the key or keys you need in a package (or the index position of a list). When you don't know exactly where you'll need data from a list, there is a special syntax for accessing such values. This syntax is clearly demonstrated when iterating over a package or perhaps only specific key/value pairs of a package.
Let's say we have some stooges:
<let stooges="{pkg('wise_guy' 'numbskull' 'leader' 'imposter';'Larry' 'Curly' 'Moe' 'Shemp')}"/>
Only three stooges are required to make a Three Stooges film, so one will have to go. The following code will build a list of only the keys we want, then iterate over them using the special syntax provided for accessing packages and lists with parameterized index/key information:
<letseq stooges="{pkg('wise_guy' 'numbskull' 'leader' 'imposter';'Larry' 'Curly' 'Moe' 'Shemp')}" keys="{pkg_names(@stooges)}"> <table depth="1"/> <foreach stooge_role="{lst_take(@keys;3)}" tally_="@i"> <willbe name="{@stooge_role}" value="'{@stooges.{@stooge_role}}'"/> </foreach> </letseq>
will iterate over the values in the package stooges
by iterating over
the values in the list stooge_role
.
Placing Package Values into Tables
Whole lists and packages can be stored in tables as values in individual cells. Within
the context of block constructs like <set>
and <let>
(as well as <block>
itself), placing a variable
reference to a list or package value will create a column of that value:
<base label="test" table="default.lonely"/> <set package="{pkg('a' 'b' 'c';1 2 3)}"/> <willbe name="test" value="{@package}"/>
will product a single-cell table containing the value of @package
. Like
the models that are produced by 1010data's Machine-Learning functions, package values that
are stored in tables may be accessed using the param(M;P;I)
function, as follows:
<base label="test" table="default.lonely"/> <set package="{pkg('a' 'b' 'c';1 2 3)}"/> <willbe name="test" value="{@package}"/> <willbe name="test2" value="param(test;'b';)"/>