Scalar expressions in 1010data

Scalar expressions allow for the storage and access of individual variables and values in 1010data.

Traditionally, 1010data is a vector processing system. In other words, it has always been good at dealing with columns of values of the same type. With the development of higher-level tools such as blocks and QuickApps, scalar expressions have become much more important for those both developing queries and designing front-end interfaces. This guide is meant to serve as an introduction to the topic of scalar values and expressions.

Traditional Syntax

Before blocks came along, the typical way to write an expression in 1010data was to perform an operation on an entire column. For instance, to select all the rows in a table or worksheet that are associated with a specific column's value, the expression would be written as:

<sel value="store=1"/>

In the example above, store is the name of a column in a table, and 1 is the value to which the operation will compare every value in the column. The operation above selects only rows in the table where the store column has the value of 1. This is probably the most basic expression in 1010data. Of course, other comparison operators could also be used:

<sel value="store<>1"/>

will select all the stores in the table that are not equal to 1.

Computed columns operate in a very similar fashion; however, the result would be a new column, which has to have a name:

<willbe name="new_column" value="column1+column2"/>

where the new column will be the sum of columns column1 and column2. This made querying data in the system very straightforward, but also had several drawbacks. For long and/or complex queries, scalability and maintainability were lacking.

Variable Assignment and Retrieval

Blocks made it possible for 1010data queries to be parameterized. A value could be assigned to a variable once and called as many times as necessary. This meant if a value was used multiple times in a query, it would only need to be changed in a single place, which made Macro Language code much easier to maintain. Here's the first example as a block:

Works!

<block name="my_block" col="store" val="1">
    <sel value="{@col}={@val}"/>
</block>

The code above translates identically to <sel value="store=1"/>. To 1010data, the two are identical. But to the person writing the query, the comparison values can be changed in the opening tag of the block, and the change will propagate to every instance of the variable.

The syntax here is straightforward. A variable is created, and its value is assigned in the opening tag of the block. In this example, two variables are created and given values: col="store" and value="1". There are now two variables whose values can be retrieved by variable name.

The syntax for retrieving a value is also easy to spot. Scalar values are always retrieved by enclosing the variable name in curly braces ({}). Inside the curly braces, the variable name is preceded with an @ symbol.

Notice in the example above that each variable is enclosed in its own set of curly braces. This is because each variable resolves to its own unique value. This is an important rule. Anything enclosed in curly braces must resolve to a single value. This will become much more interesting when list-values come into play. For now, keep in mind that the following will not work:

Does not work!

<block name="my_block" col="store" val="1">
    <sel value="{@col=@val}"/>
</block>

The above does not result in a single scalar value being returned. It results in a comparison operation that returns some unknown number of rows when evaluated by the system. Remember, scalar values must resolve to a single value.

Data Types in Scalar Expressions

As with all 1010data Macro Language queries, the data type for values must be considered when working with scalar expressions. For instance, when comparing a value or column to a string value, the string (or variable that resolves to a string) must be enclosed in single quotes. This is not true of integer or decimal values. Take our original example:

<sel value="store=1"/>

Notice above that the value to which store is being compared, 1, need not be enclosed in single quotes. However, if a comparison is made between a column's value and a text value (for instance, a state such as Alaska), the following syntax applies:

<sel value="state='AK'"/>

Notice that the text value needs to be enclosed in single quotes for the query to work, whereas the integer value in the previous query does not. The same rule applies to referencing variables in scalar expressions. If the underlying value of the variable is text, then the expression must be enclosed in single quotes:

Works!

<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note>
<block column_name="state" state_code="AK">
  <sel value="{@column_name}='{@state_code}'"/>
</block>

Works!

<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note>
<block column_name="state" state_code="'AK'">
  <sel value="{@column_name}={@state_code}"/>
</block>

The code above illustrates the use of single quotes outside the curly braces to ensure that strings are interpreted by the system properly.

Does not work!

<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note>
<block column_name="state" state_code="AK">
  <sel value="{@column_name}={@state_code}"/>
</block>

Does not work!

<note type="base">Applied to table: pub.fin.fred2.bls.smsu</note>
<block column_name="state" state_code="AK">
  <sel value="{@column_name}={'@state_code'}"/>
</block>

When working with variables that could potentially hold values of multiple types, it is best practice to enclose the entire expression in single quotes. This will "cast" any non-text types to text and prevent type errors.

Functions that Accept and Return Scalar Values

Traditional 1010data functions, such as g_functions, typically accept and return vectors. In other words, the function typically accepts some column as an input and does something to every value of the column. For example, g_avg(G;S;X) accepts one or more columns for grouping, an optional selection column, and a column on which to perform the operation, which in this case is to find the average:

<note type="base">Applied to table: pub.demo.weather.wunderground.observed_hourly</note>
<willbe name="avg_tmp" value="g_avg(zipcode date;;tempi)"/>
<sel value="g_first1(zipcode date;;)"/>

The resulting column from the code above will find the average mean temperature for all combinations of zipcode/date in the table. Both g_functions operate across all the values in each column provided as an argument.

With blocks and QuickApps, many new functions have been added to the system that accept and return scalar values instead of vector values. These functions are ideal for assigning values to variables declared in <block> and <dynamic> code. They follow the same syntactical rules as creating and referencing variables in the scalar context.

As an example, take a function that works in either the scalar or vector context: csl_len(X). This function returns the length of a comma-separated list. If it is passed a column containing comma-separated lists, it will return the value for each in the vector context:

<let>
  <table cols="list">"list,of,separated,values";"another,list,of,values,that,is,of,different,length"
  </table>
  <col name="list" format="width:50"/>
  <willbe name="list_len" value="csl_len(list)"/>
</let>

The code above will produce a new column that gives the number of elements in each of the two comma-separated lists. However, this function will also work in the scalar context:

Works!

<letseq delimited_list="yet,another,list,of,values" list_len="{csl_len(@delimited_list)}">
 <willbe name="scalar_list_length" value="{@list_len}"/>
</letseq>

The code above can be written with identical results as follows:

Works!

<let delimited_list="yet,another,list,of,values">
 <willbe name="scalar_list_length" value="{csl_len(@delimited_list)}"/>
</let>

Notice that in both examples above, the function call is nested immediately inside the curly braces. The variable is referenced with the @ symbol, but does not need to be encased directly within curly braces. The curly braces represent the entire scalar expression, not simply the variable reference. The following will not work:

Does not work!

<let delimited_list="yet,another,list,of,values">
 <willbe name="scalar_list_length" value="csl_len({@delimited_list})"/>
</let>

Many functions in 1010data will work in either the scalar or vector context. This is annotated in the 1010data function reference topics with language to the following effect for a given parameter: "Accepts a value or column name containing values..."

1010data now provides functions that are specific to scalar expressions. Most notable are: lst(X) and pkg(K;V). Both of these functions, as well as the associated functions for manipulating their values, can only be used in the scalar context. These functions must accept and return scalar values called list-values. List-values contain indexed collections of either values or key/value pairs.

For example, here is a basic list:

<note type="base">Applied to table: pub.demo.retail.item</note>
<let my_list="{lst(1 2 3)}">
    <sel value="store={@my_list.1}"/>
</let>

Lists allow for cleaner, more readable code because they allow for the storage of many values in a single variable. Without lists, each value would require its own variable to hold it.

Since lists are indexed, it's also easy to iterate over them using the <foreach> construct without the need to reference index values:

<let ideas="{3.14159, {'a', 'e', 'i' 'o' 'u'}, 'The course of true love never did run smooth'}">
  <foreach important_list="{@ideas}" tally_="@i">
    <willbe name="important_stuff{@i}" value="'{@important_list}'"/>
  </foreach>
</let>

The code above uses the implicit list syntax to build a list of important ideas. Notice that the vowels listed above are placed in a list of their own, which is nested inside the larger list. The <foreach> construct is able to iterate over each value in @important_list without the need to reference an index position, as would be necessary in a <willbe> operation or <for> loop. The list-value (that's the vowels) contained within the list will be automatically converted to a comma-separated string.

Finally, take note that the list-variable provided to the value attribute is enclosed in single quotes outside the curly braces. Since the list contains mixed value types, this will ensure all types are cast as strings and will not cause errors.

Packages are similar to lists. Packages also provide a way to build indexed variables containing multiple values. However, packages allow for "keys" to be provided, making them similar to a dictionary or associative array in other languages. Referencing package values in scalar expressions is very similar to accessing values in lists:

<let ideas="{pkg('pi' 'vowels' 'true_love';3.14159 {'a', 'e', 'i', 'o', 'u'} 'The course of true love never did run smooth')}">
  <table depth="1"/>
  <willbe name="lysander" value="'{@ideas.true_love}'"/>
</let>

where lysander takes the place of the index position in the aforementioned list syntax.

Notice again that packages can contain lists. To access the values of the list of vowels, use the index position of the item after identifying the list by its package key:

<let ideas="{pkg('pi' 'vowels' 'true_love';3.14159, {'a', 'e', 'i', 'o', 'u'} 'The course of true love never did run smooth')}">
  <table depth="1"/>
  <willbe name="vowel" value="'{@ideas.vowels.3}'"/>
</let>

will locate the value: 'i'.

Parameterizing Keys and Indexes

It is often the case that you will need to programmatically interact with data and use return values from queries to identify the key or keys you need in a package (or the index position of a list). When you don't know exactly where you'll need data from a list, there is a special syntax for accessing such values. This syntax is clearly demonstrated when iterating over a package or perhaps only specific key/value pairs of a package.

Let's say we have some stooges:

<let stooges="{pkg('wise_guy' 'numbskull' 'leader' 'imposter';'Larry' 'Curly' 'Moe' 'Shemp')}"/>

Only three stooges are required to make a Three Stooges film, so one will have to go. The following code will build a list of only the keys we want, then iterate over them using the special syntax provided for accessing packages and lists with parameterized index/key information:

<letseq stooges="{pkg('wise_guy' 'numbskull' 'leader' 'imposter';'Larry' 'Curly' 'Moe' 'Shemp')}" keys="{pkg_names(@stooges)}">
  <table depth="1"/>
  <foreach stooge_role="{lst_take(@keys;3)}" tally_="@i">
    <willbe name="{@stooge_role}" value="'{@stooges.{@stooge_role}}'"/>
  </foreach>
</letseq>

will iterate over the values in the package stooges by iterating over the values in the list stooge_role.

Placing Package Values into Tables

Whole lists and packages can be stored in tables as values in individual cells. Within the context of block constructs like <set> and <let> (as well as <block> itself), placing a variable reference to a list or package value will create a column of that value:

<base label="test" table="default.lonely"/>
<set package="{pkg('a' 'b' 'c';1 2 3)}"/>
<willbe name="test" value="{@package}"/>

will product a single-cell table containing the value of @package. Like the models that are produced by 1010data's Machine-Learning functions, package values that are stored in tables may be accessed using the param(M;P;I) function, as follows:

<base label="test" table="default.lonely"/>
<set package="{pkg('a' 'b' 'c';1 2 3)}"/>
<willbe name="test" value="{@package}"/>
<willbe name="test2" value="param(test;'b';)"/>