Query Parameters#

The 1010data platforms supports queries that can be “parameterized” to some extent, to make them more general and more generally usable. So one could write an XML query that computes some statistics given, say, an individual store-ID out of a huge table of sales data, and then call the same query with different store-IDs at need, without having to edit the query. TenFrame incorporates this functionality, so you can develop TenFrames (which are, after all, really 1010data queries) with parameters in them, and then save them to the 1010data object tree and perhaps call them via other interfaces.

It should be noted that this feature’s usefulness is somewhat limited in the face of how much you can use and control parameters in python itself, and how much more simply you can do so. You can easily define a function that takes a store-ID as a parameter and builds a TenFrame that computes statistics for it, with no need for the kind of parameterization being described here. This feature is more for developing queries that can be called or used from other interfaces, perhaps the 1010data web GUI in the macro-language workshop, or even via simple calls to the web API. So don’t be discouraged and think you have to understand this feature in order to do parameterized things in tenFrame! Most of the time you can just do normal things in python the way you would be used to doing them with pandas. This is an advanced topic for making parameterized queries you can save to the server.

The point of tenFrame is to prevent you, the user, from having to learn our XML language in order to do things, but unfortunately at least to understand what is happening, you will need to read up at least a little on how parameters work in macro-language. A look at this page should at least give you some idea.

Param Objects#

As you perform tenFrame operations on a TenFrame and its columns, the operations you do become a 1010data query in our XML macro-language. We’ve already seen, for example, how a pandas-like selection like fr[fr.storeid == 1023] becomes a 1010data XML operator like <sel value="storeid = 1023"/>. So you can see how also the “parameters” of your operations become incorporated into the query: the “1023” that you gave as the value to select on obviously has to become part of the query as well, and it does.

Query parameters work by replacing these values that you use with special Param objects, which cause the resulting query to be built in such a way that they remain parameters which can be set using 1010data’s ordinary query parameter machinery, as described in the page linked above. We’ve seen how

sales = tf.TenFrame(session, "retaildemo.retail.sales_detail")
sales = sales[sales.storeid == 1023]
sales.printprettyXML()

results in the XML

<?xml version="1.0" ?>
<macro>
        <base table="retaildemo.retail.sales_detail"/>
        <sel value="(storeid) = (1023)"/>
</macro>

which should make a certain amount of sense even without knowing our XML language: we specify the base table, and then we do a selection on it. If we want to parameterize the ID we search for, we just have to substitute a Param object in its place:

sales = tf.TenFrame(None, "retaildemo.retail.sales_detail")
sales = sales[sales.storeid == sales.Param("store_id", 1023)]
sales.printprettyXML()

which results in the XML

<?xml version="1.0" ?>
<macro>
        <library>
                <block name="block" store_id="1023">
                        <sel value="(storeid) = ({@store_id})"/>
                </block>
        </library>
        <base table="retaildemo.retail.sales_detail"/>
        <call block="block" store_id="1023"/>
</macro>

This isn’t really as mystifying as it might look, once you read some of the basics of 1010data blocks. The “body” of the query, that is, everything after setting the base table (which in our case is just a selection operation) has been removed to a <block> (named "block") inside the query’s <library>, and replaced by a call to invoke that block with the supplied value for the variable store_id. Inside the block, the variable is surrounded by @{ and }, as explained in the 1010data documentation, to make it expand to its value.

Constructing the Param object is done with frame.Param(). Note that this is called on an existing TenFrame: parameters are always part of some TenFrame and should not be considered to exist independently of them. (There is an exception to this which we’ll see below, for the anomalous but very common case in which the base table name is itself a parameter. But it’s true for the most part.) Even though .Param() is a method, it’s spelled with a capital letter, because it’s acting like a constructor, creating a Param object that’s specific to this TenFrame.

To construct a parameter, you must supply a name and a default value. The default value is necessary for several reasons, but the one that is most relevant to you, the user, is because this way your TenFrame will run normally, with the supplied default value, and you can continue to work with it and develop it just as you would any other query (TenFrame). (The default value is not strictly necessary; it defaults to the empty string "", but that is likely to cause you only trouble if you rely on it.)

The actual signature of the .Param method looks like this:

  • Param(self, name, value=””, *, quoted=None, separator=’,’)

The keyword-only parameters, quoted= and separator=, have to do with some of the finer points of parameter-handling, which we will have to look at.

Types of Parameter#

Parameters come in several types, which are mostly easy to distinguish by the type you pass in as the default value when creating it:

  1. numeric

  2. string (or column, that is, TenSeries)

  3. list, which may be one of a. list of numeric b. list of string (or column, that is, TenSeries)

So tenFrame examines the default value you pass in on construction and determines which of these types you want. If you pass in an int or a float value, you are making a numeric parameter. If you pass in a string value, you are making a string parameter. If you pass in a list, then the type of list is determined by the type of its first element. It turns out that a list of mixed numbers and strings does not really make much sense as a parameter (there aren’t any constructions in the 1010data macro language that you would be using that can handle such a mixed list), so a list only has one type. If you pass in an empty list, it is considered to be a list of strings.

String parameters are by default quoted, i.e. they appear as string literals when the macro-code is expanded before it is interpreted, with quotes around them (supplied by the 1010data qv(X) function, which handles corner-cases such as strings containing quotes). To disable this and use the strings unquoted (for example, if they are column names), you can specify quoted=False in the .Param() call. If you use a TenSeries (a column), it is considered to be a string (the column name) but without quoting (i.e., quoted=False without you having to say so.)

Numeric parameters are not quoted (unless you say quoted=True for some reason).

List parameters are converted into strings separated by the separator character (default ",") because that’s how 1010data language handles lists. You can use another character if you expect your values to contain commas, but do not use the single-quote character ' as your separator character! Lists of strings (i.e., quoted strings and not column names) are subjected to the 1010data str_to_lst(S;D) function, which splits the list on the given delimiter and considers its constituents to be literals. Lists of strings with quoted=False (likewise lists of TenSeries, which become their column names) are not so wrapped and thus remain effectively unquoted. Lists of numeric values (as determined by the first member, as mentioned above) are not quoted either.

Base Table as Parameter#

There is one exception case that is also quite common and needs to be mentioned: the case wherein the base table name is itself a parameter. The problem is that you need the base table name to construct the TenFrame, and you need the TenFrame to call the .Param() method on. The solution is simply to use |tf.that|__ to create the parameter and pass that in to the TenFrame constructor:

frm = tf.TenFrame(session, tf.that.Param("basetable", "retaildemo.retail.sales_detail"))

This is considered a special case, and you don’t need to specify quoted=False, and also in this case the <base> operator will be moved into the <block>, whereas normally it isn’t.

Param Everything!#

Here’s an example of a query that parameterizes practically everything:

b = tf.TenFrame(session, tf.that.Param("basetable", "retaildemo.retail.sales_detail"))
b = b.merge(b.Param("prodtable", "retaildemo.retail.products", quoted=False), on=b.Param("linkcol", "sku", quoted=False))
datecol = b.Param("datecol", "date", quoted=False)
storecol = b.Param("storecol", "store", quoted=False)
col2sum = b.Param("col2sum", "xsales", quoted=False)
b = b[(b[datecol].yrmo() == b.Param("yrmo", 201212)) & (b[storecol] == b.Param("store", 728))]
b = b.groupby(b.Param("grp", "category", quoted=False))[col2sum].sum()

It winds up with nine parameters, allowing the caller to customize it many ways:

basetable

The name of the sales detail table to start with. Defaults to retaildemo.retail.sales_detail.

prodtable

The name of the table which holds product information, which has to be merged with sales detail table in order to provide information on the products. Defaults to retaildemo.retail.products.

linkcol

The name of the column on which to join the basetable and the prodtable. Defaults to sku.

datecol

The name of the column containing the date of transactions. Defaults to date.

storecol

The name of the column containing store numbers. Defaults to store.

col2sum

The name of the column which is to be summed up by the query. Defaults to xsales.

yrmo

The six-digit integer specifying the month (YYYYMM) we want to look at and limit our query to. Defaults to 201212.

store

The store number we want to look at. Defaults to 728.

grp

The name of the column we want to group by when doing the sum. Defaults to category.

The yrmo and store parameters are integers, and all the rest are strings. In this case, since we are using the parameters for the names of tables and columns and such, and not as string constants, it turns out that we want quoted=False for all of them (though we don’t have to specify it for basetable, since that’s a special case.)

The XML generated looks like this:

<macro>
  <library>
    <block name="block" basetable="retaildemo.retail.sales_detail" prodtable="retaildemo.retail.products" linkcol="sku" datecol="date" storecol="store" col2sum="xsales" yrmo="201212" store="728" grp="category">
      <base table="{@basetable}"/>
      <link table2="{@prodtable}" col="{@linkcol}" type="exact"/>
      <sel value="(((yrmo(({@datecol}))) = ({@yrmo})) & (({@storecol}) = ({@store})))"/>
      <tabu breaks="{@grp}">
        <tcol name="{@col2sum}_sum" source="{@col2sum}" fun="sum"/>
      </tabu>
    </block>
  </library>
  <call block="block" basetable="retaildemo.retail.sales_detail" prodtable="retaildemo.retail.products" linkcol="sku" datecol="date" storecol="store" col2sum="xsales" yrmo="201212" store="728" grp="category"/>
</macro>