py1010 reference#

py1010: 1010data Insights Platform’s Python SDK#

py1010 enables making 1010data calls within Python and inspect the results in a familiar, Python-like way. To create a new session and log in, you can do something like:

>>> import py1010
>>> session=py1010.Session("http://www2.1010data.com/prod-latest/gw",
                           "USERNAME", "PASSWORD", py1010.POSSESS)

The last parameter is for the login type, which determines what happens when the application establishes a connection to the Insights Platform and another session is currently active. The login types are: py1010.KILL, py1010.NOKILL, or py1010.POSSESS.

For a group login, supply a group name after the login type parameter (or with the group= keyword). py1010 acquires a userid and logs in.

You can create queries using the XML query language.

>>> query=session.query("pub.demo.weather.hourly90",
                        '<sort cols="date,hour,id"/>')
>>> query.run()

The columns from the query are available as items or attributes of the query, and the contents of each column are available by subscripting:

>>> query['date'][0], query['temp'][0]
(datetime.date(1990, 1, 1), -12.2)

Insights platform “date” data is automatically converted to python datetime.date objects.

Most of the CSDK API is exposed through py1010; you may need to refer to the CSDK documentation for further explanation of some of the calls. Generally, in cases where the API function would return a non-zero value to indicate an error, py1010 will throw a TentenException object.

In the docstrings of this module, an asterisk “(*)” generally means that the documented method may cause network access.

Many of the parameters and return values of the methods will be “bytes” objects and not strings, because 1010data treats data as plain bytes and not as encoded strings, but see below.

Data Conversion#

When py1010 retrieves data from a 1010data database, by default it does some conversions before presenting it to you when you access it. The default conversions, which will usually be what you want, are described below, but they can be also overridden by user-specified conversion functions, to give you full control over how data is converted. For example, you may want to say:

py1010.globalConverters.aConvertFunc = py1010.bytes2string

Integer Columns#

Integer columns in 1010data can include things that aren’t strictly integers from Python’s point of view, like infinities and NaN (NA in the dataset). Also, dates are stored as integers in 1010data, and so appear as integer columns.

By default, an integer NA, infinity, or negative infinity is converted into float('nan'), float('inf'), or float('-inf'), respectively. For other values, if the format_type of the column is date, date4y, or ansidate, the value is converted into a python date object, and if the format_type is hm, the value is converted into a python time object. Otherwise it is returned as an integer. Note that these conversions may raise exceptions, if the integer would represent an invalid date, etc.

Long integer columns (K-type “j”) work the same way, except they are never converted into date or time objects.

Float Columns#

In float columns, NA, infinity, and negative infinity are presented as float('nan'), float('inf'), and float('-inf'), as you might expect. This isn’t conversion, that is how they are stored to begin with.

But float values are also used for datetime values, so if the format_type of the column is datehms24 or ansidatetime, the value is converted into a python datetime object (and the conversion may raise an exception.)

String Columns#

String values are normally returned as bytes, because 1010data stores them that way, as a sequence of bytes, which might not be valid Unicode characters or might have an unknown encoding. If the format_type of the column is json, then the value is converted into a string and then interpreted as JSON, returning a Python object of the appropriate type (usually a list or dictionary). The conversion to string or decoding of JSON may raise exceptions.

If you want the data to be returned as strings and not as bytes, you can set the conversion function of the column or the query (see below) to the supplied function py1010.bytes2string(), which attempts to decode the bytes as UTF-8 data, raising a UnicodeDecodeError on failure. Or you may write your own conversion function, perhaps to handle errors differently, etc.

Conversion Functions#

If these default conversions do not suit your purposes, or perhaps you need to handle exceptions at conversion-time, you can use your own conversion functions. There are several places the converters may be set, so they can be given a broad scope (inherited by many columns with only one setting) or a narrow one, and a narrow scope can be excepted from a broader one by setting the converter to the string "default".

The order of resolution is as follows:
  1. Use the converter set on the Column in question, if any, unless it is the string default, in which case do the default conversions as described above.

  2. If that is None, use the converter set on the Query to which the column belongs, unless it is the string “default”, in which case do the default conversions.

  3. If that is None, use the converter set on the Session to which the query belongs, unless it is the string default, in which case do the default conversions.

  4. If that is None, look up the converter in py1010.globalConverters.

  5. If that is None (or default), do the default conversion as described above.

There are four conversion functions at each step, one for each type of column. So the Column, Query, and :class`Session` objects each have attributes called iConvertFunc, jConvertFunc, fConvertFunc, and aConvertFunc, for converting integers, long integers, floats, and strings, respectively. So does the module variable globalConverters, which acts as the top-level, final lookup.

The resulting function should be able to accept one or two arguments, viz. the value to be converted and optionally the column object (since such a formatter may want to vary its output depending on, for example, the contents of the column’s .format_dict)

exception py1010.DBMError#

Exception raised when a DBM transaction fails.

exception py1010.TentenClosedSessionException#

Exception for trying to access a session that has been closed.

exception py1010.TentenException#
exception py1010.TentenNotLastException#

Exception when something is done that would affect the “last” query, but the calling query is not that query.

exception py1010.TentenNotRunException#

Exception for trying to access information about a query that has not been run yet.

exception py1010.TentenPoolFullException#

Exception for failing to log in using a login pool, after the user-specified number of retries has been exhausted.

exception py1010.TentenTransactionException#

Exception raised when something fails in creating or running a transaction.

py1010.AWSKey#

alias of CloudKey

class py1010.AddtableStatus(ptr=None)#

Status of an ongoing (asynchronous) Addtable transaction.

The status property is given in terms of the AddtableStatus.Status enumeration, viz:

AddtableStatus.Status.IDLE          (0)
AddtableStatus.Status.FAILED        (1)
AddtableStatus.Status.INITIALIZING  (2)
AddtableStatus.Status.LOADING       (3)
AddtableStatus.Status.COMPLETED     (4)
AddtableStatus.Status.DIAGNOSED     (5)
Variables:
  • tablepath – Name of table being uploaded to

  • status – Status of transaction (see above)

  • totalRecs – Number of records uploaded so far (or total)

  • expectedBytes – Number of bytes expected, if known

  • filteredBytes – Bytes filtered out by autocorrection

  • skippedBytes – Bytes skipped (not filtered)

  • message – Status message from server

  • targets – Location (a SourceInfo object) where filtered bytes are written

class Status(value)#

An enumeration.

expectedBytes#

Expected bytes to be uploaded.

filteredBytes#

Number of bytes filtered out by autocorrection.

message#

Status message returned by server.

skippedBytes#

Number of bytes skipped.

status#

Status of the transaction:

AddtableStatus.Status.IDLE (0) AddtableStatus.Status.FAILED (1) AddtableStatus.Status.INITIALIZING (2) AddtableStatus.Status.LOADING (3) AddtableStatus.Status.COMPLETED (4) AddtableStatus.Status.DIAGNOSED (5)

tablepath#

Full path name of table being uploaded.

targets#

SourceInfo object describing file where filtered bytes are written.

totalBytes#

Total number of bytes uploaded (so far).

totalRecs#

Total number of records uploaded (so far).

class py1010.BaseQuery(table, xml)#

Base class for Query objects, not bound to sessions.

These objects basically contain only table and ops information, and cannot be run (as they have no session associated with them). They are used in pool-warming.

Variables:
  • table – Name of the base table of this query

  • xml – The XML of the operations for this query.

class py1010.CloudKey(name, key, id_=None, region=None, keytype=TBL_SOURCE_S3, ptr=None, *)#

Class holding AWS/ABS/etc keys, wrapping CSDK AWSKEY structs.

Constructor for CloudKey.

Parameters:
  • name – A string (or bytes): the name by which the API will refer to this key.

  • key – A string/bytes containing the “secret” key.

  • id – A string/bytes containing the AWS id (AWS only).

  • region – A string/bytes containing the AWS region (AWS only).

  • keytype – The type of key. Uses constants from SourceInfo.SrcType class (S3, ABS, GCP)

  • ptr – For internal use only.

id#

The AWS key id.

key#

The “secret” AWS key.

keytype#

The type of key (int)

name#

The name by which the system knows this key.

region#

The AWS region for this key.

class py1010.Column(query, magic)#

Represents a tenten column. This is created by Query objects when they are run(). Column objects can be accessed as lists (though potentially very large ones) by subscripting (including ranges) or using iterators. Data is automatically fetched at need in ‘windows’ of size win_size (only one window is in memory at a given time for each row).

Variables:
  • name – Column name

  • type – Column type (‘a’, ‘i’, ‘j’, or ‘f’)

  • format – Format string

  • nrows – Number of rows in query

  • win_size – The win_size of the query associated with this column

  • colwin_start – The start of the current window of data for this column

  • colwin_end – The end of the current window of data for this column

  • ktype (int) – K type of the column

  • title – Column title

  • width (int) – Width field of format string (0 if not given)

  • format_dict – The format of the column as a dictionary

  • format_type – Type field of format string (None if not given)

  • dec (int) – Dec field of format string (-1 if not given)

  • rawValues – Handle infinity and NA values in integer columns, as described for the rawValues property of the Query class. If None (the default), use the value set for this column’s Query. Otherwise, override it for this column.

classmethod defaultAConv(cls, val, col, raw=False)#

Default string converter.

The function that py1010 uses to convert string columns if there is no user-specified converter.

If the column’s format indicates that it is a JSON value, attempt to render the json into a python object. If an exception is raised, re-raise it to the caller, unless the raw parameter is set to "safe", in which case just return the string value. Otherwise, return the value.

Parameters:
  • val – The value being converted.

  • col – The column containing the value.

  • raw – How to handle date conversion exceptions.

Returns:

A string or converted JSON value, as described above.

classmethod defaultFConv(cls, val, col, raw=False)#

Default float converter.

The function that py1010 uses to convert floating-point columns if there is no user-specified converter.

If the column’s format indicates that it is a timestamp (datehms24, ansidatetime), attempt to convert it to a datetime.datetime object. If an exception is raised, re-raise it to the caller, unless the raw parameter is set to "safe", in which case just return the integer value. Otherwise, return the value.

Parameters:
  • val – The value being converted.

  • col – The column containing the value.

  • raw – How to handle date conversion exceptions.

Returns:

A float or datetime value, as described above.

classmethod defaultIConv(cls, val, col, raw=False)#

Default integer converter.

The function that py1010 uses to convert integer columns if there is no user-specified converter.

  1. If the value is K’s integer NA or integer plus or minus infinity, return the K integer value only if the column’s .rawValues attribute is True (or if it is None and the query’s .rawValues attribute is True). Otherwise, return float values for NaN or infinity.

  2. If the column’s format indicates that it is some kind of date or time, attempt to convert it to a datetime.date or datetime.time object. If an exception is raised, re-raise it to the caller, unless the raw parameter is set to "safe", in which case just return the integer value.

  3. Otherwise, return the value.

Parameters:
  • val – The value being converted.

  • col – The column containing the value.

  • raw – How to handle date conversion exceptions.

Returns:

An integer, date, time, or float value, as described above.

classmethod defaultJConv(cls, val, col, raw=False)#

Default long integer converter.

The function that py1010 uses to convert long integer columns if there is no user-specified converter.

If the value is K’s integer NA or integer plus or minus infinity, return the K integer value only if the column’s .rawValues attribute is True (or if it is None and the query’s .rawValues attribute is True). Otherwise, return float values for NaN or infinity. Otherwise, return the value.

Parameters:
  • val – The value being converted.

  • col – The column containing the value.

  • raw – How to handle date conversion exceptions.

Returns:

An integer or float value, as described above.

fetchrows(self, starting=None)#

Fetch a window of rows. Users should not need to call this. (*)

format_value(self, index)#

Format the value at the given index. (*)

Returns:

A string with the column’s value at the given index, formatted according to the column’s formatting, for dates and times.

Return type:

str

get(self, n, raw=False)#

The value of the column at the given row. (*)

If the row is not currently available for this column, it is fetched. The parameter raw controls translating dates, infinities, model columns, etc. If raw=False (default), values are converted to Python objects. If raw=True, no conversion is done. If raw="safe", conversion is done, unless the conversion results in an exception, in which case the raw value is returned. Note that self.rawValues is respected when raw is not True. Use raw=True to override self.rawValues for this particular value.

getaConv(self)#

Get the text “conversion function” in use for this column.

The order of resolution is:
  1. self.aConvertFunc

  2. self.query.aConvertFunc

  3. self.query.session.aConvertFunc

  4. py1010.globalConverters.aConvertFunc

If it all resolves to None, use the default.

getfConv(self)#

Get the float “conversion function” in use for this column.

The order of resolution is:
  1. self.fConvertFunc

  2. self.query.fConvertFunc

  3. self.query.session.fConvertFunc

  4. py1010.globalConverters.fConvertFunc

If it all resolves to None, use the default.

getiConv(self)#

Get the integer “conversion function” in use for this column.

The order of resolution is:
  1. self.iConvertFunc

  2. self.query.iConvertFunc

  3. self.query.session.iConvertFunc

  4. py1010.globalConverters.iConvertFunc

If it all resolves to None, use the default.

getjConv(self)#

Get the long integer “conversion function” in use for this column.

The order of resolution is:
  1. self.jConvertFunc

  2. self.query.jConvertFunc

  3. self.query.session.jConvertFunc

  4. py1010.globalConverters.jConvertFunc

If it all resolves to None, use the default.

getlocal(self, n)#

Fetch a value from the local cache only.

This is the same as column[n], except that it raises an IndexError if the value is not available in the currently-loaded window.

has(self, n)#

Is row n currently loaded?

Checks the given index within the window of values currently loaded for this column.

dec#

Decimal places specified in the format, or -1

Same as int(self.format_dict.get('dec', -1))

format#

Column format string

format_type#

Format type (self.format_dict.get('type',""))

ktype#

The K type of this column

name#

Column name

nrows#

Total number of rows

rawValues#

rawValues: object

title#

Column title (*)

type#

Column type (a, i, j, or f)

width#

Column format width (int(self.format_dict.get('width',0)))

win_size#

The win_size of the query associated with this column

class py1010.ColumnInfo(name, type_, title=None, desc=None, format=None, int index=Index.NOINDEX, int fix=Fix.NOFIX, ptr=None, *)#

Class for holding metadata about a column being uploaded into.

Fix#

alias of Index

class Index(value)#

An enumeration.

classmethod fromSourceCol(cls, scol)#

Translate a SourceColumnInfo into a ColumnInfo by copying over some key fields.

init(self, name, type_, title=None, desc=None, format=None, int index=Index.NOINDEX, int fix=Fix.NOFIX)#
desc#

Description of the column

format#

Format of the column.

name#

Name of the column

title#

Title of the column, displayed when the table is viewed.

type#

Type of the column (“integer”, “yyyymmdd”, etc.)

class py1010.DBMDo(self, session)#

Used to implement the dbmdo attribute of Sessions.

With an open py1010 Session object session, the expression session.dbm("endpoint", key1=value1, key2=value2,...) is exactly the same as session.dbmdo.endpoint(key1=value1, key2=value2,...).

So you can say session.get_user(uid="name") and so on.

There are a very few exceptions, convenience functions added to this attribute.

session.dbmdo.list_tabs(path) returns a list of all the tables within the directory named by path, searching recursively.

session.load_tab(table_spec, sync=False) returns a tuple (path, status, msg) with the path of the table, the status code (see Session.addTableStatus()), and any message returned by the add-table command.

combine_tabs(self, source_paths, dest_path)#

Call the combine_tabs endpoint for the given tables.

Parameters:
  • source_paths – List of table names

  • dest_path – Path to which to save the combined table

list_tabs(self, path)#

List the tables within a given directory.

Searches recursively all through the given path.

Parameters:

path – Path to list.

class py1010.DirEntry(session, magic)#

A wrapper for the directory entry C structure that should only be created internally.

Variables:
  • type – 0 for directory, 1 for table

  • id – id of this object

  • name – file/directory name

  • title – title of object

  • sdesc – short description

  • ldesc – long description

  • owner – owner of this object

  • update – last update timestamp

  • secure – if 1, object only accessible via SSL connection

asdict(self)#

The information on this object as a dictionary.

metaInfo(self)#

Get the MetaData for this directory entry (*)

setMetaInfo(self, MetaData meta, flags=-1)#

Set the MetaData on this directory entry (*)

id#

ID of this object

ldesc#

Long description

name#

File or directory name

owner#

The object’s owner

sdesc#

Short description

title#

Title of this object

type#

Type of entry: 0 for directory and 1 for table(?)

update#

Update timestamp, as a datetime object

class py1010.Directory(Session session, magic)#

Represents a directory in the 1010 file tree. You should not create these directly; use the fromPath() static method instead, or call directory() on a Session object.

children(self)#

Iterate over the children of this directory. (*)

Returns:

An iterator which generates the children of this directory as DirEntry objects.

Raises:

TentenException – if the transaction fails

static createDirectory(session, pathname, title=None, users=None, uploaders=None, inherit_users=False, inherit_uploaders=False)#

Create a directory in the 1010data object tree.

Parameters:
  • pathname – Full name of folder to create.

  • title – Title of the directory (default None)

  • users – List of users allowed to read the directory (default None)

  • uploaders – List of users allowed to upload to the directory (default None)

  • inherit_users – If True, inherit allowed users from parent directory (ignoring any list of users given). Default False.

  • inherit_uploaders – If True, inherit allowed uploaders from parent directory (ignoring any list of users given). Default False.

Raises:
static fromPath(session, pathname)#

Find a directory from its pathname. (*)

You must supply a valid Session.

Return type:

Directory

Raises:

TentenException – if the transaction fails

parents(self)#

Iterate over the parents of this directory (as DirEntry objects). (*)

static removeDirectory(Session session, pathname)#

Remove a directory given the pathname.

Raises:
class py1010.Holder(magic, size, ptr=None, *)#

Parent class for classes that will wrap a C struct.

Contains a StructWrapper which does the actual wrapping.

Superclass constructor for Holder.

@param magic The magic number you have to know to use this. @param size The size of the structure to be wrapped. @param ptr A PyCapsule holding a pointer to a C struct to be wrapped. If None (not supplied), allocate a new one.

PTR#

A PyCapsule holding the struct’s pointer. Internal use only.

class py1010.MetaData(session, magic)#

Contains metadata for objects in the 1010 directory system.

These are returned by the DirEntry.metaInfo() method on DirEntry objects. Various attributes of the metadata object are configurable, and they can be altered and used in setMetaInfo(), which changes the metadata on the file entry. Note: changing the values on the metadata object does not alter the actual values on the server, until setMetaInfo() is called.

This class is not meant to be used directly; either use the metaInfo() method on a DirEntry object or use the fromPath() class method.

Variables:
  • id – Object’s id in 1010 system

  • path – Path-name in 1010 filetree

  • title – Object’s title

  • sdesc – Short description

  • ldesc – Long description

  • type – Item type (“REAL”, “DIR”, etc)

  • secure

  • own

  • owner – Item’s owner

  • update – Timestamp of last update

  • favorite

  • users – Users authorized to access this object

  • display

  • report

  • chart

  • link – Link description

  • rows – Number of rows (if a table)

  • bytes – Size of item (in bytes)

  • segs – Number of segments

  • tstat

  • access

  • maxdown – Maximum number of rows permitted per download

  • upload

  • uploaders

  • numchild

  • segby

  • keys

asdict(self)#

The metadata in dictionary form.

classmethod fromPath(cls, Session session, path)#

Fetch the metadata for the named table or directory. (*)

bytes#

size of the item (in bytes)

id#

Object’s id in the 1010 system.

ldesc#

Long description.

link description.

maxdown#

maximum number of rows permitted per download

owner#

Item’s owner

path#

Path (directories and filename)

rows#

number of rows (for a table)

sdesc#

Short description

segs#

number of segments

title#

Item’s title.

type#

Item type.

update#

Timestamp of item’s last update.

users#

Users who are authorized to access this object.

class py1010.OktaLogin(email, password, gusurl='http://gus.1010data.com')#

Used by Session in order to login using okta credentials.

This object can be passed to Session in the constructor or a new Session could be created by calling getSession. This object fetches the okta token and the available environments for the user account. A new okta token and environment list are fetched from GUS, and all the environment information is available under the environments property. Using a different environment other the default one is done by changing the chosenEnvId property. Changing the environment version is done by changing the chosenVersion property.

Create OktaLogin object :param str email: The user email :param str password: User pasword :param str gusurl: The GUS url to fetch the okta token from

getSession(self, logintype)#

Get new Session initializied with this OktaLogin object. :param logintype: KILL, NOKILL, or POSSESS

sessionLogin(self, logintype, Session session)#
class py1010.Query(session, table, xml='')#

Representing 1010data Insights Platform queries and their result sets.

Created with the query() method on Session objects. It specifies the table the query operates on and the query’s XML. Before accessing the results of a query, the query’s run() method must be executed. This creates the Column objects that hold the result set and populates the query’s coldict and cols attributes.

The columns are accessible through the query by subscripting with column names; e.g., a column named “transaction” can be accessed as query['transaction']. Columns are also accessible as attributes of the query object, so query.transaction works.

If you have names that are already used for members of the class or are Python reserved words, you must use subscripting.

Accessing some methods or properties of a Query object before that query is run raises a TentenNotRunException. The relevant properties are marked with (-) in the list below.

Variables:
  • nrows – Number of rows in the result set (-)

  • win_start – Index of starting point of last-fetched window of data (-)

  • win_end – Index of end point of last-fetched window of data (-)

  • win_size – Size of window to be downloaded (configurable)

  • title – Title of the base table of this query (-)

  • sdesc – Short description of the base table of this query (-)

  • ldesc – Long description of the base table of this query (-)

  • messagetext – Last message text from the 1010data Insights Platform session

  • table – Name of the base table of this query

  • session (Session) – The Session object for this query

  • cols – A tuple of the columns. (-)

  • coldict (dict) – A dictionary of columns indexed by name (-)

  • lastrun – A datetime object of when this query was last run.

  • rows (RowIterator) – A new row iterator object each time accessed. (-)

  • xml – The XML of the operations for this query.

  • transfermode – The transfer mode (raw, (un)compressed, etc) of the query.

  • rawValues – If True, use the raw (internal) values for infinity and NA values in integer (and 64bit-integer) columns. If False (the default), return float('inf') or float('-inf') for infinities, and float('NaN') for NA values.

Create a query with the given session URL, table, and xml ops. Usually be called by session.query(…), you shouldn’t need to call it directly.

colbyname(self, name)#

Find column in this query by name.

dictslice(self, start, stop, step=1)#

Dictionary of lists, a slice of the data.

Return a dictionary of lists, the dictionary indexed on the names of the columns of the Query and the lists being slices of the columns.

>>> query.dictslice(start, stop, step)

is equivalent to

>>> {k : list(v[start:stop:step]) for k, v in query.coldict.items()}

Note that this instantiates lists, not iterators, so all the data called for is downloaded and stored in memory. Make sure the data size is small enough to fit in memory when calling this. (*)

expand(self, bufsize=20000)#

Returns the expanded XML query. Optional parameter is the size of the buffer (default 20000). Use a larger buffer if it is too small. (*)

formatted_row(self, index)#

Values formatted according to column formats.

A list of the values of all the columns at the given index, each one is formatted according to the format of that column. (*)

freshen(self)#

Call tenten_Freshen() to refresh the state of the table(s). (*)

getrow(self, n)#

Row at a given index.

A list of the values in all the columns at that index). (*)

has(self, n)#

This is computed with respect to the _last_ read made for this query, regardless of the column. It probably is not the method you were looking for.

resave(self, querypath, title='', sdesc='', ldesc='', users=None, inherit_users=False, owner='')#

Replace the Quick Query with this new version. (*)

rowasdict(self, n)#

Row at given index, presented as a dictionary. (*)

rowiterator(self, *args, **kwargs)#

Iterate over the rows of this query.

Arguments may be integers or strings, which are used to specify columns of the query (as index or colbyname). If none are specified, it’s taken to mean all the columns, in index order. The iterator returns a tuple of the values of all the specified columns, in the specified order.

This iterator is distinct and separate from ordinary access to the columns, and pulls down all the named columns together, instead of making a separate call for each column, so it will be more efficient if accessing a table row by row.

run(self)#

Execute query. Must be done before data can be accessed. (*)

save(self, querypath, title='', sdesc='', ldesc='', force=False, users=None, inherit_users=False, owner='')#

Save this query as a Quick Query (*).

Pass in force=True to write a new query or replace an existing one of the same name, if the first write fails.

saveFile(self, dest, targettype, names=False, headers=False, compression='zip')#

Save the results of a query to a file on FTP or cloud storage.

Parameters:
  • dest – A SourceInfo object with a single SourceFile in it, or a string, which is converted to one.

  • targettype – A string (or bytes) specifying the format to save, currently one of “csv”, “xlsx”, “pdf”, “tde”, “parquet”.

  • names – Use first row as column names? (default False)

  • compression – Should the file be saved compressed? Choices are “zip” (the default), “none”, or “gzip”.

  • headers – Use first row as column headers? (default False)

saveTableMaterialize(self, *args, **kwargs)#

Call saveTableMaterialize on the Session object; raise an Exception if this Query is not the ‘current’ one which would be saved.

saveToFTP(self, filename, sep='|', linesep='\n', namerow=False, headrow=False, compression='zip')#

Save this query to FTP. If namerow is True, first row will be taken to be column names. If headrow is True, first row (or second if namerow is also True) will be taken as column headers. (*)

cursor#

Current index

ldesc#

Short description of the base table of this query. (*)

messagetext#

Session message text.

This is the session’s message text, which is filled in by CSDK layer, and serves as some documentation of the status of the last operation performed.

modelcols#

Retrieve “model-type” columns as dictionaries (‘True’), or do not perform the complex calculation and the columns will just contain the word “MODEL” (‘False’, default)

nrows#

Total number of rows in the result

rawValues#

rawValues: object

rows#

A RowIterator that iterates over the rows of the query.

This is exactly the same as rowiterator(). Note: a new RowIterator is created for each access of this property.

sdesc#

Short description of the base table of this query (*)

title#

Title of the base table of this query (*)

transfermode#

Get the transfer mode (compressed, binary, etc) for this query.

win_end#

The index of the end point of the currently loaded window (last fetch).

win_size#

The size of window to download (configurable)

Note: you can change win_size after reading in some values, so win_start+win_size does not necessarily equal win_end.

win_start#

The index of the starting point of the currently loaded window (last fetch)

class py1010.RowIterator(query, cols, start=0, stop=None, step=1, notuple=False, *)#

An object that represents a generator that yields rows, or multi-column tuples, of successive indexes through the query. It shares the columns with the query. Accessing the columns outside of the iterator at indices outside of the iterator’s current window may cause performance issues.

Construct RowIterator for query.

Creates a row iterator for the named query, starting at the given index (default zero); the rest of the arguments specify the columns, either by name or by index.

Parameters:
  • query – The Query object.

  • cols – A list of column numbers or names (can be mixed) specifying which columns are to be iterated over, and the order.

  • start – Starting index. Default 0.

  • stop – index. Default None, i.e. end of the query.

  • step – Step size, default 1. Negative step values are not supported.

  • notuple – If True and there is only one column, yield the values “bare” and not wrapped in one-element tuples (as it normally would, since it’s considered a “row”).

convertindex(self, n)#

Convert an index based on the RowIterator’s slice info.

When an index of a slice is referenced, the slice’s information has to be factored in to determine the “base” index. That is, arr[10:50:5][2] is actually arr[20], and arr[10:50:5][15] is out of bounds, even if arr itself is long enough. A RowIterator is (or may be) a slice, so this translation has to happen.

fetchrows(self, starting=None)#

Fetches a window of rows. Users should not need to call this. (*)

getrow(self, n)#

Returns a tuple of the elements in this iterator’s columns at a given index. If the index is not available from the relevant rows, it fetches the window of rows. Users should not be calling this, but should be using next(). (*)

has(self, n)#

Reflects the given index within the window of values currently loaded by this iterator. This checks the windows of all the columns in the iterator to make sure the index is available for all of them. Columns may have different windows or may be accessed outside the iterator.

class py1010.Session(url, username, password, logintype, group=None, retry=0x40000000, wait=10, logfile=None, mode='w', oktadesc=None, authentication=None, *, authscript=None, **kwargs)#

Object representing a 1010 session.

Create with 1010 URL, username (or group ownername), password, and type of login (py1010.KILL, py1010.NOKILL, or py1010.POSSESS). If this is a SAM pool (group) login, add another parameter for the groupID. py1010 automatically acquires a UID and logs in with it.

Attempting to use a closed session, either directly or indirectly by using a class:Query that was created for it, raises a TentenClosedSessionException.

Properties (default value in parens. (w) means the value is writable, and (w-) means the value is writable but not readable)):

Variables:
  • messagetext – Last message text from 1010 API

  • rc (int) – Return code of last operation performed

  • timeout (tuple) – Timeout parameter tuple: (connectTimeout, timeout) (w)

  • reuseConnection – Reuse the same connection (True) (w-)

  • lenient – Treat non-conforming columns leniently (False) (w-)

  • ignoreSSLErrors – No exception on SSL errors (False) (w-)

  • APIversion – API version of this session (w)

  • systemVersion – System version this session is logged into

  • lastResponse – Last response from the session

  • transactions – Raw bitmask of available transactions

  • dbmdo – Used for calling the dbm() method with a slightly different syntax. session.dbmdo.endpoint(key1=val1, ...) is equivalent session.dbm("endpoint", key1=val1, ...), with one or two special cases (see the docstrings on the DBMDo object.)

Create or connect to a 1010 session

Create or connect to a 1010 session with the specified URL, username (or owner name for SAM Pool login), password, login type (py1010.KILL, py1010.NOKILL, or py1010.POSSESS), and group name (for SAM pool login). For SAM pool logins, a maximum number of retries and the number of seconds to wait between retries can also be specified (silently increased if below the minimum of 10s) (*)

If the password supplied is None, py1010 will prompt the user for it.

You can also create a Session object without logging in if you have the SessionID (SID) and the Encrypted Password (EPW) of an existing 1010data session. Pass in the url and username, and pass in the EPW as the password. Use py1010.POSSESS as the logintype, and add the keyword-only parameter sid=SID to pass in the session ID.

For SSO login or other external authentication, use the keyword-only parameter authentication=TAG, with TAG being a string specifying the style of authentication. See your administrator for your organization’s custom tag, if one is being used.

Parameters:
  • url – The 1010data gateway URL.

  • username – Username on 1010data or SSO, or SAM-pool owner ID.

  • password – Password for 1010data or SSO.

  • logintypepy1010.POSSESS to use an existing session or start a new one; py1010.KILL to end an existing session (if any) before starting this one; py1010.NOKILL to error if a session already exists.

  • group – SAM-pool group name, if applicable.

  • retry – Number of times to retry if no SAM-pool user-id is available (default no limit).

  • wait – Seconds to wait between retries for a user-id.

  • logfile – Filename (or python filehandle) for writing XML log.

  • mode – Mode for writing logfile (“a” for appending, “w” (default) for truncating.)

  • oktadescOktaLogin object for Okta logins.

  • authentication – (keyword-only) Authentication tag for SSO logins.

  • authscript – (keyword-only) External authentication script for SSO logins, if not the standard one.

addKey(self, keyobj, key='', id_='', region='', keytype=TBL_SOURCE_S3)#

Add a key to the server-side keystore.

Add a new Cloud key to the keystore under a user-chosen name.

Parameters:

keyobj – The CloudKey object to add

OR you can run with just the information for creating the CloudKey object and the function will construct it for you

Parameters:
  • keyobj – The user-defined name (a string) for the key.

  • key – The secret key.

  • id_ – The AWS id (AWS only).

  • region – The AWS region (AWS only).

  • keytype (SourceInfo.SrcType) – Type of key: an element of the SourceInfo.SrcType enumeration. Default SourceInfo.SrcType.S3

Raises:

TentenTransactionException if a key with this name already exists in the keystore (among other reasons.)

addTable(self, spec, sync=False)#

Add a table from data already uploaded to FTP account.

Returns:

The name of the table.

Raises:

TentenException – if transaction fails

addTableEnd(self, table)#

Close and commit a transaction addtab.

Run this function at the end of a “transactional” addtab process to indicate that you are finished uploading chunks of data and to close and save the table. Equivalent to sending a zero-byte chunk, i.e. self.addTableFeed(table, b'', False)

Parameters:

table – Name of the table to finish.

addTableFeed(self, table, data, compress=True)#

Send a chunk of data in a transactional addtab process.

In a “transactional” addtab process, you run Session.addTableSpecs() with a source spec which includes no SourceFiles. Then you use this method to send chunks of data to be added to the table, and finally call addTableEnd() to close and commit the table.

Parameters:
  • table – Name of the table you are uploading to.

  • data – The chunk of data to send, formatted as described in the SourceSpec.

  • compress – Whether or not to compress the data being sent (default True)

Raises:

TentenException – if transaction fails

addTableSpecs(self, sourcespec, tablespec, sync=False)#

Add a table using given specifications.

Runs the API’s addtable transaction. To initiate a “transactional” addtable, in which you upload the data in chunks, use a SourceSpec that lists no files.

Parameters:
  • sourcespec – A SourceSpec object giving the location and structure of the source file(s) and the columns therein.

  • tablespec – A TableSpec object giving the name and other metadata for the target table, as well as the columns there. If no columns are supplied here, the table column information will be copied from the source column information.

Raises:

TentenException – if transaction fails

addTableStatus(self)#

Check the status of an asynchronous upload with addTable()

Returns a tuple (status, numrecs). The status is an integer with the following meanings:

0: TENTEN_ADDTAB_IDLE

1: TENTEN_ADDTAB_FAILED

2: TENTEN_ADDTAB_INITIALIZING

3: TENTEN_ADDTAB_LOADING

4: TENTEN_ADDTAB_COMPLETED

5: TENTEN_ADDTAB_DIAGNOSED

The numrecs is the number of records (rows) uploaded so far (or in total if the upload is complete).

Raises:

TentenException – if transaction fails

Return type:

tuple(int, int)

addTableStatusList(self)#

Get a list of the status of addtable transactions.

Returns:

A list of AddtableStatus objects for all of the addtabletransactions that have been started so far in this session.

Raises:

TentenException – if transaction fails

api2(self, method, data=None, *, form='json', ignoreJSONErrors=False)#

Call the QuickApp API.

Parameters:
  • method – The API (QuickApp) to call.

  • data – If given, call with be POST with this data (json-encoded)

  • form – (keyword-only) Request the data in a given format. Default is “json”; other options may include “html” or “text”; the actual possibilities are determined by the server-side code.

  • ignoreJSONErrors – (keyword-only) If json.loads() (or decode('utf-8')) fails to decode the response, if this parameter is True, just return the response data instead of raising an exception

Returns:

Decoded json, or plain bytes from server if form is not “json”. If form is “json” and ignoreJSONErrors is True, returns the decoded json unless there is an error either in decoding the bytes to a string or in decoding the json, in which case it returns the plain bytes.

classmethod authenticateCredentials(cls, url, user, password)#

Check that the given credentials are valid.

This does not require an open session (class method) and does not create one. This returns True if the credentials are valid.(*)

autoSpec(self, data)#

Run the API’s autospecfromdata transaction on the given data.

Returns a dictionary of sourcefile information, with the source columns as dictionaries in a list under ‘columns’.

The autoSpecFile() method is probably superior in most situations.

Return type:

dict

Raises:

TentenException – if transaction fails

autoSpecFile(self, spec=None)#

Run the API’s autospec transaction on the source information provided.

Parameters:

spec – A SourceSpec object, or something that can be converted into one (a string or list of strings, a SourceFile object or list of SourceFiles, or a SourceInfo). It may also contain source columns and other information to be used as “hints” by the auto-spec.

Return type:

SourceSpec

Raises:

TentenException – if transaction fails

clearCache(self)#

Clear the 1010data cache. (*)

close(self)#

Close the session and release userID if necessary.

Closed sessions can no longer be accessed for most purposes; attempting to do so raises a TentenClosedSessionException.

createDirectory(self, pathname, title=None, users=None, uploaders=None, inherit_users=False, inherit_uploaders=False)#

Create a directory in the 1010data object tree.

Parameters:
  • pathname – Full name of folder to create.

  • title – Title of the directory (default None)

  • users – List of users allowed to read the directory (default None)

  • uploaders – List of users allowed to upload to the directory (default None)

  • inherit_users – If True, inherit allowed users from parent directory (ignoring any list of users given). Default False.

  • inherit_uploaders – If True, inherit allowed uploaders from parent directory (ignoring any list of users given). Default False.

Raises:
dbm(self, trans, **kw)#

Call the ‘dbm’ API2 endpoint.

Parameters:
  • trans – The dbm transaction to invoke

  • kw – Key-value arguments for the given transaction

Returns:

The transaction’s response, decoded from json

Raises:
  • AssertionError – if the transaction fails or the result is not a dict with a return-code

  • KeyError – if the response does not have a value key

  • DBMError – if the return-code of the response indicates failure

debugAccum(self, on=True)#

Switch ACCUM (server) into (or out of) Debug mode.

delKey(self, keyobj, key='', id_='', region='', keytype=TBL_SOURCE_S3)#

Remove a key from the server-side keystore.

Remove an AWS key in the keystore, specified by name.

Parameters:

keyobj – The CloudKey object containing the name of the key you want to remove. The other information in the object is ignored.

OR you can run with just the information for creating the CloudKey object and the function will construct it for you

Parameters:
  • keyobj – The user-defined name (a string) for the key.

  • key – (optional, ignored)

  • id_ – (optional, ignored)

  • region – (optional, ignored)

  • keytype – (optional, ignored)

Raises:

TentenTransactionException if a key with this name does not already exist in the keystore (among other reasons.)

directory(self, dirname)#

Return a Directory object for the directory named (*)

disableLog(self)#

Disable XML logging

dropTable(self, table)#

Delete a table (*)

Raises:

TentenException – if transaction fails

enableLog(self, filename='log.log', mode='a')#

Enable XML logging

Parameters:
  • filename – Name of the file to log to, or a writeable file handle object (instance of IOBase)

  • mode – If a filename is given, fopen-type mode with which to open the file. Default a (append).

getQueryOps(self, pathname)#

Get the ops of a QuickQuery.

Returns the ops (the XML macro-code) of a saved QuickQuery in the 1010data object tree.

Parameters:

pathname – The path in the object tree to the saved QuickQuery.

Returns:

A dictionary {'base': basename, 'ops': optext}, containing the base table of the QuickQuery (if any; None if there is none) and the text of the ops saved in the query.

getRetry(self)#

Get retry parameters.

Returns a tuple: (retryMax, retryTimeMin, retryTimeRange, retryBase)

classmethod getUID(cls, url, owner, password, group)#

DEBUG ONLY: acquire a UID from a group. Leaves UID unused, claimed, and unreleasable!

logRawXML(self)#

Enable raw XML logging

memUsage(self)#

Return the memory usage of the session, in bytes (*)

mergeTablesMaterialize(self, tablelist, destpath, title=u'', sdesc=u'', ldesc=u'', link=u'', int maxdown=-1, int replace=False, users=None, char *segmentation=NULL, int sortseg=False, segbyAdvise=None, sortsegAdvise=None, int timeSeries=False, links=None, char *sort=NULL, *, inherit_users=False)#

Merge a list of tables.

Runs the CSDK tenten_MergeTablesMaterialize function. This saves the most recently-run Query. (*)

Parameters:
  • tablelist – List of table-names to be merged

  • destpath – Path in the 1010 object tree to save this table.

  • title – Title of the table (default “”)

  • sdesc – Short description (default “”)

  • ldesc – Long description (default “”)

  • link – String to be prepended to titles of linked columns (default “”)

  • maxdown – Maximum number of cells that can be downloaded at once. Default -1, for no limit

  • replace – Replace the table if it already exists (default False)

  • users – List of usernames with access (default None)

  • segmentation – Comma-separated string, a list of columns to segment on (Default None)

  • sortseg – Use SortSeg (True) or SegBy (False, default)

  • segbyAdvise – List of comma-separated column-name sequences which can be assumed to be segmented with SegBy segmentation if segmentation is performed (Default None)

  • sortsegAdvise – List of comma-separated column-name sequences which can be assumed to be segmented with SortSeg segmentation if segmentation is performed (Default None)

  • timeSeries – Generate time-series metadata (default False)

  • links – A list of 1010 link ops (default None)

  • sort – Comma-separated string, a list of columns by which to sort the contents of each segment (default None)

  • inherit_users – (keyword-only): If True, ignore the list of users given (if any) and set the user permissions to ‘inherit’.

Raises:

TentenException – if transaction fails

moveTable(self, oldpath, newpath)#

Move a table from one pathname to another (*)

Raises:

TentenException – if transaction fails

peekOnce(self, key, mode)#

Get the value of a server-side variable

Parameters:
  • key – Content of <peek> tag to send to server

  • mode – Value of attribute <peek mode> to send to server

Returns:

Value of the variable

Return type:

bytes

ptr(self, char *name=NULL)#
putKey(self, keyobj, key='', id_='', region='', keytype=TBL_SOURCE_S3)#

Change a key on the server-side keystore.

Alter a Cloud key in the keystore, specified by name, to contain new information.

Parameters:

keyobj – The CloudKey object with the name and updated info

OR you can run with just the information for creating the CloudKey object and the function will construct it for you

Parameters:
  • keyobj – The user-defined name (a string) for the key.

  • key – The new secret key.

  • id_ – The new AWS id (AWS only).

  • region – The new AWS region (AWS only).

  • keytype (SourceInfo.SrcType) – Type of key: an element of the SourceInfo.SrcType enumeration. Default SourceInfo.SrcType.S3

Raises:

TentenTransactionException if a key with this name does not already exist in the keystore (among other reasons.)

query(self, table, xml='')#

Create a query object for this session, for a given table and XML.

You can also pass an existing query object (from this or another session) to clone it in this session. The query is not run.

Return type:

Query

readKeys(self)#

Get a list of keys in the server-side keystore

Returns:

a list of CloudKey objects

releaseUID(self)#

DEBUG ONLY: release a claimed UID

relog(self, password, logintype=-1)#

Get a new Session object based on this one.

Returns a NEW session object, logging in again with the same URL and username, and the given password and logintype (default POSSESS). Other configurable properties are _not_ carried over into the new session. May be called on a closed session. (*)

Parameters:
  • password – Password for account; prompt for password if None.

  • logintypepy1010.POSSESS (default), py1010.KILL, or py1010.NOKILL.

Return type:

Session

saveTableMaterialize(self, tablepath, title=u'', sdesc=u'', ldesc=u'', link=u'', int maxdown=-1, int replace=False, int append=False, int appendable=True, int temporary=False, users=None, char *segmentation=NULL, int sortseg=False, segbyAdvise=None, sortsegAdvise=None, int timeSeries=False, links=None, char *sort=NULL, *, inherit_users=False)#

Create, replace, or append a table.

Runs the CSDK tenten_SaveTableMaterialize function. This saves the most recently-run Query. (*)

Parameters:
  • tablepath – Path in the 1010 object tree to save this table.

  • title – Title of the table (default “”)

  • sdesc – Short description (default “”)

  • ldesc – Long description (default “”)

  • link – String to be prepended to titles of linked columns (default “”)

  • maxdown – Maximum number of cells that can be downloaded at once. Default -1, for no limit

  • replace – Replace the table if it already exists (default False)

  • append – Append to the end of an existing table (default False)

  • appendable – Allow this table to be merged with other tables (default False)

  • temporary – Make this table temporary (default False)

  • users – List of usernames with access (default None)

  • segmentation – Comma-separated string, a list of columns to segment on (Default None)

  • sortseg – Use SortSeg (True) or SegBy (False, default)

  • segbyAdvise – List of comma-separated column-name sequences which can be assumed to be segmented with SegBy segmentation if segmentation is performed (Default None)

  • sortsegAdvise – List of comma-separated column-name sequences which can be assumed to be segmented with SortSeg segmentation if segmentation is performed (Default None)

  • timeSeries – Generate time-series metadata (default False)

  • links – A list of 1010 link ops (default None)

  • sort – Comma-separated string, a list of columns by which to sort the contents of each segment (default None)

  • inherit_users – (keyword-only): If True, ignore the list of users given (if any) and set the user permissions to ‘inherit’.

Raises:

TentenException – if transaction fails

searchTables(self, filter=None, maxresults=0)#

Search for tables with matching names. (*)

Returns:

a list of DirEntry objects.

Raises:

TentenException – if transaction fails

setRetry(self, long retryMax, long retryTimeMin, long retryTimeRange, double retryBase)#

Set retry parameters.

setUserAgent(self, agent, password)#

Set the UserAgent string. For internal use only.

setXFF(self, xff)#

Set X-Forwarded-For string

stop(self)#

Send a stop transaction

Raises:

TentenException – if transaction fails

updateObject(self, dirname)#

Refresh the session’s view of a file or a directory recursively. (*)

Raises:

TentenException – if transaction fails

uploadToFTP(self, localpath, password, remotename=None)#

Upload a file to your account’s FTP server.

Uploads a given file to the 1010data ftp account associated with the username of this session. The password must be provided.

Parameters:
  • localpath (str) – Path (local or absolute) to the file to be uploaded on the local system.

  • password (str) – Password for the FTP account. It should be the same as the password to the 1010data account (but needs to be supplied, as the password is not cached, for security reasons.)

  • remotename (str) – Name to give the file on the server side. Must be a plain base filename, not a path (i.e., it may not have / or \ characters in it). If not provided, defaults to the base name of the localpath.

uploadXMLTable(self, char *xmlbuf)#

Upload raw XML as a table (*)

Raises:

TentenException – if transaction fails

classmethod warmPool(cls, url, owner, password, group, queries=None, logfile=None)#

Same as py1010.warmpool().

APIversion#

The API version of this session (Configurable)

epw#

The encrypted password of this session.

ignoreSSLErrors#

Determines if you get an exception on SSL errors or ignore them

Default: False

lastResponse#

The last response from the session

messagetext#

Message text of the session.

pswd#

PSWD of a session

rc#

The return-code (integer) of the last transaction performed

realurl#

realurl: object

reuseConnection#

Determines if the same connection is reused between transactions

Default: True

sessionid#

SID of a session

sid#

The session ID of this session.

systemVersion#

The system version this session is logged into

timeout#

Connection timeout parameters, as a tuple: (connectTimeout, timeout)

Settable.

transactions#

The raw bitmask of available transactions

url#

url: object

userAgent#

The user agent for this session

username#

username: object

class py1010.SourceColumnInfo(name=None, title=None, type_=None, format=None, int width=0, exp=None, double scale=0.0, int alpha=Alpha.SKIP, int order=0, int skip=Skip.NOSKIP, int nowrite=Write.WRITE, ptr=None, *)#

Class for holding metadata about a column in a source to be uploaded.

Several of the fields are meant to hold values from special enumeration classes, which are internal classes of SourceColumnInfo. So the “nowrite” parameter can be SourceColumnInfo.Write.WRITE or SourceColumnInfo.Write.NOWRITE. See below, and individual docstrings.

Variables:
  • name – Column name

  • title – Column title

  • type_ – Type of column: a string (or bytes): “text”, “int”, “float”, or “bigint”

  • format – Column format descriptor (string)

  • width – Width of column; 0 (default) for no input width.

  • exp – Expression to be applied to this column before upload

  • scale – Decimal value by which to divide this column before upload. 0.0 (default) for none

  • alpha – Alphabetic case into which to force this column before upload. One of Alpha.UPPER, Alpha.LOWER, or Alpha.SKIP (default)

  • order – Positive integer for the position of this column in a reordering; 0 (default) for no reordering.

  • skip – Skip this column or not? One of Skip.SKIP or Skip.NOSKIP (default)

  • nowrite – Write this column? One of Write.WRITE (default) or Write.NOWRITE

class Alpha(value)#

An enumeration.

class Skip(value)#

An enumeration.

class Write(value)#

An enumeration.

alpha#

Alphabetic case into which to force this column prior to upload

One of SourceColumnInfo.Alpha.UPPER, SourceColumnInfo.Alpha.LOWER, or SourceColumnInfo.Alpha.SKIP.

exp#

Column expression.

format#

Column format.

name#

Name of the column.

nowrite#

Whether or not to write this column.

Set to SourceColumnInfo.Write.WRITE or SourceColumnInfo.Write.NOWRITE

order#

A positive integer indicating this column’s position in a revised column order, or 0 for no reordering.

scale#

Column scale

Decimal value by which to divide the values in this column prior to upload, or 0 for none.

skip#

Whether or not to skip this column on loading.

Set to SourceColumnInfo.Skip.SKIP or SourceColumnInfo.Skip.NOSKIP

title#

Column title.

type#

Type of column.

A string (or bytes), one of: “text”, “int”, “float”, “bigint”

width#

Column width.

class py1010.SourceFile(path, bucket=None, keyname=None, sheetID=None, range=None, account=None, container=None, sourcetype=None, ptr=None, *)#

Class describing an individual file as a data source.

A “source” for 1010data is a description of some file outside of the 1010data object tree. A source is described by a SourceInfo object, which specifies features like format and column-separators, etc, and a SourceInfo contains one or more SourceFile objects, which specify locations of actual files (FTP upload directories or cloud storage services.)

Since a “source” describes an external resource, the same objects are used as “destinations” to describe files and formats for writing output.

Construct a SourceFile object.

The SourceFile contains location information for a file outside of 1010 (in an FTP upload directory, an S3/GCS bucket, or in ABS).

Parameters:
  • path – The filename of the file.

  • bucket – The S3 bucket, for files in S3 or GCS. Leave as None for files in FTP or ABS.

  • keyname – The name assigned to the AWS key to use to access the file. See the Session.addKey() method of Session objects. Leave as None for files in FTP.

  • sheetID – To specify a worksheet in an XLSX workbook, pass the sheet’s ID here (as returned by the getworksheets transaction).

  • range – For specifying a cell-range in an XSLX worksheet.

  • account – The ABS account to be used. Leave as None for files in FTP or S3/GCS storage.

  • container – The ABS container to be used. Leave as None for files in FTP or S3/GCS storage.

  • sourcetype – The type of source, a value from the SourceInfo.SrcType enumeration. Defaults to None, in which case the value is inferred by other data given: if the bucket parameter is non-empty, the value will be SourceInfo.Type.S3. If the container is non-empty, the value will be SourceInfo.SrcType.ABS. Otherwise, the value will be SourceInfo.SrcType.FTP. Note that any other value (SourceInfo.SrcType.GCS) must be passed in explicitly (this is for backward compatibility.)

init(self, path, bucket, keyname, sheetID, range, account, container, sourcetype)#

Set object attributes on construction.

account#

The account to use on ABS to access the file, or None.

bucket#

The S3 bucket containing the file (or None).

container#

The container to use on ABS to access the file, or None.

keyname#

The user-assigned name of the AWS key to use to access the file, or None

path#

The filename of the file.

range#

A range of cells in an XLSX worksheet which this object refers to, if relevant.

(Implementation note: this property will not contain the value None. If you set it to None, that really sets it to b’’)

sheetID#

The sheetID of the worksheet this object refers to, within an XLSX workbook, if relevant.

(Implementation note: this property will not contain the value None. If you set it to None, that really sets it to b’’)

class py1010.SourceInfo(files=None, rectype=None, sep=None, eor=None, maskw=None, mchr=None, arch=None, format=None, long begbytes=0, long begrecs=0, long numrecs=0, int autoCorrect=0, int numCols=0, ptr=None, *)#

Class describing an individual file as a data source.

A “source” for 1010data is a description of some file outside of the 1010data object tree. A source is described by a SourceInfo object, which specifies features like format and column-separators, etc, and a SourceInfo contains one or more SourceFile objects, which specify locations of actual files (FTP upload directories or cloud storage services.)

Since a “source” describes an external resource, the same objects are used as “destinations” to describe files and formats for writing output.

SourceInfo objects contain metadata about the format of an external file. Several of the fields are meant to hold values from special enumeration classes, which are internal classes of SourceInfo. So the rectype can be SourceInfo.RecType.SEPARATED or SourceInfo.RecType.FIXED. See below, and individual docstrings.

Variables:
  • sourceType – Type of source (FTP, S3, etc.)

  • sep – Column separator

  • eor – Record separator

  • maskw – Max length of variable-width columns

  • mchr – “Masking” character

  • arch – Architecture: little-endian or big-endian

  • format – Type of file (“xlsx” or empty)

  • begbytes – Number of bytes to skip at the start

  • begrecs – Number of records to skip at the start

  • numrecs – Number of records to upload (0 for all)

  • autoCorrect – Enable simple autocorrection feature?

  • truncate – Autocorrect truncate control

  • pad – Autocorrect pad control

  • fix_mask – Autocorrect fix-mask control

  • numCols – Number of columns

  • ignoreNull – Replace '\\0' with ' ' (space)?

Constructor for SourceInfo objects.

Parameters:
  • files – A list of SourceFile objects (or strings, which are taken to be filenames in an FTP directory)

  • rectype – Record type: RecType.SEPARATED or RecType.FIXED or None

  • sep – Column separator

  • eor – Record separator

  • maskw – Max width of variable columns

  • mchr – “Masking” character

  • arch – Architecture: Arch.BENDIAN or Arch.LENDIAN or None

  • format – Either “xlsx” or None (default) for text files

  • begbytes – Bytes to skip at the beginning (default 0)

  • begrecs – Records to skip at the beginning (default 0)

  • numrecs – Number of records to load (default 0, for “all”)

  • autoCorrect – Enable autoCorrect? (default 0 (False))

  • numCols – Number of columns

class Arch(value)#

An enumeration.

class AutoCorrectType(value)#

An enumeration.

class RecType(value)#

An enumeration.

class SrcType(value)#

An enumeration.

getWorksheets(self, Session s)#

Run the getworksheets transaction on this SourceInfo object (which should describe a .xlsx source) using the supplied session.

init(self, files=None, rectype=None, sep=None, eor=None, maskw=None, mchr=None, arch=None, format=None, long begbytes=0, long begrecs=0, long numrecs=0, int autoCorrect=0, int numCols=0)#

Initialize fields on construction.

arch#

Architecture or “endianness” of this source. May be Arch.BENDIAN (big-endian) or Arch.LENDIAN (little-endian) or None (unspecified).

autoCorrect#

Specify “simple” autocorrection: True or False.

begbytes#

Bytes to skip at the beginning.

begrecs#

Records to skip at the beginning.

eor#

Row separator for this Source.

files#

files: object

filter_target#

filter_target: object

fix_mask#

Autocorrect fix-mask control, for delimited columns only.

Set to AutoCorrectType.NONE, AutoCorrectType.LEFT, AutoCorrectType.RIGHT, AutoCorrectType.LONG, or AutoCorrectType.SHORT.

format#

File-format of this Source. May be None or “” (for text files) or “xslx”.

ignoreNull#

Replace NUL (’\0’) characters with spaces? Set to True, False, or None (unspecified, default).

maskw#

The “masking width,” or the maximum width of variable-length columns in this source. Default 10000.

mchr#

Masking character for this Source.

numCols#

Number of columns in input data.

numFiles#

Number of SourceFiles in this Source.

This property may not be set directly.

numrecs#

Number of records to read.

pad#

Autocorrect pad control.

Set to AutoCorrectType.NONE, AutoCorrectType.RIGHT, or AutoCorrectType.LEFT.

rectype#

Record type for this Source.

May be RecType.SEPARATED or RecType.FIXED or None (unspecified).

sep#

Column separator for this Source.

sourceType#

Type of this source.

May be SrcType.S3, SrcType.ABS, SrcType.GCS, or SrcType.FTP. This property is not set directly, but is determined by the sourceType of the first SourceFile.

(SourceFiles of different types may not be combined in the same SourceInfo.)

truncate#

Autocorrect truncate control.

Set to AutoCorrectType.NONE, AutoCorrectType.RIGHT, or AutoCorrectType.LEFT.

class py1010.SourceSpec(self, source=None, sourceCols=None)#

A class to hold both SourceInfo and a list of SourceColumnInfo.

toXML(self, session)#

Convert to XML representation.

class py1010.TableInfo(name, int ID=0, title=None, sdesc=None, ldesc=None, type_=u'', int secure=0, int own=0, owner=None, update=None, int favorite=0, users=None, display=None, int report=0, int chart=0, link=u'', long numRows=0, long numBytes=0, int segs=0, int access=0, long maxdown=0, mode=Mode.REPLACE, stripe=None, stripe_factor=None, ptr=None, *)#

Class for holding metadata about a table as an upload target.

Holds data about a table for uploading (with addTableSpecs).

Several of the fields are meant to hold values from special enumeration classes, which are internal classes of TableInfo. So the mode can be TableInfo.Mode.APPEND or TableInfo.Mode.REPLACE or TableInfo.Mode.NOREPLACE. See below, and individual docstrings.

class Mode(value)#

An enumeration.

class Perm(value)#

An enumeration.

class SegType(value)#

An enumeration.

class TimeSeries(value)#

An enumeration.

init(self, name, int ID=0, title=None, sdesc=None, ldesc=None, type_=u'', int secure=0, int own=0, owner=None, update=None, int favorite=0, users=None, display=None, int report=0, int chart=0, link=u'', long numRows=0, long numBytes=0, int segs=0, int access=0, long maxdown=0, mode=Mode.REPLACE, stripe=None, stripe_factor=None)#
access#

Boolean 1 or 0 indicating whether or not this table is accessible.

chart#

Boolean 1 or 0 indicating whether or not chart specifications are saved for this table.

favorite#

Boolean 1 or 0 indicating whether or not the transaction UID has favorited this table.

id#

Unique identifier for this table.

ldesc#

Long description of the table, if any.

Link header of table, or NULL for no link header.

materialize#

Boolean 1 or 0 indicating whether or not this table is materialized.

maxdown#

Maximum download limit of table, or a non-positive integer for the default maxdown.

merge#

Boolean 1 or 0 indicating whether or not this table is appendable.

method#

Materialize method, or None for the default method.

mode#

Append or replace?

name#

Full path to the table.

numBytes#

Number of bytes in the table.

numCols#

Number of columns in this table.

numRows#

Number of rows in the table.

own#

Boolean 1 or 0 indicating whether or not the transaction UID is the owner of this table.

owner#

UID or groupname of the owner of this table, or None for the default owner.

report#

Boolean 1 or 0 indicating whether or not report specifications are saved for this table.

responsible#

Boolean 1 or 0 indicating whether or not the user is responsible for replication of data.

sdesc#

Short description of the table, if any.

secure#

Boolean 1 or 0 indicating whether or not this table is secure. Deprecated in API.

segmentation#

Comma-separated list of the names of segmentation columns.

segs#

Number of segments spanned by this table.

segsize#

Size of the segments of this table.

segtype#

Integer representing segmentation type of this table. Either TableInfo.SegType.SEGBY or TableInfo.SegType.SORTSEG. 0 if “segmentation” is None.

sort#

Comma-separated list of the names of sort columns.

stripe#

How many machines to stripe the data across.

stripe_factor#

Fraction of machines to stripe data across.

timeSeries#

Integer representing whether or not time-series segmentation is used for this table. Either TENTEN_TS or TENTEN_NOTS. 0 if “segmentation” is None.

title#

Title of the table, if any.

type#

Type of table. Currently, can be “REAL”, “VIEW”, “PARAM”, “MERGED”, “UQ”, or “TOLERANT”.

update#

Datetime of last modification to this table.

users#

users: object

class py1010.TableSpec(self, table=None, cols=None)#

Class to hold a TableInfo and a list of ColumnInfo.

toXML(self, Session session)#

Convert to XML notation.

py1010.by(s)#

(internal function: string2bytes for python3)

py1010.flushpool(url, owner, password, group, logfile=None, mode='w')#

Flush a SAM Pool.

Invokes the “markgid” or “FlushPool” transaction on the group. This guarantees that any subsequent UIDs returned for the group haven’t been logged in before on a session and are fresh.(*)

Parameters:
  • url (str) – the URL of the 1010 connection

  • owner (str) – username of the group owner

  • password (str) – the password for the group

  • group (str) – the name of the group (pool) to be flushed

  • logfile (str) – Optional name of file used for logging this transaction

  • mode (str) – fopen-style mode for opening logfile, defaults to “w”

Raises:

TentenException – if transaction fails

py1010.resetpool(url, owner, password, group, logfile=None)#

Release a SAM Pool’s IDs.

Invokes the “resetpool” transaction on the group. This releases all the group’s UIDs.(*)

Parameters:
  • url – the URL of the 1010 connection

  • owner – username of the group owner

  • password – the password for the group

  • group – the name of the group (pool) to be reset

  • logfile – filename for logging, if any.

Raises:

TentenException – if transaction fails

py1010.sampoolstatus(url, owner, password, group, logfile=None)#

Check the status of a SAM pool.

Run the “samstatus” transaction on the given SAM pool and return the results.

Parameters:
  • url – The url of the 1010data gateway.

  • owner – The userid of the owner of the SAM pool.

  • password – The owner’s password.

  • group – The name of the group.

  • logfile – Filename to log the transaction (optional).

Returns:

a list of tuples of the form: [("groupname", "username", free, loggedin, marked)...] The groupname is the same for all the users. The “free”, “loggedin”, and “marked” values are boolean values for whether that userID is free, logged in, and/or marked.

py1010.st(s)#

(internal function: bytes2string for python3)

py1010.startLogging(filename='Py1010.log')#

Start API-level logging.

py1010.stopLogging()#

Close logging.

py1010.warmpool(url, owner, password, group, queries=None, logfile=None)#

Warm a SAM Pool.

Parameters:
  • url – The URL of the 1010 connection

  • owner – Username of the group owner

  • password – Password for the group

  • group – Name of the group (SAM pool)

  • queries – List of BaseQuery objects

  • logfile – File to log transactions to (default None)

Returns:

the number of IDs warmed.