Classes Involved in Uploading and Downloading

py1010.AWSKey

alias of CloudKey

class py1010.SourceInfo(files=None, rectype=None, sep=None, eor=None, maskw=None, mchr=None, arch=None, format=None, long begbytes=0, long begrecs=0, long numrecs=0, int autoCorrect=0, int numCols=0, ptr=None, *)

Class describing an individual file as a data source.

A “source” for 1010data is a description of some file outside of the 1010data object tree. A source is described by a SourceInfo object, which specifies features like format and column-separators, etc, and a SourceInfo contains one or more SourceFile objects, which specify locations of actual files (FTP upload directories or cloud storage services.)

Since a “source” describes an external resource, the same objects are used as “destinations” to describe files and formats for writing output.

SourceInfo objects contain metadata about the format of an external file. Several of the fields are meant to hold values from special enumeration classes, which are internal classes of SourceInfo. So the rectype can be SourceInfo.RecType.SEPARATED or SourceInfo.RecType.FIXED. See below, and individual docstrings.

Variables:
  • sourceType – Type of source (FTP, S3, etc.)

  • sep – Column separator

  • eor – Record separator

  • maskw – Max length of variable-width columns

  • mchr – “Masking” character

  • arch – Architecture: little-endian or big-endian

  • format – Type of file (“xlsx” or empty)

  • begbytes – Number of bytes to skip at the start

  • begrecs – Number of records to skip at the start

  • numrecs – Number of records to upload (0 for all)

  • autoCorrect – Enable simple autocorrection feature?

  • truncate – Autocorrect truncate control

  • pad – Autocorrect pad control

  • fix_mask – Autocorrect fix-mask control

  • numCols – Number of columns

  • ignoreNull – Replace '\\0' with ' ' (space)?

Constructor for SourceInfo objects.

Parameters:
  • files – A list of SourceFile objects (or strings, which are taken to be filenames in an FTP directory)

  • rectype – Record type: RecType.SEPARATED or RecType.FIXED or None

  • sep – Column separator

  • eor – Record separator

  • maskw – Max width of variable columns

  • mchr – “Masking” character

  • arch – Architecture: Arch.BENDIAN or Arch.LENDIAN or None

  • format – Either “xlsx” or None (default) for text files

  • begbytes – Bytes to skip at the beginning (default 0)

  • begrecs – Records to skip at the beginning (default 0)

  • numrecs – Number of records to load (default 0, for “all”)

  • autoCorrect – Enable autoCorrect? (default 0 (False))

  • numCols – Number of columns

class Arch(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class AutoCorrectType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class RecType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class SrcType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
getWorksheets(self, Session s)

Run the getworksheets transaction on this SourceInfo object (which should describe a .xlsx source) using the supplied session.

init(self, files=None, rectype=None, sep=None, eor=None, maskw=None, mchr=None, arch=None, format=None, long begbytes=0, long begrecs=0, long numrecs=0, int autoCorrect=0, int numCols=0)

Initialize fields on construction.

arch

Architecture or “endianness” of this source. May be Arch.BENDIAN (big-endian) or Arch.LENDIAN (little-endian) or None (unspecified).

autoCorrect

Specify “simple” autocorrection: True or False.

begbytes

Bytes to skip at the beginning.

begrecs

Records to skip at the beginning.

eor

Row separator for this Source.

fix_mask

Autocorrect fix-mask control, for delimited columns only.

Set to AutoCorrectType.NONE, AutoCorrectType.LEFT, AutoCorrectType.RIGHT, AutoCorrectType.LONG, or AutoCorrectType.SHORT.

format

File-format of this Source. May be None or “” (for text files) or “xslx”.

ignoreNull

Replace NUL (’\0’) characters with spaces? Set to True, False, or None (unspecified, default).

maskw

The “masking width,” or the maximum width of variable-length columns in this source. Default 10000.

mchr

Masking character for this Source.

numCols

Number of columns in input data.

numFiles

Number of SourceFiles in this Source.

This property may not be set directly.

numrecs

Number of records to read.

pad

Autocorrect pad control.

Set to AutoCorrectType.NONE, AutoCorrectType.RIGHT, or AutoCorrectType.LEFT.

rectype

Record type for this Source.

May be RecType.SEPARATED or RecType.FIXED or None (unspecified).

sep

Column separator for this Source.

sourceType

Type of this source.

May be SrcType.S3, SrcType.ABS, SrcType.GCS, or SrcType.FTP. This property is not set directly, but is determined by the sourceType of the first SourceFile.

(SourceFiles of different types may not be combined in the same SourceInfo.)

truncate

Autocorrect truncate control.

Set to AutoCorrectType.NONE, AutoCorrectType.RIGHT, or AutoCorrectType.LEFT.

class py1010.SourceFile(path, bucket=None, keyname=None, sheetID=None, range=None, account=None, container=None, sourcetype=None, ptr=None, *)

Class describing an individual file as a data source.

A “source” for 1010data is a description of some file outside of the 1010data object tree. A source is described by a SourceInfo object, which specifies features like format and column-separators, etc, and a SourceInfo contains one or more SourceFile objects, which specify locations of actual files (FTP upload directories or cloud storage services.)

Since a “source” describes an external resource, the same objects are used as “destinations” to describe files and formats for writing output.

Construct a SourceFile object.

The SourceFile contains location information for a file outside of 1010 (in an FTP upload directory, an S3/GCS bucket, or in ABS).

Parameters:
  • path – The filename of the file.

  • bucket – The S3 bucket, for files in S3 or GCS. Leave as None for files in FTP or ABS.

  • keyname – The name assigned to the AWS key to use to access the file. See the Session.addKey() method of Session objects. Leave as None for files in FTP.

  • sheetID – To specify a worksheet in an XLSX workbook, pass the sheet’s ID here (as returned by the getworksheets transaction).

  • range – For specifying a cell-range in an XSLX worksheet.

  • account – The ABS account to be used. Leave as None for files in FTP or S3/GCS storage.

  • container – The ABS container to be used. Leave as None for files in FTP or S3/GCS storage.

  • sourcetype – The type of source, a value from the SourceInfo.SrcType enumeration. Defaults to None, in which case the value is inferred by other data given: if the bucket parameter is non-empty, the value will be SourceInfo.Type.S3. If the container is non-empty, the value will be SourceInfo.SrcType.ABS. Otherwise, the value will be SourceInfo.SrcType.FTP. Note that any other value (SourceInfo.SrcType.GCS) must be passed in explicitly (this is for backward compatibility.)

init(self, path, bucket, keyname, sheetID, range, account, container, sourcetype)

Set object attributes on construction.

account

The account to use on ABS to access the file, or None.

bucket

The S3 bucket containing the file (or None).

container

The container to use on ABS to access the file, or None.

keyname

The user-assigned name of the AWS key to use to access the file, or None

path

The filename of the file.

range

A range of cells in an XLSX worksheet which this object refers to, if relevant.

(Implementation note: this property will not contain the value None. If you set it to None, that really sets it to b’’)

sheetID

The sheetID of the worksheet this object refers to, within an XLSX workbook, if relevant.

(Implementation note: this property will not contain the value None. If you set it to None, that really sets it to b’’)

class py1010.SourceColumnInfo(name=None, title=None, type_=None, format=None, int width=0, exp=None, double scale=0.0, int alpha=Alpha.SKIP, int order=0, int skip=Skip.NOSKIP, int nowrite=Write.WRITE, ptr=None, *)

Class for holding metadata about a column in a source to be uploaded.

Several of the fields are meant to hold values from special enumeration classes, which are internal classes of SourceColumnInfo. So the “nowrite” parameter can be SourceColumnInfo.Write.WRITE or SourceColumnInfo.Write.NOWRITE. See below, and individual docstrings.

Variables:
  • name – Column name

  • title – Column title

  • type_ – Type of column: a string (or bytes): “text”, “int”, “float”, or “bigint”

  • format – Column format descriptor (string)

  • width – Width of column; 0 (default) for no input width.

  • exp – Expression to be applied to this column before upload

  • scale – Decimal value by which to divide this column before upload. 0.0 (default) for none

  • alpha – Alphabetic case into which to force this column before upload. One of Alpha.UPPER, Alpha.LOWER, or Alpha.SKIP (default)

  • order – Positive integer for the position of this column in a reordering; 0 (default) for no reordering.

  • skip – Skip this column or not? One of Skip.SKIP or Skip.NOSKIP (default)

  • nowrite – Write this column? One of Write.WRITE (default) or Write.NOWRITE

class Alpha(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class Skip(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class Write(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
alpha

Alphabetic case into which to force this column prior to upload

One of SourceColumnInfo.Alpha.UPPER, SourceColumnInfo.Alpha.LOWER, or SourceColumnInfo.Alpha.SKIP.

exp

Column expression.

format

Column format.

name

Name of the column.

nowrite

Whether or not to write this column.

Set to SourceColumnInfo.Write.WRITE or SourceColumnInfo.Write.NOWRITE

order

A positive integer indicating this column’s position in a revised column order, or 0 for no reordering.

scale

Column scale

Decimal value by which to divide the values in this column prior to upload, or 0 for none.

skip

Whether or not to skip this column on loading.

Set to SourceColumnInfo.Skip.SKIP or SourceColumnInfo.Skip.NOSKIP

title

Column title.

type

Type of column.

A string (or bytes), one of: “text”, “int”, “float”, “bigint”

width

Column width.

class py1010.TableInfo(name, int ID=0, title=None, sdesc=None, ldesc=None, type_=u'', int secure=0, int own=0, owner=None, update=None, int favorite=0, users=None, display=None, int report=0, int chart=0, link=u'', long numRows=0, long numBytes=0, int segs=0, int access=0, long maxdown=0, mode=Mode.REPLACE, stripe=None, stripe_factor=None, ptr=None, *)

Class for holding metadata about a table as an upload target.

Holds data about a table for uploading (with addTableSpecs).

Several of the fields are meant to hold values from special enumeration classes, which are internal classes of TableInfo. So the mode can be TableInfo.Mode.APPEND or TableInfo.Mode.REPLACE or TableInfo.Mode.NOREPLACE. See below, and individual docstrings.

class Mode(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class Perm(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class SegType(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
class TimeSeries(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
init(self, name, int ID=0, title=None, sdesc=None, ldesc=None, type_=u'', int secure=0, int own=0, owner=None, update=None, int favorite=0, users=None, display=None, int report=0, int chart=0, link=u'', long numRows=0, long numBytes=0, int segs=0, int access=0, long maxdown=0, mode=Mode.REPLACE, stripe=None, stripe_factor=None)
access

Boolean 1 or 0 indicating whether or not this table is accessible.

chart

Boolean 1 or 0 indicating whether or not chart specifications are saved for this table.

favorite

Boolean 1 or 0 indicating whether or not the transaction UID has favorited this table.

id

Unique identifier for this table.

ldesc

Long description of the table, if any.

Link header of table, or NULL for no link header.

materialize

Boolean 1 or 0 indicating whether or not this table is materialized.

maxdown

Maximum download limit of table, or a non-positive integer for the default maxdown.

merge

Boolean 1 or 0 indicating whether or not this table is appendable.

method

Materialize method, or None for the default method.

mode

Append or replace?

name

Full path to the table.

numBytes

Number of bytes in the table.

numCols

Number of columns in this table.

numRows

Number of rows in the table.

own

Boolean 1 or 0 indicating whether or not the transaction UID is the owner of this table.

owner

UID or groupname of the owner of this table, or None for the default owner.

report

Boolean 1 or 0 indicating whether or not report specifications are saved for this table.

responsible

Boolean 1 or 0 indicating whether or not the user is responsible for replication of data.

sdesc

Short description of the table, if any.

secure

Boolean 1 or 0 indicating whether or not this table is secure. Deprecated in API.

segmentation

Comma-separated list of the names of segmentation columns.

segs

Number of segments spanned by this table.

segsize

Size of the segments of this table.

segtype

Integer representing segmentation type of this table. Either TableInfo.SegType.SEGBY or TableInfo.SegType.SORTSEG. 0 if “segmentation” is None.

sort

Comma-separated list of the names of sort columns.

stripe

How many machines to stripe the data across.

stripe_factor

Fraction of machines to stripe data across.

timeSeries

Integer representing whether or not time-series segmentation is used for this table. Either TENTEN_TS or TENTEN_NOTS. 0 if “segmentation” is None.

title

Title of the table, if any.

type

Type of table. Currently, can be “REAL”, “VIEW”, “PARAM”, “MERGED”, “UQ”, or “TOLERANT”.

update

Datetime of last modification to this table.

users

users: object

class py1010.ColumnInfo(name, type_, title=None, desc=None, format=None, int index=Index.NOINDEX, int fix=Fix.NOFIX, ptr=None, *)

Class for holding metadata about a column being uploaded into.

Fix

alias of Index

class Index(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)
classmethod fromSourceCol(cls, scol)

Translate a SourceColumnInfo into a ColumnInfo by copying over some key fields.

init(self, name, type_, title=None, desc=None, format=None, int index=Index.NOINDEX, int fix=Fix.NOFIX)
desc

Description of the column

format

Format of the column.

name

Name of the column

title

Title of the column, displayed when the table is viewed.

type

Type of the column (“integer”, “yyyymmdd”, etc.)