Best practices#
There are several recommended best practices when using the 1010data Python SDK.
Clean up your sessions upon query completion
Use the with
keyword to clean up sessions. For more information, see Cleaning up a session.
Consider your result set when accessing data
When you access a piece of data from the result set of a Query, py1010 downloads a “window” of data from the column(s) in question. The default window size is 50000. (When there are many columns, py1010 will automatically reduce the window size if the number of cells being downloaded exceeds the limit which the server is willing to send.) So if you access query[1][150]
, py1010 will download elements 0-49999 of column 1 from the result set, and if you access query.rows[120030]
then py1010 will download elements 100000-149999 of all the columns, and so on.
This works well for situations when you are mostly accessing data that is close together, like when stepping through the query row by row (such as for row in query.rows:...
). If you are performing more random access of the data, getting only one or two elements from each window, you may find the overhead of fetching these large windows affects your performance. This is especially true if you are fetching many columns at once (such as using query.rows
on a query with many columns.) In such cases, you may want to reduce the size of the window being downloaded each time, by setting the py1010.Query.win_size
member of the Query object to something smaller, such as 100
or 1000
.
Have each thread create a Session object
In a multithreaded application, each thread should create its own Session
object. If multiple threads that share a single Session
object interact with the object in parallel, all but one of the threads will be queued. Users will experience much better performance if each thread has its own Session
object. These objects can all be connected to the same Insights Platform session via py1010.POSSESS
.