Best practices

There are several recommended best practices when using the 1010data Python SDK.

Clean up your sessions upon query completion
Use the with keyword to clean up sessions. For more information, see Cleaning up a session.
Scroll through data sequentially
When accessing data, scrolling sequentially produces better performance results.
Use XML queries for random access or set window size to 1

As a best practice, users should access random data via queries. The example query below requests a random 10% sample:

exampleQuery = testSession.query(path, '<sel value="draw_(33;10)")'

If your users are going to ask for a random set of data intersections that do not progress incrementally without using a query, the best practice is to set your window size to 1. The window is the number of rows of data that buffered when the Python SDK requests results from the 1010data Insights Platform. By setting the window to 1, you retrieve each row individually, which keeps the system from retrieving larger sets of unneeded data at random points in the data set and prevents unnecessary transfer of data. In general, you won't have to worry about the window. It is worth noting that even with a window size of 1, this kind of access will most likely perform poorly.

If you do want to change the window size, you can use the setWindowSize method. For example:
query.win_size = 100
Have each thread create a Session object

In a multithreaded application, each thread should create its own Session object. If multiple threads that share a single Session object interact with the object in parallel, all but one of the threads will be queued. Users will experience much better performance if each thread has its own Session object. These objects can all be connected to the same Insights Platform session via py1010.POSSESS.