Getting Started with Comotion Python SDK

This documentation helps you get set up on and use the python Comotion SDK. This documentation includes this getting started page, and full documentation on the sdk functions:

This is an open source project, and can be contributed to by the community.

The full Comotion documentation portal can be found here.

Installation

Install the comotion-sdk in your python environment using pip:

pip install comotion-sdk

Uploading Data to Dash Using the SDK

In order to use it in your python file, you must first import it. In these examples we will import the dash module directly as follows

from comotion import dash

The dash module has a number of useful functions to interact with the Comotion Dash API.

Uploading a csv file to Dash

The read_and_upload_file_to_dash reads a csv file, breaks it up, gzips the files and pushes them to the Dash API.

# Break up and upload file to Dash

from comotion import dash
from getpass import getpass

# set relevant parameters
dash_orgname = 'my_org_name'
dash_api_key = ''
dash_api_key = getpass(
    prompt='Enter your ' + dash_orgname + '.comodash.io api key:'
) # this prompts the user to enter the api key

dash.read_and_upload_file_to_dash(
    './path/to/csv/file.csv',
    dash_table='my_table_name',
    dash_orgname=dash_orgname,
    dash_api_key=dash_api_key
)

Modifying a file for upload

Often you will want to add a column - such as an upload timestamp or batch number - to the csv file to be uploaded. This can easily be done by using the modify_lambda parameter. It accepts a python function that recieves a pandas.dataframe read from the csv. Here is an example of adding a timestamp as a column called snapshot_timestamp to the csv:

# Break up and upload file to Dash with Extra Column

from comotion import dash
from getpass import getpass
from datetime import datetime

# define the function used to modify the file

myTimeStamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")

def addTimeStamp(df):
 df['snapshot_timestamp'] = myTimeStamp

# set relevant parameters

dash_orgname = 'my_org_name'
dash_api_key = ''
dash_api_key = getpass(
    prompt='Enter your ' + dash_orgname + '.comodash.io api key:'
) # this prompts the user to enter the api key

dash.read_and_upload_file_to_dash(
    './path/to/csv/file.csv',
    dash_table='my_table_name',
    dash_orgname=dash_orgname,
    dash_api_key=dash_api_key,
    modify_lambda=addTimeStamp
)

Testing and debugging an upload script

In order to check that your script is working, you can run a dry run. This saves the files locally rather than uploading them to dash - so that you can check the result before uploading.

# First test the upload using the dry_run feature

from comotion import dash
from getpass import getpass

# set relevant parameters
dash_orgname = 'my_org_name'
dash_api_key = ''
dash_api_key = getpass(
    prompt='Enter your ' + dash_orgname + '.comodash.io api key:'
) # this prompts the user to enter the api key

dash.read_and_upload_file_to_dash(
    './path/to/csv/file.csv',
    dash_table='my_table_name',
    dash_orgname=dash_orgname,
    dash_api_key=dash_api_key,
    path_to_output_for_dryrun='./outputpath/'
)

Instead of uploading, this will output the files that would have been uploaded to ./outputpath/. If the file to be uplaoded is large, it will break it up and all files would be placed in the output path.

Advanced usage with Pandas

Using this sdk in conjunction with pandas provides a powerful toolset to integrate with any source.

Here is an example of reading a table named my_table from a postgres database:

# upload a postgres table to dash using Pandas

from comotion import dash
from getpass import getpass
import pandas as pd

# set relevant parameters

dash_orgname = 'my_org_name'
dash_api_key = ''
dash_api_key = getpass(
    prompt='Enter your ' + dash_orgname + '.comodash.io api key:'
) # this prompts the user to enter the api key


# set timestamp to use as a snapshot indication
myTimeStamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S.%f")

# create dataframe from source db:
df_iterable=pd.read_sql_table(
   table_name='my_table',
   'postgresql://username:password@address.of.db:5432/mydatabase',
   chunksize=30000
   )

# note the use of chunksize which will ensure the whole table is not read at once.
# this is also important to ensure that files uploaded to Dash are below the size limit


for df in df_iterable:
   # add timestamp as a column
   df['snapshot_timestamp'] = myTimeStamp

   # create a gzipped csv stream from the dataframe
   csv_stream = dash.create_gzipped_csv_stream_from_df(df)

   # upload the stream to dash
   dash_response = dash.upload_csv_to_dash(
       dash_orgname,
       dash_api_key,
       'my_table_in_dash',
       csv_stream
   )

Running Queries and Extracting Data

You can use the sdk to run queries on Dash, as well as download the results in csv format.

Logging In

The query API is built on v2 of the Comotion API - which uses a new way to authenticate. You do not need an API key, but can log in with your normal user name and password.

In order to do this, after you have installed the SDK, you need to authenticate from the command line. Type in the following from the command line

> comotion authenticate

You will be prompted for your orgname which is your orgnisation’s unique name, and then a browser will open for you to login.

Once this process is complete, the relevant keys will automatically be saved in your computers’s credentials manager.

To prevent asking for orgname every time, you can save your orgname as an environment variable COMOTION_ORGNAME

> export COMOTION_ORGNAME=orgname
> comotion authenticate

or, include it directly in the comment line:

> comotion -o orgname authenticate

Running a query

You can then use the query object in the comotion.dash module to create a query and download the results. Here is an example code snippet. You do not need to authenticate in your code - the Auth class takes care of that.

from comotion.dash import DashConfig
from comotion.auth import Auth
from comotion.dash import Query

config = DashConfig(Auth("myorgname"))

# this step creates the query object and kicks off the query
query = Query(query_text="select 1", config=config)

# this step blocks until the query is complete and provides the query metadata
final_query_info = query.wait_to_complete()


with open("myfile.csv", "wb") as file:
   with query.get_csv_for_streaming() as response:
      for chunk in response.stream(524288):
         file.write(chunk)