comotion.dash module

class comotion.dash.DashConfig(auth: comotion.auth.Auth)[source]

Object containing configuration information for Dash API

auth

comotion.Auth object holding information about authentication

Type

comotion.Auth

class comotion.dash.Query(config: comotion.dash.DashConfig, query_text: str = None, query_id: str = None)[source]

The query object starts and tracks a query on Comotion Dash.

Initialising this class runs a query on Comotion Dash and stores the resulting query id in query_id

COMPLETED_STATES = ['SUCCEEDED', 'CANCELLED', 'FAILED']
download_csv(output_file_path, fail_if_exists=False)[source]

Download csv of results and check that the total file size is correct

Parameters
  • output_file_path (File path) – Path of the file to output to

  • fail_if_exists (bool, optional) – If true, then will fail if the target file name already/ Defaults to false.

Raises

IncompleteRead – If only part of the file is downloaded, this is raised

get_csv_for_streaming() → urllib3.response.HTTPResponse[source]

Returns a urllib3.response.HTTPResponse object that can be used for streaming This allows use of the downloaded file without having to save it to local storage.

Be sure to use .release_conn() when completed to ensure that the connection is released

This can be achieved using the with notation e.g.:

with query.get_csv_for_streaming().stream() as stream:
  for chunk in stream:
      # do somthing with chunk
      # chunk is a byte array ``
get_query_info() → comodash_api_client_lowlevel.model.query.Query[source]

Gets the state of the query.

Returns

Model containing all query info, with the following attributes

query

query sql

query_id

query_id of query

status
completion_date_time

GMT Completion Time

state

Current state of query. One of QUEUED,RUNNING,SUCCEEDED,FAILED,CANCELLED

stateChangeReason

info about reason for state change (generally failure)

submission_date_time`

GMT submission time

Return type

QueryInfo

is_complete() → bool[source]

Indicates whether the query is in a final state. This means it has either succeeded, failed or been cancelled.

Returns

Whether query complete

Return type

bool

query_id() → str[source]

Returns query id for this query

state() → str[source]

Gets the state of the query.

Returns

One of QUEUED,RUNNING,SUCCEEDED,FAILED,CANCELLED

Return type

str

stop()[source]

Stop the query

wait_to_complete() → bool[source]

Blocks until query is in a complete state

Returns

Final state, one of ‘SUCCEEDED’, ‘CANCELLED’, ‘FAILED’

Return type

str

comotion.dash.create_gzipped_csv_stream_from_df(df: pandas.core.frame.DataFrame) → _io.BytesIO[source]

Returns a gzipped, utf-8 csv file bytestream from a pandas dataframe

Useful to help upload dataframes to dash

It does not break file up, so be sure to apply a maximise chunksize to the dataframe before applying - otherwise dash max file limits will cause an error

Parameters

df (pd.DataFrame) – Dateframe to be turned into bytestream

Returns

The Bytestream

Return type

io.BytesIO

comotion.dash.read_and_upload_file_to_dash(file: Union[str, _io.FileIO], dash_table: str, dash_orgname: str, dash_api_key: str, encoding: str = 'utf-8', chunksize: int = 30000, modify_lambda: Callable = None, path_to_output_for_dryrun: str = None, service_client_id: str = '0')[source]

Reads a file and uploads to dash.

This function will: - Read a csv file - Break it up into multiple csv’s - each with a maximum number of lines defined by chunksize - upload them to dash

Parameters
  • file (Union[str, io.FileIO]) – Either a path to the file to be uploaded, or a FileIO stream representing the file to be uploaded Should be an unencrypted, uncompressed CSV file

  • dash_table (str) – name of Dash table to upload the file to

  • dash_orgname (str) – orgname of your Dash instance

  • dash_api_key (str) – valid api key for Dash API

  • encoding (str) – the encoding of the source file. defaults to utf-8.

  • chunksize (int) – (optional) the maximum number of lines to be included in each file. Note that this should be low enough that the zipped file is less than Dash maximum gzipped file size. Defaults to 30000.

  • modify_lambda – (optional) a callable that recieves the pandas dataframe read from the csv. Gives the opportunity to modify - such as adding a timestamp column. Is not required.

  • path_to_output_for_dryrun (str) – (optional) if specified, no upload will be made to dash, but files will be saved to the location specified. This is useful for testing. multiple files will be created: [table_name].[i].csv.gz where i represents multiple file parts

  • service_client_id (str) – (optional) if specified, specifies the service client for the upload. See the dash documentation for an explanation of service client.

Returns

List of http responses

Return type

List

comotion.dash.upload_csv_to_dash(dash_orgname: str, dash_api_key: str, dash_table: str, csv_gz_stream: _io.FileIO, service_client_id: str = '0') → requests.models.Response[source]

Uploads csv gzipped stream to Dash

Expects a csv gzipped stream to upload to dash.

Parameters
  • dash_orgname (str) – Dash organisation name for dash instance

  • dash_api_key (str) – Valid API key for the organisation instance

  • dash_table (str) – Table name to upload to

  • csv_gz_stream (io.FileIO) – Description

Returns

response from dash api

Return type

requests.Response

Raises

HTTPError – If one is raised by the call

comotion.dash.upload_from_oracle(sql_host: str, sql_port: int, sql_service_name: str, sql_username: str, sql_password: str, sql_query: str, dash_table: str, dash_orgname: str, dash_api_key: str, dtypes: dict = None, include_snapshot: bool = True, export_csvs: bool = True, chunksize: int = 50000, output_path: str = None, sep: str = '\t', max_tries: int = 5)[source]

Uploads data from a Oracle SQL database object to dash.

This function will: - get the total number of rows chunks need for the sql query - get chunks of data from the SQL database - upload them to dash - append them to csv output (if specified) - save error chunks as csv (if any)

Parameters
  • sql_host (str) – SQL hot e.g. ‘192.168.0.0.1’

  • sql_port (str) – SQL port number e.g 9005

  • sql_service_name (str) – SQL service name e.g. ‘myservice’

  • sql_username (str) – SQL username,

  • sql_password (str) – SQL password,

  • sql_query (str) – SQL query,

  • dash_table (str) – name of Dash table to upload the data to

  • dash_orgname (str) – orgname of your Dash instance

  • dash_api_key (str) – valid api key for Dash API

  • export_csvs – (optional) If True, data successfully uploaded is exported as csv Defaults to True i.e. output is included

  • dtypes (dict) – (optional) A dictionary that contains the column name and data type to convert to. Defaults to None i.e. load dataframe columns as they are.

  • chunksize (int) – (optional) the maximum number of lines to be included in each file. Note that this should be low enough that the zipped file is less than Dash maximum gzipped file size. Defaults to 50000.

  • sep (str) – (optional) Field delimiter for ComoDash table to upload the dataframe to. Defaults to /t.

  • output_path (str) – (optional) if specified, no output csv are saved in that location. If not, output is place in the same location as the script Defaults to None

  • include_snapshot (bool) – (optional) If True, an additional column ‘snapshot_timestamp’ will be added to the DataFrame. This column will contain the time that data is loaded in “%Y-%m-%d %H:%M:%S.%f” format in order to help with database management Defaults to True i.e. snapshot_timestamp is included

  • max_tries (int) – (optional) Maximum number of times to retry if there is an HTTP error Defaults to 5

Returns

DataFrame with errors

Return type

pd.DataFrame