comotion.dash module¶
-
class
comotion.dash.
DashConfig
(auth: comotion.auth.Auth)[source]¶ Object containing configuration information for Dash API
-
auth
¶ comotion.Auth object holding information about authentication
- Type
comotion.Auth
-
-
class
comotion.dash.
Query
(config: comotion.dash.DashConfig, query_text: str = None, query_id: str = None)[source]¶ The query object starts and tracks a query on Comotion Dash.
Initialising this class runs a query on Comotion Dash and stores the resulting query id in query_id
-
COMPLETED_STATES
= ['SUCCEEDED', 'CANCELLED', 'FAILED']¶
-
download_csv
(output_file_path, fail_if_exists=False)[source]¶ Download csv of results and check that the total file size is correct
- Parameters
output_file_path (File path) – Path of the file to output to
fail_if_exists (bool, optional) – If true, then will fail if the target file name already/ Defaults to false.
- Raises
IncompleteRead – If only part of the file is downloaded, this is raised
-
get_csv_for_streaming
() → urllib3.response.HTTPResponse[source]¶ Returns a
urllib3.response.HTTPResponse
object that can be used for streaming This allows use of the downloaded file without having to save it to local storage.Be sure to use
.release_conn()
when completed to ensure that the connection is releasedThis can be achieved using the with notation e.g.:
with query.get_csv_for_streaming().stream() as stream: for chunk in stream: # do somthing with chunk # chunk is a byte array ``
-
get_query_info
() → comodash_api_client_lowlevel.model.query.Query[source]¶ Gets the state of the query.
- Returns
Model containing all query info, with the following attributes
- query
query sql
- query_id
query_id of query
- status
- completion_date_time
GMT Completion Time
- state
Current state of query. One of QUEUED,RUNNING,SUCCEEDED,FAILED,CANCELLED
- stateChangeReason
info about reason for state change (generally failure)
- submission_date_time`
GMT submission time
- Return type
QueryInfo
-
is_complete
() → bool[source]¶ Indicates whether the query is in a final state. This means it has either succeeded, failed or been cancelled.
- Returns
Whether query complete
- Return type
bool
-
-
comotion.dash.
create_gzipped_csv_stream_from_df
(df: pandas.core.frame.DataFrame) → _io.BytesIO[source]¶ Returns a gzipped, utf-8 csv file bytestream from a pandas dataframe
Useful to help upload dataframes to dash
It does not break file up, so be sure to apply a maximise chunksize to the dataframe before applying - otherwise dash max file limits will cause an error
- Parameters
df (pd.DataFrame) – Dateframe to be turned into bytestream
- Returns
The Bytestream
- Return type
io.BytesIO
-
comotion.dash.
read_and_upload_file_to_dash
(file: Union[str, _io.FileIO], dash_table: str, dash_orgname: str, dash_api_key: str, encoding: str = 'utf-8', chunksize: int = 30000, modify_lambda: Callable = None, path_to_output_for_dryrun: str = None, service_client_id: str = '0')[source]¶ Reads a file and uploads to dash.
This function will: - Read a csv file - Break it up into multiple csv’s - each with a maximum number of lines defined by chunksize - upload them to dash
- Parameters
file (Union[str, io.FileIO]) – Either a path to the file to be uploaded, or a FileIO stream representing the file to be uploaded Should be an unencrypted, uncompressed CSV file
dash_table (str) – name of Dash table to upload the file to
dash_orgname (str) – orgname of your Dash instance
dash_api_key (str) – valid api key for Dash API
encoding (str) – the encoding of the source file. defaults to utf-8.
chunksize (int) – (optional) the maximum number of lines to be included in each file. Note that this should be low enough that the zipped file is less than Dash maximum gzipped file size. Defaults to 30000.
modify_lambda – (optional) a callable that recieves the pandas dataframe read from the csv. Gives the opportunity to modify - such as adding a timestamp column. Is not required.
path_to_output_for_dryrun (str) – (optional) if specified, no upload will be made to dash, but files will be saved to the location specified. This is useful for testing. multiple files will be created: [table_name].[i].csv.gz where i represents multiple file parts
service_client_id (str) – (optional) if specified, specifies the service client for the upload. See the dash documentation for an explanation of service client.
- Returns
List of http responses
- Return type
List
-
comotion.dash.
upload_csv_to_dash
(dash_orgname: str, dash_api_key: str, dash_table: str, csv_gz_stream: _io.FileIO, service_client_id: str = '0') → requests.models.Response[source]¶ Uploads csv gzipped stream to Dash
Expects a csv gzipped stream to upload to dash.
- Parameters
dash_orgname (str) – Dash organisation name for dash instance
dash_api_key (str) – Valid API key for the organisation instance
dash_table (str) – Table name to upload to
csv_gz_stream (io.FileIO) – Description
- Returns
response from dash api
- Return type
requests.Response
- Raises
HTTPError – If one is raised by the call
-
comotion.dash.
upload_from_oracle
(sql_host: str, sql_port: int, sql_service_name: str, sql_username: str, sql_password: str, sql_query: str, dash_table: str, dash_orgname: str, dash_api_key: str, dtypes: dict = None, include_snapshot: bool = True, export_csvs: bool = True, chunksize: int = 50000, output_path: str = None, sep: str = '\t', max_tries: int = 5)[source]¶ Uploads data from a Oracle SQL database object to dash.
This function will: - get the total number of rows chunks need for the sql query - get chunks of data from the SQL database - upload them to dash - append them to csv output (if specified) - save error chunks as csv (if any)
- Parameters
sql_host (str) – SQL hot e.g. ‘192.168.0.0.1’
sql_port (str) – SQL port number e.g 9005
sql_service_name (str) – SQL service name e.g. ‘myservice’
sql_username (str) – SQL username,
sql_password (str) – SQL password,
sql_query (str) – SQL query,
dash_table (str) – name of Dash table to upload the data to
dash_orgname (str) – orgname of your Dash instance
dash_api_key (str) – valid api key for Dash API
export_csvs – (optional) If True, data successfully uploaded is exported as csv Defaults to True i.e. output is included
dtypes (dict) – (optional) A dictionary that contains the column name and data type to convert to. Defaults to None i.e. load dataframe columns as they are.
chunksize (int) – (optional) the maximum number of lines to be included in each file. Note that this should be low enough that the zipped file is less than Dash maximum gzipped file size. Defaults to 50000.
sep (str) – (optional) Field delimiter for ComoDash table to upload the dataframe to. Defaults to /t.
output_path (str) – (optional) if specified, no output csv are saved in that location. If not, output is place in the same location as the script Defaults to None
include_snapshot (bool) – (optional) If True, an additional column ‘snapshot_timestamp’ will be added to the DataFrame. This column will contain the time that data is loaded in “%Y-%m-%d %H:%M:%S.%f” format in order to help with database management Defaults to True i.e. snapshot_timestamp is included
max_tries (int) – (optional) Maximum number of times to retry if there is an HTTP error Defaults to 5
- Returns
DataFrame with errors
- Return type
pd.DataFrame