Prodata Documentation#
prodata#
Simplified proapi access, response formatting and preprocessing of machine
data for data analytics.
Dependencies#
Python: >= 3.10
In addition to the requirements.txt dependencies, the package propai
(provided by Proemion) is required.
What is it?#
This package has two focus areas, details as follows:
ProQuery#
The ProQuery class offers several helpers to in and output API
query data and responses directly in certain formats like pandas.Dataframes,
handling API pagination and date formatting etc.
Preprocessing#
preprocessing which offers advanced data processing pipelines using
Scikit-learn-based transforms framework.
How to use#
Proquery#
Example: Get all machines with a specific ECU software version and a certain operating hours range.#
# imports
from prodata.proquery import ProQuery
from collections import namedtuple
# Instantiate ProQuery and API and perform authentication.
pq = ProQuery(client_id="YOUR_CLIENT_ID", client_secret="YOUR_CLIENT_SECRET")
API = pq.api
# Set signal keys and query machines with the query string 'q'.
signal = namedtuple("Signal", "type key")
signal_hours = signal(type='numeric',
key='value.common.machine.hours.operation.total')
signal_ecu = signal(type='string',
key='value.custom.ecu.drive.software.identification')
machines = pq.get_df(
API.machines_api.machines_get_machines,
q=f'measurements.{signal_hours.type}.{signal_hours.key}=gt=1000 and '
f'measurements.{signal_hours.type}.{signal_hours.key}=lt=3000 and '
f'measurements.{signal_ecu.type}.{signal_ecu.key}==1.09'
)
Alright that was easy, now let´s check which of these machines have a specific DTC active:
# additional imports
import datetime
# Define a timerange.
_from, to = pq.convert_to_posix((2024, 10, 1), datetime.datetime.now())
# Query machines which have DTCs with specific target Source, SPN and FMI.
target_spn = '520568'
target_fmi = '22'
target_source = '0'
machines_w_errors = pq.input_df(
API.j1939_api.j1939_get_machines_id_dtcs,
data=machines,
params_col_names={'id': 'id'},
q=f'source == {target_source} '
f'and spn=={target_spn} '
f'and fmi=={target_fmi} '
f'and active == true',
_from=_from, to=to)
Great, now let´s check if there are some overdue maintenance tasks for the machines showing the error which were not touched yet:
# Filter all machines having the DTC.
machines_w_errors = machines_w_errors[machines_w_errors['status'].notna()]
# Get info for maintenance tasks which were not addressed yet.
# NOTE: Below is shown how columns of a dataframe are passed into a RSQL query.
machines_w_errors = pq.input_df(
API.maintenance_tasks_api.maintenance_tasks_get_maintenance_tasks,
rsql_cols=['input_id'],
q='machine.id=={} '
'and deadline == "overdue" '
'and progress =out= ("skipped", "completed")',
data=machines_w_errors)
Contents: