Machines from various projects record data at different frequencies and during certain events into tables that can be stored on Edge or in the Cloud, depending on the architecture in place.
The Data Manager exposes an endpoint that allows the retrieval of data written within these tables. Additionally, it is capable of performing even complex operations on the data, thus returning a processed version of the data to the user.
The endpoint to refer to is the following:
post
Sends a query to the computer, describing a table, to be parsed and evaluated.
Authorizations
AuthorizationstringRequired
The identity provider authentication token.
Path parameters
assetIdstringRequired
Asset instance ID.
Query parameters
orientationstringRequired
Represents how the Back End wants the response back.
Example: list_of_dicts or dict_of_lists
Body
fromstringOptional
Begin of the time range.
Example: 2023-01-01T00:00:00.000Z
tostringOptional
End of the time range.
Example: 2023-01-01T00:00:00.001Z
resamplestringOptional
Temporal interval to resample for, but it can also be a column. In this case all continuous values for that column are grouped togheter. When a resample is done , the resampled key is returned togheter with timeStart and timeEnd. They represen the start and end timestamp of the resampled interval.
Example: state or 1D
maxPointsToReturnintegerOptional
It indicates the number of max values that have to be returned when resample is null.
Example: 200
operationstringOptional
An operation that has to be applied to every requested column. The same behaviour can be obtained by specifying the operation for each column.
Example: sum
fillstringOptional
It can be 'None', which means that I do not want any fill operation done on data, or 'null', which means that I want the data to be filled with None values.
Example: None
subAssetIdstringOptional
It allows to specify the machine code of a principal machine's component (multi machine case) to retrieve only its data.
Example: Line2
timezonestringOptional
Timezone to use to apply resampling.
Example: Europe/Berlin
filtersobjectOptional
Contains the filters that have to be applied to the data to be returned.
Dict which represent tables that can be computer by the DM using some conditions and can then be used to calculate expressions.
limitintegerOptional
How many values have to be taken from the returning table. Es. 1 means only the last value has to be taken.
Example: 3
orderBystringOptional
How returned data should be ordered.
Example: DESC or ASC
groupByobjectOptional
List of column names that the Data Manager should use to organize data. All the tables specified in the column parameter should contain the column names specified here. In output there is a dict where the keys represent the values assumed by the groupby field, the values are list_of_dicts or dict_of_lists representing the tables, as specified in orientation.
calendarbooleanOptional
Boolean to flag if it is required to apply calendar.
checkTimeIntervalExtremesstringOptional
Allows to indicate how to check for data at the extremes of the specified time interval (from - to). Accepted values = 'narrow' -> do not check extremes, 'wide' -> check both extremes 'left' -> check only left extreme (from) 'right' -> check only right extreme (to)
Responses
200
OK
Responseany
This is an example of a response from the data manager when requested to format its response using lists of dictionaries.
To invoke this endpoint, it is necessary to select the machine from which you want to retrieve data and pass its ID directly within the path.
The only parameter present in the query is "orientation," a string type parameter that allows specifying the format in which the data should be returned. The possible values are "list_of_dicts", which indicates to return the data in the form of a list of dictionaries, or "dict_of_lists", which indicates to return the data in the form of dictionaries of lists.
The body is particularly detailed, allowing the service to perform complex data processing operations.
Basic Usage
The first key to analyze is called "columns". This field must be a list of dictionaries where each dictionary indicates something that needs to be calculated or extracted from the data: it could be a formula, an entire column, or the whole table. Each of these dictionaries must be structured with the following keys:
"expression" :
through this key, you can specify to the Data Manager what you want to calculate, as the key must contain a string that indicates a formula to be applied to the data. Note that there is a specific syntax to indicate the columns of the various tables:
If you want to refer to another expression, you can use the following syntax, where "alias" refers to the key used in the dictionary that describes that formula:
For example, if you wanted to retrieve a table column you might use the expression:
If, on the other hand, you wanted to retrieve an entire table, you would use the expression:
Finally, you can construct more complicated forms, such as:
"alias" :
When the service returns the calculated data using what is indicated in the "expression" key, it must be able to assign a name to that value. The content of this key is the value that will be associated with the calculated expression;
"operation" :
for each individual expression, you can specify an operation that must be applied. In this case, the key "operation" must contain a string representing the operation;
"returnScalar" :
a Boolean value that allows specifying whether to return (True) or not (False) the specified value calculated over the entire reference period. If the value is set to False and data are requested over a period of 30 days, then many points representing the trend of the expression in those thirty days will be returned. Specifying the key "returnScalar" to True, on the other hand, will return a single value that corresponds to the expression calculated over the entire reference period.
In addition to the "columns" key, to provide a first example, it is necessary to mention the following keys:
"tz" : the timezone of the considered machine;
"from" : the initial moment from which to retrieve data or perform calculations;
"to" : the final moment beyond which data are not to be considered;
"resample" : defines how to aggregate the data, so as to summarize in points at a frequency established by this value. If no string is specified, then all points in the time period will be returned. Accepted values for this key are, for example, "1h", "1d", "5h", "2d" etc.
Below are some basic examples that can be recreated from the information just mentioned.
1- Retrieving entire columns:
The response will be in a format similar to the one presented below:
In the request, the user expressed the desire to download the two columns "count" and "goodCount" from the table "processData". They wanted to obtain a single point per day (resample = "1d") and were interested in a period of three days ("from": "2023-12-01T00:00:00.000Z", "to": "2024-01-01T00:00:00.000Z").
2- Retrieving a single scalar value over a broad time period:
The response will be in a format similar to the following:
As can be seen, after specifying the "returnScalar" key in each of the expressions, a unique value that summarizes the two columns over time was returned, in addition to the time series.
3- Calculating various formulas with references to other expressions:
The response will be similar to the following:
In this example, the "$K_" expression was used, which allowed referring to other formulas previously specified.
Intermediate Usage
Other keys of interest that can be used within the body to refine the data processing further include:
"operation" :
this key allows specifying an operation that must be applied to each of the expressions listed in the "columns" key. It can be a valid alternative when you do not want to specify the same key in every dictionary of "columns", or if you want to ensure compatibility in data dimension. Indeed, if an operation were specified for only one of the indicated expressions, it could lead to an error due to the difference in the size of the various calculated data.
An example of use is as follows:
The response will no longer contain the time series, as an operation to sum all columns has been requested, which will be performed over the requested time period. Therefore, a single scalar value will be returned for each element in "columns". An example response is provided below:
"limit" :
through this key, you can specify how many rows should be returned starting from the bottom of the specified table. This key is used as an alternative to "from" and "to". An example of use is as follows:
In this case, the last three values that aggregates "aggr0" and "aggr1" have taken are requested. The response is in the following format:
"filters" :
the Data Manager supports the use of filters to be applied to the data to be processed. This can be done using the appropriate key and the structure for the filter as specified on the General Rules of the Data Manager page.
An example is provided:
The response below represents how many pieces were made in March by the operator "Waluigi":
"groupBy" : this field allows grouping based on the values taken by one or more columns. The field can be a single string or a list of strings. For example:
The response should contain the data of "historyBreakdowns" grouped by the value taken by the "cause" column. An example of response is as follows:
Advanced Usage
"resample" :
this key can be used to indicate how to aggregate data to summarize multiple points into a single value, but it can also be used to aggregate data according to the value of a certain column. In fact, you can pass the name of the column for which you want to resample within the key. An example is as follows:
In the response, you can expect to obtain all the states that have occurred over the requested time period. Note that, by performing resampling on a column, the response includes additional keys such as "timeFrom", "timeTo", and "duration" that indicate the start of a record, the end time, and its duration. This functionality is particularly useful when using tables written at a constant frequency as it allows transforming the table as if it had been written at the occurrence of an event of change in the column used for resample. Below is an example of the output:
"placeholders" :
some projects use machines composed of multiple sub-machines or components. This key allows specifying which sub-machine or component is being referred to, in order to use the correct tables. The key is associated with a dictionary in the following form:
"maxPointsToReturn" :
as an alternative to using the "resample" key, the "maxPointsToReturn" key can be used, which must take an integer value. In this way, given the reference interval of the call identified by the "timeFrom" and "timeTo" keys, it is possible to specify to the Data Manager how many points should be returned in that interval. This system allows the service to manage the amount of data, without having to have the knowledge necessary to establish a resample value.
"calendar" :
this key assumes a boolean value that tells the service whether to use the shifts set on the calendar to calculate the requested data. Note that the calendar can only be set via application. Furthermore, if a calendar has not been created for the selected machine, but the request is made with the "calendar" key set to "True", no data will be returned.
"childTables" :
in some cases, it may be necessary to define custom tables. These structures are not really written by the machine, but are created from existing tables by applying certain conditions. This functionality can be useful, for example, to apply filters on tables not prepared for filtering. Below is an example:
In the example, data from the "historyBreakdowns" table is desired, which have the "cause" column with a value other than "ukw". In this case, such column is of the "field" type so it is not possible to perform a filter on it. Then the structure of the child table, as defined in the example above, can be used. Each child table must be defined using the following structure, where "columnType" identifies the type of the column and can take the value "tags" or "fields", while "parent" indicates the actual table to use to apply the filter.
The output of the example is shown below:
The Data Manager also exposes a second endpoint that allows for data calculation.
post
Sends a query to the computer, describing a complex table of data that has to be computed using data from different assets.
Authorizations
AuthorizationstringRequired
The identity provider authentication token.
Query parameters
orientationstringRequired
represents how the Back End wants the response back.
Example: list_of_dicts or dict_of_lists
Body
subAssetIdstringOptional
This is present when the machine/asset is composed of more lines.
fromstringOptional
Begin of the time range.
Example: 2023-01-01T00:00:00.000Z
tostringOptional
End of the time range.
Example: 2023-01-01T00:00:00.001Z
resamplestringOptional
Temporal interval to resample for, but it can also be a column. In this case all continuous values for that column are grouped togheter. When a resample is done , the resampled key is returned togheter with timeStart and timeEnd. They represen the start and end timestamp of the resampled interval.
Example: state or 1D
maxPointsToReturnintegerOptional
It indicates the number of max values that have to be returned when resample is null.
Example: 200
operationstringOptional
An operation that has to be applied to every requested column. The same behaviour can be obtained by specifying the operation for each column.
Example: sum
fillstringOptional
It can be 'None', which means that I do not want any fill operation done on data, or 'null', which means that I want the data to be filled with None values.
Example: None
timezonestringOptional
Timezone to use to apply resampling.
Example: Europe/Berlin
filtersobjectOptional
Contains the filters that have to be applied to the data to be returned.
Dict which represent tables that can be computer by the DM using some conditions and can then be used to calculate expressions.
limitintegerOptional
How many values have to be taken from the returning table. Es. 1 means only the last value has to be taken.
Example: 3
orderBystringOptional
How returned data should be ordered.
Example: DESC or ASC
groupByobjectOptional
List of column names that the Data Manager should use to organize data. All the tables specified in the column parameter should contain the column names specified here. In output there is a dict where the keys represent the values assumed by the groupby field, the values are list_of_dicts or dict_of_lists representing the tables, as specified in orientation.
calendarbooleanOptional
Boolean to flag if it is required to apply calendar.
checkTimeIntervalExtremesstringOptional
Allows to indicate how to check for data at the extremes of the specified time interval (from - to). Accepted values = 'narrow' -> do not check extremes, 'wide' -> check both extremes 'left' -> check only left extreme (from) 'right' -> check only right extreme (to)
Responses
200
OK
Responseany
This is an example of a response from the data manager when requested to format its response using lists of dictionaries.
The functionalities exposed are exactly the same as those exposed so far, with the difference that in this case, it is not necessary to specify the ID of the machine for which the calculations are to be made. Indeed, this endpoint involves using, in the "columns" key, more articulated regular expressions than those previously exposed since the asset ID to which that table refers must be specified in front of the table name. The regular expression to express a column then becomes:
while the expression to refer to an already expressed concept remains the same: