Process Analytics: Let Engineers Find the Answers Themselves

Real-time monitoring tells you whether equipment is running normally right now. Alerts notify you when something goes wrong. But the most valuable questions in a plant come after that: why did this batch underperform? How strong is the relationship between injection temperature and defect rate? Two machines of the same model, running at the same conditions — why do their performance curves diverge?

Answering these questions does not require a better dashboard. It requires tools for exploration: overlaying several signals on the same chart, selecting time ranges of interest, quantifying the relationship between two parameters, pulling out the “golden batches” and the “problem batches” and comparing them side by side. Until now, this kind of work lived in SQL queries, Excel sheets, or a data engineer’s backlog.

TDengine’s process analytics puts this capability directly in the hands of process and operations engineers. TDengine TSDB provides the underlying statistical and AI analysis functions. TDengine IDMP embeds these capabilities inside the panels and views where data already lives — so engineers can investigate without switching tools and without writing code.

Batches Are the Natural Unit of Industrial Analysis

In process manufacturing, discrete production, and chemical reactions, the most meaningful unit of analysis is not a time window — it is a batch. What happened to process parameters across the full run? Where did this batch diverge from the previous one? Which batches over the past three months had elevated defect rates, and what do they have in common that the good batches do not?

TDengine models each production batch as a complete event with a defined start time, end time, and custom attributes — batch ID, operator, quality outcome, and more. When a batch ID changes, TDengine stream processing automatically closes the previous batch, calculates summary statistics, and generates the batch event record. No manual entry required. Historical data can be backfilled on setup.

With a full batch archive in place, the analysis can begin. A process team filters for high-defect batches, then overlays their parameter curves against normal batches. IDMP provides several comparison modes. Overlaying all batch curves on a shared time axis quickly surfaces the outliers. Arranging each batch in its own swimlane lets engineers inspect individual run profiles without visual clutter. For batches of unequal duration, time normalization maps every curve to a 0%–100% completion scale — making it possible to compare “the midpoint of the reaction” across runs of any length. Envelope analysis automatically computes the normal parameter range from historical reference batches, drawing a corridor on the chart; any new batch that steps outside that corridor is immediately visible.

Image
Compare the events and generate the metric envelope by a simple click in TDengine

Quantifying the Relationship Between Two Signals

A common engineering question is: do these two parameters move together, and how strongly? A scatter plot with one attribute on each axis makes the relationship visible at a glance. Fitting a regression curve — linear, exponential, or polynomial — quantifies it: the direction, the slope, the degree of nonlinearity. How much does chiller power drop per degree of increase in chilled water setpoint temperature? The answer comes from the data, not from intuition.

For statistical correlation, TDengine’s built-in CORR function computes the Pearson correlation coefficient between any two time-series attributes — grouped by device, or windowed over rolling time intervals. The result is a number between -1 and 1, directly expressing the strength and direction of the linear relationship.

When the relationship involves a time lag — an upstream parameter change that takes minutes or hours to propagate to a downstream metric — the TLCC function computes cross-correlation across a range of lag steps, estimating the physical propagation delay. This provides a quantitative basis for closed-loop control optimization. For comparing two time series that share a similar shape but differ in phase or sampling rate, the DTW function aligns them in the time domain before computing similarity, handling the kind of temporal distortion that standard correlation cannot.

Operational Patterns Emerge from Clustering

Equipment operates across multiple distinct regimes, but engineers often cannot say in advance how many regimes exist or where their boundaries lie. Running a clustering algorithm on a scatter plot of two attributes lets the data answer that question directly. Points naturally aggregate into colored regions — the normal operating zone, the degraded zone, and the outliers — without any predefined rules.

TDengine supports K-Means, DBSCAN, Gaussian Mixture Models, and several other clustering algorithms, each suited to a different data distribution. For a wind turbine, clustering wind speed against active power reveals which periods correspond to normal operation, which to yaw misalignment, and which to curtailment — in a single chart. A displaced scatter cluster often carries more diagnostic information than a column of alert log entries.

Forecasting: See Where Things Are Heading Before They Arrive

Real-time monitoring tells you where a process is now. Forecasting tells you where it is going. TDgpt‘s built-in time-series forecasting engine, exposed through the FORECAST() SQL function, estimates future values of any attribute based on its historical behavior. The forecast curve appears in trend panels as a natural continuation of the historical line. When will a storage tank reach its upper limit? Will compressor discharge temperature breach its threshold in the next 24 hours? How will wastewater plant influent flow change across a holiday weekend? These become data-driven answers rather than judgment calls.

TDgpt’s forecasting library spans statistical models, machine learning, and deep learning — Holt–Winters and ARIMA for structured periodic patterns, Prophet for series with holiday effects and irregular gaps, LSTM and PatchTST for complex nonlinear dependencies, and TDtsfm, TDengine’s pretrained time-series foundation model, for zero-shot forecasting when historical data is limited. FORECAST() can be called directly in SQL for ad hoc queries, or enabled through IDMP’s attribute configuration interface to display predictions persistently on trend panels.

Exploration Is the Starting Point — Findings Need to Land

When a process engineer discovers through batch analysis that injection temperature consistently drops in the second half of problem batches, the next step is configuring a real-time analysis to monitor that deviation continuously — generating an alert event the moment it appears. The insight developed through process analytics becomes a monitoring rule that runs without anyone watching.

Process analytics, real-time analytics, and AI-powered insights share the same TDengine data foundation. KPI attributes produced by real-time stream computations feed directly into scatter plot regression. Anomaly events detected by AI can be pulled into batch comparison for root cause investigation. Findings validated through process analytics can be pinned as dashboard panels for long-term tracking. Exploration is not an end in itself — it is the first step from data to action.

Frequently Asked Questions

What is the difference between process analytics and real-time analytics, and when should I use each?

Real-time analytics runs continuously on the data stream — automatic, persistent, operating whether or not anyone is watching. Process analytics is on-demand investigation — used after something goes wrong to understand why, or before a process change to establish a quantitative baseline. Both work on the same data. The typical flow is: a real-time alert surfaces an issue, and process analytics is where engineers dig into the root cause.

What setup is needed before batch analysis is available?

The device needs an integer-typed batch ID attribute that increments with each new batch. Once a state window–based stream is configured, TDengine maintains batch event records automatically and can backfill historical data. If batch boundaries are defined by data silence rather than a batch ID — as with some process equipment — a session window trigger achieves the same result.

Can correlation analysis establish causality?

Correlation quantifies the strength and direction of a statistical relationship, but does not prove causation. It is highly effective at narrowing the field — identifying which of many candidate parameters is most strongly associated with a quality outcome — and at validating physical hypotheses with data. The TLCC function goes a step further by estimating propagation delay between variables, which provides directional evidence that can inform causal reasoning alongside domain knowledge.

Both clustering and anomaly detection can surface unusual behavior. What is the difference?

Anomaly detection scans a single time series for deviations over time: it answers “when did this attribute behave abnormally?” Clustering groups data points in a feature space: it answers “what distinct operating regimes exist, and which data points fall outside the normal zone?” The two approaches complement each other — use clustering to define the boundaries of normal operation, then use real-time anomaly detection to monitor whether live data stays within those boundaries.

Do engineers need a data science background to use process analytics?

Process analytics is designed for process and operations engineers, not data scientists. Scatter plot regression and clustering require only selecting the axis attributes and an algorithm — no code. Batch comparison requires filtering events and choosing a display mode. For engineers comfortable with SQL, the CORR, TLCC, and FORECAST functions are available for more flexible ad hoc queries directly in TDengine.