by Anahid Jalali (AIT), Andreas Rauber (TUWien), Jasmin Lampert (AIT)

Deep learning models for time series prediction have become popular with the rise of IoT and sensor data availability. However, their lack of explainability hampers their use in critical industrial applications. While existing model-agnostic approaches like LIME and SHAP have been used in time series classification applications, it is worth mentioning that they may have limitations in their suitability. For example, the random sampling process used by LIME leads to unstable explanations. We propose a counterfactual explanation approach for interpretable insights into time series predictions to address this issue. We choose an industrial use case, determining machine health, and employ k-means clustering and Dynamic Time Warping (DTW) to handle the temporal dimension. DTW compares and aligns two time series by discovering the optimal path of alignment that minimizes disparities in their temporal patterns. We explain the model's decisions using local surrogate decision trees, analysing feature importance and decision cuts.

For this purpose, we focus on a time series classification task that determines whether a machine is healthy or unhealthy. We introduce an automatic example-based approach to extract factual and counterfactual samples from clustered input data, enabling justification of model classification results. To achieve this, we employ the k-means algorithm to partition the time series into clusters, minimising clustering error by considering the sum of squared distances from each data point to its cluster centre. Unlike spatial distance-based approaches, we utilise Dynamic Time Warping (DTW) as a similarity measure to account for the temporal dimension inherent in time series data.

To explain the model's decisions, we identify each cluster's most important time series features, including frequencies, amplitude, pitch, mean and standard deviation. We further employ local surrogate decision trees (DTs) as interpretable models to explain the black-box decisions. DTs are well-known for their interpretability and require fewer resources than other methods. We elucidate the temporal changes and their contributions to the prediction results by analysing these DTs' feature importance and decision cuts. The tree's feature importance is used to identify the top influential parameters, and the decision cuts help extract classification decision boundaries.

We tested our proposed approach on the Commercial Modular Aero-Propulsion System Simulation (CMAPS) dataset [L1], a widely cited dataset used for Prognosis Health Management tasks. The dataset consists of 218 engines that start in a healthy state and experience artificially injected faults until breakdown. We have experimented with different cluster sizes, evaluating them using the silhouette metric and determining the optimal number of clusters to create smaller neighbourhoods.

We calculate the closest clusters representing healthy and unhealthy conditions for each test sequence. We then utilise a trained LSTM model to predict the class of the test sample and, based on the prediction, assign the factual and counterfactual clusters accordingly. We repeat this process for all sequences of one engine.

Additionally, we investigate the contribution of time series characteristics to the classification output. For this, we extract time-domain features such as mean, standard deviation, minimum and maximum from each cluster and train local surrogate DTs on these clustered features. The feature importance analysis reveals that the mean values at the signal's first and last time steps are the most influential features in predicting the healthy or unhealthy classes. Other significant features include the lower frequency of the time series and the standard deviation of the second half of the sequence's time steps. By extracting rules from the DTs, the authors gain insights into parameter changes and their impact on decision-making.

In Figure 1, we present visual representations of factual and counterfactual examples. A factual example includes: i) the test sequence sample, ii) the centre of the closest cluster with the same class, and iii) all the samples in the cluster. Similarly, a counterfactual example includes: i) the test sequence sample, ii) the centre of the closest cluster with the opposite class and iii) all the samples in the cluster. These visualisations help illustrate the dissimilarities between the clusters and provide a further understanding of the model's decisions.

Figure 1: Illustration of the extracted counterfactual samples for one engine, in which the surrogate model had 100% accuracy in predicting its health state. The background colour indicates the model prediction for each flight sequence: green for healthy and red for unhealthy. This plot shows the dissimilarities of the counterfactual to the sequence, e.g. when the predicted unhealthy cluster have different behaviour in both spatial and time dimensions compared to the predicted sequences [4].
Figure 1: Illustration of the extracted counterfactual samples for one engine, in which the surrogate model had 100% accuracy in predicting its health state. The background colour indicates the model prediction for each flight sequence: green for healthy and red for unhealthy. This plot shows the dissimilarities of the counterfactual to the sequence, e.g. when the predicted unhealthy cluster have different behaviour in both spatial and time dimensions compared to the predicted sequences [4].

As our research progresses, we are expanding our approach to encompass time series forecasting. Additionally, we strongly advocate incorporating expert feedback into the explanations, as it can enhance both the model's performance and the overall quality of the explanations.

Links:
[L1] https://www.nasa.gov/content/prognostics-center-of-excellence-data-set-repository

References:
[1] T. Sivill and P. Flach, “LIMESegment: meaningful, realistic time series explanations,” in Int. Conf. on Artificial Intelligence and Statistics, 2022, pp. 3418–3433.
[2] M. Guillemé, et al., “Agnostic local explanation for time series classification”, in 2019 IEEE 31st Int. Conf. on Tools with Artificial Intelligence (ICTAI), Nov. 2019, pp. 432–439, 2019
[3] A. Theissler, et al., “Explainable AI for time series classification: a review, taxonomy and research directions,” IEEE Access, 2022.
[4] A. Jalali, et al., “Explaining binary time series classification with counterfactuals in an industrial use case,” presented at ACM CHI Workshop on Human-Centered Perspectives in Explainable AI, 2022.

Please contact:
Anahid Jalali, AIT Austrian Institute of Technology, Austria
This email address is being protected from spambots. You need JavaScript enabled to view it.



Next issue: April 2025
Special theme:
Cultural AI
Call for the next issue
Image ERCIM News 134
This issue in pdf

 

Image ERCIM News 134 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed