by Noémi Friedman (Institute for Computer Science and Control (SZTAKI)) and Abdel Labbi (IBM Research – Europe)
Machine learning (ML) brings many new and innovative approaches to engineering, improving efficiency, flexibility and quality of systems. This special theme of ERCIM News focuses on ML applications in industrial engineering (see keynote by Christopher Ganz), with a focus on civil, environmental, mechanical, chemical, process, agricultural, and transportation engineering.
Data-driven, surrogate and hybrid modelling
ML models can learn in a progressive manner from empirical evidence, which makes them great candidates for continuously correcting modelling errors and adapting to drifts in engineering systems, that are classically modelled by the laws of physics . Surrogating and/or combining these simulations with ML algorithms can overcome limitations of knowledge and computational capacity and lead to improved predictions and engineering innovations (see the introductory paper on the advantages of ML-based surrogate modelling, Asgari et al.).
As reflected in this special theme, data-driven modelling, surrogate modelling and hybrid modelling (a combination of physics-based and learning-based models) have been successfully used in various engineering applications.
Digital twins or meta-models that replace computationally expensive physics-based simulation models can significantly reduce the computational time of modelling complex systems, such as the simulation of a methanation reactor in a power-to-gas process (Asgari et al.) or the dynamic simulation of a tall timber building (Kurent et al.).
Complex and highly non-linear systems can be extremely sensitive to changes in their governing parameters. To quantify the uncertainties of model outputs due to the possible deviations of the input parameters from their estimated value, one is often forced to run instances of the deterministic simulation model over a wide range of parameters. The same is true when parameters have to be identified, calibrated, or optimised. The computational time of ML-based surrogate models can be a small fraction of that of the original physics-based model and so they can enable an efficient way of handling such problems (Hoang et al.). Some examples presented in this special theme include the estimation of aerofoil aerodynamic performance statistics (Liu et al.), the surrogate-based calibration of tall timber building dynamics (Kurent et al.), the reduced order flow simulation by a theory and data driven hybrid model (Deng et al.), and the deep learning model for an accurate spatio-temporal rainfall estimation (Folino et al.).
ML algorithms can also contribute to coarse-grained models. Coarse-grained models are simplified models usually defined on a coarse grid or scale that can accurately simulate the phenomena that happens on a finer scale. Van Halder et al. describe the process of creating a coarse-grained model that simulates the sloshing motion of water, and Karavelic et al. describe a probabilistic scale bridging of micro- and macro-scales to model the plastic behaviour of heterogenous composite materials. Both models apply ML tools for upscaling.
ML algorithms have been receiving particular attention in the field of autonomous vehicles. Using sensor data, learning agents can address the problems of traffic congestion, energy consumption and emissions. Nevertheless, when it comes to passenger safety, a learning-based control design can never give a 100% guarantee of avoiding emergency scenarios. In such cases, a hybrid model that combines the benefits of model-based and learning-based control design can provide an efficient but still robust compromise (Németh et al.).
ML learning tools can control not only the autonomous vehicle but also the flow around it, enabling a more efficient vehicle design. Using sensor data and actuators, an automatic ML-based closed loop control can be built (Cornejo-Macedas et al.) to reduce drag or to increase lift, usually by aiming to avoid flow separation. Manipulating the flow in this way can increase performance and reduce energy consumption – important goals in aircraft design.
ML for production control and process optimisation
Process optimisation aims to reduce production time, optimise material and energy consumption, and increase product quality. Manufacturers can profit greatly from ML tools that can discover hidden dependencies between production parameters, foster production efficiency and flexibility, and manage complex optimisation tasks (Samsonov et al.).
ML tools are always based on observed data, which, unfortunately, is often difficult or expensive to collect or may raise privacy concerns. The more data available, the more ML can discover and improve. We may substitute for collecting additional data by synthetic data generation, which is called a soft or virtual sensor. Garcia-Ceja et al. describe a soft-sensing system for the optimisation of a chemical process.
In cases where synthetic data cannot replace actual data, we may still improve prediction accuracy by incorporating all available background engineering knowledge. For example, an agricultural prediction model of yield of nitrogen status can be improved by combining ML tools with complex systems theory (Raubitzek et al.).
The high dimensionality of descriptive data can cause another type of problem for process optimisation (Savvopoulos et al., Gaudin et al.). Autoencoders enable high dimensional data to be encoded in a much smaller dimensional representation, and optimisation tasks can be carried out in this reduced latent space. The use of a variational autoencoder - one that learns a probabilistic rather than a deterministic description of the latent variables - can increase the robustness of the description (Gaudin et al., Savvopoulos et al.).
Switching from a deterministic to a probabilistic approach can also ameliorate the so-called inverse problems, in which the input parameters of processes or models are to be calibrated or optimised. Inverse problems are usually ill-posed, since several values of the parameters may result in an equally good fit for the desired or measured output of the model or process. Consequently, a probabilistic description of the optimised or calibrated parameters (Hoang et al., Smaragdakis et al., Kurent et al.) gives a more robust and informative solution.
Monitoring and anomaly detection
Inherent changes in the environment and the system itself can create anomalies or drifts that incrementally build up and result in performance degradation. Continuous monitoring and control are therefore essential for the optimal operation of most engineering processes and systems.
The use of modern machine learning methods, such as Deep Learning and Graph Neural Networks allows complex system behaviour modelling without the need to define a large, and usually partial, set of rules and patterns of “normal” behaviour that can quickly become obsolete with time. The application of machine learning methods is made possible by the extensive instrumentation of most engineering systems and processes as well as the high frequency at which those sensors operate. This leads to innovation in monitoring (such as the new positioning system using 5G millimetre wave networks (Gante et al.) and renders classical feature-based and rule-based methods for anomaly detection obsolete because of the combinatorial amounts of data and feature combinations and rules. Deep Neural Networks are particularly well suited in such cases as they automatically learn a reduced dimensionality representation in which anomalies are more efficiently characterised (Kumar Jha et al.). By continuously retraining the system as new data arrives, machine learning models continuously adapt their internal representation of the normal and anomalous behaviour.
This special theme features several examples of combining machine learning techniques and domain-specific models for monitoring and optimising systems in domains such as autonomous transportation (Sahin et al., Lo Duca et al.) and neonatal intensive care (Földesy et al.).
Innovative approaches to anomaly detection in complex networks are addressed by Gutiérrez-Gómez et al. in the context of anomalies or outliers of node attributes in graphs. Defining anomalies in a subgraph or a view is an interesting approach to multi-context anomaly detection, since graphs can represent complex dynamics that are difficult to characterise at a global scale.
As fine-grained instrumentation becomes pervasive in system and process engineering, the adoption of machine learning methods for data-driven anomaly detection, monitoring, and online control is becoming mainstream engineering of complex systems.
This special theme of the ERCIM News explores different fields in which ML algorithms are replacing or enhancing analysis-based methods. By using all available data to simulate complex engineering systems, we can reduce computational time or increase accuracy and efficiency. ML can help engineers create designs with increased performance and reduced consumption, identify hidden dependencies and anomalies, and optimise and control manufacturing. Nevertheless, a knowledge gap still exists between engineering, manufacturing, and big data analysis. We strongly encourage initiatives to close the gap as described by Bernijazov et al., and improve the efficiency of ML tools as outlined by Muccini et al. and Pikoulis et al.
Institute for Computer Science and Control (SZTAKI), Hungary
IBM Research - Europe, Zurich, Switzerland