A Segmental Autoencoder-based Fault Detection for Nonlinear Dynamic Systems: An Interpretable Learning Framework

This paper presents a segmental autoencoder-based fault detection (FD) framework for nonlinear dynamic systems. The basic idea behind the proposed FD scheme is to identify a generalized kernel representation based on the representation knowledge learned from an autoencoder. By using the system data, several cascades, linking nonlinear operators, are employed to obtain a data-based model which describes the nonlinear dynamic behaviors. With the help of the segmental structure of an autoencoder, a residual generator is then constructed. Rigorous mathematical analysis and an application on a continuous stirred tank reactor demonstrate the effectiveness of the proposed FD method.


Introduction
Dynamic systems often involve complex interactions that are prompt to numerous uncertainties and abnormal behaviors [1]. When the nominal operation is affected by uncertainties or abnormal behaviors, system identification, monitoring and control become more challenging resulting loss of process safety, optimality, and economics [2-6]. Fault detection (FD) techniques provide systematic solutions to monitor the process and improve control performance [7][8][9][10].
Recent techniques utilize deep neural networks with nonlinear activation units to take the behavioral nonlinearities into consideration [20].
In the pattern recognition literature, these neural networks have also been used in anomaly/outlier/out-of-distribution detection [21], which are similar to the process monitoring and diagnosis techniques in terms of motivation and methodology. These data-driven approaches construct mappings between process input-output data to accurately estimate whether the operational conditions are within the desired ranges. Such an estimation is used to provide crucial information to the operator and improve the operational performance and safety. In [20], a fully connected neural network and a recurrent neural network were employed to design the two FD methods, respectively corresponding to finite impulse response filters and a recursive filter. The study [15] proposed a variational autoencoder-based FD scheme by combining it with Kalman filter. More recently, both unsupervised and supervised learning tools were involved for system identification and the designs of residual generators in [22] [22] has proven the equivalence condition, based on which both unsupervised and supervised learningbased FD approaches can achieve the same FD performance.
Inspired by the study [22] where a bridge links unsupervised to supervised learning-based FD approaches, this work proposes a novel and optimal FD scheme for nonlinear dynamic systems. The proposed method consists of a segmental autoencoder network with stable kernel representations to detect the fault dynamically, without requiring prior process knowledge. Because it relies on nonlinear deep neural networks with a small number of trainable parameters, the computational efficiency is preserved during both the training and inference phases. The contributions of the study are summarized as follows: e Formulating a data-based nonlinear model with the aid of the cascades to capture system dynamics, Constructing a generalized kernel representation using a segmental autoencoder, Developing a novel FD algorithm to detect the faults more accurately, Rigorously deriving analytical expressions to prove optimality of the proposed FD method, Quantitatively analyzing the performance of the proposed scheme through a continuous stirred tank reactor (CSTR) simulation.
The rest of this article is organized as follows. Detailed background information and problem formulation are provided in Section 2. The proposed FD method using the segmental autoencoder and performance analysis are detailed in Section 3. Simulation studies are presented in Section 4, and the conclusion is given in Section 5.
where ¢en and dae are the mappings generated by the encoder and decoder, respectively; f and its estimation fi are the input and output signals, and @ is the latent variable of a reduced dimension; "o" represents the cascade connection between two nonlinear operators.

Problem Description
Consider a nonlinear system that is defined as a(k +1) = f(w(k), u(k), w(k)), y(k) = g(w(k), uk), o(k)), where x € R", u € R', and y € R™ are the system state, input, and output, respectively; w € R" and v € R™ are noise sequences; f(-) and g(-) are continuous nonlinear mappings. Since only wu and y of (3) can be used, a moving horizon (i.e., the stacked vector [22]) is usually adopted for system identification: (3) The notation above can be used for any variable in (3).
In this paper, we intend to design a data-driven FD scheme for dynamic systems given in (3) with the help of an autoencoder. Considering the actuator fault @, and sensor fault 0,,
(4) (5) 556 Definition 1. Given a nonlinear system (3), a nonlinear operator K is called a generalized stable kernel representation of (3) if for w = 0, v = 0, and a given x(0), the following holds where "|-|" represents the composite operator, and Znix(k) is the mixture of the stacked u and y: Based on the nonlinear K defined in (6), the residual signal can be obtained according to  (14) In (12) to (14), f2° is defined by f2'~7 0 f, 0 f, = fer 10 f 2s f t t 3.2 Data-based State Estimation Since the system state, x, is unknown, the data-based model (11) is tractable. As an estimation, 7 can be obtained through a nonlinear estimator o, such as a mapping generated by neural networks. By using the past horizon s,,, one can obtain where o : R'™ -FR". It can be derived as follows Substituting (17) into (11) yields Qe o fei of, Qe In (19), f= fe -0°9, and f= fu -0° gu.  Based on the objective given in (20), the optimal hyperparameter of a segmental autoencoder can be obtained, equivalent to K*. Motivated by (20), the design concept, together with optimal FD performance, of the proposed FD scheme is summarized in the following theorem.
Proof. Based on the autoencoder (TY) defined in (24) where Tr(-) is the trace operator. Combining (5), (22) in the faulty condition becomes ri (k) =r(k) + frets where fre includes 0, and 6,-related terms, whose mathematical description is given in [22]. Correspondingly, (25) will be (28) T? (rf (k)) = T?(r(k)) + fre frets because, in general, 6, and @, are independent of e, ft By combining (27) with (29), it can be verified that the segment H y(T") and the associated * can maximize the influence of f4 71 fret, indicating the optimal FD power of the proposed scheme. Therefore, this theorem is proven.
It is worth mentioning that when neural networks are employed to design an FD scheme, the fault-related term fre must not disappear or be weakened. In addition, fi.) can be quantitatively described. For example, the study [24] considered this problem based on the following Taylor series expansion: 1 rh wrt VrAQo-5 (0) HAG, where A@ is the changes caused by 0, and 0,. Since r is obtained via the optimal hyperparameter H,,(Y*), one can obtain the following two results: Vr =Oand H 40, where H is the Hessian matrix of H(Y). Therefore, we have and fr" fret > O => E(T?(r!(k))) > E(T?(r(k))). 33) Here, #7 is a three-dimensional tensor, ensuring that the fault influences will exist in the obtained residual signal. a Fig. 2: The proposed segmental autoencoder-aided FD architecture The architecture of the proposed segmental autoencoderaided FD scheme is depicted in Fig. 2. We can observe that only the segment of autoencoder is used to construct residual signals. It is the main difference between the proposed scheme and the existing autoencoder-based FD approaches.

Performance Comparison and Implementation
In order to provide a more detailed analysis, performance comparisons with the traditional autoencoder-based FD strategy and some possible variants will be conducted. For this purpose, we choose a semi-autoencoder (SmAB), a total  Therefore, the aforementioned mathematical analysis has illustrated the optimal FD power of the proposed segmental autoencoder-based scheme.
According to Fig. 2 and Theorem 1, the complete procedures of the segmental autoencoder used for FD are summarized in Algorithms | and 2. Algorithm 1 The segmental autoencoder-based FD application: Off-line learning Table 1: Configurations of the structing residual generators autoencoder used in con-1: By using multiple unit delays, both the input and reference signals of an autoencoder are formed according to (24); 2: Construction of an autoencoder with the structure 11(T) is fol- A total of 1.32 x 104 data samples are obtained from the C-STR simulation. These data are divided into a training data set with 1.2 x 10+ samples and an online test data set with 1.2 x 10° samples. For the construction of a segmental autoencoder, 75 unit-delays are introduced to the input and 75 unit-delays are introduced to the reference output; the hidden layer has 95 neurons. The main configurations of the autoencoder used in this study are summarized in Table 1.

FD Results Using The Segmental Autoencoder
Three faults occurring from the 201-st step, denoted as 64, 6.1, and 6,9, are adopted in this case study. Following the description given in [25], 6, is a structure (and multiplicative)   Fig. 4 that the drifted faults contribute to a gradual increase of test statistics. Specifically, FARs are 0%, 0.52%, and 0.52%, and MDRs are 9.49%, 0%, and 2.22%; they respectively correspond to the results in Fig.  4 from the top to the bottom. Overall, the proposed segmental autoencoder-based approach shows satisfactory results in the simulation of various FD tasks.

Conclusion
This paper has proposed an optimal FD scheme for nonlinear systems, whose core is an unsupervised learning strategy that can identify nonlinear system dynamics. By adopting a segmental autoencoder, the minimized reconstruction error can be expressed by explainable knowledge representation. Furthermore, multiple unit delays are equipped in the autoencoder to improve the computation efficiency and the stability of learning procedures. Then, a residual generator can be constructed for the FD purpose. The validation studies have shown that the proposed FD method effectively detects faults for nonlinear dynamic systems.