Rotor Blade Pitch Imbalance Fault Detection for Variable-Speed Marine Current Turbines via Generator Power Signal Analysis

Marine hydrokinetic (MHK) turbines extract renewable energy from oceanic environments. However, due to the harsh conditions that these turbines operate in, system performance naturally degrades over time. Thus, ensuring efficient condition-based maintenance is imperative towards guaranteeing reliable operation and reduced costs for hydroelectric power. This paper proposes a novel framework aimed at identifying and classifying the severity of rotor blade pitch imbalance faults experienced by marine current turbines (MCTs). In the framework, a Continuous Morlet Wavelet Transform (CMWT) is first utilized to acquire the wavelet coefficients encompassed within the 1P frequency range of the turbine’s rotor shaft. From these coefficients, several statistical indices are tabulated into a six-dimensional feature space. Next, Principle Component Analysis (PCA) is employed on the resulting feature space for dimensionality reduction, followed by the application of a KNearest Neighbor (KNN) machine learning algorithm for fault detection and severity classification. The framework’s effectiveness is validated using a high-fidelity MCT numerical simulation platform, where results demonstrate that pitch imbalance faults can be accurately detected 100% of the time and classified based upon severity more than 97% of the time.


I. INTRODUCTION
T HE kinetic energy housed within open-ocean, tidal, and riverine currents represents a highly concentrated source of energy that is both renewable and sustainable. It has been estimated that the energy extracted from the U.S. Gulf Stream current alone has the potential to generate up to 18.6 GW (163 TWh/yr) of electrical power [1]. In addition, the temporally averaged power density encompassed within the section of the Gulf Stream that runs between South Florida and North Carolina reaches 3.3 kW/ 2 [2], while the technically extractable riverine and tidal power encompassed within the entirety of the U.S. has been calculated at 14 GW (123 TWh/yr) [3] and 50 GW (438 TWh/yr) [4] respectively. Various forms of marine current turbines (MCTs) are utilized to convert the kinetic energy residing within open-ocean, tidal, and riverine currents into electricity. However, for the purposes of this research, only horizontal axis MCTs equipped with permanent magnetic synchronous generators are considered.
There are significant cost reductions that need to be made before marine energy technologies can become cost competitive with other forms of contemporary energy generation (e.g., coal, gas, oil, and nuclear power) [5]. A fundamental challenge impeding the wider scaled implementation of marinebased electricity generation revolves around its currently high Levelized Cost of Energy (LCOE). From the perspective of a marine energy device, the LCOE is an integrated metric that takes into account both the cost and estimated electricity producing capability of the device. Such metrics aid in estimating the revenue per megawatt-hour (MWH) of electricity generation needed to satisfy a minimum rate of return, while also considering the costs associated with building, deploying and operating the device over its lifetime. Investments in marine-based electricity generation are expected to remain sparse until more reliable baseline cost scenarios that utilize standardized cost reporting methodologies and assumptions are made available [6].
Research aimed at reducing the LCOE generally progress down one of two paths. Path one focuses on increasing the marine energy device's overall power output, while path two deals with decreasing the device's operation and maintenance costs (O&M). Regardless of path however, the overall goal is to maximize the amount of revenue obtainable from the grid tied electricity produced by the device.
Path one research was performed in [7], where control strategies for MCTs operating at overrated current speeds were investigated. The results of the research revealed that a fluxweakening control strategy could be used to accelerate the turbine's permanent magnetic synchronous generator (PMSG) over its nominal speed for better power limiting control at high marine current speeds. In one of our recent works [8], an adaptive super-twisting sliding mode control strategy was developed and validated on the same 720-kW numerical simulation platform utilized in this work. Path two research was performed in [9], where a Smart Vibration Monitoring System (SVMS) was developed to improve ocean current turbine efficiency and reliability. The SVMS methodology utilized advanced signal processing on vibration data captured in real time to perform condition-based monitoring and incipient fault detection. In [10], high frequency modal analysis, power trending and envelope analysis were used to identify faulty bearings possessed by an emulated ocean current turbine (OCT).
The research performed in this work is focused on O&M cost reduction, as such costs were found to account for an estimated 26-32% of the LCOE [11]. Due to the lack of industry maturity, research relating to this field remains an under-represented area. Therefore, to address such headwinds, we have proposed an innovative framework that possess the following scientific merits: 1) The creation of a fault detection framework that does not require the use of a multi-sensor network. We propose a novel single sensor based methodology that only requires the use of a shaft key encoder. 2) To combat challenges associated with non-stationary signal analysis, our method incorporates the use of wavelet analysis to overcome issues related to spectral leakage. 3) Our framework incorporates the use of dimension reduction techniques for optimal fault feature representation learning. This step facilitates improved downstream machine learning-based fault detection and identification. The remainder of this paper is structured as follows: Section II discusses the advancements made by related work in this field. Section III briefly references the MCT numerical simulation platform utilized in this work and touches upon the nature of the rotor blade pitch imbalance faults being analyzed. Section IV introduces our fault detection framework along with a background literature synopsis. Section V discusses our experimental design and quantifies its results. Lastly, Section VI presents the conclusion and future work.

II. RELATED WORK
Due to the intrinsic similarities between wind turbines and marine current turbines, many of the same fault detection methodologies developed for wind turbine generators (WTG) can easily be employed on MCT generators without major modification. Electrical stator current analysis is one of the most widely used fault detection techniques for such tasks [12]- [14], as it provides an optimal means of identifying abnormal frequency excitations within the characteristic frequency ranges of WTGs. Such methods traditionally incorporate the use of advanced signal processing and machine learning techniques due to their adeptness at facilitating varying degrees of non-intrusive condition based monitoring. However, since many of the signals utilized for non-intrusive fault detection are non-stationary in nature, the occurrence of spectral leakage adds an additional layer of complexity to their analysis. Spectral leakage occurs when the power of a signal is smeared across its frequency spectrum due to the signal not being periodic during some predefined sampling interval. This phenomenon adds additional frequency components to the spectrum of the signal, thereby masking important spectral details about the signal. To overcome such challenges, the following research has been carried out.
Synchronous Sampling-based Methods: Gong et-al developed a synchronous sampling algorithm for the detection of rotor eccentricities and bearing faults experienced by variable speed direct drive wind turbines. The algorithm synchronously sampled the turbine's non-stationary generator stator current signal and applied an impulse detection module to identify fault signatures within the stator current's frequency spectrum. Unfortunately, this version of the algorithm is only applicable towards wind turbines possessing PMSGs, as frequency smearing is induced within the spectrum of the stator current signal when such a method is employed on a turbine possessing a DFIG [15] . In [15], Cheng et-al utilized synchronous sampling on the Hilbert Envelope obtained from the generator stator current signal of a DFIG wind turbine for gearbox fault detection. Since the stator current's envelope signal still retains fault frequencies proportional to the turbine's shaft rotating frequency, frequency excitations at the gearbox's characteristic frequencies were able to be easily identified when applying synchronous sampling on the amplitude demodulated envelope signal. However, such methods are limited to fault diagnosis, and provide no insight toward wind turbine fault prognosis.
Empirical Mode Decomposition-based Methods: In [16], Lu et-al leveraged the abilities of Empirical Mode Decomposition (EMD) and the Hilbert-Huang Transform (HHT) to extract wind turbine fault signatures related to rotor blade imbalances and inner-race way bearing faults from the frequency spectrum of the turbine's electrical stator current signal. Lu's proposed method accomplished this by accurately capturing the instantaneous amplitude and frequency of the generator current signals and then performing a comparative study between the signals in healthy and faulty conditions. In [13], Zhang utilized synchronous sampling and EMD in conjunction with a Generalized Likelihood Ratio (GLR) test to demodulated ocean current and turbulence fluctuation effects from the frequency spectrum of the generator's stator current signal. Doing so allowed for much easier fault signature identification within the frequency spectrum of the stator current signal. Challenges encountered when utilizing EMD-based methods for fault detection include the lack of a strong theoretical foundation, limitations associated with the creation of edge effects, strong reliances on the sifting of the stopping criteria, and extreme amounts of interpolation.
Wavelet-based Methods: In [17], Freeman employed the use of a Morlet Continuous Wavelet Transform (MCWT) on the generator power signal of a MCT. The wavelet coefficient energy contained within the frequency range of the turbine's rotor shaft (1P frequency) was quantified and utilized as a fault signature for rotor blade pitch imbalance faults. In [18], an adaptive Morlet wavelet was used to create a rolling bearing fault detection algorithm for wind turbine planetary gearboxes. Unfortunately, the performance of wavelet based fault detection lies within the user's ability to match the shape of the wavelet kernel to the shape of the fault signature within a signal of interest. Thus, it is imperative for the user to have an extensive domain knowledge in the wavelet field.
Artificial Intelligence-based Methods: In [19], a stacked autoencoder was used in conjunction with a support vector machine to extract fault features affecting gearboxes in DFIG wind turbines. In [20], an adaptive feature extraction algorithm that utilized signal resampling, frequency tracking, and a particle swarm optimized multi-class support vector machine was developed to identify gearbox fault signatures masked within the stator current signal of a PMSG generator.

III. MCT MODELING AND IMBALANCE FAULTS ANALYSIS A. MCT Modeling and Simulation
The MCT numerical simulation platform utilized for this work incorporates the blade element momentum modeling technique and dynamic wake inflow model presented in [21] to calculate the fluid structure interactions. Additionally, the 20-m diameter variable pitch rotor utilized by the platform is based upon the work done in [22]. The degrees of freedom of the rotor were limited to the rotation angle and velocity, and incorporates the "Tidal Turbine" version of the numerical simulation platform presented in [23]. The turbulence model accounts for spatial coherence over the swept area of the rotor bade as presented in [24], and a vertical shear profile that linearly decreases with depth. Including vertical shear was very important in this study as the coupling between the naturally occurring water shear and the rotor blade pitch imbalance is what is responsible for generating the primary fault signatures that appear in the MCT's generated power signal. Thus, increasing the shear magnifies the fault signature while increasing the turbulence creates the fault signatures.

B. Analysis of Rotor Blade Pitch Imbalance Faults
Rotor blade pitch imbalance faults are induced by misaligned rotor blades. Such misalignments create differences in the vertical shear profiles experienced by the blades and have the potential to induce shaft torque variations [25].
The kinetic energy inherent within the varying dynamic loads and vibrations produced by shaft torque variations is transferred between the turbine's rotor shaft and generator via the electromagnetic couplings shared between the two. It was shown in [26], that through such interconnections, amplitude and frequency modulations (AM and FM) are induced upon the generator's electrical stator current signal. Thus, for a direct drive MCT under the influence of pitch imbalance faults, the shaft torque generated by the variable speed rotor can be described as: where, T t is the torque experienced by the rotor shaft, T tw is the torque resulting from turbulence and wave activity, A stv is the amplitude of the shaft torque variation stemming from the pitch imbalance fault, and f st is the frequency of the shaft torque variation. The resulting AM and FM affects both the generator stator current signal, I gen (t), and its fundamental frequency, f I (t), such that: where in (2), I gen is the amplitude of the stator current signal, I vca is the component of the stator current amplitude that is generated from the variable ocean current power, A pfc and are the respective amplitude and phase components of the signal that are created by the pitch imbalance fault. In (3), f gen is the fundamental frequency of the stator current signal, f vcf is the component of the fundamental frequency that is created by the variable ocean current power, and A pff and are the amplitude and phase components of the signal created by the pitch imbalance fault, respectively.
Combining equations (2) and (3) allows for the formulation of the modulated stator current signal: where C gen is the AM and FM version of the generator stator current signal under pitch imbalance fault conditions. Lastly, if it is assumed that the generator outputs an ideal three-phase supply voltage, then the instantaneous power can be described as: where V gen (t) is the single phase stator terminal voltage. Thus, the single-phase power of the generator can be described similarly to [27] as: This analysis yields the following conclusions: 1) Rotor blade pitch imbalance faults induce shaft torque variations; 2) The shaft torque variations create dynamic loads and vibrations that are transferred onto the rotor shaft [28]; 3) The kinetic energy housed within these dynamic loads and vibrations modulate the generator's electrical power signal; and 4) The shaft torque variations create excitations within the frequency spectra of the electrical power signals. However, due to the induced AM, FM, and spectral leakage, these excitations become masked within the frequency spectrum of P gen (t). Thus, traditional Fourier based frequency analysis methods are unable to fully capture the entire range of such dynamics and highlights the need for more robust techniques.

IV. PROPOSED FAULT DETECTION FRAMEWORK
Our novel fault detection and severity classification framework is proposed in Fig. 1, and begins with a signal acquisition phase, whereupon the instantaneous rotor frequency signal, f rot , and P gen (t) are acquired from the turbine's generator. The signal f rot is strictly used to determine the 1P frequency range that P gen (t) is band-passed filtered around and is not used afterwards.
Since the CWT is highly adept at analyzing the frequency spectrums of non-stationary signals, a time-frequency spectrum of P gen (t) is created utilizing a Morlet wavelet based CWT. The resulting wavelet coefficients are then tabulated into a six-degree feature space, where each dimension of the feature space corresponds to a different statistical index calculated from the wavelet coefficients. The chosen six statistical features are the mean, standard deviation, skewness, kurtosis, RMS, and peak to peak values.
Principle Component Analysis (PCA) is then employed to reduce the dimensionality of the feature space by only utilizing the components that correspond to the directions of maximum variation within the data set. Finally, a KNN machine learning algorithm is employed on the lower dimension feature space for the purpose of fault detection and severity classification. Additional details for each step of the framework are provided below.

A. Signal Acquisition and Conditioning
Band-pass filtering is performed on P gen (t) via the application of a Gaussian filter kernel that is mean centered at f rot 's average frequency, such that the full width and half maximum of the kernel corresponds to the +/-3 standard deviation (3-STD) range of f rot . This filtering removes the DC offset, sampling noise, and any other unrelated frequencies of interest form the the spectrum of P gen (t).

B. Time Frequency Spectrum Analysis
A Continuous Wavelet Transform (CWT), is used to generate the time-frequency spectrum of P gen (t). The CWT can be interpreted as the convolution between P gen (t) and several localized wave-like functions of oscillatory nature. These wavelike functions possess finite energy, zero mean, and are all derived from a single mother wavelet basis function, Ψ( ) [29], [30].
Ψ( ) contains two hyper-parameters that can be tuned to increase its robustness for non-stationary signal analysis. The first parameter is the dilation parameter, a, which controls the stretching and contraction of Ψ( ). The second parameter is the translation parameter, b, which shifts Ψ( ) along the length of P gen (t). Ψ( ) takes the general form: Ψ( ) = Ψ (( − )/ ), utilizing a range of a's and b's. In the equation below: the quantities represented by T( , ) are known as the wavelet coefficients, and are measures of cross-correlation between P gen (t) and Ψ( ) [30]. Values of T( , ) are usually visualized via a spectrogram image, whose axes represent the various translation and dilation parameters of Ψ( ). Additionally, w(a) is a weighting function that is customarily set equal to 1/ √ to ensure that wavelets of the same scale all posses equal amounts of energy. Lastly, the * symbol indicates that the complex conjugate of Ψ( ) is used. A wide variety of Ψ( ) functions may be used when applying the CWT, with the optimal choice largely dependent upon the similarity of shape between Ψ( ) and the fault signature being analyzed. Since this research is concerned with analyzing 1P band specific frequency activity, the increased control, precision, and shape of the Morlet wavelet's windowing kernel made it the ideal Ψ( ) for this research [31].
A complex Morlet wavelet, w, can be constructed by multiplying a Gaussian window function with a sine wave: w = 2 (− 2 /2 2 ) , where i is the imaginary operator, f is the peak frequency in Hertz of the sine wave, and t is the time in seconds [32]. Additionally, = n/2 f , is the parameter that controls the width of the Gaussian window function, for which n dictates the trade off between frequency and time precision. Furthermore, n is an extremely non-trivial parameter that controls the Heisenberg uncertainty principle for time-frequency analysis and heavily influences the quality of results that are achievable from the data [31]. Through empirical analysis, n was selected to range from 5-15, with 100 increments between these values.

C. Feature Space Creation & Optimization
The wavelet coefficients generated by the CWT are statistically tabulated into a that is comprised of features corresponding to the mean, standard deviation, skewness, kurtosis, RMS, and peak to peak values. PCA is then utilized to reduce the dimensionality of this six-degree feature space by extracting out only the principle components associated with the directions of maximum variation within the data set.
PCA is a data reduction technique whose aim is to construct a set of principle components based upon the covariance amongst a set of correlated features in an N-dimensional data set [31]. Each principle component represents a vector in Ndimensional space that characterizes the direction of a certain amount of variance contained within the data set.
Performing PCA begins with the construction of a covariance matrix, CovMat [31]: In the equation above, X is the given n by m data set or feature matrix being analyzed andX is the mean value X. After the covariance matrix has been calculated, an eigendecomposition of the covariance matrix is performed. Matlab's eig function makes this process convenient and efficient via the use of the functional equation: where W is an m by m matrix of the resulting principal components, and is a diagonal matrix of the associated eigenvalues. The eigenvectors characterize patterns within the covariance matrix, and can be viewed as a set of new coordinate axes to analyze the feature space with. The eigenvalues, , represent the magnitude of the eigenvectors. Principle components are customarily ordered according to decreasing score value, T, such that the 1st principle component characterizes the "direction" of the largest amount of variance within the data set, the 2nd principle component characterizes the "direction" of the 2nd largest amount of variance within the data set, and so forth. Since T = X · W, for which T is an n by m score matrix, is ordered, it can be truncated to contain only the "r" most relevant principle components, such that: T r = X · W r . Here, W r is an m by r matrix, and T r is the n by r truncated score matrix obtained from T.

D. Application of Machine Learning Algorithm
A KNN machine learning algorithm is then employed on the truncated feature space for fault detection and severity classification.
The KNN machine learning algorithm is a lazy, nonparametric, supervised learning algorithm that is adept at solving both regression and classification problems. The decision making capability of the KNN algorithm is based on feature similarity, in that the algorithm assumes that similar instances exist in close proximity to each other. The prediction of a new instance is determined by a majority voting mechanism, in which the class of the new instance is chosen to be the same as that of the majority class of the K nearest instances (neighbors) around this new instance. The most common way to compute the distance between instances is via a Euclidean When classifying a new instance, the selection of the most optimal amount of neighbors is usually done empirically.

A. Simulation Setup and Data Set Preparation
50 random seeds of data were simulated for the experiment, where for each seed a generator power and rotor frequency signal were simulated for each of the three studied fault cases (i.e., zero degrees, two degrees, and four degrees).
Each signal was sampled at a frequency of 10Hz and for a total duration of 300 seconds. The simulation platform utilized a turbulence intensity of 10% and a shear reduction rate of 0.0035 (m/s)/m with depth. The mean flow speed of the simulated ocean currents was 2 m/s at a max height of 20 m above the center of rotation of the rotor. To control the rotor speed, an industry standard fixed gain torque, = k · 2 , controller was used to maintain an operating speed near that of maximum power production. A representative look at the simulation output corresponding to seed one of the data set can be seen in Fig. 2.

B. Experimental Results
Fig. 2a displays the P gen (t) signals corresponding to seed 1 of the data set. In Figs. 2b -d, the time-frequency spectrums of the P gen (t) signals are shown. The bottom portion of the spectrums portray the sum of the wavelet coefficient energy contained within the 1P frequency region, for which the center red line represent the mean of f rot , and the two boundary red lines represents the +/-3STD range limit of f rot . While there does appear to be a clear distinction within the 1P band specific activity amongst these plots, establishing a consistent means of accurately quantifying the difference in this activity, through out the entirety of the data set, is a fundamental goal of this work.
To aid in developing a means of accurately quantifying the 1P band specific frequency activity for all fault cases, a six dimensional statistical feature space was created to concisely describe the statistical characteristics of the wavelet coefficient energy contained within the +/-3STD regions. Fig.3 is a plot of the Pearson's Correlation Matrix which is used to simultaneously investigation the linear dependencies between the different wavelet coefficient statistical features used to create the feature space.
Since the correlation coefficients possessed by many of the pairs of features have large magnitudes, PCA can be employed to reduce the dimensionality of the feature space. Figs.4a depicts the cumulative sum of variance possessed principle components, where it is shown that approximately 95% of the total variance contained within the feature space can be captured by these first two principle components. Figs.4bc show the relative percentage wise contribution that each statistical features makes towards the creation of principle components one and two respectively. Furthermore, the shear and turbulence characteristics that are responsible for creating the analyzed fault signatures and their intensities in this research are captured very well via the use of PCA. As shown, principle component one has its largest contributions made by the RMS, Mean, and STD features, while principle component 2 is constructed primarily from the Skew and Kurt features.  Before the KNN machine learning algorithm is employed the data set is partitioned into a 70% training set and 30% testing set. Fig.5a is a display of the projection matrix obtained after employing PCA on the testing partition. Repeated cross validation was utilized to train the KNN classifier, such that 10 fold cross validation was performed 10 distinct times on the training data set. The trained classifier was then employed on the testing data set displayed in Fig.5a. Correspondingly, Fig.5b displays the confusion matrix, for which the reported accuracy shows that the classifier is able to correctly predict the presence of a fault 100% of the time and correctly classify the severity of the fault 97.78% of the time. Through empirical analysis, the KNN algorithm was optimized to used five neighbors, for which a simple Euclidean distance measure was used to determine the distance between each neighbor.
Lastly, a summary table of the experimental results are presented in TABLE I. Of note in the table is the Kappa metric, which was found to be 96.67%. The Kappa metric is a measure of how closely the instances classified by the KNN classifier matched the ground truth while also taking into account the extent to which the data collected in the experiment is correctly represented by the measured features [33]. The Kappa metric is relevant because it removes the element of instances being correctly classified based upon random chance from its evaluation of accuracy. Additionally,the table also shows that the classifier proposed in our framework outperformed the No Information Rate, which is a metric produced when a naive classifier is employed with the purpose of proving that the model presented in this work possess a statistically significant P-value.
which compares the observed accuracy of the classifier with

VI. CONCLUSIONS
The objective of this research was to develop a fault detection and severity classification framework for MCT rotor blade pitch imbalance faults. In the proposed framework, a Morlet CWT was first utilized to view the time-frequency spectra of the turbine's electrical power signals. Next, the wavelet coefficient energy encompassed within the 1P rotational frequency range of the turbine's rotor shaft were extracted and statistically tabulated into a six-degree feature space. PCA was then used to reduced the dimensionality of the feature space so that the new coordinate axes of the space are aligned with the directions of maximum variation. Lastly, a KNN machine learning algorithm was employed on the resulting feature space for fault detection and severity classification. This research found that the KNN machine learning algorithm correctly predicted the presence of pitch imbalance faults 100% of the time and correctly classify their severity 97.78% of the time.
Our proposed method has very low hardware cost, and does not require the use of complex sensor networks (containing vibration, strain, torque, and/or acoustic emission sensors) that are used in contemporary fault detection and condition monitoring systems. Additionally, the proposed framework can also be effortlessly integrated into existing MCT control systems. In the future, it is hoped that the robustness of this framework can be expanded to allow for the prediction and classification of an extended range of faults that affect MCT systems, such as gear faults, drivetrain and bearing faults, and