Overvoltage prediction method integrating the model-driven and data-driven techniques

The improved DT-based method proposed in Section 2 realizes reliable prediction accuracy and strong adaptability to high-risk scenarios. However, for data-driven methods, the lack of theoretical analysis and the potential over-fitting phenomenon due to insufficient training samples should be taken into account. Furthermore, the interpretability of prediction results are crucial for decision-making in power system operation. To further address the above issues, this section proposes an overvoltage analysis method integrating the model-driven and data-driven techniques, and the integration framework of two methods is presented.
Model-driven overvoltage analysis method
The transient overvoltage mechanism and an analytical expression on transient overvoltage peak value are studied, and key influencing factors leading to overvoltage problems are extracted. The proposed model-driven method has the potential for online application, which provides the effective support for the integration with data-driven methods.
(1) Transient overvoltage mechanism
The equivalent circuit of a typical wind-thermal-bundled HVDC transmission system is shown in Fig. 3.
Fig.3 Typical two-terminal AC/DC hybrid system model
When the AC/DC hybrid system is operating normally, . The active and reactive powers should be balanced, which can be expressed as:
Where the subscript N indicates the normal operating condition,QdrN is the reactive power consumption of the converter station, QCrN ,Qac 1N ,QwN denote the reactive output of reactive power compensation device, sending AC system, and wind farm, respectively. A short-circuit fault which occurs in the receiving AC system may lead to CF. For the sending AC system, the voltage is directly related to the reactive power, and the transient overvoltage of converter bus can be expressed as follows.
Where ∆Qr is the reactive surplus of converter station, Scr is the short circuit capacity of converter station. During the CF, the reactive power consumed by the converter is dynamic. Therefore, the ∆Qr in can be expressed as follows.
Combining with , the transient overvoltage level under different DC faults can be obtained.
(2) Expression of transient overvoltage level
According to the dynamic characteristic of reactive power compensation device, can be derived from .
Combining and , the transient overvoltage of converter bus can be derived.
According to [28], the Qdr can be calculated as follows.
Where Udr andUdr 0 are DC voltage and no-load DC voltage of the rectifier bus, respectively;Idr is the DC current. It is obvious that theIdr is always greater than 0 during the CF. According to , the Qdr is greater than 0, which indicates that the rectifier and inverter only absorb reactive power. Furthermore, the derivative of Qdr is shown as follows.
When a CF occurs, the DC voltage of the inverter will directly plummets to 0, and the DC current will rapidly increase. Then, the DC current will reach I min (I min is depending on VDCOL), and the rectifier constant current controller will escalate the firing angle to reduce the DC current. Consequently, the DC current will increase first and then decrease toI min, and the Ud will also decrease to Udr min. Make dQdr /dt =0, then
whenUdr =Ud minandIdr =Id rmin, is satisfied, and the rectifier will absorb the minimum reactive power. The Udr min can be calculated as follows.
Substituting into , the Qdr can be expressed as:
Therefore, the transient overvoltage of converter bus can be calculated as follows.
It can be seen from that the crucial factors which contribute to overvoltage issues are the short-circuit capacity, the reactive output of the AC system during normal operation, and the reactive power consumption of the converter station during the fault. Therefore, alternative transient overvoltage control measures can be summarized as follows: reducing reactive-voltage sensitivity [29] and suppressing reactive surplus source [30,31].
Framework of the integrated methodThe proposed theoretical analysis method for calculating the overvoltage peak value of converter buses achieves the compatibility of computation speed and accuracy for online application, providing effective support for the integration with data-driven method. Specifically, as shown in , the theoretical overvoltage values can be obtained efficiently by utilizing the equivalent parameters of AC system and the operation parameters of DC system. To avoid the over-fitting phenomenon caused by improper feature selection and enhance the interpretability of regression prediction results, the theoretical analysis results are regarded as additional input features to the original training samples for the data-driven method. The detailed integration mode between model-driven and data-driven overvoltage analysis methods is depicted in Fig. 4. The objective of improved DT model is transformed from massive data relationship mining to association pattern revealing between the theoretical evaluation values and the true values. Therefore, for typical fault scenarios, the key electrical quantities and corresponding theoretical overvoltage values are selected as input features, and the overvoltage peak values obtained by the time domain simulation method are taken as the output. The DT model is trained by the improved samples to achieve fast error correction.
Fig.4 Integrated prediction network structure
Specific procedure of overvoltage level predictionThe specific procedure of overvoltage peak value prediction is depicted in Fig. 5, which is comprised of offline training and online prediction.
Fig.5 Specific procedure of overvoltage level prediction
In the stage of offline training, typical fault scenarios are simulated in PSASP software, taking into account factors such as load levels, renewable energy penetration rates, and active power transmitted by the DC link. From the simulation results, characteristic quantities related to overvoltage levels are extracted to form the sample set. The input features of the sample set consist of key electrical quantities and corresponding overvoltage peak values calculated by theoretical analysis method, while the output labels are the overvoltage peak values obtained by time domain simulation method. The tree growing and pruning procedures are then carried out based on the sample sets, and the splitting rules of each node are determined according to the prediction performance to obtain the optimal overvoltage level prediction model. In the stage of online application, when a fault occurs in the actual power grid, key electrical quantities are collected by WAMS, and the corresponding theoretical overvoltage peak value is obtained through . The combined input features are then fed into the well-trained improved DT model, which accurately predicts the overvoltage level and guides the secure and stable operation of power systems.

Case study

In this section, the Northwest China local region hybrid AC/DC power grid depicted in Fig. 6 is adopted as the test system to verify the accuracy and effectiveness of the proposed method. The test system is constructed based on 750kV grid structure. Qingyu DC transmission projects are adopted to achieve the transmission of renewable energy electricity in Northwest China region, forming a typical hybrid AC/DC power grid.
Fig.6 Northwest China local region hybrid AC/DC power grid
Performance of improved DT model
(1) Evaluation indices
Performance evaluation indices, containing the mean absolute error (MAE), mean absolute percentage error (MAPE), root-mean squared error (RMSE) and coefficient of determination (R2), are adopted in this paper to evaluate the overvoltage prediction effect of different models, and calculation formulas are shown as:
where m is the number of testing samples,yi is the predicting value,yi is the actual value, and is the mean value of yi .
(2) Improved DT model training
Taking the operating mode and fault type of power systems into account, the simulation software PSASP is adopted to establish a dataset of overvoltage under large disturbances. For the operating mode, the output of traditional power stations and renewable energy stations are adjusted under the load levels of 90%, 100% and 110%. In addition, the DC transmission is adjusted in increments of 10% within the range of 60% to 100%. As for the fault type, CF is set at the rectifier station, including single and double pole faults. The number of times that CF occurs is set as 1, 2 and 3, respectively. In addition, the duration of CF ranges from 0.15s to 0.25s. The simulation time is 20s, and the rated frequency of the test system is 50Hz. For each DT, the total number of samples is 4480, where 60% of the samples is selected as the training data set, and the remaining 40% are used for the testing dataset.
The key electrical characteristics are selected as the input features, and the overvoltage peak value of Qingnan 750kV bus is collected as the output. As a common approach for modelling and verifying model parameters, the ‘10-fold cross-validation’ [32] is adopted to determine the depth of DT model. With the increasing of DT depth, the prediction effect of testing set are depicted in Fig. 7. Considering the coordination between calculation efficiency and prediction accuracy, the optimal DT depth is determined as 11.
Fig.7 Prediction effect under different DT depth
During the training process, as the number of samples increases, the RMSE and R2 of training set and testing set are depicted in Fig. 8 and 9, respectively. The consistency of prediction accuracy between training set and testing set is demonstrated, verifying the effectiveness and generalization capability of the integrated method.
Fig.8 RMSE of training and testing set
Fig.9 R 2 of training and testing set
(3) Prediction effect and visualization of improved DT model
Fig. 10 and 11 depict the MAPE index of traditional DT model and improved DT model under different ranges of actual overvoltage peak values. It can be concluded from the prediction effect of the traditional integrated method that the MAPE for testing samples with high actual peak values is considerably higher than those with low actual peak values. Hence, modifying the DT algorithm is expected to reduce the MAPE for cases with high actual peak values. After modifying the common DT algorithm, the overall prediction effect in testing samples is improved, especially in the cases with high actual peak values.
In detail, to evaluate the efficacy of the improved DT algorithm, four regions have been delineated based on overvoltage levels, namely region 1 with less than 1.34 p.u., region 2 ranging from 1.34 p.u. to 1.38 p.u., region 3 ranging from 1.38 p.u. to 1.42 p.u., and region 4 with more than 1.42 p.u.. The corresponding MAE and MAPE indices for each of these regions are presented in Tables 1 and 2. It can be seen that the maximum improvement in both MAE and MAPE indices, amounting to 6.9% and 34.3%, respectively, occurs in region 4. These findings indicate that the proposed approach has more significant enhancement in high-risk scenarios.
Table 1 MAE (p.u.) of four overvoltage regions