Critical Failure Mechanisms in Isolated Three-phase Multilevel Inverters Design

—Isolated multilevel inverters are widely used in renewable energy systems and industrial applications. Isolated IGBT topologies exploit the usage of low-frequency transformers that improve robustness and reliability. However, critical failure mechanisms should be considered at the design stage to ensure proper performance. This paper describes these critical failure mechanisms, such as short circuit, cross-conduction, IGBT high inductive load avalanche, IGBT second turn-on, VS-undershoot, transformer inrush current, IGBT thermal runaway, and cable switching interference. Furthermore, this paper comprises design techniques to prevent these failures. The previous failure mechanisms come from the inverter's power stage, except switching interference from control signal cables and directly affecting the control device functionality. This work also proposes a circuit topology based on FPGA resources to reduce switching interference from control signal cables. It behaves like a fault-tolerant digital input that effectively filters bouncing events shorter than 2µs. Measurements report satisfactory experimental results upon constructing a 45kVA ac-side-isolated 13-level inverter.


I. INTRODUCTION
ULTILEVEL inverters (MLIs) have become more popular than conventional DC/AC inverters in some application fields because they operate with many voltage steps that improve the output waveform. Indeed, MLIs feature several advantages over conventional topologies, as follows. 1. MLIs operate at both high and low frequency, reducing switching devices rate. 2. MLIs feature low output harmonic distortion. 3. MLIs can achieve high power capability by dividing the total power into several switching devices. 4. MLIs extend the switching devices lifetime and reliability due to low switching rates. 5. MLIs that operate at low frequency, considerable reduce switching losses, enhancing the overall efficiency. An extensive comparison among multilevel topologies (state of the art) elaborates on [1].
MLIs are categorized into non-isolated multilevel inverters and isolated ones, among others. Isolated MLIs are subcategorized into DC-side and AC-side isolated. This paper focuses on AC-side isolated inverters, especially the Cascade Transformer Multilevel Inverter (CTMI).
CTMIs are extensively used in large solar power plants because they feature low harmonic distortion, galvanic isolation, and low-frequency switching operation [2]. Moreover, [3], [4], [5], and [6] show that CTMIs can improve reliability, robustness, and efficiency despite the use of large low-frequency transformers.
Reference [7] elaborates on the CTMI features, more importantly, the advantage of using a DC source solely. Indeed, it also claims that the main disadvantage of CTMIs is the usage of large transformers that imply an increase in weight and inverter size. Figure 1 depicts the constructed 45kVA multilevel inverter block diagram. It discloses a modular CTMI of 18 transformerisolated power modules, the FPGA-based control signal processing, the microcontroller-based analog signal processing, and Human Interface Machine (HMI), the analog front end, the DC main switch and protections, the control signals isolation module, the isolated low-voltage power supplies, the voltage, and current sensors, and the DC-link and control signals distribution. No filtering stage was developed; however, the harmonic distortion at the inverter's output can be significantly improved by using a passive filter. A proper passive filter design approach is presented in [8]. Figure 1 highlights a single power stage, called ''single tap.'' A ''single tap'' comprises an Isolated Gate Bipolar Transistor (IGBT) full-bridge and a 2.5kVA low-frequency transformer. It is the key focus of this work since the critical failure mechanisms come from this stage.
The low-frequency transformer plays a critical role in the design stage. Its electrical properties, such as high inductance, high inrush current, reduced di/dt, and high efficiency at heavy loads [7], [9], determine several design techniques to prevent failures [10].
The paper is organized as follows. Section II describes the short circuit failure mechanism. Modular CTMI topology involves two types of short circuits. Short circuit type I comes when the current travels across a single arm of the full-bridge, whereas short circuit type II comes when the current passes through the transformer primary winding. Several techniques to prevent IGBT failure due to short circuits have been published as seen further, e.g., [11].
However, many of these techniques are difficult to accomplish; moreover, they are not mature. In this work, the well-known and proven desaturation technique combined with a shortcircuit-tolerant IGBT [13] was implemented as a robust solution to prevent IGBT failure due to short circuit events. Fig. 1. The 45kVA constructed modular CTMI. Each output line contains six stages to perform a 13-level inverter. This work focuses on the critical failure mechanisms that mainly come from a single power stage (highlighted as single tap).
Section III approaches the cross-conduction failure mechanism. It describes how this mechanism may conduct the IGBT into a failure scenario, e.g., [14]. Two passive delay lines were inserted into the control signal path to prevent crossconduction.
Section IV deals with the high inductive load avalanche failure mechanism. It concludes that a meticulous design tradeoff between cross-conduction and proper transformer inductance energy deviation should be fulfilled to prevent IGBT avalanche.
Section V comprises the second turn-on occurrence due to the IGBT Miller capacitance and a high dv/dt. The circuit design ensures a low IGBT gate impedance to prevent this situation, leading to a type I short circuit event.
Section VI focuses on the VS-undershoot event predominantly caused by the parasitic inductances of the collector and emitter IGBT terminal paths. Proper Printed Circuit Board (PCB) design techniques should be performed to diminish these parasitic inductances.
Section VII describes how the high transformer inrush current at the starting process may affect the DC-link voltage, leading to a low input voltage failure. This section also introduces a soft-start process by using Pulse Width Modulation (PWM) techniques.
Section VIII approaches the importance of thermal management. It stands that a relatively low IGBT package temperature reduces the probability of IGBT thermal runaway in case of overcurrent, short circuit, or avalanche events [15].
Section IX notices the negative effects of switching interference absorbed by the control signal cables and carried to the control device pins (FPGA). For this reason, a faulttolerant digital input implemented using FPGA resources was introduced.
Finally, section X summarizes the failure mechanisms presented throughout the paper and shows pictures of the testbench and the entire multilevel inverter.

II. SHORT CIRCUIT FAILURE MECHANISM
Previously, section I introduced two possible short-circuit events, classified according to the short-circuit current path into the full-bridge circuit. Figure 2 illustrates these possible shortcircuit scenarios further. The full-bridge topology sets up two arms that contain two IGBTs. In any situation, one IGBT within an arm is on-state solely. However, when both IGBTs are on-state due to a failure scenario, an abrupt increase in current through the IGBTs occurs. This current is, in most cases, high enough to damage the switching devices almost instantly. Indeed, the enormous di/dt may damage the IGBT and the IGBT driver, as seen ahead. This failure mechanism is called the ''type I'' short circuit event.
When the short circuit event comes from the transformer output, see Fig. 2, the IGBTs current flows through the transformer primary winding reducing the di/dt considerably. A reduced di/dt enables a higher time window within IGBTs must be turned off before damage. This failure scenario that originated externally is called the ''type II'' short circuit event.
The type II short circuit event has a higher occurrence probability than the type I. However, the circuit design must consider both failure events. The circuit design aims to prevent the IGBT damage by turning it off as soon as possible a short circuit event occurs.
Several design techniques are available today, e.g., [16] introduces a particular type of Metal Oxide Semiconductor Field Effect Transistor (MOSFET) that eases drain current measurements by adding a current-sensing terminal. The MOSFET driver stage uses this additional terminal to detect short circuit events conducting the transistor to turn off quickly. This technology advantages the conventional shunt method because it does not directly affect the load voltage. However, these special switching devices are not commercially available in extensive. Reference [17] introduces a short circuit detection method based on current shunts built into an IGBT power module that effectively reduces the IGBT turn-off time upon a short circuit event. However, this method uses shunt resistors that require additional circuit stages, such as an RC compensation circuit and a filtered amplifier. Other methods use di/dt and dv/dt detection to infer that a short circuit event occurred [18], [19]. Nonetheless, these methods require complex hardware. Another method [20] analyses the IGBT gate voltage to detect hard switching faults and faults under load; again, the IGBT driver hardware is significantly more complex.
This work prefers the conventional IGBT desaturation method over the previous methods to detect short circuit events due to its robustness, well-known trade-offs, and IGBT driver's commercial availability. However, the IGBT desaturation technique involves a significant IGBT turn-off time upon a short circuit event. This time is called the ''blanking time (t b ).'' A large tb combined with a high desaturation current could lead to an IGBT thermal breakdown. References [21], [22], [23] and [24] improves the desaturation technique reducing tb even if the IGBT operates with high desaturation currents.
The proposed circuit design to prevent IGBT failures due to short circuit events enhances the proposed one in [25]. The principal changes comprise the driver substitution by the IR2127, the gate voltage clamp, the VS-undershoot reduction resistor (R4), and the use of a short-circuit tolerant IGBT. Figure 3 depicts the enhanced design.

A. Principle of Operation
The circuit in Fig. 3 sets up R1, R2, R3, C2, D1, and the IR2127 driver to accomplish the short circuit detection. The principle of operation is based on the statement that a saturated IGBT (Q1) increases its collector-emitter voltage (V CE ) as the collector current increases accordingly. The higher IC, the higher V CE . The IGBT driver uses a current sense terminal (CS) to detect an IC increase by reading the VCE voltage.
When Q1 is inactive, I C is close to zero, and V CE equals the DC-link power supply voltage (VBUS). At this stage, D1 isolates the collector voltage of the voltage divider built with R1 and R3. The HO terminal remains in low-state; thus, CS voltage goes below the threshold voltage V CSth [26], determining a regular operation.
When the HO terminal asserts, V CE remains equal to V BUS until the turn-on time elapses. Then, CS voltage starts going up gradually due to the integrator action of R2, R1, R3, and C2. CS voltage must not reach V CSth before V CE goes down to prevent an erratic short circuit detection event. When VCE finally goes down, D1 goes active and establishes a V CS -V S voltage below VCSth.
If a short circuit event occurs while Q1 is active, V CE increases to a value determined by the desaturation current and the VGE voltage [13]; Therefore, D1 is reversed biased isolating the CS voltage divider of the collector voltage. As the HO terminal stills asserted, CS starts going up again until it reaches VCSth. At this point, the CS signal is propagated into the driver until the HO signal shutdowns the IGBT and the FAULT terminal is driven low, indicating a short circuit failure event.
According to [27], R2 is typically chosen to be 10k for a bootstrap voltage V B = 12V, 22k for V B = 15V and 33k for V B = 18V. D1 must be an ultra-fast recovery diode. The voltage VX is typically 9.2V. R1 and R3 values are obtained from (1).
(1) Fig. 3. Short-circuit current detection topology based on the IR2127 integrated circuit. The figure shows the high-side driver solely.

B. The Field Stop Trench Technology
The total blanking time depends on the passive integrator RC constant and the propagation delay of the IGBT driver. It determines the short circuit time the IGBT must withstand. The proposed IGBT features a robust Field Stop II trench construction that provides superior performance; Furthermore, it can withstand a 10us short circuit [13]. For more information about this technology and its performance, refer to [28] and [29]. Figure 4 shows the short circuit type I waveform. Notice a high di/dt and collector current. The figure also reveals the total blanking time around 2µs. The IGBT [13] can withstand the measured collector current during the blanking time without suffering failure. The maximum collector current is controlled by V GE . The higher V GE , the higher I C ; However, an excessive collector current during turn-off can lead the IGBT to either a latch-up failure mechanism or thermal runaway. In any case, the IGBT would be destroyed. The proper V GE value is a tradeoff between maximum collector current and switching losses. It should be higher than Miller's plateau voltage and low enough to prevent latch-up and thermal runaway failures. VGE nominal value is 13V.  Figure 5 depicts the short circuit type II waveform. A reduced di/dt can be observed due to the high inductance of the transformer primary winding. Here, the collector current is restricted by VGE and the CS voltage divider. Therefore, the designer must ensure that the product of V CE and I C during desaturation lies within the Safe Operating Area (SOA). In this case, the blanking time results 470µs for a VGE = 13V. The IGBT operates withing the SOA according to [13].

III. CROSS-CONDUCTION FAILURE MECHANISM
The cross-conduction mechanism emerges during the switching intervals while two IGBTs of the same arm are active. It means that an IGBT turns on while the opposite IGBT has not turned off yet. The higher IGBT turn-off times and the driver propagation delays originate from this phenomenon. The crossconduction is directly related to the switching losses; the higher cross-conduction, the higher the switching losses. Furthermore, excessive cross conduction produces high collector current spikes during transitions that gradually degrade the switching device. IGBT degradation leads to failures such as bonding wire cracking and detachment, die stresses, and in some cases, thermal breakdown.
Since the low-side IGBTs of the full-bridge also operate as clamping devices for the high inductive load of the transformer's primary winding, it is necessary to keep a certain level of cross-conduction to prevent an unclamped high inductive scenario.
A good trade-off equals the IGBT turn-on and turn-off times. Passive delay lines equal the switching times to achieve a good balance between switching losses and unclamped high inductive spikes (UIS). For example, RG, D3, R5, C4, and D6 from Fig. 3 set up two delay lines to control cross-conduction. Figure 6 shows experimental results by measuring the lowside and high-side switching times. Upon passive delay lines design, turn-on and turn-off times are almost equal.

IV. IGBT HIGH INDUCTIVE AVALANCHE FAILURE MECHANISM
Modular CTMIs use large low-frequency transformers that feature high inductance. In high inductance switching applications, the proper management of the inductor stored energy is nowadays a challenge. This stored energy becomes a high voltage spike that appears upon the IGBT turn-off transition, causing an avalanche failure. The avalanche mechanism deals with the energy in a different way shortcircuit does. In the short-circuit event, the IGBT is active. Under this condition, the current distribution across the surface of the die is uniform, maximizing the ability to dissipate the energy. In the avalanche event, the IGBT is non-conducting with no bias on the gate. Thus, when an avalanche occurs, the current travels around the perimeter of the die, creating a nonuniform current distribution across the die surface, so the maximum current to failure is much less in the avalanche event than in the short-circuit event [30], [31].
Because of the high transformer's inductance, several precautions should be taken, as follows. 1. Limit the dv/dt by increasing the RG value. 2. Keep a certain level of crossconduction. 3. Use a switching coding that lets the inductive load clamps through the low-side IGBTs. 4. Reduce PCB parasitic inductances (Lstray). 5. Reduce the Equivalent Series Resistance (ESR) and the Equivalent Series Inductance (ESL) of the DC-link capacitors.
As seen previously, the dv/dt should be reduced. Notice that efficiency is not influenced because of the low switching frequency operation. According to [32], the dv/dt is limited by (2). (2) Where: dv/dt is the rate of rising voltage waveform on the collector at turn on. V TH is the plateau level of the gate. RG is the total gate resistance. C GC is the gate-emitter capacitance.
The cross-conduction trade-off was discussed previously. The switching code is a programming issue that is out of the scope of this paper. Another clamping solution is encountered in [33].
The ESR was reduced by paralleling low ESR capacitors as close as possible to the IGBTs. Section X shows a picture where these capacitors can be seen.

A. Experimental Results of Avalanche Failure Mechanism
Measurements of VCE, IC, and VGE were taken during the IGBTs turn-off transition of the high-side and low-side IGBTs accordingly. Figure 9 shows the V CE , I C , and V GE waveforms. Notice that VCE does not observe voltage spikes. Fig. 9. VCE, IC, and VGE waveforms during the turn-off transition of the highside IGBT. The IGBT tail current is disclosed. Figure 10 shows the VCE, IC, and VGE waveforms of the lowside IGBT. VCE transition does not present spikes due to the clamping process implemented by programming. Fig. 10. V CE , I C , and V GE waveforms during the turn-off transition of the lowside IGBT. Again, the low-side IGBT turn-off transition occurs during the clamping time, so no further IC transition is observed.

V. IGBT SECOND TURN-ON FAILURE MECHANISM
The second turn-on is a failure mechanism originated by the IGBT'S Miller capacitance combined with a high dv/dt at the collector node. The mechanism occurs during the IGBT turnoff transition by the induced voltage from the collector node. The induced voltage in the gate node may increase high enough to reach the plateau level, turning on the IGBT, see Fig. 11. The second turn-on leads to a type I short-circuit failure.
The gate node impedance and the dv/dt have a strong influence on the gate-emitter voltage. Therefore, there is a solution to reduce the gate voltage induction by adding an external capacitor between the gate and emitter nodes [34].
In this work, the gate impedance was reduced by paralleling an ultra-fast-recovery diode to RG. As a result, RG is obtained according to [35] and equals 10Ω. Fig. 11. The second turn-on failure mechanism. During the turn-off transition, the high voltage at the collector node induces a voltage spike into the gate node. If the voltage spike in the gate node reaches the IGBT plateau level, the device will turn on again. Figure 12 and Fig. 13 present the influence of dv/dt during the turn-on transition. dv/dt influence on VGE can be observed during the turn-on transition, too, avoiding the risk of causing a second turn-on event due to induced spikes through the scope probes. Fig. 11. dv/dt influence on the gate voltage during the high-side IGBT turn-on transition. Upon decreasing the gate impedance, a small influence can be observed.

VI. VS-UNDERSHOOT FAILURE MECHANISM
VS-undershoot failure mechanism is an Undervoltage spike that appears at the driver's VS terminal. It is generated by the store energy into the parasitic inductances of the PCB tracks (L stray ) when a high di/dt transition occurs. VS-undershoot directly affects the high-side IGBT driver, as seen in Fig. 13.
Since CTMIs use low-frequency transformers that limit di/dt, one would think that VS-undershoot could be neglected. However, during a short-circuit event or an overload condition, the L stray current could be high enough to cause a VS to undershoot. Therefore, the high-side driver could fail. Reference [36] suggests some general recommendations to minimize VS undershoot, as follows: 1. Use thick, direct track between switches avoiding loops or deviations. 2. Avoid interconnect links. 3. Consider re-locating IGBTs to reduce track lengths. 4. Connect VS and COM as close as possible to the IGBT pins. 5. Locate the driver as close as possible to the IGBTs. 6. Increase the bootstrap capacitance using low ESR capacitors. 7. Use a second low-ESR capacitor from VCC to COM. 8. Add a resistor from COM to the IGBT's emitter (R4). 9. Add a fast recovery diode from COM to VS (D4); see Fig. 3. Fig. 13. VS-undershoot failure mechanism. It is produced by a high di/dt across the parasitic inductances and directly affecting the high-side IGBT driver.

A. Measuring VS undershoot
VS undershoot severity is measured as the difference between (VB -VS) -(VS-COM) under overload or short-circuit conditions. Figure 14 shows the VS-undershoot phenomena under an overload condition. Isolated probes are required to accomplish this measurement.
The Overload condition meets the following parameters:

VII. TRANSFORMER INRUSH CURRENT
Large power transformers feature a high efficiency under relative low load conditions [7], i.e., under a load higher than 10%, the power efficiency can reach 98%. However, they also feature a high inrush current at the starting process due to core magnetization.
Modular CMTIs contain several transformers, so the total inrush current drawn from the DC-link is the sum of the transformer's particular inrush currents. As a result, a huge inrush current drawn from the DC-link leads to a failure due to a voltage drop transient. A good selection of DC-link capacitors is crucial to prevent design drawbacks. Reference [37] comprises the DC-link selection and calculation.
This work proposes a soft start technique to minimize the inrush current based on PWM. It consists of a PWM generator implemented into the FPGA that gradually increments the pulse width during the starting process. Figure 15 shows the transformer's inrush current of two scenarios. Notice the considerable reduction of the inrush current by using the PWM technique.

VIII. IGBT THERMAL RUNAWAY FAILURE MECHANISM
The IGBT thermal runaway is a failure mechanism that occurs in situations where a temperature rising changes the conditions in a way that causes a further increase in temperature. Therefore, it is uncontrolled-temperature positive feedback that leads the IGBT to a destructive scenario. A high collector current mainly causes thermal runaway during the turn-off transition [29]. Reference [38] introduces a method to predict the IGBT junction temperature and presents the negative impact of high temperature on the die solder joints.
One should take the following design considerations to prevent a thermal runaway condition: 1. Limit the short-circuit current by decreasing V GE , but not so much to cause excessive power losses, increasing the device temperature. 2. Reduce the short-circuit blanking time as short as possible. 3. Keep VB ripple voltage as low as possible by installing a proper bootstrap capacitor. 4. Maintain the IGBTs case temperature as low as possible by using a robust thermal management system.
Previous sections discussed short-circuit current limit and blanking time reduction. Thermal management is not described in this paper; However, experimental results comprise IGBTs temperature under several load conditions. Figure 16 presents the temperature behavior of the IGBT's case under three different load conditions. In addition, the average temperature of the four IGBT cases is reported. Fig. 16. The IGBT-case thermal behavior under three different load conditions. IGBTs were mounted on a 180mm x 130mm x 33mm aluminum heat sink without force convection.

A. Influence of Bootstrap Capacitor on IGBT Operating Temperature in Low-Frequency Switching Inverters
In low-frequency switching applications, the bootstrap capacitor plays a critical role in the inverter's efficiency. In this situation, conductive losses are higher than switching losses. Therefore, V CE highly influences the overall inverter's efficiency.
Large bootstrap capacitors are required to prevent a considerable voltage drop in VGE during conduction. On the other hand, a relatively low V GE implies a higher V CE , increasing power losses, and IGBT's temperature. Therefore, one must keep VB ripple voltage as low as possible. Reference [35] presents a calculation method of the bootstrap capacitor. Figure 17 shows the VB ripple voltage (V B ) using a bootstrap capacitor value of 47µF. Again, low ESR capacitors are recommended, so a tantalum capacitor in parallel with a ceramic capacitor is connected.

IX. CABLE SWITCHING INTERFERENCE
Modular CTMIs require many control-signal cables that connect each power module to the main control board. These cables lie within a noise environment due to near power switching events. Noise often induces voltage spikes of short duration (<1µs) in the control-signal cables causing erratic pulses at the control device inputs (FPGA). These pulses lead to a misinterpretation of the control device, causing a failure.
Several EMI models for switching electronics have been published, e.g., [39]. They help predict EMI's influence on the hardware. Reference [40] describes some failure mechanisms that influence I/O interfaces in digital devices like microcontrollers.
This paper proposes a hardware-based fault-tolerant digital input that filters any bounce event shorter than 2µs. Figure 18 depicts the circuit design implemented into the control FPGA.

A. Testing the Fault-Tolerant Digital Input
For testing the fault-tolerant digital input circuit, the overall inverter is required, i.e., the total Fig. 1 hardware must operate normally. Therefore, this measurement is a field test that must capture erratic pulses caused by switching interference during inverter's regular operation and confirm that they are effectively filtered. Figure 19 shows three captured erratic pulses that were filtered by the fault-tolerant digital input. An additional pulse is generated by causing a massive short-circuit event between two output lines of the inverter. As a result, the control device successfully processed the failure event. 19. Experimental results of the field test on the proposed fault-tolerant digital input. V FLIN is the circuit input voltage at the FPGA pin, whereas V EN is the output voltage that disables the entire inverter when it is asserted. (a) shows the pulse generated upon a massive short-circuit event. In this situation, the inverter processed the failure event and stopped the inverter operation. (b) through (d) are captured events caused by switching noise. Notice that they were successfully filtered.

X. TESTBENCH AND INVERTER PICTURES
The testbench shown in Fig. 20 discloses the measurement setup. All measurements were taken in the laboratory except the fault-tolerant measurements that were taken in the field. Fig. 20. The testbench. It shows a single tap solely, three 700W resistive loads, a Xilinx Zynq-7000 FPGA board, an isolation interface located between the FPGA and the power board, and a 2.5kVA low-frequency transformer. Under the table, there is a 10kVA three-phase multi-tap isolation transformer for security purposes. Figure 21 depicts the main components of a single tap. A Xilinx Zynq-7000 FPGA board and an electrical isolation board were used. Figure 22 shows a significant part of the complete threephase multilevel inverter and the entire control stages. Eighteen cascaded full bridge transformer-isolated stages make up the inverter.
The inverter dimensions are: 200cm x 80cm x 80cm.

A. About the research project
The power inverter is a part of an overall project of sustainable energy generation and distribution within a geographical region where electric power lines are not available. All parts of the system are commercially available. However, the inverter lies within a research project intending to build a reliable and robust design.
The power generation plant consists of a 12kW solar panel generator, a 10kW wind turbine, a 240V-435Ah battery bank, a 45kVA three-phase inverter, and a power distribution box.
Upon the project conclusion in September 2019, the inverter is operating correctly since that date.

XI. CONCLUSION
Upon research project conclusion, experimental results, and proper operation of the isolated multilevel inverter for more than twenty months under adverse conditions. I conclude that the failure mechanisms described in this paper are the most critical part of the design stage to ensure an efficient, reliable, and robust operation of isolated multilevel inverters.