Segmentation and quantification of solar PV losses using data-driven algorithms to help better design and operational monitoring

In the current scenario of increasing demand for solar Photo-voltaic (PV) systems, the need to predict their feasibility and monitor performance is more than ever. Although PV systems are known for their reliability, they are not above the damaging effects of their surroundings. Various lossy phenomena affect overall plant performance. In this paper, several of such losses, namely thermal, soiling, module degradation and inverter clipping, are discussed. Algorithms to evaluate these losses are proposed which are data-driven and empirical in nature. This is done as an effort to leverage the analytical capabilities provided by the plant data. The paper also compares the estimated losses with those obtained using the PVsyst simulation. As the latter is an independent industrial standard, it helps in understanding the ground reality of PV performance and insights for better operational monitoring. These insights are of immense business value and are aimed at optimizing performance and thereby revenue.


Introduction
PV systems are a predominant means of harnessing solar energy. They are cheaper than most means of renewable energy along with low periodic maintenance. They are also highly durable and easily scalable. Hence, there is a rapid demand for them worldwide. The major bottleneck to widespread installation of solar PV systems is their high dependence on surrounding weather. Even a sporadic event such as passing cloud cover can drastically drop the resulting generation. This brings the need to conduct feasibility & performance analysis of such projects. Simulation using energy modelling tools is the most widely used course of action. S.M. Maleki et al [1] and N.M. Kumar et al [2] establish the context of the discussion henceforth and are to be used for reference. Since any decrease in PV generation can result in considerable financial penalties, it is important to predict the possible generation. Apart from incident irradiation, major parameters of interest in estimating generation are PV losses. There are several types of losses each with its varying degree of complexity. Factors like module temperature, ambient temperature, communication loss, grid unavailability, soiling, design flaws etc., contribute to overall generation loss. N.M. Kumar et al [2] is a study to understand various PV losses using simulations of the energy modelling tool PVsyst.
The first aspect of studying the lossy physics of a PV module is attributed to its operating temperature. Irradiation is incident energy and when the module absorbs it, only a portion of the captured energy is harnessed into electric power. The remainder has a side effect of increasing the material's temperature. The module temperature rises as the incident irradiation and the amount of time it falls increase. As temperature rises, material efficiency decreases which further aggravates the module operation and so on. T. Dierauf et al [3] and R. Bohra [4] elucidate the requisite concepts regarding thermal loss and emphasize its contribution in a fuller understanding of PV losses.
Soiling of the solar modules is another lossy phenomenon with a significant effect on its performance. The amount and rate of soiling is highly dependent on the plant surroundings. Soiling rate is defined as the loss in generation due to soiling as a percentage of the generation which would have been with clean modules. M.G. Deceglie et al [5] puts forward the case of estimating soiling rates using the advantage of plant data. It also presents the necessary motivation towards pursuing such numerical models to better monitor PV systems.
The above types of generation loss have more to do with the surroundings than the material properties itself. But module degradation is majorly attributed to its electrical characteristics. It explains the inherent nature of the material to result in decreased performance over time even in the most standardized operation conditions. Regardless of the weather and geographic influence, module degradation poses a great financial risk. Therefore, its proper understanding is not just a research interest but also of fiscal responsibility. A. Ndiaye et al [6] deals with standardization of various measures relating to module degradation and does a great review of the technicalities. M. Malvoni et al [7] studies degradation estimates in a grid connected PV system. Since the dataset for this study is based on grid-connected rooftop PV as well, it serves the purpose of establishing a reference and helps in drawing parallels.
Inverter clipping is more of a design optimization result rather than a lossy phenomenon. It nevertheless results in a saturated generation in the wake of high enough irradiation. In simple terms, if the DC power injected by the module array into the inverter is higher than its rated DC capacity then its AC output flattens and results in clipped generation. Various design constraints such as loading ratio, inverter capacity, inverter cost, plant capacity and project cost determine whether clipping is acceptable in a PV system and its extent. This paper is organized as follows. Section 2 explains the methodology of estimating each of the aforementioned losses. Section 3 discusses the obtained results. It also compares the estimates with PVsyst simulation results to understand variations. Apart from these, there is a discussion on the limitations of the model. Section 4 summarizes the content and gives concluding remarks.

Input dataset
As part of the asset management, all the solar PV components have sensors whose measures are sent to the servers on a real-time basis. This is incorporated into the analytics portal which is used for operations and monitoring. The above snippet is the sample data showing readings from the plant components. It is a time-series data with a temporal resolution of 5 minutes. This table is only intended to introduce the scope of sensor measurements possible in a PV system. The actual data and its list of parameters is dependent on plant capacity and design. For the reference dataset, a grid-connected solar rooftop PV plant in India was studied and its loss parameters were estimated. The plant components are discussed in the prologue of the results section.

Performance Ratio (PR)
This is the standard measure of a PV system performance. It is the ratio of actual plant efficiency and theoretical efficiency.
pltcap is henceforth used to denote DC plant capacity (in KW).

Temperature Corrected PR and Thermal loss
It is generally observed that PR has significant seasonal variation. PR tends to have higher values in winter which can be mistaken for better performance of the plant and the inverse happens in summer. It is quite possible that the PV system was neither overperforming in winter nor underperforming in summer. Instead, this spurious variation is due to the adverse effect of module temperature on module efficiency. Therefore, PR is corrected for the changes in module temperature. S. Pandey et al [8] is a technical briefing paper in an effort to understand the variations in temperature corrections with respect to Indian operating conditions. It presents the concept of temperature coefficient and studies thermal loss. This section toes a similar path and takes a step forward in actually using the temperature coefficient to calculate what would have been the generation prior to the loss. Standard temperature condition (STC) is defined at 25°C and is considered as the reference.

Procedure:
 Given the AC active power and module temperature, STC corrected active power is calculated.
Power stc : Temperature corrected output AC power μ : temperature coefficient (°C/°C) T mod : module temperature  Power stc is an instantaneous value and is therefore calculated for each timestamp. Summation of Power stc over the day will give the temperature corrected daily generation Energy stc .


Having the irradiation data as well, temperature corrected PR i.e., PR stc is calculated.

Module soiling loss
Since the data captures only the generation post soiling, the ideal generation has to be estimated using actual generation data. Majority of existing approaches involve employing a comparison analysis between dirty and clean modules to quantify generation difference. This conventional method, while useful in benchmarking, proves to be a cumbersome effort in dealing with large-scale PV plants. This section presents a novel method of estimating the soiling rates which uses a data-driven approach.
Assumptions  The trend in is generally assumed to be that of a periodically linear one with a negative slope and having a sharp rise at a cleaning event.  The cleaning event is not always feasible to be an instantaneous (single day) and can stretch depending on the plant capacity. In such cases, the effect of cleaning in PR trend isn't one of a sharp rise but that of a gradual yet considerable increase.

Proposed algorithm
 Within a month, the maximum of PR stc (refer temperature corrected PR part) is assumed to be representative of a cleaning event. Let it be Max.PR stc .  The local minimum of PR stc in the left neighbourhood of the maximum is considered as the dirtiest the modules have been. Let it be Min.PR stc .  The aggregate loss in PR is then calculated as the difference between the above two and is normalized by ACC (average cleaning cycle in days) to get averaged daily PR loss defined as∆PR (%/day).
 Daily generation loss (KWh) is then calculated using ∆PR (%/day), irradiation and pltcap.  This daily loss is summed over the cleaning cycle period to get total generation loss (in KWh) due to soiling. ∆Gen S (KWh) : rise in generation due to cleaning ∆Energy S (KWh) : total generation loss due to soiling between consecutive cleanings  Ideal generation is the sum of the actual generation (Energy stc ) and the total loss (∆Energy S ).  Soiling rate is then the total loss as a percentage of the ideal generation.
The motivation behind using PRstc instead of PR is to calculate soiling loss independent of thermal loss. PR stc represents an idealized notion of generation excluding the loss due to module temperature. Hence, the logic is to estimate the soiling loss using the pre-loss generation and it is rational to assume that this approach better captures the actual figures. As stated in the assumptions, the nature of the cleaning event is not temporally visible in majority cases since the data is from industrial-sized PV plants. Also, the inclusion of such a variable in the estimation model can ruin accuracy along with adding complexity. Hence, PR stc was used to quantify the cleaning action as well. denotes what would have been the unclipped active power at 'T X ' with the reference PR whereas (Active power iX ) is the actual inverter output power which is clipped.

Inverter clipping loss
Here, INV i ∆gen D is the generation loss due to clipping of INVi on date 'D'. Summation of all such days in a month-wise & yearly fashion as below (for month 'M' and year 'Y')  These monthly/ yearly losses are summed over all INV i to get total clipping loss in KWh.  This is then expressed as a percentage of Energystc (refer thermal loss part).

Module degradation
PV solar 's performance decreases over time due to weather and operating condition. The degradation of the module is one of the key markers for actual photovoltaic performance assessment. Module degradation assessment is also necessary for predicting plant's performance in upcoming year, this is required for preventive maintenance of modules and calculating the lifetime of PV plants. This paper calculates over the year module degradation with respect to previous year and compares simulated standard data (PV Syst) with actual on-site data of the plant.  The plant data from January 2018 to June 2019 has been used for the study.

Monthly plots
Yearly Module degradation

PVsyst simulation
The below snapshot is from the simulation report using PVsyst v6.68 for the same plant. The 'soiling loss factor' refers to the loss due to module soiling and is shown to be 2%.'PV loss due to temperature' refers to the thermal loss which was estimated above. It is shown to be 10.2% in this simulation. The total value of inverter loss is taken here to be 1.9% and this includes various losses shown below. LID (light induced degradation) which is an important contribution to overall module degradation is shown here to be 2.5%.

Discussion
 The need for precise calculation of thermal loss is clearly illustrated in the above results where PR and PRstc are shown. The seasonal variation in PR is evident whereas PRstc is a more tolerant measure to understand plant performance.  The monthly variation in various losses calculated above gives thee necessary insights to understand the plant environment and thereby help in preventive maintenance.  Such losses can be monitoring to understand the plant limitations in terms of generation and thereby help the financial and business side of the operations in adjusting the revenue models.  Large-scale reproduction of the proposed models on various sites pan India has the potential to map the generation capability with respect to Indian geography of future PV systems.

Conclusion
Solar PV is such a technology which has been enjoying increasing demand and this market scenario is quite favourable for innovation in energy research. This paper hopes to not only introduce the context of PV losses but also tries to engage the motivation to adopt datadriven and empirical methodologies to understand modern systems. This approach is better in the sense that it only gets better at prediction as time goes by and there is more data. Industrial research such as the above work in critical analysis of PV systems not only helps identify possible limitations but also suggest room for improvement. Since energy generation and project cost are key towards maximizing revenue, these estimation models aimed at predicting PV losses are to be deemed indispensable. As with any estimation, there is no one unique way of hitting the bull's eye that is to know the exact value. The algorithms proposed above are very much dependent on the quality and quantity of data. However, the comparison between losses estimated using plant data and standard simulation using energy modelling can act as feedback towards improving the design and maintenance of such PV systems.