Intrahour Direct Normal Irradiance Forecasting Based on Sky Image Processing and Time-series Analysis

based on DNI measurements and


Introduction
Development of sophisticated and efficient techniques for renewable energy is a major concern nowadays.Among the various renewable energy technologies, concentrating solar power (CSP) will play a key role in the future: its share of global electricity is envisioned to reach 13-15% by 2050 [1].Deployment of CSP technologies is penalized by various scientific and technical bottlenecks, such as the efficient control of CSP systems [2].The H2020 project SFERA III (Solar Facilities for the European Research Area) [3] aims to tackle some of these bottlenecks, in order to achieve better competitiveness of CSP systems.One way to optimize their management is to use model predictive control, which necessitates prediction of variables of interest, and in particular DNI, which is the direct irradiance received on a plane normal to the Sun.Accurate DNI forecasts can contribute to the reduction of fluctuations of CSP plants' output due to solar irradiance intermittency and variability [4].Intrahour DNI forecasts are thus needed, with an accurate prediction of ramp events.The main approaches to forecast solar resource are statistical models, image-based models, and numerical weather prediction (NWP) models [4].As this work discusses DNI forecasting for short-term forecast horizons (up to 15 minutes), the focus will be on statistical and ground-based sky imagery models.A hybrid forecast model is developed, harnessing the advantages of both approaches.

Related work
Even though we are interested in DNI forecasting, the scope of this section is broadened to solar resource, since a large percentage of existing works is dedicated to GHI and photovoltaic (PV) power production forecasting.DNI forecasting is considered more complicated due its high variability.Also, CSP technologies are very sensitive to ramp events, making their accurate prediction crucial.Nonetheless, studying approaches dedicated to GHI or PV power forecasting is interesting, since parts of the forecasting procedure can be similar.Existing works can be divided according the input data used.
• Solar resource (and possibly other variables) measurements.Even if some classical time-series approaches are employed, most of the recent works are statistical models based on artificial intelligence tools [5], [6], [7].These models can predict solar resource for horizons ranging from a few minutes to several days.However, as they are purely based on historical solar resource data, these models usually fail at accurately predicting ramp events.• Ground-based sky images and solar resource (and possibly other variables) measurements.For these models, the forecast horizon is greatly reduced, generally culminating at 30 minutes: indeed, the information contained in sky images becomes limited as we approach their boundaries.However, using sky images allows for the prediction of atmospheric disturbances that will affect solar resource, which in turn can lead to accurate prediction of ramp events.Here, the forecast generally consists of three major steps: first, image acquisition and cloud detection; second, cloud motion estimation; and finally forecasting using extracted features.In particular, that last step can involve using a clear-sky model [8], [9], [10], [11].Indeed, DNI can be divided into the clear-sky DNI (  ) and the clear-sky index (  ) as follows: In this paper, high dynamic range (HDR) images provided by a sky imager are fed to a machine-learning-based model, which is used to accurately detect the clouds without losing critical information in the circumsolar region.Then, the dominant cloud motion is estimated with an optical flow algorithm and clustering.An adaptive approach is then used to determine a region of interest (ROI), that may contain clouds that will block the Sun after the considered forecast horizon.All of this information is then fed to a complex artificial neural network model.The developed approach is designed to be able to forecast DNI ramps and to be robust when faced with different kinds of clouds.These topics are some of the difficulties that most models struggle with.

Hybrid model
The proposed hybrid model is presented in this section.Contrary to purely statistical models, this model relies on DNI measurements and sky images to predict DNI.Sky images are used to detect clouds and estimate their motion, so as to achieve better ramp events detection.As can be seen in its global architecture shown in Figure 1, it consists of four steps detailed in the sequel: HDR image acquisition, clear-sky DNI forecast, image processing, and DNI forecast.

Image processing
In this step, features are extracted from the acquired images (Figure 2).The HDR images are treated to correct the fisheye lens distortion, and used to detect clouds using a k-nearest neighbour model (k-NN) model and estimate their motion using the Farnebäck optical flow algorithm [14].The ROI is then located based on the estimated motion with the aid of the kmeans clustering method.Finally, the cloud fraction in the ROI is calculated (   ) and fed to the DNI forecast model that takes the RGB image of the ROI (   ) as input as well.

DNI forecast model
The DNI forecast model can be divided into three main parts (see Figure 3).The first part, responsible for image feature extraction, is a convolutional neural network (CNN).The second part is a multi-layer perceptron (MLP) with the cloud fraction (   ) and the clear-sky DNI forecast as inputs.The outputs of the CNN and the MLP networks are then fed to a "Regression MLP", used to merge extracted features and forecast DNI at time instant  + .

Reference model
The proposed hybrid model is compared to three reference models: the smart persistence, and two recurrent neural network (RNN) models.The smart persistence model is based on Equation 1, and the supposition that the clear-sky index   is constant over the forecast horizon.The DNI forecast is obtained as follows: The smart persistence model thus needs clear-sky DNI forecasts, provided by the clear-sky DNI model described in Section 3.1.The two reference RNN models are using past DNI observations as input.The first model consists of multiple layers of long short-term memory units (LSTM), followed by fully connected layers, while the second proposed model consists of a convolutional layer, LSTM layers and fully connected layers (CNN-LSTM).

Database and networks' training
The database used in this study consists of 373 days, during which DNI and sky images are stored every 30 seconds.DNI is measured with a pyrheliometer, and the sky images are obtained using a ground-based camera developed by PROMECA (http://promecaweb.com),with the help of PROMES-CNRS (see Figure 4).A simple classification of the database reveals 128 clear-sky days (34.4%), 49 overcast days (13.1%) and 196 days with mixed situations (52.5%).All the networks are trained using a database consisting of 40 days, selected to have examples from different seasons and with various DNI profiles.This database is then split into a training dataset of 22 days, and a test dataset of 18 days.

Performance metrics
In this paper, three performance metrics are used: 1.The normalized root mean squared error (nRMSE): (5) 2. The skill factor (SF), allowing the models' performance versus the smart persistence model to be evaluated: 3. The ramp detection index (RDI) [15], allowing the models' ability to predict ramps (sudden DNI variations) to be evaluated.This demonstrates the benefits of including sky images: even though it has been trained with the same DNI measurements, the hybrid model is able to generalize to these situations.

Conclusion
This paper deals with the development of a hybrid intra-hour forecast model, combining knowledge-based and machine-learning approaches and taking DNI measurements and HDR sky images as inputs.This hybrid model is compared to LSTM and CNN-LSTM models that take solely past DNI observations as input, in order to assess the benefits of integrating HDR sky images in the forecasting process.The smart persistence model is also used as reference for the comparison.Results show that the tested models are capable of outperforming the persistence model: for the hybrid model, skill factor values range from 12% to 26% as the forecast horizon increases from 5 to 15 minutes, whereas the LSTM and CNN-LSTM models score around 10% for all horizons.The ramp detection index shows that the tested models are capable of predicting DNI ramps: the hybrid model is able to forecast 72% to 80% of the ramps, whereas the LSTM and CNN-LSTM models are less efficient and detected between 53% and 66% of the ramps.This difference is due to the fact that LSTM and CNN-LSTM models are purely statistical and rely solely on past DNI observations, without taking into account the atmospheric situation: efficient cloud detection and accurate cloud motion estimation translates into better ramp detection and precise DNI forecasts.For clear-sky and overcast situations, the persistence model produces very good results, and the results obtained by the LSTM and CNN-LSTM models are considerably inferior.However, the hybrid still manages to outperform the persistence model, with skill factor values ranging from 6% to 9.5%.Thanks to the inclusion of sky images, it successfully manages clear-sky, overcast, and mixed situations.As for the complexity of the models, the analysis shows that, while the hybrid model is more complex, time-consuming, and demands more computational resources, it is still able to provide forecasts within 7% of the 30 seconds sampling time.In the framework of the European project SFERA III, the proposed model has been implemented in situ to provide real-time DNI forecasts to CSP infrastructure users.

Figure 1 .
Figure 1.The hybrid model.   is the cloud fraction in the ROI.   is a RGB image of the region of interest. �  is the forecast clear-sky DNI. � is the forecast DNI.

Figure 2 .
Figure 2. Image processing steps leading to the features' calculation.   is a RGB image of the region of interest.   is the cloud fraction in the ROI.

Figure 3 .
Figure 3. DNI forecast model.   is a RGB image of the region of interest.   is the cloud fraction in the ROI. �  is the forecast clear-sky DNI. � is the forecast DNI.

Figure 6 .
Figure 6.Comparison of the LSTM model, the CNN-LSTM model and the hybrid model on clear-sky and overcast days (test dataset), for H = 5 minutes, H = 10 minutes and H = 15 minutes.The skill factor of the LSTM and CNN-LSTM models is not represented because it is too low (around -120% for H = 5 minutes, around -50% for H = 10 minutes and around -25% for H = 15 minutes).Ramp detection index is not included because there are very few ramps during these low-variability days.