AI-powered Energy Internet Towards Carbon Neutrality: Challenges and Opportunities

—From self-driving vehicles, voice recognition based virtual digital assistants, smart thermostats to recommendation systems, Artiﬁcial Intelligence (AI) is becoming a crucial part of the carbon neutral society that has drawn considerable interest from energy supply ﬁrms, startups, technology developers, ﬁnancial institutions, national governments and the academic community. The emergence of AI initiates numerous opportu- nities to transform energy industry to AI-powered smart system which can revolutionize traditional approaches of creativity thinking, strategical operation, and solution seeking, especially for accelerating carbon neutrality of our society. This survey provides a comprehensive overview of fundamental principles that underpin applications of big data analysis in Energy Internet (EI), such as smart energy supply and consumption, smart health and Fintech. Next, we focus on intelligent decision-making for the energy industry and inform the state-of-the-art by thoroughly reviewing the literature. Subsequently, cybersecurity issues for AI system related to EI are discussed with recent advancements from vulnerability analysis of AI system to differential privacy and to blockchain based security technology. To our knowledge, this is one of the ﬁrst academic, peer-reviewed works to provide a systematic review of AI applications for EI research and initiatives in terms of big data analysis, intelligent decision- making and AI related cybersecurity. These initiatives were systematically classiﬁed into different groups according to the ﬁeld of application, methodology and contribution. Afterwards, potential challenges, limitations for existing research and op- portunities for future directions are discussed, ranging from emerging explainable AI, to localized multi-energy marketplaces, self-driving electric vehicle charging and e-mobility. This paper can help us understand how to build smart cities and critical infrastructure for a climate-changed world towards the UN’s sustainable development goals.


I. INTRODUCTION
S INCE the evolution of life, the Sun with its energy nourishing the earth advocated the prosperity of human agricultural civilization. The use of coal and other fossil energy produced the industrial revolution. Computers alongside communication technology helped us enter the information society.

A. Recent Advance of AI
There has been significant progress made by AI which is demonstrated as a variety of high-profile successes, such C. Li as self-driving vehicles, virtual assistants, and video gameplaying systems in terms of supervised, unsupervised, semisupervised and reinforcement learning methods [1]. Specifically, much of the recent progress in the last decade has been driven by three main catalysts, i.e., data, powerful computational resources, and distributed algorithms, which enable the ability to train a much deeper neural network, thereby extracting more complicated features in a wide range of challenging problems.
Image Understanding. Deep learning technique can significantly improve the performance of various classification tasks without using human-defined feature engineering. Once plenty of image data becomes available, the requirement for humandeciphering of unique features distinguishing one species from another is no longer needed. This has been successfully applied to medical image recognition of skin cancer [2].
Intelligent Decision Making. In 2016, a deep reinforcement learning technique AlphaGo was developed as the breakthrough AI system to win the world-champion Go player by the Google Deep Mind team [3]. This super learning ability of the deep learning multi-agent system for learning to play multiplayer based games is also demonstrated by OpenAI Five which successfully won a professional Dota2-playing team with a vastly complex strategy including self-play techniques, self-imitation learning, LSTM network, proximal policy optimization, and so forth [4].
Natural Language Processing. Deep learning technique is becoming the leading trend in the field of language translation. With new Contextualized word embedding techniques [5], deep neural network based machine translations are rapidly approaching human levels, such as transformer architecture.
Physical Automation. The application of deep learning technique into physical systems is an area of AI where the significant achievement has also been reached. This impressive technology makes self-driving cars increasingly smarter and safer. Newly developed driving aids enable a function of reducing driver stress by taking over partial steering duties which can automatically match the driving speed to posted road signs and help stop the car if an accident risk is observed.

B. Concept of EI
Today, the Internet is eliminating the boundaries of time and space. Now with Cloud computing integration, our society is becoming more connected collaborative and intelligent than ever before. In the energy field, the environment can no longer tolerate the pollution produced by fossil fuels. With the changes in energy production and consumption taking place, renewable energy is leading us to a sustainable future [6].
In energy production, technological innovation is bringing photovoltaic power costs down. With established applications of energy storage technology, renewable energy will progressively become the main source for energy supply to achieve efficient multi energy co-existence. In energy consumption, with the transition to electrification, electric vehicles will progressively replace traditional vehicles. As home energy management technology develops, homes, whilst consuming energy, will also become independent power production units [7].
In energy information sharing, communication technology and IoT devices continue to excel in drawing the connection between energy production and consumption closer. Energy information efficiently gathers on the cloud platform to centralize the intelligent allocation and management in the foreseeable future [8]. Energy Internet (EI) will expand from districts to cities and to countries providing sustainable energy to the smart society through big data analysis, intelligent decision making and cybersecurity where the characteristic of EI could include: 1) Multiple energy sources involving in electricity, gas/biomass, heat/cool and hydrogen, 2) Hybrid AC and DC power supply, 3) Electrification of transportation, 4) Digitized and distributed operation of energy system, i.e, peer-to-peer (P2P) operation, 5) Highly flexible and autonomous system with selfmanagement, 6) Resilient to extreme weathers and conditions, 7) Bidirectional power and informational flow at every stage, 8) Easy to expand and support the scalability of global energy system, 9) Plug and play of PEV, renewables, energy storages, balanced loads, 10) Intelligent prewarning to faults, real-time fault location and isolation, 11) The highly integrated and reliable communication system, 12) Integrating IoT infrastructure and security measures to prevent cyberattacks, 13) Blockchain enabled transparent, tamper-proof and secure energy systems, 14) More energy plan choices available and large bill savings to consumers, 15) Innovative and competitive localized energy market.
The application of AI in the energy sector is essential to the development of EI which is creating a critical impact as clean, cheap, and reliable energy system. AI-powered EI continuously focuses on smart power generation, consumption and infrastructure with applying machine learning techniques, sharing economic concepts, integrating renewable energy with big data analysis, and paying attention in being an EI enabler and supplier, which eventually simplifies, digitizes, and automatizes the global operation and maintenance service towards a

C. The Role of AI-powered EI in Achieving Carbon Neutrality
By 2050 more than two-thirds of the world's population is expected to live in urban areas, which is attributed by the gradual shifting from rural to urban areas and the natural overall growth of the population. This large population growth in cities could cause a series of challenging issues including: 1) Poor air quality, 2) Insufficient clean water, 3) Lack of sanitation, 4) Waste-disposal problems, 5) Inadequate medical care infrastructure, 6) Traffic congestion, 7) High energy consumption, 8) Cyberattack.
Carbon Neutrality&SDGs. Successfully solving these challenging issues plays a significant role in achieving sustainability with UN's Sustainable Development Goals (SDGs) [9] in terms of balance between the economy, society and the environment. Achieving SDGs and carbon neutrality are fundamentally connected. The Intergovernmental Panel on Climate Change (IPCC) has already found that climate change can undermine sustainable development, and that well-designed mitigation and adaptation responses can support the SDGs. As illustrated in Fig. 1, SDGs 6, 7, 11, 8, 9 and 13 are largely beneficial from carbon neutrality achieved by AI-powered EI which enables low-carbon equipment, recyclable materials, renewable energy resources and ICT technologies for solving challenging issues.
Carbon neutrality requires the mitigation actions and adaptation measures to be taken at all levels to reduce carbon emissions. These actions can interact with SDGs in positive The challenging issues solving by AI-powered EI to achieve UN's SDGs ways that are known as synergies, or in negative ways by hindering SDGs (known as trade-offs). Thus, a balance between synergies and trade-offs is important when achieving both the SDGs and carbon neutrality. In particular, the emerging AIpowered EI provides an efficient solution and can contribute to sustainable developments while also lowering emissions, reducing climate change impacts, and facilitating adaptation.
Addressing these challenging issues involves the mitigation of the negative impacts by rapid urbanization and the adaptation of innovative technologies for unavoidable consequences. Mitigating the negative impacts of rapid urbanization requires changes in energy consumption patterns in commercial buildings, industries, residential homes, and transportation. Adapting innovative technologies requires an understanding of extreme conditions to plan for resilience and emergency management when disaster happens.
The emerging EI concept brings together integrated energy systems, diversified energy generation technology, reliable energy storage technology, advanced ICT technology, etc. to improve energy efficiency and resource usage, reduce air and water pollution, and create a sustainable solution of mitigation and adaptation to overcome the challenges faced by our world. By considering the UN's SDGs, the benefits contributed by AI-powered EI on Society, Economy and Environment can be illustrated in Fig. 2 where the detailed assessments are presented as follows.
AI&EI for Society. AI-powered EI could be significantly beneficial for our society through achieving the targets of clean water and sanitation (SDG 6), affordable and clean energy (SDG 7), and sustainable cities (SDG 11). Children and the elderly with several chronic conditions from low-income communities are exceptionally vulnerable to inauspicious health outcomes and social consequences from exposure to water and air pollution. AI-powered EI improves the efficiency of clean technologies [10] to create less water and air pollution when supplying the same volumes of energy to end customers which can considerably mitigate the health problems caused by pollution.
AI-powered EI can support a low-carbon energy consumption associated with smart urban planning, management and development which encompasses a variety of interacted green technologies to achieve a more sustainable future. By big data analysis, smart buildings have great potential to respond to peak shaving signal for relieving congestion which can largely postpone expensive electrical infrastructure upgrades [11]. By deep learning technique, self-driving EVs can make possible autonomous renewable energy harvesting without compromising traffic problems across the cities in the future [12].
AI&EI for Economy. AI-powered EI could play a meaningfully role in promoting the transition to a low-carbon economy by creating decent work and economic growth (SDG 8), and industry, innovation and infrastructure (SDG 9). Although many job positions can be replaced by AI, the outcome of the technological revolution is rising employment and economic growth. When AI is revolutionizing energy industry, innovation can boost the number of jobs in related sectors, especially in knowledge-intensive ones. For instance, a number of data analyst positions are required by distribution network companies to improve the efficiency of network operation by identifying various abnormalities. AI-powered EI could directly bring a huge economic benefit to reach the goal of being 100% powered by renewable energy resources. By the improvement of energy efficiency and the decrease of emissions overall, DeepMind AI has a great reduction on Google data centre cooling bill by 40% [13].
AI&EI for Environment. AI-powered EI could directly address challenging environmental issues caused by climate change which is related to climate action (SDG 13). All innovative clean technologies of AI-powered EI aim to provide a promising approach on fighting with climate changes to preserve the environment through supporting low-carbon energy systems which result in an environmentally friendly energy consumption [14]. By intelligently integrating renewable energy, energy storage and EVs into our communities, greenhouse gas emissions can be materially reduced by the significant reduction in use of fossil fuels. This paper proposes a survey of AI-powered EI achieving the UN's SDGs via different AI applications in energy system in Section III. This involves big data analysis on prediction and anomaly detection, intelligent decision making on energy management, and cybersecurity issues. Section VI discusses possible future direction of AI application in EI. Conclusion is drawn in Section VII.

II. MOTIVATION
In this section, we primarily focus on the motivations of this survey which summarizes the key challenges that may be faced in achieving a carbon neutral energy Internet.

A. Key Challenges Towards Carbon Neutrality
To achieve carbon neutrality, there are four key challenges in the energy industry as illustrated in Fig. 3.
Digitalization. The energy system is confronting increasing challenges for sustainability: The large scale integration of distributed energy resources, the electricity market regulation changes, the highly elevated risk of cyber-attacks, the aging power delivery infrastructures and plant generations, and volatile fossil fuel prices in recent times. To tackle these challenges, digital technologies such as Internet of Things (IoT) and Blockchain are essential prerequisites in supporting low-carbon power sector transformation objectives, assisting the management of huge amounts of data collection and processing, delivering real-time solutions in emergency, providing optimized decision-making supports for increasingly complex energy systems. However, the way in which to digitalize the energy system efficiently is the first key challenge which should be intensively investigated.
Decentralization. The rapid declining expense of distributed energy generations and energy storage, i.e. rooftop solar PV and home battery, and the gradual rising environmental awareness of human-being are making customers seek more low carbon-emitting power generation infrastructures which actually decentralize the energy supply. On the national electricity market level, utilities are undergoing a green cultural transformation where they are moving forward from hierarchical management models to decentralized electricity market, and allowing third-party asset ownership, i.e., community battery. For instance, replacing traditional electricity business models by Peer-to-Peer (P2P) energy trading can bring new sources of revenue for retailers and distribution network companies. However, impact of this on the energy supply and demand from decentralization will be carefully considered in this paper.
Decarbonization. In order to fight against climate changes and global warming , all mankind on earth should take bold, ambitious climate actions to not only reduce their energy consumption, but also lower carbon emissions. Consumer power consumption behavior is evolving to the stage where people prefer to utilize products or services that have a positive effect on the environment from going green. This in turn will incentivize companies to innovate new offsetting technologies which counteract unavoidable emissions. For example, hydrogen has been introduced to blend with natural gas, which is widely used for heat pumps in domestic cooling/heating and targeted in the decarbonization of large-scale energy-hungry industrial processes, i.e., Nonferrous metals industry. Nonetheless, decarbonization has aspired to offer better energy efficiency of which everlasting environmental questions should be examined and occasionally answered in this survey. More importantly, measuring progress towards carbon neutrality requires standard quantitative measurement for carbon footprint.
Electrification. Electrification refers to the technologies for replacing the usage of fossil fuels, i.e., Coal and natural gas, which only utilize electricity as its source of energy. A report by the International Renewable Energy Agency (IRENA) said that renewable energy resource is increasingly cheaper than fossil fuels used for power generation. The rise of E-mobility is changing the way of transportation which now can be powered by renewables rather than fossil fuels. Long-term carbon neutrality not only relies on high energy efficiency by decarbonization, but also largely depends on electrifying the energy-hungry machines. Notwithstanding, how electrification can potentially achieve the reduction and the mitigation of carbon dioxide emissions from powering the transportation, heating buildings, and supporting industrial sectors should be comprehensively studied.

B. Leveraging AI for Solving Key Challenges
The survey of this paper will cover three important aspects including big data analysis, intelligent decision-making and cybersecurity which leverage AI to address these four key challenges as follows.
Big Data Analysis. The energy system transformation makes EI fitting for a zero emissions future. However, this also brings various technologically and socio-economically challenging issues. These originate from electrifying the heating and the transportation systems, understanding the uncertainty of load demand and variable renewable energy on existing assets, processing the energy related businesses and quantifying climate related risks. AI spans various applications of EI providing it with powerful tools to help the understanding of spatio-temporal data streams that are now being collected by advanced metering infrastructure (AMI) smart meter and phasor measurement units (PMUs), and unlocks great potential from the energy system. Many important applications have been studied by big data analysis across technical, social and environmental sciences and all of these topics are currently experiencing a large flow of data from smart meters, environmental monitoring systems, renewable energy power plants, electricity market, electrical grid monitoring systems, as well as physics-based models.
Intelligent Decision-making. Optimization based decisionmaking process can be applied to all operations of energy system from energy generation, to energy delivery, and to energy consumption. Intelligent decision-making of energy management in the Energy Internet could considerably promote optimal energy consumption and generation scheduling strategies, maintain the system stability and reliability, and respond to real-time electricity prices towards a sustainable future [15]. Given an operation and control task of energy system, the optimization based mathematical model can be established according to expert knowledge which could result in a model based solution. Moreover, in the real energy system, the objective function of the optimization problem, which is to improve system efficiency, often suffers from uncertainties and is often non-convex, requiring a robust approach. The existing research mainly leverage a Predict, then Optimize diagram to solve this issue. In addition, with more energy sources integrated into the energy system, it is difficult to harness a mathematical model to precisely describe the operation process where a data-driven approach can be applied. In particular, most energy management problems can be transformed into policy decision problems and can be well solved using deep learning and reinforcement learning [16].
Cybersecurity. Prediction and optimization are widely used in the Energy Internet where the AI system can significantly promote the performance of operation and control on EI. The deep learning based forecasting algorithms become more complex but could bring better accuracy. The deep reinforcement learning based decision-making could produce a better operational strategy against the uncertainty. However, these AI algorithms are more susceptible to cyberattacks which are not well perceived in the society of energy system.
Transforming the way in which we produce and consume energy to reduce our carbon footprint requires digitization, decentralization decarbonization, and electrification to be taken into account, which are key challenging issues widely investigated by the existing research. Addressing these challenges of the energy transition also benefits from AI where a large influx of data of the Energy Internet can be leveraged to reinforce system efficiency. Therefore, themes of this survey involving social, economic and environmental benefits of AI in terms of big data analysis, intelligent decision making and cybersecurity are investigated, which are expected to yield new valuable knowledge for all stakeholders.

III. THE OVERVIEW OF BIG DATA ANALYSIS IN EI
The state-of-the-art applications of big data analysis for EI are presented by machine learning techniques including time series modelling and forecasting, supervised and unsupervised classification, fault diagnosis, anomaly detection and computer vision. In addition, to enable the reproducibility of these results, the data source is also provided as in Fig. 4.

A. Forecasting Tasks
To capture the way of energy production and consumption, a large influx of data [37] can be harnessed for building models to characterize consumption patterns, forecast production with future information, and predict price against uncertainty.
In this section, we will mainly present the recent advancement of deep learning based forecasting in the energy system.
Load Forecasting. AI algorithm aims to address the high volatility and uncertainty of individuals' power consumption at the residential level by time series regression approach. The long short-term memory (LSTM) [74] architecture is one of typical recurrent neural networks which can capture the temporal dependency of time series data. In [19] and [17], standard LSTM based deep networks are developed to shortterm forecast the energy consumed by individual residential Fig. 4: The taxonomy of big data analysis in EI to achieve UN's SDGs households where residential smart meter data are publicly available in [18] and [20], respectively. By collecting the house temperature and humidity conditions through a Zig-Bee wireless sensor network, the performance of data-driven feature engineering based gradient boosting machine (GBM) is compared with traditional machine learning techniques including support-vector machine (SVM) and random forest (RF) [75]. This smart appliance energy prediction is also used to verify the effectiveness of time series prediction by generative adversarial network (GAN) [25]. Moreover, many other learning systems are also introduced for load forecasting problems, e.g., extreme learning machine (ELM) [21], active learning based LSTM model [23], ensemble learning [27], [76], and so forth. Solar Energy Forecasting. AI aims to predict solar energy produced by rooftop PV panel and solar farm according to different feature inputs. In [36], an Auto-LSTM model is employed to predict the time series of solar energy where this model can take advantages of the feature learning by an AutoEncoder and the temporal feature extraction by LSTM. To exploit spatio-temporal distribution of solar irradiates, a convolutional graph autoencoder [38] is developed for solar irradiate forecasting while a deep graph dictionary learning [77] is built for the problem of net load disaggregation. When the real dataset is largely corrupted by noisy or partially missed by malfunction of sensors, a graph signal processing approach is proposed for high-resolution PV forecasting where PV data is modelled by numerical simulation [40]. Further details of solar energy forecasting can be found in [78] whilst the commercial solar forecasting toolkit can be found in [79].
Wind Energy Forecasting. Wind energy is another typical renewable energy resource which has a higher volatility in general. Motivated by reducing the possibility of the overfitting of the prevailing LSTM model, a concatenated residual learning with stacked bidirectional long short-term memory (Bi-LSTM) layers by connecting the multi-level residual network and DenseNet is proposed for wind energy forecasting in [42]. In [45], a hierarchical forecasting is introduced for wind power energy where a generalized least squares method is firstly established for reconciling wind power prediction at different levels to achieve better accuracy [45]. The pairs bootstrap based ELM is introduced for the probabilistic wind power forecasting to model the regression uncertainty with the best performance [80].
Energy Price Forecasting. Electricity price plays a significant role in maintaining the balance of power supply and demand. A wavelet transform based neural networks (NN) model is proposed for electricity price forecasting [46] where the wavelet decomposition is applied to characterize the original price signal before being processed by NN. A two-stage approach of the probabilistic electricity price forecasting is investigated to establish prediction intervals of market clearing prices where ELM is used to estimate the point forecasting of the price in the first stage and the maximum likelihood method is used to estimate uncertainties of the noise variance in the second stage [81]. Moreover, a bootstrapping based ELM [82] is proposed for improving electricity price intervals forecasting accuracy.
EV Charging Demand Forecasting. With the increasing  [48]. A longitudinal dynamics model [50] is employed to estimate the charging demand for electric buses which can be easily applied to the large transportation network for bus transit operators. A comparative case study of scenario-based forecasting of EV consumption is investigated in [83]. To capture the spatio-temporal distribution of the charging load for EVs, a probabilistic model is developed by combining real-world household travel survey (HTS) data with GIS information which is computationally efficient for large quantities of EVs in the transportation sector at a country scale [84]. Probabilistic Forecasting. To practically convey forecast uncertainties, probabilistic forecasting [30] is introduced by quantile regression where the data-driven feature engineering technique and the clustering method are harnessed before the net load prediction task is implemented by Gradient Boosting Regression Trees (GBRT). Deep Bayesian LSTM networks is proposed in [32] with the same procedure of probabilistic forecasting [30] where the set of customers are divided into different groups by clustering. To further handle the uncertainty [85], multitask learning (MTL) based Bayesian LSTM networks are studied to simultaneously account for customers' difference across distinct groups. Moreover, by using estimation theory, peak load estimation is studied in [34] where different statistical analytical models are used.
By embedding the mixture density networks layer and Monte Carlo dropout technique into the structure of multiattention recurrent neural network (MARNN) [56], a new probabilistic forecasting for battery's SOC under primary frequency control is proposed where the probabilistic forecasting is not implemented by the quantile loss function. In [86], a hierarchical probabilistic forecasting model is introduced to decompose the problem into time and energy flexibility prediction of EVs. More details of energy forecasting can be found in [87].

B. Smart Home & Smart Building
Non-intrusive Load Monitoring / Energy Disaggregation. Non-intrusive load monitoring [88] aims to disaggregate the power consumption into individual appliance-level consumptions for specific customers which provides promising potential in enhancing the demand response. The first attempt of deep learning technique to solve energy disaggregation can be found in [89] where energy disaggregation is recognised as a 'denoising' task by the multi-layer denoising autoencoder. In [54], a 18-layer CNN structure is developed for considerably boosting the accuracy of classification on type II home appliances. The hybrid intelligent method of non-intrusive load monitoring can be referred to in [90] and [91]. In addition, energy disaggregation is remodeled as a temporal dictionary learning problem of which the representation of the temporal features of power consumption is learned by the latent space of LSTM autoencoder [92].
Load Characteristic Modelling. It is critical to better understand the relationship between power consumption patterns and customers' behaviour for meaningfully improving the energy efficiency and the reliability of the energy system by enabling demand management. Clustering algorithms such as K-means and Agglomerative clustering are a powerful unsupervised learning technique to divide a set of load data into different groups by the similarity measurement in [93] and [80]. Typically, customer power consumption behavior similarities can be applied to improve the performance of the forecasting problem [94] and the problem of residential baseline estimation [95].
Residential Electricity Plan Recommendation. In Australian electricity market, there are numbers of energy consumption plans with various discounts provided by different retailers. However, how to find a most suitable one for a specific customer is quite challenging. In [52], a personalized recommendation system is proposed to learn user experiences of different smart household appliances and present a recommendation of possible cost-saving energy plans to customers by the collaborative filtering technique while taking their daily lifestyles into account. In [96] and [97], similar personalized recommender systems are developed for electricity plan recommendation by the collaborative filtering technique where different feature extraction methods are applied to capture the preference of power consumption.

C. Smart Cities
Waste Management. When AI technologies are applied to the wastewater treatment process, sludge expansion problems can be addressed in improving aeration and pump more efficiently which could meaningfully reduce energy cost [71]. Machine learning technique is integrated into wastewater treatment plants modelling where highly complex energy cost function can be obtained and solvable [98]. The state of wastewater treatment plant can be monitored by various sensors to control energy consumption and discharge quality where detecting faults in wastewater treatment plants can be implemented by LSTM [99].
Smart Healthcare. AI technologies enable the diagnosis and treatment of diseases more intelligent which can be harnessed to built the clinical decision support system, i.e., skin cancer [2]. The most representative smart healthcare system is IBM's Watson which can provide an optimal solution of clinical diagnosis through in-depth clinical data analysis for the disease risk prevention [100]. An explainable AI based early warning system [101] is introduced to early detect acute critical illness where clinical translation can be earlier obtained by accompanying an explainable results with information on the electronic health records.
FinTech & InsurTech. AI technologies are infiltrating Fintech and Insurtech industries which can provide costeffectiveness, better customer experience, and enhanced financial security. In [102], a deep learning based relational stock ranking method is proposed for stock prediction where the temporal evolution over time and the relationship among different stocks are captured by temporal graph convolution. By leveraging Bayesian mixture density networks [103], a forecasting framework of individual claims reserving is developed for claiming analytics tasks which can estimate cash flow over multi-period.

D. Power System Stability Assessment.
Given a normal or stable condition, the ability of the power system is defined as returning to its initial condition after being disturbed where steady state stability and transient stability are the major concerns in the power system [104]. In [105], an AI based early-warning system is developed for achieving online detection in risky operating conditions where the proposed method involves an ELM based prediction and a decisionmaking process towards a multiobjective optimization framework. In [106], an ensemble model is proposed for the precontingency stability assessment by considering incomplete PMU dataset where the learning model only requires a minimum number of classifiers for clustered PMU measurements.

E. System State Estimation
State estimation aims to identify unknown values of the power system state variables based on imperfect measurements which provide useful information for the security and the reliability of the power system [107]. A Bayesian state estimation  [63] against modeling errors in the presence of bad and missing data for unobservable distribution systems is investigated in [108] where the distribution learning technique and the Monte Carlo technique are developed to train deep neural network. By only leveraging the historical smart meter data [70], a data-driven power network topology estimation method is introduced for medium voltage and low voltage distribution networks where the statistical dependencies is captured by the probabilistic graphical model amongst bus voltages.

F. System Anomaly Detection
Electricity Theft. Electricity theft is becoming a major concern for utilities causing a large financial loss that requires an effective way to identify and prevent it [109]. Similarly, clustering algorithms are applied to find out the group of highly suspected fraudulent customers at the first stage while oneclass classification technique is harnessed to detect the true energy theft [60]. In [109], energy theft detection is addressed by gradient boosting machine (GBM) where feature engineering preprocessing is highlighted by GBM. Energy-intensive appliance detection can greatly help demand response program on the mechanism design of targeted customers to curtail the load of large energy consumption appliances, i.e. water pump operation detection [62].
Grid Resilience Management. The major cause of transmission line outages can be attributed to lightning, storm, bushfire and so on, which can cause a large number of reliability issues. To improve the resilience of the energy system, predicting the potential risk of natural disaster is an effective way to prevent catastrophic consequences before it happen. A general regression neural networks (RNN) [47] is designed for predict lightning outages of transmission lines. In [110], Pearson correlation analysis is used to study the weight of the connection between weather parameters and distribution system failures while regression analysis is applied to predict the damage by storms.

G. Image Processing Tasks
Given significantly increased accuracy of image classification, using CNN based deep learning algorithms is the most effective way to automatic capture the important features of image for different tasks. In [63], the whole sky images are utilized for the PV generation forecasting problem by using the ConvLSTM technique as shown in Fig. 5.
Power grid assets easily suffer from corrosion and aging to different levels year by year. The mask RCNN is harnessed for performing the automatic image-based corrosion monitoring of steel transmission towers [111] and conducting the automatic inspection of the aging electrical distribution poles [112] which can significantly reduce human based inspections.

H. Battery Management System
Battery SOC/SOH. It is critical to develop an effective battery management system for ensuring the safety of the battery where the state of charge (SOC) and the state of health (SOH) are essential for battery monitoring, thermal management, and charging/discharging management. In [64], a deep fully convolutional neural network is leveraged for estimating a Li-ion battery's SOC where the learning rate is strategically optimized to outperform recurrent models, i.e., LSTM and GRU. A hierarchical ensemble model [66] and a hybrid data-driven ensemble learning method [113] are developed for Li-ion battery's SOH estimation where the single layer based ELM enables an online estimation.
Lithium Battery Default Diagnosis. Lithium batteries have been widely applied in electric vehicles, distributed energy storage, and large-scale energy storage thanks to their high power density and long life-cycle. However, battery failures such as overcharge, over-discharge, and overheating reduces the reliability of the battery system and may also cause safety problems. Therefore, diagnosis and elimination of faults are of great significance to reduce the failure risk of battery.
The model-based fault diagnosis uses the residual between the model estimated value and the measured value to determine the fault type. The battery model mainly includes mechanism models, equivalent circuit models, and data-driven models. Among the three models, the equivalent circuit models are frequently used, as the mechanism model is too complex and the extrapolation performance of the data-driven model is poor. The equivalent circuit model usually applies the parameter estimation or state estimation to realize fault diagnosis [114].
Model-free fault diagnosis methods can generally be divided into two categories: statistical analysis and data-driven fault diagnosis methods. The statistical methods such as information entropy and normal distribution are usually used for the statistical analysis fault diagnosis method [115]. The measured current, voltage, and temperature signals will be analyzed, and reasonable thresholds set to achieve fault diagnosis. The datadriven methods usually use machine learning and a multimodel fusion strategy for fault detection [116]. Data-driven algorithms include random forest (RF) and principal component analysis (PCA) which are used for detecting battery internal short-circuit faults [117]. The multiclass relevance vector machine is also used for internal short-circuit faults in batteries [118]. A backpropagation neural network (BPNN) is deployed in [119] to predict the current and thus to diagnose the external short-circuit fault of the battery. The long short-term memory recurrent neural network (LSTM) is used to predict battery voltage for early fault warning [120].
The Gaussian distribution and multi-level screening strategy are applied to detect abnormal voltage fluctuations [121]. For battery-package fault diagnosis, the K-means clustering algorithm, and 3 screening approach are exploited to detect and locate the abnormal cells [122].
Fuel Cells Default Diagnosis. Proton exchange membrane fuel cells (PEMFC) are considered to be the preferred candidate for future power sources because of their high efficiency, zero-emission, and high energy density. It has been widely used in stationary, portable, and transportation applications. However, low reliability and poor durability block its largescale commercialization. Studies have shown that the fault diagnosis of fuel cells is of great significance to their stable operation and durability improvement.
Model-based fault diagnosis can be divided into analytical model, gray box model, and black-box model-based methods. The analytical model can describe the internal electrochemical reaction of the fuel cell, but it is complicated and difficult to be applied online. For gray box model-based methods, parameter identification, observer, and equivalent space methods are usually used to diagnose faults [123]. For the blackbox models based methods, neural networks, fuzzy logic, and support vector machines (SVM) are usually used [124].
Model-free fault diagnosis mainly uses signal processing and machine learning algorithms to detect the faults. Fast Fourier Transform (FFT) is used in [125] to obtain spectrum information for flooding fault determination. Discrete wavelet transform (DWT) is used on PEMFC output voltage to diagnose PEMFC membrane drying and flooding faults [126]. The performance of K-means clustering, Gaussian mixture model (GMM), and SVM are studied on PEMFC water management faults [127]. An extreme learning machine (ELM) is deployed to identify PEMFC system failures [128]. To detect the unknown system faults, a spherical-shaped multiple-class SVM (SSM-SVM) is adopted [129].
Overall, PEMFC is a power generation device and the lithium battery is an energy storage device. The difference in operating principles results in different types of faults. Despite the fault diagnosis methods of these two devices being similar, the model-based methods are frequently used for lithium battery fault detection whereas the data-driven methods are preferred for PEMFC.

IV. INTELLIGENT DECISION-MAKING IN ENERGY INTERNET
In this section, the state-of-art methodology of how AI could make the decision-making process more intelligent in the Energy Internet will be discussed with regard to three categories which include the knowledge based optimization method, smart Predict-Then-Optimize approach and the datadriven framework on solving various optimization problems in energy system.

A. Knowledge Based Optimization
Linear Programming. Linear programming has a long history for being used in various business applications, i.e., long term energy planning with renewable resources [130] and Ev charging/discharging management [150].

Bilevel
Optimization. An optimization model which is subject to the constraint generated by another optimization problem [131]. A bilevel programming [151] for coordinating EV charging requirements is developed where the aggregator can determine charging retail prices through a mathematical program with equilibrium constraints for optimizing the decision-making of the EV aggregator.
Nonlinear Optimization. Except linear programming, all mathematical programming models are categorized as nonlinear optimization, i.e., the optimal power flow (OPF) model [133]. In [152], a mixed-integer linear programming (MILP) method is introduced to guarantee the worst case for deep neural network predictions which enables a significant reduction of computational time for the OPF model in terms of maximum constraint violations. In [153], Lyapunov optimization is applied to develop an online energy management strategy for operation and control of microgrids, taking into account the power flow constraints which can adapt to uncertain environments in real time.
Non-cooperative Game. To model the competition between individual players, the non-cooperative game is proposed to capture each player's behaviour. In [154], a sparse promoting game model is developed for the load shifting problem in residential demand side management where an accelerated distributed optimization method is studied for Nash equilibrium computation.
Stackelberg Competition. To develop pricing mechanism for DERs and DR, a Stackelberg game model is proposed for integrating renewable energy resources (RES) [135]. To enable a peer-to-peer (P2P) energy trading among different prosumers, a similar competition model is introduced for pricing demand response and RES in an energy sharing community [155].
Stochastic Programming. Otherwise referred to as an optimization model in which parts of parameters are involved in the uncertainty with known probability distributions. In [156], a two-stage stochastic game model is employed for modeling the energy trading where the uncertainty is measured by conditional value at risk (CVaR) [157].
Robust Optimization (RO). RO does not require that probability distributions of uncertain data are known in advance, instead the main assumption only requires that the uncertain data belongs to the uncertainty set. A two-stage robust optimization model for maximizing the total profit of the microgrid, which considered operational and maintenance costs of energy storages, wind turbines and PVs, and the transaction cost with main grid, is investigated in [158].
Integer Programming. Quadratic utility function with integer constraints is widely used to model the economic issues of energy system operation, i.e., unit commitment (UC) [159] and economic dispatch [160]. In [138], outer approximation method is applied to UC through a decoupling technique which can divide the problem into a nonlinear programming problem and a MILP problem. The optimal operation of the community battery is controlled by the double deep Q-learning method.
Distributed control Proximal policy optimization 2020 [144] Capacity scheduling of PV-Battery system A proximal policy optimization-based DRL is studied to perform the capacity scheduling of PV-BSS where the method can readily leverage continuous action space and determine the specific amount of charging/discharging.

Policy gradient method DQN & DPG 2018 [145] Smart building energy scheduling
The energy scheduling problem for building loads is achieved by On-line learning.
HVAC control DDPG 2020 [146] Residential load management The noncooperative stochastic game is introduced to model interactions between households and the electricity price.

Distributed DRL
Multi-agent DDPG 2020 [147] Cooperative load frequency control A data-driven method for cooperative load frequency control is developed for the multi-area power system in continuous action domain.

MA-DRL
DDPG+PER strategy 2019 [148] Strategic bidding A bilevel optimization for bidding in electricity market is solved by DRL in multi-dimensional continuous state and action spaces which enables market participants with more profitable bidding decisions.

Deregulated electricity market
Actor-critic model 2020 [149] Frequency regulation by battery A DRL-based data-driven approach is proposed for optimal frequency control by BESS which involves a cost model considering battery cycle aging cost, unscheduled interchange price, and generation cost.

B. End-to-end Learning Approach for Predict-Then-Optimize
Unlike the standard paradigm of predict-then-optimize procedure, the optimization problem structure including its objective and constraints is leveraged for designing better prediction loss function which can measure the decision error originated from the prediction algorithm [161]. By incorporating stochastic programming [162], a task-based end-to-end learning tech-nique is designed for the generator scheduling problem and battery charging/discharging management. In [139], a modelfree end-to-end learning model is the problem of stochastic economic dispatch according to the forecasting which can avoid the issue of the stochastic optimization problem of economic dispatch during each iteration by exploiting the optimization problem structure.

C. Data-driven Optimization
As demonstrated in Section IV-A, stochastic optimization can be applied for addressing various uncertainties of RES and load demands if the knowledge of their probability distributions are known. Robust optimization merely requires the support of the uncertain parameters on the mathematical model. However, it is either too conservative or computationally intractable [163] where the decision-making produced by robust optimization maximizing the worst-case pay-off may show a bad performance in practice. Moreover, although the exact probability distribution of parameters may be unknown, there are usually rich empirical data that can be collected for the procedure of decision-making. In some cases, it is difficult to obtain the exact model of DGs, DSs, load demands and their parameters. It is essential to develop an intermediate approach incorporating partial knowledge about model and available data which could produce better performance in practical cases. Deep reinforcement learning technique is effectively a trial and error learning for self-improvement, wherein the agent system is implemented for the energy system operation with zero understanding of how the system works and datadriven trained extensively through reward mechanisms.
Q-learning. The automatic demand response strategy of a single building without adding explicit knowledge about the customers' dissatisfaction with job rescheduling due to the advantage of the model-free method in reinforcement learning is investigated by [164]. Although the automated control system has a computational complexity that is linearly related to the number of device clusters, smart grids will suffer the shifted peak problem if a lot of consumers adopt a similar automatic control system. This can be solved by randomizing the automatic decisions to some extent or coordinating users in an integrated management system. [140] forecasted future electricity prices by artificial neural networks and scheduled the non-shiftable, shiftable, and controllable loads in hours ahead. But the uncertainty of appliance energy usage patterns and working time, and the network congestion in the demand response are ignored. 50 electric water heaters with temperature sensors were used in the lab experiments to show the efficiency of the schedule with Q-learning [165].
The assumption of [166] requires the full information of other customers or services as the partial input. Instead of taking each customer into consideration independently, the learning process of the entire smart grid system and finding its optimal energy consumption schedule is of great importance. Besides, thermostatically controlled loads such as water heaters or heat pumps were scheduled by Q-learning without expert knowledge about the system dynamics or solutions [167]. [168] transformed the demand response problem into a stochastic Stackelberg game and learns the optimal strategy with fewer user responses to preserve user privacy.
Deep Q-learning Network (DQN). The build of Q-table in the Q-learning method requires the discretization of state spaces and action spaces which increases the computation time and heavily influences the performance. To solve this issue, DQN approximates the Q-table by deep neural networks and the state space could be continuous. [141] scheduled the electric vehicle charging tasks based on the forecasted electric price by DQN. [169] used DQN to navigate the electric vehicles to the charging station with the price information and the traffic information. However, these studies ignored the overestimation problem in deep reinforcement learning.
Twin Delayed Deep Deterministic (TD3) policy gradient algorithm. In the category of model-free reinforcement learning algorithms, policy gradient [145] is another popular type of model-free algorithms. DQN and DPG (Deep Policy Gradient) were utilized in [170] to implement online optimization of building loads. The difficulty of discretizing the action space in DQN and DPG still exists especially when the number of appliances and households. DDPG (Deep Deterministic Policy Gradient) was adopted to tackle the scalability issue of the smart grid in [171], [146]. n [172], the knowledge-assisted DDPG is introduced to the cooperative wind farm control framework for maximizing the total power generation with uncertainty. However, the overestimation problem of Q value in DQN, and DDPG, which influences the convergence issue, can not be ignored. TD3 [173] adds two additional neural networks to obtain the minimum value of the estimated Q values in target networks compared with DDPG.
Soft Actor-Critic (SAC). The main characteristic of SAC, motivated by DDPG, is the entropy regularization [174] which relies on the stochastic policy. The benefits of introducing entropy into the reinforcement learning algorithm are that the policy can be as random as possible, the agent can explore the state space more fully to avoid the policy falling into local optimum early, and multiple feasible solutions can be explored to complete the specified task to improve the antiinterference ability for optimal energy management [175]. The goal of the SAC algorithm is to maximize the expected payoff while simultaneously maximizing the entropy of the strategy as much as possible, even if the strategy is more stochastic. This structure is much like the trade-off between exploration and exploitation, since greater entropy helps more explorations and also helps prevent the strategy from converging to a local optimum too early.

V. CYBERSECURITY FOR AI IN ENERGY INTERNET
In this section, cybersecurity issues associated with different energy systems are investigated by the vulnerability analysis of AI system in EI, the privacy preserving machine learning technique and blockchains, respectively.

A. Vulnerability of AI System in EI
Load Forecasting under Cyberattacks. Machine learning based AI systems in EI are vulnerable to various cyberattacks. In [176], how to attack the forecasting algorithms without any prior knowledge is developed where black-box attacks can generate hard-to-detect false data for forecasting algorithms. In [177], adversarial attacks on energy theft detection algorithms are investigated by black-box iterative cyberattacks where the case study shows these attacks can easily escape from classic anomaly detection algorithms. Although countermeasures to load forecasting under cyberattacks are developed by machine Power ledger learning techniques [178], the strategically designed adversarial attacks on load forecasting can cause either increased operational costs or undesired load shedding. Smart Home under Cyberattacks.Smart home Wi-Fi IoT devices are prevalent nowadays and bring significant improvements of energy efficiency to our daily lives. However, they also pose an attractive target for adversaries seeking to launch attacks. In [179], 29 of 40 popular smart home Wi-Fi IoT devices have been demonstrated to exhibit security vulnerabilities. More specifically, 23 of the 40 tested devices could communicate without any encryption. In the remaining 17 devices, 6 IoT devices accepted forged server certificates. Based on the discovered vulnerabilities, an adversary could launch man-inthe-middle (MITM) attacks. Based on the MITM attacks, the adversary could steal the advanced encryption standard (AES) key of a smart camera and decrypt the recorded video streams. To address the discovered vulnerabilities, a solution to protect the IoT communications using smart home Wi-Fi routers is developed by encryption.

B. Privacy Preserving Machine Learning
Privacy concerns are the major obstacle for many AI applications which involve in individual's personal information.
For the big data analysis of smart meter data, high resolution dataset, i.e. 1-minute sampling, could disclose many private information of a consumer's daily routine including occupancy, habits and individual preference on different appliances. Yet these big analytic tasks on smart meter data have great potential to largely save energy consumptions and mitigate carbon emissions through improved energy efficiency.
Federated Learning. To protect data privacy, federated learning [185] aims to train a shared learning model across multiple decentralized edge devices maintaining the privacy by local data samples without having to actually exchange information, as opposed to implementing the model on one server with the aggregated dataset. In [183], the load data collected by edge devices are leveraged to train a shared model by federated learning method for maintaining privacy locally.
Differential Privacy. Although distributed computation can partial solve this issue, a new technique is required to overcome the obstacle. Differential privacy enables the possibility for the energy system to acquire and share aggregated information about customer daily habits while maintaining customer's privacy. In [180], NILM is implemented by the compressive sensing framework which incorporates with differential privacy to achieve the trade-off between the inference accuracy and the privacy sensitivity level. In [181], differential private load forecasting is demonstrated by Tensorflow privacy technique where the privacy issue is solved by the Laplace mechanism of differential privacy. Load characteristic clustering is widely applied to different big data analytic tasks of energy system, i.e., forecasting, which can draw attention to the privacy. In [182], a privacy preserving load clustering algorithm is designed by differential privacy method.

C. Blockchains
Blockchains are distributed ledgers enabling a secure storage of digital transactions which can avoid a centre of authority for these transactions. Importantly, the automatic execution of smart contracts can be implemented by blockchains tech- nology in a P2P energy trading network [186]. Instead of operating the power ledger in one single trusted centre, each individual prosumer can hold a copy of the records' renewable energy supply and consumption chain which can be used to reach a consensus on the valid state of the power ledger. New energy trading transactions can be linked to previous ones by cryptography which makes a p2p energy trading system secure and resilient. Each prosumer can validate all transactions for themselves if valid, facilitating a transparent, trustable, and tamper-proof procedure of energy trading. Nevertheless, the blockchain based energy trading system can still suffer from cyberattacks. In [184], a new blockchain framework for the energy trading among neighbors is designed which takes a countermeasure against cybersecurity threats by deep learning technique to avoid attacks and identify fraudulent transactions.

FUTURE GRID
In this section, the potential research areas relevant to AI technologies will be discussed for applications in EI.

A. Human-centric AI
Despite these advances, many AI application areas are still fairly immature in EI system. There are increasingly concerns on persistent ethical problems, i.e., prejudice and bias, around AI technology. More specifically, the most immediate concern is that workers would be replaced by AI systems across an extensive range of industry sectors. What is more, the question persists whether AI-enabled robots will replace or slavery humans. As shown in Fig. 6, there is no argument over the fact that there are certain events in which utilization of a robot can actually turn out to be much more efficient when compared to a human being. Controversially, mankind would be slaved by AI in future.
However, human-centric AI is introduced which will be beneficial to both individuals and the entire society. The objective of this concept is to support and empower humans for incorporating humam-centric design to appropriate digital ethical standards and social values which are contributed to non-discrimination by explainable AI, autonomy by hyperautomation, and privacy protection by AI for cybersecurity. We argue that AI is an enabler of smart future not only toward the carbon neutrality but also toward UN's SDGs if we insist on the human-centric design.

B. The Growing Role Of Explainable AI (XAI)
Most of the popular deep learning techniques are unexplainable with a black box metaphor which creates a barrier for the system operator to understand the results as the engineering practice of the energy system requires a high standard on the reliability [187].
First of all, although the forecasting tasks such as load forecasting and renewables prediction can be addressed by newly developed deep learning techniques with better performance, the system stakeholders are wanting to know what happened, why it happened, what for by the prediction and how to better use the results. In fact, prosumers can harness this information to get the best out of P2P energy trading by what actually impacts the load and electricity price.
Moreover, explainable AI is crucial for anomaly detection of EI system. If a customer is flagged as electricity theft by intelligent algorithm, the distribution system operator would not simply disconnect the power for this customer, but would rather conduct an investigation and validate the suspicion by explanations produced by XAI.
In addition, deep reiformance learning approach based decision-making shall also be explainable. If an auditor is validating a rejection of an insurance claim by a DRL algorithm which has been detected as anomalous, he can specifically pinpoint the misinformation by XAI. This would make it much easier to make a final decision, rather than simply saying "This is fraud, so we refuse to pay it".

C. The Rise of Hyperautomation
Hyperautomation is the concept that most applications within Energy Internet can be automated like self-driving cars. The COVID-19 pandemic is accelerating the adoption of this concept which is considered as the digital future automation.
Energy market operations should be automated by explainable AI technique to acquire useful knowledge of energy supply and demand in the market and by deep reinforcement learning to achieve an optimal energy trading and consumption. Furthermore, by leveraging IOT techniques, fault diagnosis of the energy system can automatically collect important data, identify fault in early stage and take right actions to prevent system failure by robots and drones running new deep learning algorithms of hyperautomation. For hyperautomation initiatives to be successful, automated EI should be adaptive to uncertainty and promptly respond to unexpected circumstances. The decision-making of energy market operation would only be strengthened when long-term operational cost reduction methodologies can be leveraged with minimal requirement of human interactions.
Tesla's Autobidder Platform [188] successfully exemplifies the hyperautomation which enables various DERs as VVPs to autonomously interact with Australian electricity market to provide energy arbitrary and ancillary services.

D. Increased Use Of AI For Cybersecurity Applications
Prevention of cyberattacks on EI system is a never-ending competition which requires constant updates to the defending technology to keep pace with ever-evolving cybersecurity threats such as malware, FDI attack, denial-of-service (DDoS) attack and so forth [189]. AI technologies can be employed to improve the performance of threats identification. However, state-of-the-art AI technologies are also vulnerable to the cybersecurity attack which is referred as artificial intelligence attack.
Unlike traditional cybersecurity issues that are caused by system vulnerability, AI attacks can be created by inherent limitations in the deep learning algorithms that cannot be debugged at present. For example, load forecasting algorithm can be attacked by false information [177].
Privacy concern is another critical issue which has hindered the progress of smart meter analysis related applications. By incorporating with differential privacy technique, federated learning can provide a practical solution to this issue. Only when privacy by design (PbD) is widely involved in the business practice, data-driven applications such as smart home demand side management can be implemented practically.

E. Digital Twin for Data Analysis and System Operation
Digital twin is not new which was introduced to improve physical model simulation of spacecraft by NASA in 2010 [190]. The renaissance of Digital twin is paired with advancements in AI which significantly increases the value of the virtual design tool through creating virtualized models based on historical data and laws of physics rather than merely the design configuration.
With the high penetration of DERs, the energy system consists of various spatio-temporal information which makes the decision-making more complicated. Graph neural network could be potentially beneficial for this spatio-temporal information which produces real-time solutions for all operations of energy system. In [191], the EV charging planning problem is solved by a multi-graph neural network based approach to allocate the investment in a competitive environment. The digital twin, which can generate an evolving profile from power generation to power delivery, is able to transform power industrial operations, create extra business opportunities, present new insights on performance of all assets.

VII. CONCLUSION
To conclude, AI technologies can clearly benefit Energy Internet operations, multi-energy markets and end consumers.
They offer various useful applications by big data analysis, optimal management strategies by intelligent decision-making, and better information security by vulnerability analysis and countermeasures against cyberattacks. Most importantly, however, AI technologies offer novel solutions for empowering Energy Internet to play a more active role in achieving the UN's SDGs. These technologies have enabled applications ranging from the conventional energy sector to health and finance, which has sparked a great deal of interest in academia and industry. Many research and commercial entities are currently pursuing AI related innovation to strike the delicate balance between economic growth and environmental sustainability towards bright future. All of these technologies are a part of fast-moving area of research and development, therefore a review on these emergent technologies is required to improve understanding, inform the body of knowledge on different applications and realise their potential.
The contribution of this work is to provide a timely, academic-led review on AI enabled applications in the general concept of Energy Internet including data processing, system operation and cybersecurity, such as smart energy consumption, smart health and FinTech. Firstly, this paper reviewed various academic research activities which presented an overview on the applications of AI technologies including big data analysis based applications, AI enabled intelligent decision-makings, and cybersecurity issues for AI systems in Energy Internet. Next, the paper presented two renewable energy use cases along with an in-depth discussion on interpretability for the probabilistic forecasting. Subsequently, a systematic review on a broad range of research activities was provided which showcased the hot topics in which Energy Internet stakeholders and industries are pursuing innovative techniques. This shows that these projects are still in an early development stage, and research activities are still ongoing in key improvement areas that would allow desired high efficiency, distributed and security. Future research directions are also discussed for each topic.