Predictive Analytics Have the Highest Value, So Why Are Utilities Analyzing the Past?
Many utilities are realizing that the huge amount of granular data coming at them can be leveraged to tackle
their biggest challenges, such as distributed generation and engaging consumers beyond traditional power delivery.
A growing number of power companies are initiating big-data analytics projects, which include reports,
dashboards and visualizations of historical data. But will they get the most out of the data by analyzing the past?
Can slicing and dicing yesterday’s data provide them with the maximum value they so urgently need to address
emerging operational and marketing needs?
According to the technology research firm Gartner, predictive analytics can hold significantly more value for
companies than simply analyzing the past. Answering questions like: “What will happen?” and “How will we make
it happen?” instead of “What happened?” makes a big difference in value. Yet most of the analytics projects that are
being initiated by utilities today fall into the latter category.
One of the main reasons for this is the lack of input from machine-learning experts and data scientists who possess
a deep understanding of the energy markets. Another reason may be a lack of understanding of what predictive
analytics is, how it differs from descriptive analytics, and what value it can create for utilities.
So what is predictive analytics? Predictive analytics uses advanced machine learning and statistical algorithms to learn
from historical data in order to provide accurate forecasts and actionable insights with a predictive nature (defined as
“foresights” by Gartner), which enable users to be more proactive and positively influence tomorrow’s bottom-line results.
Examples for the energy industry on the operational side include outage predictions, system failure predictions, accurate
load forecasting at the meter/sub-meter and sub-hour levels for balancing supply and demand, optimizing demand response
programs, and detecting early warnings of irregularities.
On the marketing side, examples include predicting customer churn and churn root-cause analysis, prediction of customers’
responses to specific pricing, marketing energy efficiency offerings, detecting early warnings of irregularities of a household’s
electricity usage and informing the household as part of customer engagement initiatives.
All these examples of predictive analytics applications can have a huge impact on utilities’ marketing campaigns and customer
engagement, as well as customer acquisition and retention (in competitive markets). Predictive analytics applications can also
help balance the grid and reduce risky spikes during peak hours.
These benefits are attractive to utilities. However, they tend to start with descriptive analytics because predictive analytics are
deemed expensive, complex and resource-intensive. This is far from the truth. Since predictive analytics projects are usually led
by machine-learning experts, the algorithms they develop are almost always fully automated. Moreover, if we take into account
the financial impact of the predictive analytics projects, the time to reaping the full value of predictive analytics could be significantly
In cases where utilities decide to initiate predictive analytics implementations, how can they make sure they derive maximum value
out of these projects? Here are four important suggestions resulting from my direct experience working on these solutions.
Correctly define the business objective
When it comes to machine learning and data mining, defining the target function correctly is crucial. The nuances can make all the
difference. In many of the cases where we are required by utilities to provide accurate sub-hour load forecasts at the meter level,
they define the target function to be maximum accuracy. But when I ask if the penalties of overestimation and underestimation of
load are equal, the answer is “no” in many cases.
At this stage, I suggest that the target function should be maximum profit instead of maximum accuracy, since solving the problem
for maximum accuracy holds an assumption that these penalties are symmetric. It is important to understand that these are two
different problems that will produce different results. Another example is the importance of aligning the algorithms’ definition of
accuracy to the business’ definition of accuracy, since the method by which accuracy is measured affects the results that the algorithms
produce. My suggestion when defining the objectives is to discuss them with a machine-learning expert who has an in-depth understanding
of the energy industry.
Start with a proof of concept based on sampled data
The best way to start a predictive analytics implementation is with a proof of concept, which consists of only a sample of the data and does
not require integration with other systems. This is a good way to learn the value that could be derived from the data with a minimum
investment of time and money.
We always suggest starting with sampled data in order to emphasize the significant business value that can be derived by using predictive
analytics (this process usually takes us between one and three weeks to complete). Nevertheless, it is extremely important to sample the
data correctly for these proofs of concept and to make sure the sampled data is not biased. What we do in these cases is either sample the
data ourselves or provide very detailed instructions on how to do it correctly. The danger with biased data is that accuracy measures may
indicate that results and algorithms are on target, when in actuality, the predictions (as well as the accuracy measures) are wrong.
Always insist on receiving the accuracy measures
Although this is a very common requirement of utilities when it comes to granular load forecasting implementations, I have found that
when required to provide forecasts for customer responses to energy-efficiency offerings, predicting customer churn, or identifying system
failures, utilities don’t always require accuracy measures.
Obtaining and understanding these measures is crucial for evaluating the project and for understanding how to deal with the results. When
predicting customer responses for a demand response program, for example, it is important to know the false positive measures (that is,
what percentage of customers were predicted to say “yes” but instead said “no”), as well as the false negative measures (what percentage of
customers were predicted say “no” but instead said “yes”).
Opt for transparent predictive algorithms
Algorithms are divided into the categories of “black boxes,” in which no one understands what leads the algorithm to produce its results, and
“transparent algorithms,” in which the method by which the results are achieved are readily recognizable. As an example, for system failure
predictions, the black-box algorithm could simply provide the failure probability of a system at a certain time. A transparent algorithm would
also present and rank the importance of the parameters that lead to that result or generate a decision tree that presents how results were
produced. In most cases, this information has as much business value as the predictions themselves, so you always get more value with
transparent predictive algorithms.
There is certainly value to be gained from dashboards and graphs that analyze the past. But the big promise of data analytics can only be
realized with granular, accurate predictions.
Dr. Noa Ruschin-Rimini is the founder and CEO of Grid4C. She holds a Ph.D. in machine learning and data mining from Tel Aviv University’s
engineering department, where she specialized in predictive analytics and anomaly detection of time series.