Answering the hard questions on machine learning and predictive analytics
We wanted to get the skinny on utility analytics, an insider viewpoint on what’s working and what’s not. So we sat down with Dr. Noa Ruschin-Rimini, founder and CEO of Grid4C, a company that’s all about machine learning and smart grid predictive analytics. And we asked her a few hard questions about the industry and what’s really important.
Why should utilities care about machine learning?
Ruschin-Rimini: The deployment of smart hardware is accelerating as these devices become more affordable.
Many utilities are realizing that the huge amount of granular data generated by these devices can be leveraged to tackle their biggest challenges, such as distributed generation, and engaging consumers beyond traditional power delivery. As a result, the amount of data analytics initiatives implemented by utilities is growing rapidly.
On the data analytics side, analytics capabilities are changing from what is called descriptive analytics — which is simply slicing and dicing historical data — to predictive analytics and the use of machine learning algorithms, which provide sophisticated actionable insights and answer questions like: what will happen and how will we make it happen.
This enables utilities to be proactive and have better control on tomorrow’s bottom line results. According to Gartner, that’s where most of the value is.
Today utilities are seeking to maximize the value derived from their data analytics initiatives and are starting to realize predictive analytics, and machine learning is the answer.
What are utilities doing right when it comes to grid analytics?
Ruschin-Rimini: First, utilities are realizing the value of predictive analytics and machine learning over descriptive analytics and are initiating machine-learning projects. Second, many utilities are asking analytics firms to define analytics use cases for them. Additionally, utilities are initiating proof of concepts — asking analytics firms to use sampled data and demonstrate value before they get into a big analytics project. And utilities are conducting pilots in which they measure the accuracy of the analytics vendors’ results. Those are all good things.
What are utilities doing wrong when it comes to grid analytics?
Ruschin-Rimini: There are a few problem areas. Utilities are settling for descriptive analytics, and they shouldn’t. In many cases, utilities are settling for slicing and dicing historical data and trying to figure out the trends and patterns themselves instead of throwing the huge amount of data into the machine and letting that machine use sophisticated algorithms that automatically produce predictions and actionable insights. Machine-learning implementations produce much more value than descriptive analytics and, in many cases, cost less.
The second problem: Utilities are going for what we describe as the big bang approach. They often decide to implement one end-to-end big data analytics project, which includes a first phase of integrating many systems, without testing the value of these integrations. In these cases, it may take years to realize value. When it comes to machine learning and predictive analytics, value can be realized very quickly. So the recommended approach would be to focus on the biggest pains and quick wins, run pilots to test the value, integrate systems only where value is demonstrated, and implement one use case after the other.
Also a problem: Utilities are not defining the business objective correctly. When it comes to machine learning and data mining, defining business objective correctly is crucial. In many of the cases, when we are required to provide accurate granular load forecasts, utilities define the business objective to be maximum accuracy. When I ask if the penalties of overestimation and underestimation of load are equal, the answer is no in many cases. At this stage, I suggest that the target function would be maximum profit, instead of maximum accuracy, since solving the problem for maximum accuracy holds an assumption that the penalty is symmetric. It is important to understand that these are two different problems that will produce different results.
Another issue: Utilities deciding to develop machine learning/statistical algorithms themselves. Machine learning is a broad domain with numerous expertise and categories, even selecting the right machine-learning expert to hire requires deep understanding in the field. In the case of machine-learning implementations, choosing “make” over “buy” is risky, will be much more expensive, and will significantly increase the time to value.
Finally, utilities are conducting accuracy competitions between machine learning vendors. Conducting those competitions requires deep understanding in statistics and machine learning. In many cases I’ve seen, utilities receive results from all vendors and then realize they do not know how to compare/rank them.
How can analytics make a utility more profitable?
Ruschin-Rimini: Examples of machine learning use cases on the operational side include outage predictions, system failure predictions, accurate load forecasting at the meter/sub-meter and sub-hour levels for better balancing supply and demand, optimizing demand response programs, detecting early warnings of irregularities and identifying the irregularities’ signature, to predict sophisticated thefts, meter malfunctioning events and more.
Examples of machine learning use cases on the customer side, include detecting usage irregularities at the household levels and identifying the appliance that caused the deviation, disaggregating the total usage of the house into the usage of the appliances so residential customers know how much every appliance costs, customer churn predictions and churn root-cause analysis, prediction of customers’ response to specific pricing, marketing and energy efficiency offerings, and more.
All of these machine learning use cases can have a huge impact on the utilities’ efforts to deploy effective marketing campaigns, to improve customer engagement, to increase customer acquisition and retention (in competitive markets), to reduce spikes during peak hours, to better balance the grid and more.
What advice would you give utilities to prep for the analytics they’ll need in the future?
Ruschin-Rimini: First and foremost, choose machine learning and predictive analytics over descriptive analytics. Then start with a proof of concept that consists of sampled data. The best way to start machine-learning implementations is by a proof of concept, which consists of only a sample of the data, and does not require integration to other systems. It is a good way to learn the value that could be derived from the data in minimum investment of time and money.
Use transparent machine-learning algorithms. Algorithms are divided into black box algorithms, in which no one understands what lead the algorithm to produce the results, and transparent algorithms, in which the algorithms present how results were achieved. In most cases, this information has as much business value as the predictions, so you always get more value with transparent predictive algorithms.
Choose machine-learning industry-specific products over generic statistical tools. Implementing plug-and-play products, which are aimed at utilities and developed around utility use cases, will cost less and provide significantly shorter time to value than buying a statistical tools that comes with statistic experts, which are most likely to “reinvent the wheel” for you, with all the money and time you can imagine comes with that unnecessary reinvention.