Back

 Industry News Details

 
Avoiding industrial IoT digital exhaust with machine learning Posted on : Apr 29 - 2017

The internet of things is speeding from concept to reality, with sensors and smart connected devices feeding us a deluge of data 24/7. A study conducted by Cisco estimates that IoT devices will generate 600 zettabytes of data per year by 2020. Most of that data is likely to be generated by automotive, manufacturing, heavy industrial and energy sectors. Such massive growth in industrial IoT data suggests we’re about to enter a new industrial revolution where industries undergo as radical a transformation as that of the first industrial revolution. With the Industry 4.0 factory automation trend catching on, data-driven artificial intelligence promises to create cyber-physical systems that learn as they grow, predict failures before they impact performance, and connect factories and supply chains more efficiently than we could ever have imagined. In this brave new world, precise and timely data provided by low-cost, connected IoT sensors, is the coin of the realm, potentially reshaping entire industries and upsetting the balance of power between large incumbents and ambitious nimble startups.

But as valuable as this sensor data is, the challenges of achieving this utopian vision are often underestimated. A torrent of even the most precise and timely data will likely create more problems than it solves if a company is unprepared to handle it. The result is what I refer to as “digital exhaust.” The term digital exhaust can either refer to undesirable leakage of valuable (often personal) data that can be abused by bad actors using the internet or simply to data that goes to waste. This article discusses the latter use, data that never generates any value.

A 2015 report by the McKinsey Global Institute estimated that on an oil rig with 30,000 sensors “only 1% of the data are examined.” Another industry study suggests that only one-third of companies collecting IoT data were actually using it. So where does this unused data go? A large volume of IIoT data simply disappears milliseconds after it is created. It is created by sensors, examined locally (either by an IoT device or a gateway) and discarded because it is not considered valuable enough for retention. The majority of the rest goes into what I often think of as digital landfills — vast storage repositories where data is buried and quickly forgotten. Often the decisions about which data to discard, store and/or examine closely are driven by a short-term perspective of the value propositions for which the IIoT application was created. But that short-term perspective can place a company at a competitive disadvantage over a longer period. Large, archival data sets can make enormous contributions to developing effective analytic models that can be used for anomaly detection and predictive analytics.

To avoid IIoT digital exhaust and preserve the potential latent value of IIoT data, enterprises need to develop long-term IIoT data retention and governance policies that will ensure they can evolve and enrich their IoT value proposition over time and harness IIoT data as a strategic asset. While it is helpful for a business to have a clear strategic roadmap for the evolution of its IIoT applications, most organizations simply do not have the foresight to properly assess the full range of potential business opportunities for their IIoT data. These opportunities will eventually emerge over time. But by taking the time to carefully evaluate strategies for IIoT data retention, an enterprise can lay a good foundation upon which future value can be built.

So how can I avoid discarding data that might provide valuable insight or be monetizable, while still not storing everything? If storage cost and network bandwidth were unlimited, the answer would be easy. Simply sample sensor data at the highest resolution possible and send every sample over the network for archival storage. However, for many, IIoT applications with large numbers of sensors and high-frequency sampling, this approach is impractical. A balance must be struck between sampling data at a high enough rate to enable the responsiveness of real-time automated logic and preserving data at a low enough rate to be economically sustainable.

Is artificial intelligence the answer?

AI software algorithms that can emulate certain aspects of human cognition are becoming increasingly commonplace and accessible to open source communities. In recent years, the capabilities of these algorithms have improved to the point where they can approximate human performance in certain narrow tasks, such as image and speech recognition and language translation. They often exceed human performance in their ability to identify anomalies, patterns and correlation in certain data sets that are too large to be meaningfully evaluated using traditional analytic dashboards. And their ability to learn continually, while using that knowledge to make accurate and valuable predictions as data sets grow, increasingly impacts our daily lives. Whether it’s Amazon recommending books or movies, your bank’s fraud detection department giving you a call, or even self-driving cars now being tested — machine learning AI algorithms are transforming our world.

Many consider artificial intelligence critical to quickly obtaining valuable insight from IIoT data that might otherwise go to waste by surfacing critical operational anomalies, patterns and correlations. AI can also play a valuable role in identifying important data for retention. But employing AI in IIoT settings is not as simple as it sounds. Sure, AI cloud services (such as IBM’s Watson IoT and Microsoft’s Cortana) can be fed data and generate insight in a growing number of areas. However, IIoT poses some special challenges that make using AI to decide which (and how much) data to retain a non-trivial exercise. View More