Global Big Data Conference

The Basics of Deep Learning and How To Apply It To Predict Failures Posted on : Jul 28 - 2016

Deep Learning is a hot topic. Big players like Google, Microsoft and IBM invest heavily in new projects around Deep Learning. Their goal? Developing neural networks which learn more and more complex tasks. But how does it work?

Spam filters already filter out our unwanted email with extreme high precision. Not many people understand how these unwanted messages are separated from wanted messages. You can’t simply filter based on sender address, as new spam addresses can easily be created. The second reason is that spam is often sent from legitimate email accounts hijacked by third parties. The best way to separate spam is to look at the content of the email messages. The most effective techniques to do this are based on machine learning.

Machine learning is a disciple which concerns itself with the development of self-learning systems. These systems learn in an automated fashion to recognize structure in data. In this way, the system learns a model that explains the data with which we can do predictions over unseen data. Well-known examples of machine learning are face recognition, voice recognition, and text translation. Also, the self-driving car from Google is loaded with different kinds of machine learning systems to recognize pedestrians and traffic signs.

The Underlying Principle

The principle behind machine learning is rather simple. Imagine we want to build a machine that separates apples from pears. A digital image is made of an object, and two values called features are extracted from this digital image by a small piece of hand written code. The code extracts the color of the object in the image from red to green and the shape of the object from circular to oval. Now imagine we have a set of images containing both apples and pears. For each image, we also know if it contains an apple or a pear, we call these the labels of the images. When we compute the features for the images of the training set and plot them we get the following graphs.

We see that apples and pears mostly occupy there own areas. Both object classes are therefore largely separable by separating the space into two distinct spaces (blue line). Given a new image of an object, we can now identify it as an apple or a pear by computing the features and checking in which space it lays. In essence, the algorithm has learned from data to differentiate apples from pears.

Although this is the case we also see that the system can make mistakes if the computed features are close to the line separating the two object classes. This is because there exists green oval apples and more round red pears. The accuracy of the algorithm is therefore highly dependable on the number of samples in the training set and the quality and number of features used. For example, we could have used a third feature that would quantify the texture of the object. This might have increased the accuracy of the algorithm.

Deep Learning

The above-described method is the essence of machine learning and has been applied in this manner for decades. The most important element is constructing quality features such that object categories are separable. One might ask, however, is it also possible to learn these features directly, instead of hand coding them? This is indeed possible, and methods to do so have existed since the seventies. One method which can be used to learn features automatically are neural networks. Neural networks are loosely based on the way the brain works.

Artificial neural networks are made up of artificial neurons that model single brain cells. These artificial neurons represent one unit of computation. An artificial neural network receives different values as input (for example from other artificial neurons) and then computes a simple equation to produce a single output value. This output value may then functions as the input for other neurons. By connecting neurons in layers we construct one big artificial neural network. Although single neurons perform simple computations the network as a whole can perform a very complex calculation. The image below illustrates this idea, with neurons represented as circles and output-input connections between neurons as lines. The interesting thing about neural networks is that they automatically learn the features required. We can imagine a neural network that can separate apples and pears by learning the shape and color features directly from images it receives as input.

The term deep learning refers to the number of layers in the neural network, also called the depth of network. The depth plays an important role in learning good features. This is because every layer learns a set of features based on the features learned in the layer before. The deeper the network the more complex the features that can be learned.

Although neural networks learn the features by themselves, they weren’t often applied in practice. The reason for this is two-fold. First of all, many training examples are required, second many layers are needed to learn good features, which in turn requires a lot of computational power. With the rise of big data and an increase in computational power over the last few years, it has become possible to apply these neural networks in practice. Neural networks can learn far more complex features than can be constructed by hand. Therefore they often outperform systems normally coded by hand.

Applications

Machine learning and deep learning are widely applicable and are not limited to separating pears and apples for industrial agriculture. For example, systems exist that learn to identify cancer cells from healthy cells in medical scans. The accuracy of these kinds of systems has improved rapidly over the last few years. Facebook, for example has created a Siri-like system that is capable of analyzing the content of pictures with a high degree of accuracy and can then answer questions about the image content.

Although these kinds of systems do not perform better than humans yet, there exists more specialized systems that already outperform humans. For example, an application by Microsoft recognizes dog breeds with high accuracy, outperforming humans.

Machine learning is not only used for classification but is also used for analyzing text. A neural network can, for example, be used to extract the sentiment of text. This indicates how positive or negative the text is. This is a well-known technique used to automatically asses the content of product reviews for example.

The most impressive application of machine learning, in my opinion, is in the field of artificial intelligence. The combination of neural networks with reinforcement learning makes it possible to construct intelligent agents that can learn from their environment.

The best example of this is a system produced by Google DeepMind which learns to play Atari video games like pong and breakout completely autonomously by trial and error. The system only receives screen input and can only produce button presses on the video game controller, just like a human. In some video games, the system actually outperforms humans.

Anomaly Detection

One of the areas we are currently focusing our attention on is the applicability of deep learning in the context of anomaly detection in signals (streams). This means detecting if we see abnormal patterns in signals compared to the normally observed frequently occurring patterns.

The aim is to use anomaly detection to detect problems before they escalate and become catastrophic. This way the problem can be resolved faster allowing for shorter down times. This is based on the premise that abnormal behavior precedes system failure.

Anomaly Detection & Deep Learning

Neural networks can be used to implement anomaly detection, the idea is to construct a neural network that takes in a signal as input and then reconstruct the same signal in its output. In this manner, the neural network learns features that characterize the signals it is trained on. These features thus define the normal behavior of the signal.

When the network is confronted with an abnormal signal it will not be able to properly reconstruct that signal in its output. This causes a large deviation between the input signal and reconstructed output signal, a term we call the reconstruction error. This reconstruction error indicates abnormal behavior of a signal if applied in real time. View More

Get the