Back

 Industry News Details

 
How Recurrent Neural Networks Teach Computers to Read Posted on : Jun 25 - 2017

Memory and context play a huge role in helping humans interpret the world. If you encounter a word like “spring” while reading, you don’t usually have to ask yourself whether it’s a verb meaning “to jump,” or a noun referring to the season after winter, or another noun referring to a coil of metal, because the context of the sentence makes it clear. But as with many tasks, what’s effortless for humans can be incredibly difficult for computers.

We’ve spent a good deal of time looking at various types of neural networks and their applications. We started with feedforward networks and their ability to sort and label objects based on shared characteristics, and then we explored convolutional networks, which are particularly well-suited to decoding images. These types of networks fall short, however, when it comes to processing sequences of inputs where interpreting each element builds on the previous one.

Fortunately, there’s another type of neural network that solves this very problem. Recurrent neural networks (RNNs) are ideal for processing sequences and lists, making them essential for many advanced natural language processing (NLP) tasks. They’re the secret sauce behind Google’s revamped Google Translate program, and they’re becoming increasingly important to a number of deep learning tasks, from machine translation, to natural language generation, to computer vision.

The Shortcomings of Feedforward Networks

Before we cover what makes recurrent networks special, let’s take a brief look again at a standard feedforward neural network. Generally, a fixed input leads to a fixed output. For example, you input a series of photographs, and the network tells you whether there’s a person present or not in each photograph as an output. Another example: You might give the network a number of suspicious-looking financial transactions, and the network might give you the likelihood each of those transactions is fraudulent. In both these examples, the neural network assumes each input and output to be independent, ie, whether a given transaction looks fishy has no bearing on whether the one following it will also be fishy. View More