How does deep learning work?
Feature extraction … detect low-level features (edges, colors, textures) to the primitive layers.
Summary … Deeper layers combine these features into higher-level concepts (faces, words, objects).
Prophecy … The final layer produces an output, such as a classification label or numerical prediction.
A brief history
Deep learning began in the 1940s based on neural network research, although progress was limited by the computing power and data available at the time. In 2006, Geoffrey Hinton, Ruslan Salakhutinov, and Dmitry Alekseev popularized Deep belief networksrevitalizing interest in deep architecture training. The biggest breakthrough came in 2012 when Hinton students Alex Krzysztofsky and Ilya Sotskiver developed ALEXANTa deep sculptural neural network that scored a dramatic win at the Imagint competition, sparked the boom in modern deep learning. In 2014, Ian Goodfellow, Joshua Bengo, and Aaron Corwell published the influential Deep learning Textbook, Regularization of the Field. Neural networks use layers of low- and high-level feature detectors to filter the input data, with the number of layers depending on the task. For example, face recognition often requires more layers than audio processing. The term “depth” refers to the number of layers. A system with 10 layers is considered shallower than 100. Capsule networks, introduced in mid-2010, aim to better model relationships between objects in images, with further improvements in accuracy. Today, autonomous vehicles and medical diagnostics power deep learning capabilities, from fraud detection, product recommendations, speech recognition, translation, and robotics, whose influence continues to grow across industries.