Artificial Neural Network vs Deep Neural Network

Nina Almaamary
4 min readMay 9, 2021

--

When looking up images of both of these topics you’ll see something similar to Fig 1.1. On a first glance, the difference between an artificial neural network and deep neural network is the depth or the number of hidden layers, but is that it?

Fig 1.1. Neural Network and Deep Neural Network

It’s apparent that the neural network (NN) has only 1 hidden layer, therefore we say it’s shallow. While deep neural network (DNN) has 3 hidden layers therefore we say it's deep(doesn’t have to be 3, it could be 2 or more!). A network with only 1 hidden layer is also called an Artificial Neural Network (ANN).

- So you’re telling me that deep neural network is another name for a neural network with more than 1 hidden layer ??
Yes and No!

There is a ‘deeper’ reason for having different namings, and it involves the concept of Backpropagation. I’ll provide an abstract explanation just for the purpose of comparing NNs with DNNs. However it’s important to also understand the concept of Feedforward in a network.

Feedforward

Feedforward: you can think of it as ‘feeding’ the input through the network. The information moves in a forward direction ‘fed in a forward manner’, from the input node through the hidden layer to the output node. The ‘lines’ in the network are called weights and so each line has a certain weight to it (one might be heavier than others).

This post shows different neural network architectures, where different architectures feed information in a different manner.

Let’s say we have a neural network whose task is to identify handwritten digits. The input in this case, would be the image pixels of a single handwritten digit, and each pixel would have a weight assigned to it (how important is that pixel, ‘how heavy is it?’). The output would be the network’s predicted digit, this prediction might be correct or wrong, which leads us to the second concept.

Backpropagation

Backpropagation: you can think of it as the process of ‘training’ the network or allowing the network to ‘learn’. In some sense finding a set of weights that would allow the network to output a desired behavior.

So in the case of identifying handwritten digits, given that the network’s predictions were incorrect, the network would backpropagate this error, to identify which weights are contributing the most to the error, and change its value accordingly.

You see before deep learning, a Neural Network with many hidden layers would perform poorly and sometimes would not be capable of learning. The network would suffer from Vanishing gradients.

Vanishing gradients refers to the network’s inability to learn, in this case when backpropagating through multiple hidden layers the change in weights that contribute the most to the error, are so small that the network is not making any progress in its learning (happens in the early layers). This book explains Vanishing gradients in greater detail.

- So what did we do back then?
We made sure that the network will grow vertically rather than horizontally, meaning we keep the 1 hidden layer and just add neurons to it.

- Okay... and how does that solve the problem?
Universal approximation theorem

According to the Universal approximation theorem an Artificial Neural Network can approximate any continuous function, and so by keeping the 1 hidden layer, and making it longer would still allow the network to solve it’s tasks and not worry about the vanishing gradients. This book explains the universality theorem in great detail.
But… for some reason, in some tasks it still did not perform as good as we expected.

And so with increase of data, better hardware and the improvement of our Algorithms we found different techniques of avoiding vanishing gradients, which made it possible to train a deep neural network!.
So the set of techniques that allow deep neural networks to learn is called Deep Learning. You'll constantly see that Deep learning is used in many task such as image recognition, self driving cars, NLP and much more!

Team shallow or team deep?
Do Deep Nets Really Need to be Deep? shows that shallow networks can achieve accuracies similar to deep neural networks.

So if shallow network can achieve accuracies similar to deep neural networks why bother putting effort in building deep neural networks?

it just happens to be that practically deep neural network preforms better at solving real-world problems, given the set of techniques that we currently have/know of.

Conclusion

- So you’re telling me that deep neural network is another name for a neural network with more than 1 hidden layer ??
Yes and … the set of techniques used to train the network.

Deep networks uses deep learning to achieve high accuracies and in some tasks surpass human performance.

--

--

Nina Almaamary
Nina Almaamary

Written by Nina Almaamary

(ノ◕ヮ◕)ノ*:・゚✧ Interested in A.I.

No responses yet