How Recurrent Neural Network works| LSTM | Deep Learning
- December 03, 2020
- By Saurav prasad
- 3 Comments
While most of the deep-learning problems can handle by artificial neural networks and convolutional neural networks, but in some areas, we need new approaches to tackle the problem. To be more precise, problems involving the relation between past and present, i.e., text generation, audio analysis, stock price prediction, etc.
Suppose you ask Google assistant, "Who is Tom cruise. Where he lives?" and it will tell you, "He is an American actor and lives in Beverly Hills." You must be thinking, "what's great in that?" We, humans, understand the contextual meaning of words, but that is not the case with computers. So in the above example, when we asked, "Where he lives?" How does the assistant know we are still talking about Tom Cruise? To tackle these kinds of problems, we need recurrent neural networks because they can remember past events and make a meaningful relationship between past and present information.
So what is a Recurrent Neural Network?
It is a type of neural network where the output of the previous state is feed as input for the current state. It allows the RNNs to have some memories that are lacking in ANNs and CNNs. Hence we can use it for the data having any sequence.
The above diagram is a standard representation of RNNs in deep learning, and we are going to stick to that for the sake of convenience. At time T, RNNs get the input from the two sources. First, from the data and another from the previous timestamp. So whatever output we are getting, it's routed back along with current data. This structure gives the RNNs, a short term memory, Hence we can find its uses in deep learning a lot.
Let's understand it with an example.
Suppose I asked you orally for the sum of five integers. It would take a fraction of a second to answer. You were able to do that because first, you know how to sum, second you have a short term memory, to allow you to remember the numbers for a while.
Similarly, we have five integers, and we want to find the sum. So we feed these numbers to the RNN at different timestamps, as you can see in the picture. Since RNNs have memory, they remember what output they got at a particular instant, and we get the desired result. It is a rudimentary example and aimed at making you understand the basic idea behind RNNs.
Mechanism of Recurrent Neural Networks
- Xt is the input at the time T. Xt gets multiplied by the weight matrix Wxh.
- ht-1 is the hidden state at timestamp t-1. It also gets multiplied by the weight matrix Whh. This weight matrix is same across the network.
- To get the next hidden state(ht), we quash the two incoming inputs using a tanh or hyperbolic tangent function.
- To get the output at each timestamp, we multiple the ht with another weight matrix Wyh.
Types of Recurrent Neural Networks.
Training of Recurrent Neural Network.
- To convert these characters into the numeric format, first, we apply one-hot encoding.
- The one-hot encoded vector passed to the hidden layer along with weights attached.
- The hidden layer squash the incomings using tanh or hyperbolic tangent function. The output of the hidden layer diverges. First, the output goes to the next hidden layer and second as an output along with some weights attached.
- The weights attached output is then passed to the softmax function to get the probabilities of the next character in the sequence.
- Recurrent neural networks also use backpropagation to learn from the data. I have talked about backpropagation in this post.
So How RNNs backpropagate?
What is Long Short Term Memory ( LSTM ) Network?
- Input at time t or data at that timestamp.
- Hidden state or you can say short-term memory carrier.
- Cell state or we can say long term memory carrier.
- Output at time t or prediction at that timestamp.
- Hidden state or short term memory gets updated based on the computation done in these four gates.
- Cell state or long term memory gets updated based on the computation done in those gates.
What is Forget Gate?
What is Input gate ?
What is Update Gate?
- Where she lives?
- What is her profession?
- Whom she Married?
3 Comments
Well explained 👍👍
ReplyDeleteAmazing work!
ReplyDeletepersonally, I liked most, All the concepts covered with simple terms instead of going through thousands of pages all covered with minimum pages it's more recommended to people who have a basic understanding of data science ML/AI want to get the job it will help them to brush up their concepts, still, there are some things to cover in this overall well explained keep rocking...
ReplyDeletePlease do not comment any spam link in the comment box