Security Patterns for Microservice Architectures

This blog is referenced from: https://developer.okta.com/blog/2020/03/23/microservice-security-patterns?fbclid=IwAR1RqjHffB0YelLKjL8cWVBzNJExVdFAANPdKHSPTP9fMKil25DlD-T6VjY for research purpose.

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Diving Into the LSTM

Hello data enthusiasts! Since you have landed here, I presume you have made it this far (till the time-series problem) into your journey are eager to grasp its concept and hands on experience. So without further ado let’s get started. In this article, I will be explaining all the concepts related to LSTMs in the most simple way possible to help you get a strong grasp on the topic.

LSTM is a recurrent neural network (RNN) architecture that remembers values over arbitrary intervals. LSTM has somewhat of an insensitivity to gap length and this is the feature of the LSTM which makes it stand above the normal RNN or HMM (Hidden Markov Model), hence making it much suitable to classify, process, and predict time-series problems. One of the advantages of LSTM is insensitivity to gap length. RNN and HMM rely on the hidden state before emission/sequence. If we want to predict the sequence after 1,000 intervals instead of 10, the model might forget the starting point by then. LSTM REMEMBERS.

Before understanding the concept behind how the LSTMs actually works, you should have some ground knowledge regarding the RNN. In case you don’t actually, I suggest you read it at the link below so that you have a better grasp of the problem that we’re trying to solve.

Many of you might have seen a movie called Memento, where the main character suffers from a disease called short-term memory loss and keeps making pointers and indicators to keep himself aware of the situation in the long term. So technically when we use RNN because we keep track of a very limited number of states before our current state hence these architectures suffer from a similar problem which is a short-term memory problem. Let’s understand that with an example.
Suppose you have a sentence completion task in which the words to be written by the autocomplete model mostly depend on the first word of a very long sentence. For example, consider the two sentences

Everything in both sentences is similar except for the starting word and the decision on what the first word of auto-complete part is comes down to the first word of the sentence. In a normal RNN if we feed a problem like this since the RNN is able to remember a limited interval and because of the vanishing gradient problem, it won’t be able to completely remember the first word of the sentence that it read hence leading it to produce a very vague auto-complete sentence.

How LSTM solves this problem is essentially how LSTM actually works. You see in the normal RNN we have only the short term memory state and the main problem with that is that it can only correctly remember the last few words/ inputs of the sequence along with the problem that we input each and every single word through the short term memory cell so it sort of tries to remember all of it. What if we were to introduce another unit in the RNN to cater the long-term memory and it remembers only the words/inputs that are bound to make an impact in the long term? This would essentially solve all of our problems because we have the traditional RNN structure to remember the short-term relation of the inputs and a long-term unit to remember the inputs bound to have an impact in the long term.

If we had such a unit in our architecture, considering the same problem as in the previous section, the RNN would have essentially remember the words such as Today and Yesterday and would use them to make the decision at the beginning of the auto-complete statement.

So essentially what we are doing is identifying the words/inputs at run time which have the probability to impact our final decision whether it be a classification problem or a prediction problem like auto-complete.

The main question that arises here is that how does LSTM know which inputs are to be kept in the long-term memory unit? Also when to drop a certain input that it is holding in the long-term memory unit and lastly when to keep multiple inputs in the long-term memory unit?

The answer to all of these questions is how typically any model in a deep learning domain learns, TRAINING. During the training time, our model optimizes itself to learn the inputs that it should hold for a long time and optimizes itself to learn to identify the inputs that can play a role in the long term.

Let’s take an example of a University Lecture room where there are two students. The first one is a student with 100% attendance and the second one shows up once in a blue moon, probably came today to class because he was too short on attendance. Either way, the professor makes a rule to constantly take notes in the class. The student that has been taking constant classes has now optimized his note taking technique in such a way that he can essentially identify the key points and take notes of them and when the lecturer moves further into the lecture, say 45 mins passed, he can still relate the points with the current discussion and answer any question asked correctly. Whereas now consider our guy who’s been bunking lectures. Since he does not has the capacity to optimize his note taking technique he’ll be essentially writing down every single thing that the professor says. When asked a question he is more likely to repeat the last two sentences of the professor.

Both were taking notes and paying attention so what makes the difference here? The 100% attendance student named LSTM has:

The student with the short attendance does not hase any of the above. This is how you can actually find the difference between how a normal RNN operates and how LSTM solves this problem. Hope the analogy clears it all out for you.

Let’s expand this example further and say the lecturer starts another topic within the same lecture. Now since the 100% attendance student has taken long term points that are no longer relevant to the new topic and the short attendance student has been writing down every single thing, you might think that the short attendance one might perform better in the situation because he is only making context from last few sentences and the 100% attendance student will be trying to cater to the things previous previously discussed? But guess what, you’re wrong because the LSTM also has the ability to forget the long term inputs or drop them if they are no longer relevant. So yeah practice does make a man perfect.

LSTM module has 3 gates named as Forget gate, Input gate, Output gate.

LSTMs’ core component is the memory cell. It can maintain its state over time, consisting of an explicit memory (also called as the cell state vector) and gating units. Gating units regulate the information flow into and out of the memory.

Cell State Vector
-
Cell state vector represents the memory of the LSTM and it undergoes changes via forgetting of old memory (forget gate) and addition of new memory (input gate).

Gates
-
Gate: Sigmoid neural network layer followed by point-wise multiplication operator.
- Gates control the flow of information to/from the memory.
- Gates are controlled by a concatenation of the output from the previous time step and the current input and optionally the cell state vector.

Forget Gate
-
Controls what information to throw away from memory.
- Decides how much of the past you should remember.

Update/Input Gate
-
Controls what new information is added to cell state from current input.
- Decides how much of this unit is added to the current state.

Output Gate
-
Conditionally decides what to output from the memory.
- Decides which part of the current cell makes it to the output.

The whole article I’ve been waiting to say this but LSTM is also a Stark, it remembers, the North remembers xD.

LSTM Rememberss !!!!

With the memory & gating mechanisms it is a good choice for such sequences which have long-term dependencies in them.

The Applications of LSTM’s:

Add a comment

Related posts:

For those of us who spend way too much time on Twitter

For those of us who spend way too much time on Twitter, you may have noticed something strange pop up on your timeline late last week. On Wednesday, July 15th, a long list of celebrities…

The Credit System Breaks Down

When Equifax revealed weeks ago that its servers were hacked, it exposed hundreds of millions of adult Americans (possibly Canadians and British too) to identity theft. It also illustrated how a…