An RNN, or recurrent neural network, is a neural network built for sequences such as text, speech, or time series. At each step, it combines the current input with a hidden state from the previous step, so the output can depend on what came earlier.
That is the key idea: an RNN has a running memory. An LSTM is a gated kind of RNN that manages that memory more carefully when important information must survive for many steps.
What an RNN does at each time step
At time step , a simple RNN updates its hidden state with a rule like
Here is the current input, is the previous hidden state, and is the new hidden state. The matrices and and the bias are learned during training.
If the model also produces an output at each step, a common form is
The exact output rule depends on the task. Some problems need one output per step, while others use only the final hidden state.
Why the hidden state matters
A feedforward network sees one input and moves on. An RNN reuses part of its previous computation. That reuse is what makes it useful for text, speech, time series, and other ordered data.
You can think of the hidden state as a compact note the model writes to itself after each step. The next step reads that note, updates it, and passes the revised version forward.
If you change the order of the same inputs, the hidden states usually change too. Sequence order matters.
Worked RNN example
Real RNNs usually use vectors and nonlinear activations. To keep the arithmetic readable, use a toy one-number state:
Now process the sequence , , .
First step:
Second step:
Third step:
What matters here is not the exact formula. It is the dependence on the previous state. At step 2, the update does not use only ; it also uses what was carried from step 1. That is the core RNN idea.
If you swap the order and use , , , then
The final state is different even though the same numbers appeared. That is exactly why RNNs are sequence models rather than bag-of-inputs models.
Why basic RNNs struggle on long sequences
In a basic RNN, old information has to survive through many repeated updates. If the sequence is long, that can be hard. Useful signals may fade, and during training the gradients can also shrink or blow up across many steps.
That is why plain RNNs often struggle when the task depends on information from far back in the sequence. The issue is not that recurrence is wrong. The issue is that long-range memory is hard to maintain with a simple hidden-state update.
How LSTM improves RNN memory
An LSTM, short for long short-term memory, is a gated RNN. It introduces a more structured memory path, usually called a cell state, plus gates that control what information is forgotten, what new information is written, and what part is exposed as output.
You do not need the full gate equations to understand the point. The design gives the model more control over memory. If a detail should survive for many steps, an LSTM is better equipped to keep it than a plain RNN.
That does not mean an LSTM remembers everything forever. It means the architecture is better at learning when to preserve information and when to discard it.
RNN vs. LSTM in plain language
A basic RNN has one running state and updates it repeatedly. An LSTM adds a stronger memory mechanism around that idea.
If the sequence is short and the dependency is local, a plain RNN may be enough. If the task depends on information from much earlier in the sequence, an LSTM is often the safer choice.
Common RNN and LSTM mistakes
Thinking an RNN sees the whole sequence at once
It usually does not. The standard picture is step-by-step processing, with state carried forward.
Assuming LSTM solves memory perfectly
It helps with long-range dependencies, but it is still a trained model with finite capacity and practical limits.
Ignoring sequence order
RNNs are built for ordered data. Shuffling sequence elements changes the computation.
Treating the hidden state as human-readable memory
The hidden state is a learned numerical representation, not a clean sentence-like summary.
When RNNs and LSTMs are used
They are used for sequence problems such as language modeling, speech, handwriting, sensor streams, and time-series forecasting. Today, many language tasks use transformers instead, but RNNs and LSTMs still matter because they teach sequence memory clearly and can still be useful in smaller or specialized settings.
Try your own version
Write a four-step sequence of your own and apply the toy rule . Then swap the order of two inputs and compare the final state. That small experiment makes the role of recurrence much clearer than the acronym alone.
If you want to explore another case, compare this page with a transformer or Markov chain explainer and notice what each model does with past information.
Need help with a problem?
Upload your question and get a verified, step-by-step solution in seconds.
Open GPAI Solver →