Database normalization can be summed up in one sentence: "Store each fact in exactly one place." The systematic way to do that is 1NF → 2NF → 3NF.
Why does this matter? Because if the same information is scattered across multiple rows, the moment you update one and forget the others, your data becomes inconsistent. This is called an update anomaly. Try it yourself below.
If Prof.A's office changes, in a denormalized table you have to manually update both rows. Miss even one, and you get a contradiction like "Prof.A is in Room 301 and Room 502 at the same time." In a 3NF design, you only need to update exactly 1 row.
That’s really the whole point of normalization — the rest is just a set of rules for deciding which column belongs where so each fact is stored in only one place.
Functional Dependency — The Criterion for Normalization
Normalization asks, "What does this column depend on?" This is called a functional dependency.
For example, student_id → student_name means "if you know the student ID, the name is determined."
In the diagram below, pay attention to the arrow colors:
- Green = depends on the whole key (good)
- Yellow = depends on only part of the key (partial dependency → violates 2NF)
- Red dashed = depends through a non-key column (transitive dependency → violates 3NF)
Removing the yellow and red arrows = splitting the table = normalization.
1NF → 2NF → 3NF at a Glance
Click through the steps below to see how the table gets split at each stage. Red cells mark columns that should not be in that table.
1NF: One Value per Cell
If you put multiple values like MATH101, CS102 into a single cell, you can’t search properly and you can’t join properly either.
Rule: Every cell must be atomic, meaning it contains only one value.
Columns like phone1, phone2, phone3 have the same problem — they just hide repeated values inside the column names.
2NF: Remove Partial Dependencies
In a table with a composite key (student_id, course_id), student_name depends only on student_id. It does not depend on the whole key, only part of it — that’s a partial functional dependency.
Rule: Every non-key column must depend on the entire key.
If there is a partial dependency, move that column out into a separate table.
What if the key is a single column? Then partial dependency is impossible, so if the table is in 1NF, it is automatically in 2NF.
3NF: Remove Transitive Dependencies
In a Courses table, instructor_office does not depend directly on course_id (the key). It depends through the path course_id → instructor_id → instructor_office — in other words, a non-key column depends on another non-key column. That is a transitive dependency.
Rule: Non-key columns must depend only on the key. They must not depend through another non-key column.
Solution: move instructor_office into an Instructors table.
One Question Instead of Memorizing Rules
"Every non-key attribute must depend on the key, the whole key, and nothing but the key."
That one sentence is the whole story:
- the key = 1NF (a fact must be identifiable by a key)
- the whole key = 2NF (it must depend on the entire key, not just part of it)
- nothing but the key = 3NF (it must depend directly on the key, not through another non-key column)
If you get a normalization question on an exam, ask just this: "What does this column depend on? The whole key, part of the key, or a non-key column?"
Common Mistakes
"Once you reach 3NF, you're done." — 3NF is a great starting point, not the finish line. Sometimes you need the stricter BCNF, and other times you intentionally denormalize for performance.
"More tables is always better." — The goal of normalization is not to increase the number of tables. It’s to make sure each table is responsible for only one kind of fact. If you split tables without a dependency-based reason, you just make JOINs more complicated.
"If it’s NoSQL, normalization doesn’t matter." — Normalization is a term specific to relational databases, but the problem of keeping duplicated data consistent exists in any database. Even when you denormalize, you should know exactly what tradeoff you’re making.
Check Your Own Schema
Pick one table from your own project and try this:
- Write down what the key is.
- Check whether each non-key column depends on the whole key or only part of it. → If it’s only part, that violates 2NF.
- Check whether any non-key column depends on another non-key column. → If so, that violates 3NF.
These 3 steps are enough to catch most normalization problems.
Need help with a problem?
Upload your question and get a verified, step-by-step solution in seconds.
Open GPAI Solver →