A scatter plot shows the relationship between two numerical variables. Each point is one pair of values, so you can quickly see whether the data rises, falls, spreads out, clusters, or contains unusual points.
That makes a scatter plot the fastest way to answer the question most students actually have: "What is going on in this data?" Before you compute correlation or draw a line of best fit, the plot tells you whether those summaries even make sense.
How to read a scatter plot
The horizontal axis shows one variable and the vertical axis shows the other. If one student studied for hours and scored , the point is .
Once the points are on the graph, look for the overall pattern:
- Positive correlation: points tend to rise from left to right.
- Negative correlation: points tend to fall from left to right.
- Little or no clear correlation: points do not show a strong linear trend.
Also check for clusters, gaps, and outliers. Real data almost never lands exactly on one line, so the goal is to see the trend, not perfect alignment.
What correlation means on a scatter plot
Correlation describes the direction and strength of a linear relationship. "Linear" is the key condition: correlation is summarizing how well the points match a straight-line trend.
If the points cluster around an upward-sloping line, the correlation is positive. If they cluster around a downward-sloping line, the correlation is negative. If the points look scattered with no clear straight-line direction, the linear correlation is weak or close to zero.
A curved pattern can still show a real relationship. It just may not have strong linear correlation.
When a line of best fit helps
A line of best fit is a straight line drawn to represent the overall trend of the points. It does not need to pass through every point. Its job is to stay close to the cloud of points overall.
Use a line of best fit only when the scatter plot is roughly linear. In that case, the line helps with two things:
- summarizing the trend
- making rough predictions inside the observed range
If the pattern is curved, broken into clusters, or dominated by outliers, a straight best-fit line can hide more than it explains.
Scatter plot example: study hours and quiz scores
Suppose a teacher records study time and quiz score for five students:
These points rise from left to right and stay fairly close to a straight line. That means the relationship is positive and roughly linear.
So both correlation and a line of best fit are reasonable summaries here. You would expect the best-fit line to have a positive slope because larger study times tend to go with larger quiz scores.
Now add one extra point at . The trend may still be positive, but this point is an outlier, and it could pull the line of best fit downward. That is why the graph should come before the summary: the picture tells you whether the summary is trustworthy.
Common scatter plot mistakes
Treating correlation as causation
If two variables move together, that does not automatically mean one causes the other. A third factor may affect both, or the pattern may be more complicated than it first appears.
Forcing a line onto a curved pattern
Some data follows a curve rather than a straight line. In that case, a linear best-fit line may give a misleading summary.
Ignoring outliers
One unusual point can change the apparent trend a lot. Outliers do not always mean the data is wrong, but they should never be ignored without checking the context.
Forgetting what one point represents
A scatter plot only works for paired data. Each point must come from one observation that has both an -value and a -value.
When scatter plots are used
Scatter plots are used in statistics, science, business, and social research whenever you want to compare two numerical variables. Common examples include height and weight, advertising spend and sales, or time studied and test score.
They are especially useful at the start of an analysis because they can reveal patterns a single formula may hide, such as clusters, gaps, or outliers.
Picture First, Summary Second
A scatter plot answers three questions before any formula does: does the pattern rise or fall, is it roughly linear, and are any points unusually far from the rest? Only when the cloud looks roughly linear do a correlation coefficient or a line of best fit become trustworthy summaries. Letting the picture come first is what keeps those summaries from hiding clusters, curves, or outliers.
Frequently Asked Questions
- What does a scatter plot show?
- A scatter plot shows the relationship between two numerical variables, with each point representing one pair of values. It lets you quickly see whether the data rises, falls, clusters, spreads out, or contains unusual points, which tells you whether summaries like correlation or a line of best fit even make sense.
- How do you tell positive from negative correlation on a scatter plot?
- Look at the overall direction of the points from left to right. If they tend to rise, the correlation is positive. If they tend to fall, it is negative. If the points show no clear straight-line trend, the linear correlation is weak or close to zero. Also check for clusters, gaps, and outliers.
- When should you use a line of best fit?
- Use a line of best fit only when the scatter plot shows a roughly linear pattern. The line summarizes the overall trend and supports rough predictions inside the observed range. If the pattern is curved, broken into clusters, or dominated by outliers, a straight best-fit line can be misleading.
- Does a line of best fit have to pass through every point?
- No. Real data almost never lands exactly on one line. The job of a best-fit line is to stay close to the overall cloud of points and represent the general trend, not to connect or touch individual data values.
Need help with a problem?
Upload your question and get a verified, step-by-step solution in seconds.
Open GPAI Solver →