Covariance and correlation both describe how two variables move together, but they answer slightly different questions. The one-line version: covariance gives the direction of joint movement and keeps the original units, while correlation standardizes that relationship into a unitless number between and .
Covariance itself measures whether two variables tend to be above or below their means together. A positive covariance means the variables usually move the same way relative to their averages; a negative covariance means one tends to be above average when the other is below.
Covariance Vs. Correlation, Side By Side
Covariance Correlation
Measures direction of joint movement direction + standardized strength
Units original units (x times y) unitless
Range no fixed range between -1 and 1
Best for original-units variation, comparing across data sets
covariance matrices
Formula s_xy r = s_xy / (s_x s_y)
Correlation standardizes covariance by dividing by the standard deviations, when those are nonzero:
That is why correlation is unitless and easy to compare across data sets, while covariance has no fixed range.
The Formulas, For Samples And Populations
For a sample of paired data, a common formula is
where and are the sample means. Each product is positive when the pair falls on the same side of both means, and negative when the pair falls on opposite sides.
For a full population rather than a sample, the denominator is typically instead of :
Use the sample version for sample data and the population version only when the data represents the entire population.
Reading The Sign
Covariance is built from paired deviations from the mean. If both deviations are positive, their product is positive; if both are negative, their product is also positive. Those pairs push covariance upward, because the variables move together relative to their centers. If one deviation is positive and the other negative, the product is negative, pulling covariance downward. So covariance is really an average of joint movement around the mean.
When To Use Which
- Use covariance when you care about joint variation in the original units, or when it appears inside a larger calculation such as a covariance matrix.
- Use correlation when you want a unitless summary that is easier to compare across data sets.
Covariance is especially common in covariance matrices, where each entry summarizes how two variables vary jointly. That matters in portfolio risk, principal component analysis, and multivariable modeling.
Worked Example: Study Hours And Quiz Scores
Suppose a small sample records study hours and quiz scores:
First find the means:
Now the deviations and their products:
- For :
- For :
- For :
Add the products:
Because this is sample covariance, divide by :
The covariance is positive, so more study time goes with higher quiz scores here. But is not a universal strength scale: its size depends on the units, hours times score points. Change the measurement scale and the covariance changes too, even if the pattern stays similar. This is exactly the case where correlation helps, because it strips out the units. To feel the contrast, recompute this data with the correlation coefficient and notice how standardizing the scales changes the interpretation.
Confusion Points To Watch
Treating A Large Covariance As Automatically Strong
A covariance of is not automatically stronger than a covariance of . The variables may simply be measured on larger scales.
Mixing Up Sample And Population Formulas
If your data is a sample, divide by . If it is the whole population, divide by .
Thinking Zero Covariance Means No Relationship At All
A covariance near means little linear co-movement around the means. It does not rule out a nonlinear relationship. If two variables are independent and the covariance exists, the covariance is ; the reverse is not always true.
Reading Covariance As Causation
Covariance only describes how variables vary together. It does not explain why.
Frequently Asked Questions
- What does covariance measure?
- Covariance measures whether two variables tend to be above their means together, below their means together, or move in opposite directions.
- Can covariance be negative?
- Yes. A negative covariance means higher values of one variable tend to occur with lower values of the other, relative to their means.
- What is the difference between covariance and correlation?
- Covariance keeps the original units and scale, while correlation standardizes the relationship so the result is unitless and easier to compare across data sets.
Need help with a problem?
Upload your question and get a verified, step-by-step solution in seconds.
Open GPAI Solver →