Grouped-data questions about medians and percentiles almost always come down to one column you can build yourself: the cumulative frequency, or running total of how many observations fall at or below each class boundary. Add that column and an ogive, the graph of that running total, and finding the median, quartiles, and percentiles becomes a procedure you can repeat.

When To Use Cumulative Frequency

Reach for cumulative frequency whenever a question asks for ordered position in a data set rather than a class-by-class count: a median, a quartile, or a percentile. It is the right tool when raw data are large and a grouped table is easier to read than a long list, and it is what an ogive is built from. If you only need to know how many observations land in one class, plain frequency is enough; the moment "at or below" enters the question, switch to cumulative frequency.

The Procedure, Step By Step

If the class frequencies are f1,f2,,fkf_1, f_2, \dots, f_k, then the cumulative frequency up to class kk is

Fk=f1+f2++fkF_k = f_1 + f_2 + \cdots + f_k

Each row adds one more class to the total. If the cumulative frequency is 2828 at the end of a class, then 2828 observations are in that class or below it. For ungrouped data, cumulative frequency is just a running count; for grouped data, it is a running count by class interval.

To locate a value, work from the total frequency NN:

  • the median is around the N/2N/2th value
  • the first quartile is around the N/4N/4th value
  • the third quartile is around the 3N/43N/4th value
  • the ppth percentile is around the (p/100)N(p/100)Nth value

An ogive turns those positions into a reading. Plot the upper class boundary on the horizontal axis and the cumulative frequency on the vertical axis, then join the points; the curve only rises, because cumulative frequency never decreases. Start from a vertical position, move across to the curve, and drop down to the horizontal axis to estimate the value.

A Complete Pass Through One Table

Suppose test scores for 4040 students are grouped like this:

Score Frequency Cumulative frequency
0-10 22 22
10-20 55 77
20-30 99 1616
30-40 1212 2828
40-50 88 3636
50-60 44 4040

The total frequency is N=40N = 40.

The median

The median is the N/2=20N/2 = 20th value. Reading the cumulative frequencies, the total is 1616 up to 20-30 and 2828 up to 30-40, so the 2020th value lies in the 3030-4040 class. If you want a grouped-data estimate, use interpolation only when it is reasonable to treat the values as spread fairly evenly through that class:

medianL+N/2Fbeforefw\text{median} \approx L + \frac{N/2 - F_{\text{before}}}{f} \cdot w

Here L=30L = 30 is the lower boundary, Fbefore=16F_{\text{before}} = 16 is the cumulative frequency before the class, f=12f = 12 is the class frequency, and w=10w = 10 is the class width. So

median30+20161210=30+401233.3\text{median} \approx 30 + \frac{20 - 16}{12} \cdot 10 = 30 + \frac{40}{12} \approx 33.3

That estimate is not exact. It depends on the assumption that the values inside the 3030-4040 class are spread fairly smoothly.

The 75th percentile

The 7575th percentile is the (75/100)40=30(75/100) \cdot 40 = 30th value. The total is 2828 up to 30-40 and 3636 up to 40-50, so the 3030th value lies in the 4040-5050 class. Using the same interpolation idea,

P7540+3028810=42.5P_{75} \approx 40 + \frac{30 - 28}{8} \cdot 10 = 42.5

On an ogive, you would mark 3030 on the cumulative-frequency axis, move across to the curve, and read down to about 42.542.5 on the score axis.

Where Each Step Tends To Break, And How To Check It

Step: building the column. The frequent slip is confusing frequency with cumulative frequency. Frequency counts one class; cumulative frequency counts that class plus all earlier ones. Self-check: the last cumulative entry must equal NN.

Step: locating the position. The position comes from the total NN. Use the wrong total and every later step is off. Self-check: re-derive the position from NN before reading the table.

Step: reading off the value. A grouped estimate lands inside a class, not on an exact original value, and it depends on how the data are distributed in the interval. Self-check: confirm your answer falls between the class boundaries you used.

Step: drawing the ogive. For grouped data, plot against class boundaries, especially upper class boundaries; plotting against midpoints changes the meaning. Self-check: the leftmost point should sit at the first cumulative frequency, the rightmost at NN.

Where Cumulative Frequency Shows Up

Cumulative frequency appears in exam-score summaries, income distributions, quality-control data, and any setting where percentiles or medians matter more than individual bin counts. It is especially handy when raw data are large and a grouped table reads more clearly than the full list of observations.

Run The Procedure Once More

Build a cumulative frequency column for a small grouped table with N=50N = 50, then locate where the 2020th, 2525th, and 4545th values fall. Draw the ogive, read off the median and one percentile, and compare those readings with the table-based estimates. Matching the two is the cleanest sign the procedure has clicked.

Frequently Asked Questions

What is cumulative frequency?
Cumulative frequency is the running total of frequencies, so it shows how many observations are at or below a given value or class boundary.
What is an ogive?
An ogive is a graph of cumulative frequency against value or class boundary. It is commonly used to read medians, quartiles, and percentiles.
Can you find exact percentiles from grouped data?
Usually only approximately. An estimate inside a class depends on an interpolation assumption, which treats values as spread fairly smoothly through that class.

Need help with a problem?

Upload your question and get a verified, step-by-step solution in seconds.

Open GPAI Solver →