Interquartile Range (IQR) Explained: How to Find It Step by Step

The interquartile range is one of the most useful numbers in statistics, yet it’s often skipped over in favour of the average. This guide explains exactly what the IQR is, how to find it step by step for both odd and even data sets, how it powers outlier detection and box plots, and why it quietly does a lot of work in machine learning.

interquartile range shown as the middle box on a box plot of a data set
The interquartile range is the length of the box in a box plot.

Key takeaways

  • The interquartile range (IQR) is the spread of the middle 50% of your data: $\text{IQR}=Q_3-Q_1$.
  • It is robust — outliers barely affect it, unlike the range or the standard deviation.
  • It powers the 1.5 × IQR rule for spotting outliers and is the body of every box plot.
  • In machine learning it drives robust feature scaling and outlier removal on skewed data.

What is the interquartile range?

The IQR measures how spread out the middle half of a data set is. To get it, you split your ordered data into four equal parts using three cut points called quartiles, then look at the distance between the first and third of those cut points. Everything in between — the central 50% of your values — sits inside the interquartile range.

That “middle 50%” idea is what makes the IQR so valuable. The ordinary range (maximum minus minimum) can be wrecked by a single freak value. The interquartile range ignores the extreme top 25% and bottom 25% entirely, so one ridiculous outlier can’t distort it. For a formal treatment, see the IQR reference.

Definition. The interquartile range is the difference between the third quartile and the first quartile of an ordered data set: $\text{IQR}=Q_3-Q_1$. It describes the spread of the central 50% of the values.

Quartiles: the building blocks

You can’t understand the IQR without quartiles, so let’s be precise about them. Sort your data from smallest to largest and the quartiles are the three values that chop it into four equally sized groups:

  • First quartile (Q1) — the 25th percentile. A quarter of the data lies below it. It is the median of the lower half.
  • Second quartile (Q2) — the 50th percentile, which is simply the median. Half the data lies below it.
  • Third quartile (Q3) — the 75th percentile. Three quarters of the data lies below it. It is the median of the upper half.

Together with the minimum and maximum, these give the five-number summary (min, Q1, median, Q3, max) — a compact snapshot of any data set. The interquartile range is just the gap between Q1 and Q3 in that summary.

The IQR formula

The formula could not be simpler:

$$\text{IQR}=Q_3-Q_1$$

All the real work is in finding Q1 and Q3 correctly. There is a small wrinkle: statisticians use a few slightly different conventions for locating quartiles (the “exclusive” and “inclusive” methods differ in whether the median is included when you split the data). The method below — treating Q1 and Q3 as the medians of the lower and upper halves, excluding the overall median when the count is odd — is the one taught in most courses and used by our median calculator.

How to find the interquartile range step by step

  1. Sort the data from smallest to largest. Quartiles are meaningless on unordered numbers.
  2. Find the median (Q2) to split the data into a lower half and an upper half.
  3. Find Q1 as the median of the lower half, and Q3 as the median of the upper half.
  4. Subtract: the IQR is $Q_3-Q_1$.
ℹ️ Odd vs even countsIf the data set has an odd number of values, exclude the middle value (the median) from both halves before finding Q1 and Q3. If it has an even number, the data splits cleanly into two halves and you use every value.

Worked example 1: an odd number of values

Take the data set 7, 2, 9, 4, 12, 5, 8 — seven values.

First, sort it: 2, 4, 5, 7, 8, 9, 12. With seven values the median is the 4th value, 7. Because the count is odd, we set that middle value aside. The lower half is 2, 4, 5 and the upper half is 8, 9, 12.

The median of the lower half is Q1 = 4. The median of the upper half is Q3 = 9. Therefore the interquartile range is:

$$\text{IQR}=Q_3-Q_1=9-4=5$$

So the middle 50% of this data set is spread across a range of 5 units. Notice that the largest value, 12, played no part in the calculation — that robustness is the whole point.

Worked example 2: an even number of values

Now take 10, 3, 7, 1, 8, 5 — six values. Sorted: 1, 3, 5, 7, 8, 10. With an even count the data splits straight down the middle into a lower half 1, 3, 5 and an upper half 7, 8, 10.

Q1 is the median of 1, 3, 5, which is 3. Q3 is the median of 7, 8, 10, which is 8. So:

$$\text{IQR}=8-3=5$$

If you ever want to skip the arithmetic, paste either data set into the median calculator — it reports Q1, Q3 and the IQR automatically, which is handy for checking your hand calculations.

The interquartile range and outliers: the 1.5 × IQR rule

The most famous use of the IQR is detecting outliers. A value is flagged as a likely outlier if it falls beyond “fences” placed one and a half interquartile ranges past the quartiles:

$$\text{Lower fence}=Q_1-1.5\times\text{IQR},\qquad \text{Upper fence}=Q_3+1.5\times\text{IQR}$$

Anything below the lower fence or above the upper fence is treated as an outlier. Let’s apply it to the first example, where Q1 = 4, Q3 = 9 and IQR = 5. The lower fence is $4-1.5\times5=-3.5$ and the upper fence is $9+1.5\times5=16.5$. Every value in 2, 4, 5, 7, 8, 9, 12 sits comfortably inside that window, so there are no outliers — which matches our intuition.

⚠️ Common mistakeThe 1.5 multiplier is a convention, not a law of nature. Some analysts use 3 × IQR for “extreme” outliers. The IQR itself never changes — only the multiplier you choose for the fences does.

The box plot connection

Open any box plot and you are looking at the interquartile range directly. The box stretches from Q1 to Q3, so its length is the IQR. The line inside the box is the median, and the “whiskers” reach out to the most extreme values that still fall inside the 1.5 × IQR fences. Any point beyond a whisker is drawn separately as an outlier. This is why box plots are such an efficient way to compare several groups at a glance: a wide box means a large interquartile range and lots of spread; a narrow box means the middle of the data is tightly packed.

The IQR turns a messy column of numbers into a single, outlier-proof measure of how spread out the typical values really are.

Interquartile range vs range vs standard deviation

Three measures describe spread, and choosing the right one matters. Here is how they compare:

MeasureWhat it capturesAffected by outliers?Best for
RangeMax − min (full spread)Extremely — one value sets itQuick, rough idea
Interquartile rangeSpread of the middle 50%Barely — ignores the tailsSkewed data, outlier detection
Standard deviationAverage distance from the meanYes — squares the deviationsRoughly symmetric data

The rule of thumb: report the median and interquartile range when your data is skewed or full of outliers, and the mean and variance or standard deviation when it is roughly symmetric. House prices and incomes, which have long right tails, are almost always summarised with the median and the IQR for exactly this reason.

Why the interquartile range matters in machine learning

The IQR earns its place in any machine learning toolkit, and it shows up in three practical ways. First, robust feature scaling: libraries like scikit-learn include a RobustScaler that centres each feature on its median and scales it by the interquartile range instead of the standard deviation. On data with heavy outliers, this keeps the scaling stable where standard scaling would be dragged around by the extremes.

Second, outlier removal: the 1.5 × IQR rule is one of the simplest, most defensible ways to filter anomalous rows before training. Because the IQR is robust, the very outliers you want to remove don’t inflate the threshold that’s supposed to catch them — a problem that wrecks naive standard-deviation-based filters.

Third, honest data exploration: when you first meet a dataset, plotting the interquartile range for each feature (via box plots) instantly reveals which columns are skewed, which carry outliers, and which need transforming before a model ever sees them.

🤖 ML insight

When a feature is heavily skewed, scaling it by the standard deviation lets a handful of extreme values dominate. Scaling by the IQR instead — the idea behind RobustScaler — keeps the bulk of the data well-behaved, which is why it’s a go-to for real-world, messy features.

Tips for working with the interquartile range

A few practical pointers save a lot of confusion. Always sort first — it is the single most common source of wrong answers. Be consistent about your quartile method; if you compare two datasets, compute their IQRs the same way. And remember that a small interquartile range doesn’t mean “no variation” — it means the middle of the data is tight, while the tails could still be long. Pair the IQR with a box plot whenever you can, so you see the shape, not just the number.

Frequently asked questions

What is the interquartile range in simple terms?
It is the spread of the middle 50% of your data — the distance between the first quartile (Q1) and the third quartile (Q3).
How do you calculate the IQR?
Sort the data, find the median to split it in half, take Q1 as the median of the lower half and Q3 as the median of the upper half, then subtract: IQR = Q3 − Q1.
Why is the interquartile range better than the range?
The range depends entirely on the two most extreme values, so a single outlier distorts it. The IQR ignores the top and bottom 25%, so it is far more robust.
What is the 1.5 × IQR rule?
It flags outliers: any value below Q1 − 1.5×IQR or above Q3 + 1.5×IQR is treated as an outlier. It is the basis of box-plot whiskers.
Is the interquartile range the same as the standard deviation?
No. The standard deviation measures average distance from the mean and is sensitive to outliers; the IQR measures the spread of the middle half and is robust to them.
When should I use the interquartile range in machine learning?
Use it for robust feature scaling (RobustScaler), for outlier removal via the 1.5×IQR rule, and for exploring skewed features with box plots.

Interquartile range: summary

The IQR is the spread of the central 50% of a data set, found by subtracting the first quartile from the third. Its great strength is robustness: outliers, which destroy the range and distort the standard deviation, barely touch it. That robustness is exactly why it underpins outlier detection, box plots, and robust scaling in machine learning. Work through the two examples above, try your own numbers in the median calculator, and explore the rest of the foundations in our statistics for machine learning guide.

Scroll to Top