Ordinal Logistic Regression: Best Simple Guide 2026

Q: What is ordinal logistic regression?

Ordinal logistic regression, also called the ordered logit or proportional odds model, predicts an outcome that falls into three or more ordered categories such as low/medium/high. It models cumulative probabilities with category-specific thresholds and one shared set of slopes.

Q: What is the proportional-odds assumption?

It is the rule that each predictor has a single slope shared across all category thresholds, so the odds ratio is the same at every cut point. One unit of the predictor multiplies the odds of a higher category by the same factor everywhere.

Q: How is ordinal different from multinomial logistic regression?

Ordinal logistic regression assumes the categories are ordered and estimates one shared slope per predictor. Multinomial logistic regression treats the categories as unordered labels and estimates a separate slope set for each class versus a reference.

Q: How do I interpret the odds ratio in ordinal logistic regression?

Exponentiate the slope to get e raised to beta. It is the factor by which the odds of being in a higher outcome category multiply for each one-unit increase in the predictor, and under proportional odds it applies at every threshold.

Q: When should I use ordinal logistic regression?

Use it when the outcome has three or more categories that are genuinely ordered, such as ratings or severity levels, so the ordering carries information. Ignoring that order with a multinomial model loses statistical power.

Ordinal logistic regression is the model you reach for when your outcome has three or more ordered categories — low/medium/high, disagree/neutral/agree, or a survey rating from 1 to 5 — and you want to use that ordering instead of throwing it away. This guide explains the proportional-odds model, the cumulative-logit math, a worked example, and exactly when to pick it over its cousins.

ordinal logistic regression for ordered categories — Ordinal logistic regression models ordered outcome categories.

What is ordinal logistic regression?

Ordinal logistic regression — also called the ordered logit or proportional odds model — predicts an outcome that falls into one of several ordered categories. Ordinary binary logistic regression handles a two-way yes/no split. Ordinal logistic regression generalizes it to $J$ ordered levels, such as a satisfaction rating of low, medium, or high, where high sits above medium which sits above low. The order is real information, and the model is built to respect it.

The trick is to avoid modeling each category in isolation. Instead, ordinal logistic regression cuts the ordered scale at every boundary between adjacent categories and models the cumulative probability of being at or below each cut. One set of slopes is shared across all those cuts, which is what makes the model compact and interpretable. You can experiment with the binary building block in our logistic regression calculator before stacking up the ordered version.

Ordinal logistic regression A model for an outcome with $J \ge 3$ ordered categories. It estimates $J-1$ threshold intercepts $\alpha_j$ and a single shared set of slopes $\beta$, modeling the log-odds of falling at or below each category boundary as a linear function of the predictors.

The proportional-odds (cumulative-logit) model

Let the outcome $Y$ take ordered values $1, 2, \dots, J$. For each cut point $j$ (from $1$ up to $J-1$), the model writes the log-odds of being in category $j$ or lower as:

$$\ln\!\left(\frac{P(Y \le j)}{P(Y > j)}\right) = \alpha_j – \beta_1 x_1 – \beta_2 x_2 – \cdots – \beta_k x_k$$

Read the pieces carefully. The left side is a cumulative logit — the log-odds of landing at or below category $j$. Each $\alpha_j$ is a threshold (a category-specific intercept), and because the categories are ordered, the thresholds are too: $\alpha_1 < \alpha_2 < \cdots < \alpha_{J-1}$. The slopes $\beta_1, \dots, \beta_k$ are shared across every threshold — there is only one $\beta$ per predictor, not a different one at each cut. That single shared-slope rule is the famous proportional-odds assumption.

The minus sign in front of the $\beta x$ terms is a convention that makes interpretation pleasant: a positive $\beta$ means larger $x$ pushes the outcome toward higher categories. With the cumulative form in hand, the probability of any single category is just a difference of two adjacent cumulative probabilities, $P(Y = j) = P(Y \le j) – P(Y \le j-1)$.

📊 What “proportional odds” meansFor any predictor, $e^{\beta}$ is a single odds ratio that applies identically at every threshold. Moving one unit in $x$ multiplies the odds of being in a higher category by the same factor whether you are comparing low-vs-rest or high-vs-rest. One number summarizes the whole effect.

Checking the proportional-odds assumption

Because the model leans on one shared slope per predictor, you should sanity-check that the assumption is reasonable rather than just assume it. Three practical checks:

Fit separate binary models at each cut. Collapse the outcome into “at or below $j$” versus “above $j$” for each threshold and fit a plain logistic regression. If the slope for a predictor is roughly stable across those cuts, proportional odds is plausible.
Run a formal test. The Brant test (or a likelihood-ratio test against a model with cut-specific slopes) flags predictors whose effect drifts across thresholds. A small p-value warns the assumption is shaky.
Plot the cumulative logits. If the gaps between cumulative-logit lines stay parallel across levels of a predictor, the proportional-odds picture holds; clearly crossing lines suggest it does not.

⚠ When the assumption failsIf one predictor clearly violates proportional odds, you are not stuck. Options include a partial proportional odds model (free slopes only for the offending predictor), an adjacent-categories or continuation-ratio model, or falling back to multinomial logistic regression, which estimates separate slopes everywhere at the cost of ignoring order.

Worked example: predicting satisfaction

Suppose we predict a customer satisfaction rating with three ordered levels — low (1), medium (2), high (3) — from a single predictor $x$, the number of support interactions resolved on first contact. Fitting an ordinal logistic regression returns two thresholds and one slope:

$$\alpha_1 = -1.2, \qquad \alpha_2 = 1.0, \qquad \beta_1 = 0.8$$

The shared slope $\beta_1 = 0.8$ gives an odds ratio of $e^{0.8} \approx 2.23$. Interpret it as the effect on the odds of landing in a higher satisfaction category: each additional first-contact resolution multiplies the odds of being in a higher rating bucket by about $2.23$, and this holds at both the low-vs-(medium+high) cut and the (low+medium)-vs-high cut.

To get an actual probability, plug a value of $x$ into the cumulative logits. For $x = 2$:

$$P(Y \le 1) = \frac{1}{1 + e^{-(\alpha_1 – \beta_1 x)}} = \frac{1}{1 + e^{-(-1.2 – 1.6)}} \approx 0.057$$ $$P(Y \le 2) = \frac{1}{1 + e^{-(\alpha_2 – \beta_1 x)}} = \frac{1}{1 + e^{-(1.0 – 1.6)}} \approx 0.354$$

From these cumulatives, the per-category probabilities follow by subtraction: $P(Y=1) \approx 0.057$, $P(Y=2) = 0.354 – 0.057 \approx 0.297$, and $P(Y=3) = 1 – 0.354 \approx 0.646$. So a customer with two first-contact resolutions is most likely to be highly satisfied — exactly the kind of ordered, probabilistic answer ordinal logistic regression is built to give.

Ordinal vs multinomial vs binary logistic regression

All three are logistic models; they differ in how many classes the outcome has and whether those classes are ordered. The table makes the choice concrete:

Model	Outcome ordered?	Number of classes	What it estimates
Binary logistic	N/A (just two)	2	One intercept and one slope set; a single logit
Ordinal logistic	Yes, ordered	3 or more	$J-1$ thresholds, one shared slope set (proportional odds)
Multinomial logistic	No, unordered	3 or more	Separate slope set for each class vs a reference

The key contrast is the slope count. Ordinal logistic regression spends one slope per predictor and many thresholds; multinomial logistic regression spends a full slope set per class. Ordinal is leaner precisely because it borrows strength from the ordering.

When to use ordinal vs multinomial

The decision rule is short: use ordinal logistic regression when the order of the categories carries information, and use multinomial when it does not.

Are the categories genuinely ordered? Ratings, agreement scales, severity levels (mild/moderate/severe) → ordinal logistic regression.
Are they just labels with no natural ranking? Predicting which of red/green/blue, or which product category → multinomial logistic regression.
Are there exactly two outcomes? → plain binary logistic regression.

Ignoring order has a real cost. If you feed an ordered outcome into a multinomial model, you throw away the ranking and lose statistical power — you estimate far more parameters than necessary and your coefficients become harder to interpret. When order is meaningful, the proportional-odds model gives you a tighter, more interpretable fit with a single odds ratio per predictor.

✅ Rule of thumbIf you could sensibly say one category is “more” than another, the order matters — reach for ordinal logistic regression first and only relax to multinomial if the proportional-odds assumption fails badly.

🤖 ML context

Ordinal logistic regression is a generalized linear model in the same family as binary and multinomial logistic regression — same linear backbone, different link. It shows up across supervised learning wherever targets are ordered ratings: review scores, credit grades, disease staging. Master the binary case first with the logistic regression calculator, then review the shared logistic regression assumptions before trusting an ordinal fit.

Frequently asked questions

What is ordinal logistic regression?

Ordinal logistic regression, also called the ordered logit or proportional odds model, predicts an outcome that falls into three or more ordered categories such as low/medium/high. It models cumulative probabilities with category-specific thresholds and one shared set of slopes.

What is the proportional-odds assumption?

It is the rule that each predictor has a single slope shared across all category thresholds, so the odds ratio is the same at every cut point. One unit of the predictor multiplies the odds of a higher category by the same factor everywhere.

How is ordinal different from multinomial logistic regression?

Ordinal logistic regression assumes the categories are ordered and estimates one shared slope per predictor. Multinomial logistic regression treats the categories as unordered labels and estimates a separate slope set for each class versus a reference.

How do I interpret the odds ratio in ordinal logistic regression?

Exponentiate the slope to get e raised to beta. It is the factor by which the odds of being in a higher outcome category multiply for each one-unit increase in the predictor, and under proportional odds it applies at every threshold.

When should I use ordinal logistic regression?

Use it when the outcome has three or more categories that are genuinely ordered, such as ratings or severity levels, so the ordering carries information. Ignoring that order with a multinomial model loses statistical power.

Key takeaways

Ordinal logistic regression models an ordered outcome by stacking cumulative logits: $J-1$ thresholds capture where the category boundaries sit, while a single shared slope per predictor captures the effect, summarized by one odds ratio under the proportional-odds assumption. Check that assumption, and if it holds you get a compact, interpretable model that respects the ordering instead of discarding it. Continue with the logistic regression calculator, compare it against multinomial logistic regression, review the logistic regression assumptions, or read the formal reference on Wikipedia.