Logistic Regression Assumptions

The logistic regression assumptions are the short list of conditions your data must satisfy for the model’s coefficients, p-values, and predicted probabilities to be trustworthy — and they are noticeably gentler than the assumptions linear regression demands. This guide walks through all seven, explains why each one matters, and shows you exactly how to check it before you trust the output.

logistic regression assumptions checklist
The key assumptions behind a valid logistic regression model.

The seven logistic regression assumptions at a glance

Logistic regression is a generalized linear model, so it inherits a linear backbone but applies it to the log-odds rather than the raw outcome. That single design choice reshapes the whole assumption list. Here are all seven, what each means, and how to verify it:

AssumptionWhat it meansHow to check
Binary / categorical outcomeThe dependent variable is a class label, not a continuous numberConfirm the target has two (or a fixed set of) categories
Independence of observationsEach row is a separate, unrelated caseCheck study design; watch for repeated measures or clustering
Linearity of the logitEach continuous predictor is linear in the log-odds, not the probabilityBox-Tidwell test; plot logit vs predictor; add a $x\ln x$ term
Little multicollinearityPredictors are not strongly correlated with each otherVariance inflation factor (VIF); correlation matrix
No extreme outliersNo single point dominates the fitStandardized residuals, Cook’s distance, leverage
Adequate sample sizeEnough events per predictor to estimate stable coefficientsRule of thumb: about 10 events per predictor
No perfect separationNo predictor splits the classes perfectlyWatch for huge coefficients and standard errors that explode

You can experiment with all of these in practice using our logistic regression calculator. Now let us walk through each assumption in turn.

1. A binary or properly categorical outcome

The first of the logistic regression assumptions is the most basic: the response variable must be categorical. Standard (binary) logistic regression expects exactly two outcomes — coded $0$ and $1$, such as pass/fail, churn/retain, or disease/no disease. If you have more than two unordered categories you need multinomial logistic regression; for ordered categories you need ordinal logistic regression. Feeding a continuous target into a logistic model is a category error — that is a job for linear regression.

How to check: simply inspect your target column. For the classic two-class case, see our guide to binary logistic regression.

2. Independence of observations

Each observation must be independent of the others. The likelihood that the model maximizes assumes that knowing the outcome of one row tells you nothing extra about another. This breaks when you have repeated measurements on the same subject, time-series data, or observations clustered within groups (students within schools, patients within hospitals).

How to check: there is no single statistic — this is mostly a question of study design. If the same unit appears multiple times, or rows are grouped, reach for a mixed-effects (multilevel) or generalized estimating equations model instead. Violating independence makes standard errors too small, so significance gets overstated.

3. Linearity of the logit (not the probability)

This is the assumption beginners most often misread. Logistic regression does not assume a straight-line relationship between predictors and the probability — that relationship is the curved sigmoid. What it assumes is that each continuous predictor is linear in the log-odds:

$$\ln\!\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \dots + \beta_k x_k$$

So the linearity assumption lives on the logit scale. If the true relationship between, say, age and the log-odds bends, the model will be misspecified even though the probability curve still looks like a reasonable S.

📊 The Box-Tidwell checkThe standard test is Box-Tidwell: for each continuous predictor $x$, add an interaction term between $x$ and its own natural log, $x \cdot \ln(x)$. Refit the model. If that $x\ln x$ term is statistically significant, the linearity-of-the-logit assumption is violated for that predictor and you should transform it (log, spline, or polynomial). You can also simply plot the empirical logit against the predictor and look for curvature.

4. Little or no multicollinearity

Like any regression with multiple inputs, logistic regression assumes the predictors are not highly correlated with one another. When two predictors carry nearly the same information, the model cannot tell their effects apart: coefficients become unstable, their signs can flip, and standard errors balloon.

How to check: compute the variance inflation factor (VIF) for each predictor. A common guideline is that $\text{VIF} > 5$ (or more conservatively $> 10$) signals a multicollinearity problem worth addressing — by dropping a redundant predictor, combining correlated ones, or using regularization. A correlation matrix of the predictors is a quick first screen.

5. No extreme outliers or influential points

A handful of unusual rows can drag the fitted coefficients far from where the bulk of the data would put them. Logistic regression assumes no single observation exerts outsized influence. Outliers in predictor space (high leverage) are especially dangerous when combined with a surprising outcome.

How to check: examine standardized (Pearson or deviance) residuals, leverage values, and Cook’s distance. Points with large residuals or high Cook’s distance deserve a look — verify they are not data-entry errors, and consider how sensitive your conclusions are with and without them.

6. A large enough sample size

Maximum likelihood estimation, which fits logistic regression, only behaves well with enough data — and what matters is not the total row count but the number of cases in the rarer class. The widely cited rule of thumb is about 10 events per predictor (the EPV, or events-per-variable, guideline).

🧮 Sizing exampleIf only $12\%$ of your $500$ rows are positives, you have roughly $60$ events. At $10$ events per predictor, you can responsibly fit about $6$ predictors — not $20$. Overshoot this and coefficients become noisy and overfit, and confidence intervals widen unhelpfully.

7. No perfect separation

Perfect (or quasi-complete) separation happens when some predictor, or combination of predictors, splits the two classes cleanly — every case above a cutoff is a 1 and every case below it is a 0. When this occurs, the maximum-likelihood estimate of the corresponding coefficient does not exist: the optimizer keeps pushing it toward $\pm\infty$ to make the predicted probabilities approach 0 and 1 exactly.

How to check: watch for the tell-tale symptoms — coefficients and standard errors that blow up to implausibly large values, or software warnings about non-convergence or “fitted probabilities numerically 0 or 1.” Remedies include removing or combining the offending predictor, gathering more data, or using penalized (Firth) logistic regression, which keeps estimates finite.

⚠ What logistic regression does NOT assumeA common point of confusion: logistic regression does not assume normally distributed residuals, and it does not assume homoscedasticity (constant error variance) — unlike linear regression. The outcome is binary, so the errors are Bernoulli, not Gaussian, and their variance $p(1-p)$ depends on the predicted probability by design. Do not waste time running normality or constant-variance tests on a logistic model.

A short worked sense-check

Imagine predicting loan default from three predictors — income, credit score, and debt-to-income ratio — using $400$ applicants of whom $40$ defaulted. Run the checklist: the outcome is binary (default yes/no) ✓; applicants are independent ✓; you have $40$ events for $3$ predictors, comfortably above the $30$ that the 10-per-predictor rule asks ✓. Then you compute VIF and find income and debt-to-income are heavily correlated (VIF $> 8$) — a multicollinearity flag, so you combine them or drop one. Finally a Box-Tidwell test flags credit score as non-linear in the logit, so you add a spline. In a few minutes you have checked every assumption that matters and fixed the two that failed — that is the entire workflow.

Quick definition The logit (log-odds) is $\ln\!\big(p/(1-p)\big)$. The linearity assumption of logistic regression applies to this quantity, not to $p$ itself — which is why a curved probability plot is perfectly normal and expected.

🤖 ML context

Checking the logistic regression assumptions is part of responsible supervised learning. Logistic regression is also the simplest neural network — a single sigmoid neuron — so the same intuitions about linearity and separation carry into deep learning. Build hands-on intuition with the logistic regression calculator, then compare with multivariate logistic regression.

Frequently asked questions

What are the main assumptions of logistic regression?
The main assumptions are a binary or categorical outcome, independent observations, linearity between continuous predictors and the log-odds, little multicollinearity among predictors, no extreme influential outliers, an adequate sample size, and no perfect separation of the classes.
Does logistic regression assume linearity?
Yes, but only linearity between each continuous predictor and the log-odds (the logit), not the probability. The probability follows the curved sigmoid, so a non-linear probability plot is normal and expected.
Does logistic regression assume normality of residuals?
No. Unlike linear regression, logistic regression does not assume normally distributed residuals or homoscedasticity. The outcome is binary, so the errors are Bernoulli and their variance depends on the predicted probability by design.
How do I check the linearity of the logit assumption?
Use the Box-Tidwell test: add an interaction between each continuous predictor and its natural log. If that term is significant, linearity of the logit is violated and you should transform the predictor. You can also plot the empirical logit against the predictor and look for curvature.
What is perfect separation in logistic regression?
Perfect separation is when a predictor or combination of predictors splits the two classes perfectly. The maximum-likelihood coefficient then diverges toward infinity, causing huge coefficients and standard errors. Fix it with penalized (Firth) logistic regression, more data, or by removing the offending predictor.

Key takeaways

The logistic regression assumptions are fewer and friendlier than those of linear regression: a categorical outcome, independent rows, linearity of the logit, low multicollinearity, no dominant outliers, enough events per predictor, and no perfect separation. Crucially, normality and homoscedasticity are not required. Check each with the right tool — VIF, Box-Tidwell, Cook’s distance, an events-per-variable count — and your model’s probabilities and p-values will hold up. Continue with the logistic regression calculator, the binary logistic regression guide, or the formal reference on Wikipedia.

Scroll to Top