The logistic regression decision boundary is the exact place where the model stops favoring one class and starts favoring the other — the set of inputs where the predicted probability is precisely $0.5$. Understanding it is the single fastest way to see how a logistic regression model actually separates two classes, and why that separation is always a straight line.

What is the logistic regression decision boundary?
A logistic regression decision boundary is the set of input points where the model is exactly undecided — it assigns equal probability to both classes. Logistic regression turns a linear score into a probability with the sigmoid function:
$$p = \sigma(z) = \frac{1}{1 + e^{-z}}, \qquad z = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots$$The model predicts class 1 when $p \ge 0.5$ and class 0 otherwise. The boundary is where $p = 0.5$. Because the sigmoid satisfies $\sigma(0) = 0.5$, the predicted probability equals one half exactly when $z = 0$. So the entire boundary collapses to one clean equation:
$$\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots = 0$$Why the logistic regression decision boundary is linear
Here is the key fact every beginner should internalize: even though the probability curve $p = \sigma(z)$ is an S-shaped, non-linear sigmoid, the logistic regression decision boundary is linear. The reason is that the boundary is defined by $z = 0$, and $z$ itself is a plain linear combination of the inputs. Setting a linear expression equal to zero always produces a flat object: a point, a line, a plane, or a hyperplane.
This is simultaneously logistic regression’s greatest strength and its main limitation. A linear boundary is easy to fit, easy to interpret, and resistant to overfitting — but it cannot separate classes that are tangled in a curved or circular pattern. The good news: you can bend the boundary by engineering new features. Add a squared term like $x_1^2$, or an interaction term like $x_1 x_2$, and the boundary in the original feature space becomes a curve, even though it is still linear in the expanded feature set.
One predictor: the boundary is a threshold point
With a single input $x$, the score is $z = \beta_0 + \beta_1 x$. Setting $z = 0$ and solving gives one number:
$$x^{*} = -\frac{\beta_0}{\beta_1}$$Everything to one side of $x^{*}$ is predicted class 1, everything to the other side is class 0. Suppose we model whether a student passes an exam from hours studied, and the fit gives $\beta_0 = -4.0777$ and $\beta_1 = 1.5046$. Then:
$$x^{*} = -\frac{-4.0777}{1.5046} = \frac{4.0777}{1.5046} \approx 2.71 \text{ hours}$$So a student studying more than about 2.71 hours is predicted to pass, and less than that is predicted to fail. That single threshold is the decision boundary in one dimension.
Two predictors: a straight line in the plane
Add a second feature and the boundary equation becomes:
$$\beta_0 + \beta_1 x_1 + \beta_2 x_2 = 0$$This is the equation of a straight line in the $(x_1, x_2)$ plane. You can even rewrite it in the familiar slope-intercept form:
$$x_2 = -\frac{\beta_0}{\beta_2} – \frac{\beta_1}{\beta_2}\, x_1$$Points on one side of this line get $z > 0$ (predicted class 1); points on the other side get $z < 0$ (predicted class 0). Move up to three features and the boundary is a flat plane; with more features it becomes a hyperplane. In every case it stays flat — that is what “linear decision boundary” means.
| Dimensions (features) | Boundary equation | Boundary shape |
|---|---|---|
| 1 feature | $\beta_0 + \beta_1 x_1 = 0$ | A point (single threshold) |
| 2 features | $\beta_0 + \beta_1 x_1 + \beta_2 x_2 = 0$ | A straight line |
| 3 features | $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 = 0$ | A flat plane |
| $n$ features | $\beta_0 + \sum_i \beta_i x_i = 0$ | A hyperplane |
Worked example: classifying points with a 2D boundary
Let the fitted coefficients be $\beta_0 = -3$, $\beta_1 = 1$, and $\beta_2 = 1$. The decision boundary is:
$$-3 + x_1 + x_2 = 0 \quad\Longrightarrow\quad x_1 + x_2 = 3$$So any point whose coordinates add up to more than 3 lands on the class-1 side, and any point summing to less than 3 lands on the class-0 side. Let us classify two points step by step:
- Point $(1, 1)$: compute $z = -3 + 1 + 1 = -1$. Since the sum $1 + 1 = 2$ is less than 3, $z < 0$ and $p = \sigma(-1) \approx 0.27 < 0.5$ → predict class 0.
- Point $(2, 2)$: compute $z = -3 + 2 + 2 = 1$. Since the sum $2 + 2 = 4$ is greater than 3, $z > 0$ and $p = \sigma(1) \approx 0.73 > 0.5$ → predict class 1.
- The boundary itself: any point with $x_1 + x_2 = 3$, such as $(1.5, 1.5)$, gives $z = 0$ and $p = 0.5$ exactly — the model is perfectly undecided.
Notice how the two test points sit on opposite sides of the same straight line. That line is the logistic regression decision boundary, and the sign of $z$ is all you need to read off the class.
Reading the geometry of the boundary
The coefficient vector $(\beta_1, \beta_2, \dots)$ is perpendicular (normal) to the decision boundary and points toward the class-1 region. Its magnitude controls how steep the probability transition is: large coefficients make the sigmoid rise sharply, so the model jumps from “almost certainly class 0” to “almost certainly class 1” over a narrow band around the boundary. Small coefficients spread that transition out into a gentle gradient. The boundary line stays in the same place — only the confidence around it changes.
Getting curved boundaries when you need them
If your two classes form rings or crescents, a straight line will never separate them well. You do not have to abandon logistic regression — you expand its feature space. Add polynomial features such as $x_1^2$, $x_2^2$, and the interaction $x_1 x_2$, and the model fits coefficients on those too. The boundary $\beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_1^2 + \cdots = 0$ is still linear in the new features, but traced back into the original $(x_1, x_2)$ plane it can be an ellipse, parabola, or other curve. This is exactly the trick that lets a “linear” model carve non-linear regions.
🤖 ML context
The linear decision boundary is the bridge from logistic regression to the support vector machine and the single-neuron perceptron, which both separate classes with a hyperplane. Build intuition with the logistic regression calculator, then see how the coefficients that shape this boundary are learned in logistic regression gradient descent.
Frequently asked questions
What is the logistic regression decision boundary?
Why is the logistic regression decision boundary linear?
How do I find the decision boundary with one feature?
Can logistic regression have a non-linear decision boundary?
What happens to the boundary if I change the 0.5 threshold?
Key takeaways
The logistic regression decision boundary is simply where the predicted probability equals $0.5$, which is exactly where the linear score $z = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \cdots = 0$. That makes it a point in 1D, a straight line in 2D, a plane in 3D, and a hyperplane beyond — always flat, which is why logistic regression is a linear classifier. Add polynomial features to curve it, or move the threshold to shift it. Keep going with the logistic regression calculator, the guide to binary logistic regression, and logistic regression gradient descent, or read the formal reference on Wikipedia.