Supervised vs unsupervised vs reinforcement learning is the first big map you need when learning machine learning: three families of algorithms that differ by what data they see and what feedback they get. This guide compares all three side by side, shows the math and a real example of each, and answers the question beginners ask most — including whether reinforcement learning is supervised or unsupervised.

📝 Supervised
Learns from labeled data — inputs paired with known answers. Goal: predict the answer for new inputs.
🔮 Unsupervised
Learns from unlabeled data — inputs only. Goal: discover hidden structure and groups.
🎮 Reinforcement
Learns from trial and error in an environment, guided by rewards. Goal: choose actions that maximize long-term reward.
The three types of machine learning at a glance
Almost every algorithm you will meet falls into one of three buckets. The single fastest way to tell them apart is to ask: what does the model learn from, and how is it told whether it did well?
| Aspect | Supervised learning | Unsupervised learning | Reinforcement learning |
|---|---|---|---|
| Input data | Labeled: pairs $(x_i, y_i)$ | Unlabeled: inputs $x_i$ only | No fixed dataset — an environment |
| Feedback signal | The correct answer for every example | None — the model is on its own | A delayed reward or penalty |
| Goal | Predict the label of new inputs | Find hidden structure or groups | Maximize cumulative reward over time |
| Typical tasks | Regression, classification | Clustering, dimensionality reduction | Control, sequential decision-making |
| Example algorithms | Linear & logistic regression, SVM, neural nets | k-means, PCA, hierarchical clustering | Q-learning, policy gradients |
| Real example | Spam detection, house-price prediction | Customer segmentation | Game-playing AI, robotics |
What is supervised learning?
Supervised learning is the most common and the easiest to grasp. You give the algorithm a dataset where every input $x$ comes with the correct output $y$ — the label — and it learns the mapping between them. Formally, it searches for a function $f$ that fits the labeled pairs:
$$f:\;X \to Y \qquad\text{by minimizing}\qquad \frac{1}{n}\sum_{i=1}^{n} L\big(f(x_i),\, y_i\big)$$where $L$ is a loss function measuring how far each prediction is from the true label. The two flavors are:
- Regression — predicting a number, such as a price or temperature. Try our linear regression calculator to fit a line to labeled data.
- Classification — predicting a category, such as
spamvsnot spam.
What is unsupervised learning?
Unsupervised learning removes the answer key. The algorithm sees only the inputs $\{x_1, x_2, \dots, x_n\}$ with no labels, and its job is to discover structure on its own. Because there is no correct answer to compare against, success is measured by how well the model organizes the data. A classic example is k-means clustering, which groups points into $k$ clusters by minimizing the spread inside each one:
$$\min \sum_{j=1}^{k}\;\sum_{x \in C_j} \lVert x – \mu_j \rVert^2$$Here $\mu_j$ is the center of cluster $C_j$. The two main jobs of unsupervised learning are:
- Clustering — grouping similar items, such as segmenting customers by behavior.
- Dimensionality reduction — compressing many features into a few (for example with PCA) while keeping the important variation.
What is reinforcement learning?
Reinforcement learning (RL) is different from both. There is no dataset of examples at all. Instead an agent interacts with an environment: it observes a state, takes an action, and receives a reward (or penalty). Over thousands of attempts it learns a policy — a strategy for choosing actions — that maximizes the expected total reward, where future rewards are discounted by a factor $\gamma$:
$$\max\; \mathbb{E}\!\left[\sum_{t=0}^{\infty} \gamma^{t}\, r_t\right], \qquad 0 \le \gamma \le 1$$The catch that makes RL hard is delayed feedback: a move now (sacrificing a chess piece) may only pay off many steps later. The agent must balance exploration (trying new actions) against exploitation (using what already works). This is how AI learns to play games, control robots, and tune recommendation systems.
Is reinforcement learning supervised or unsupervised?
This is the question that trips up almost every beginner, so let’s answer it directly: reinforcement learning is neither supervised nor unsupervised — it is its own third category.
Here is why it sits apart:
- It is not supervised, because no one hands the agent the correct action for each state. There is no labeled answer key — only a reward that says “that was good” or “that was bad,” often long after the action.
- It is not unsupervised, because the reward is a feedback signal. Unsupervised learning gets no feedback at all; RL is constantly guided by reward.
Supervised vs unsupervised vs reinforcement learning: the key difference
Strip away the jargon and the whole comparison reduces to one axis: the feedback the model receives.
| Paradigm | Feedback per example | What it optimizes |
|---|---|---|
| Supervised | Full — the exact correct answer | Prediction accuracy on labels |
| Unsupervised | None | Structure / similarity in the data |
| Reinforcement | Partial — a delayed reward | Long-term cumulative reward |
Read top to bottom, the feedback signal fades from a precise label, to a vague reward, to nothing — and the difficulty of the problem rises in step.
Where do semi-supervised and self-supervised learning fit?
Once the three main paradigms click, you will quickly run into two hybrids that live between supervised and unsupervised learning — and they matter because labeled data is expensive while unlabeled data is everywhere.
- Semi-supervised learning uses a small amount of labeled data together with a large pool of unlabeled data. The few labels steer the model while the unlabeled examples reveal the overall shape of the data. It is common in domains like medical imaging, where labeling every scan by hand is costly.
- Self-supervised learning is the engine behind modern large language models. It is technically unsupervised — there are no human labels — but the model creates its own labels from the data, for example by hiding a word in a sentence and learning to predict it. This turns a giant unlabeled corpus into a supervised-style training task without anyone tagging it.
A note on deep learning
Beginners often ask where deep learning fits in this picture. It is not a fourth category: deep learning simply means using multi-layer neural networks, and those networks can be trained in any of the three paradigms. A neural net can do supervised image classification, power an unsupervised autoencoder, or act as the policy in a reinforcement learning agent. The paradigm describes how the model learns from feedback; deep learning describes what kind of model is doing the learning.
Which type should you use?
Pick the paradigm by matching it to your data and your goal:
- Do you have labeled examples and want to predict the label of new data? Use supervised learning (regression for numbers, classification for categories).
- Do you have data but no labels, and want to find patterns or groups? Use unsupervised learning (clustering or dimensionality reduction).
- Are you training an agent to make a sequence of decisions where good behavior is rewarded? Use reinforcement learning.
🤖 ML insight: where the math overlaps
All three ultimately minimize (or maximize) an objective with the same engine: gradient-based optimization. Supervised learning descends a loss surface, RL ascends an expected-reward surface, and many unsupervised methods minimize a reconstruction or distance cost. Under the hood they all lean on derivatives — see our derivative calculator and regression tools to build the intuition.
Frequently asked questions
What is the difference between supervised, unsupervised, and reinforcement learning?
Is reinforcement learning supervised or unsupervised?
Which type of machine learning is easiest to start with?
Can you combine the three types?
What are real examples of each?
Key takeaways
The supervised vs unsupervised vs reinforcement learning split comes down to feedback: supervised learning is taught with labeled answers, unsupervised learning finds structure with no feedback, and reinforcement learning is guided by rewards. Master this map and the rest of machine learning falls into place. Keep going with the linear regression calculator for a hands-on supervised example, or read the formal overview of machine learning on Wikipedia.