Perceptron: The Simplest Neural Network Learns to Classify

simulation beginner ~6 min Updated 2026-03-19

Loading simulation...

100 points, η=0.1, 100 epochs: accuracy 97%, loss 0.03. Decision boundary converged at 42° after 34 epochs. Linear separation achieved.

With 100 data points, learning rate 0.1, and 100 epochs, the perceptron achieves 97% accuracy with loss 0.03. The decision boundary converges after approximately 34 epochs.

Formula

y = σ(w₁x₁ + w₂x₂ + b) where σ is step function

Δwᵢ = η × (target - output) × xᵢ (perceptron learning rule)

The Birth of Neural Networks

In 1958, Frank Rosenblatt built the Mark I Perceptron — a machine that could learn to classify visual patterns by adjusting connection weights. The New York Times reported it as a machine that 'will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.' The reality was more modest but no less revolutionary: the perceptron was the first algorithm that could learn from examples, adjusting its behavior based on feedback rather than explicit programming. This simulation recreates that fundamental learning process.

How the Perceptron Learns

A perceptron takes two inputs (x1, x2), multiplies each by a weight (w1, w2), adds a bias (b), and applies a threshold: if w1*x1 + w2*x2 + b > 0, output 1; otherwise output 0. Training adjusts the weights using the perceptron learning rule: for each misclassified point, nudge the weights in the direction that would correct the classification. The learning rate η controls the nudge size. The decision boundary is the line w1*x1 + w2*x2 + b = 0 — you can watch it rotate and shift as training progresses.

The XOR Problem and Its Resolution

In 1969, Marvin Minsky and Seymour Papert published their devastating analysis: the perceptron cannot solve XOR (points at (0,0) and (1,1) are class A; (0,1) and (1,0) are class B). No single straight line can separate them. This proof, which applies to any linearly inseparable problem, triggered the first AI winter — funding for neural network research dried up for over a decade. The solution came in the 1980s: stack multiple perceptrons in layers, add nonlinear activation functions, and use backpropagation to train. These multi-layer networks can draw curved decision boundaries, solving XOR and much more complex problems.

From Perceptron to Deep Learning

Every modern deep learning system — from GPT to AlphaFold — is fundamentally an enormous stack of perceptron-like units. A single perceptron separates points with a line. Two layers can carve out convex regions. Three or more layers can approximate any continuous function (the Universal Approximation Theorem). Today's language models contain billions of these simple units organized in transformer architectures. The perceptron learning rule evolved into backpropagation, then into Adam, AdaGrad, and other optimizers. But the core idea remains: adjust weights to reduce error, one step at a time.

FAQ

What is a perceptron?

The simplest possible neural network — a single neuron that takes weighted inputs, sums them, applies a threshold, and outputs a binary classification. Invented by Frank Rosenblatt in 1958, it's the building block of all modern deep learning.

What is a decision boundary?

The line (or hyperplane in higher dimensions) that separates two classes. A perceptron can only create a linear boundary — a straight line. This limits it to linearly separable problems. Multi-layer networks can create curved boundaries.

What does the learning rate control?

How large each weight update step is during training. Too small: learning is slow and may get stuck. Too large: the boundary oscillates and never converges. Finding the right learning rate is a fundamental challenge in machine learning.

Why was the perceptron controversial?

In 1969, Minsky and Papert proved the perceptron cannot solve XOR (exclusive or) — a simple non-linearly-separable problem. This led to the first 'AI winter'. The limitation was overcome by multi-layer networks with backpropagation in the 1980s.

Sources

Rosenblatt, F. (1958). The Perceptron: A Probabilistic Model for Information Storage. Psychological Review.
Minsky, M. & Papert, S. (1969). Perceptrons: An Introduction to Computational Geometry.
Bishop, C.M. (2006). Pattern Recognition and Machine Learning, Chapter 4.

Embed

<iframe src="https://homo-deus.com/lab/neuroscience/neural-network/embed" width="100%" height="400" frameborder="0"></iframe>

View source on GitHub

Perceptron: The Simplest Neural Network Learns to Classify

Formula

The Birth of Neural Networks

How the Perceptron Learns

The XOR Problem and Its Resolution

From Perceptron to Deep Learning

FAQ

Sources

Other simulations: Neuroscience

Neural Action Potential

Hebbian Learning Rule

Neural Firing Patterns

Synaptic Plasticity & LTP/LTD

Embed