Layers of Computation
A neural network transforms input data through successive layers of neurons. Each neuron computes a weighted sum of its inputs, adds a bias term, and applies a nonlinear activation function. The first layer extracts simple features; deeper layers combine these into increasingly abstract representations. A face recognition network might detect edges in layer 1, eyes and noses in layer 3, and complete faces in layer 5. This hierarchical feature learning is what makes deep networks so powerful.
Activation Functions: The Source of Power
Without activation functions, a neural network would just be a series of matrix multiplications — equivalent to a single linear transformation no matter how many layers. Nonlinear activations like ReLU, sigmoid, and tanh give networks the ability to model curved decision boundaries and complex functions. The simulation lets you switch between activations and see how they change the network's behavior — sigmoid squashes everything to [0,1], tanh to [-1,1], while ReLU passes positive values unchanged and zeros out negatives.
The Forward Pass
The forward pass is the computation that transforms input to output. Data flows from the input layer through hidden layers to the output. At each layer, the operation is: multiply inputs by weights, add biases, apply activation. This simulation visualizes the process — watch activation values flow through connections, with line thickness representing weight magnitude and color representing positive (cyan) or negative (red) values.
Counting Parameters
A network's capacity is largely determined by its parameter count — the total number of weights and biases. Each connection between neurons is one weight; each neuron has one bias. A fully connected layer with n inputs and m outputs has n×m + m parameters. Modern language models have billions of parameters, but even small networks with a few hundred parameters can learn surprisingly complex functions, as this simulation demonstrates.