Central Limit Theorem — Interactive Demonstration

simulator intermediate ~8 min
Loading simulation...
Central Limit Theorem — sample means from ANY distribution converge to a normal distribution as sample size increases

The Central Limit Theorem states that the distribution of sample means approaches a normal distribution as the sample size increases, regardless of the shape of the original distribution.

Formula

Standard Error = sigma / sqrt(n)
X_bar ~ N(mu, sigma^2 / n) as n -> infinity

The Theorem That Powers Statistics

The Central Limit Theorem (CLT) is the reason most of statistics works. It says that no matter what distribution you start with — uniform, exponential, bimodal, or anything else — the average of many independent samples will follow a normal distribution. This universality is what makes polls reliable, clinical trials valid, and quality control possible.

Watching Normality Emerge

This simulation lets you see the CLT in action. Choose a wildly non-normal source distribution (try exponential or bimodal), then increase the sample size. With sample size 1, the distribution of means matches the source. By sample size 5-10, it starts looking bell-shaped. By 30, it is nearly perfectly Gaussian. The transformation is startling and beautiful.

The Standard Error

The CLT also tells us how spread out the sample means will be: the standard error equals the population standard deviation divided by the square root of the sample size. This explains why averaging more data gives better estimates — but with diminishing returns. To cut the error in half, you need four times as much data.

History and Importance

The CLT was first proved in limited form by Abraham de Moivre in 1733 and generalized by Pierre-Simon Laplace in 1812. The modern rigorous version was established by Aleksandr Lyapunov in 1901. Today it underpins virtually all of inferential statistics, from the humble confidence interval to sophisticated Bayesian methods that assume approximate normality of posteriors.

FAQ

What is the Central Limit Theorem?

The CLT states that when you take many samples from any distribution and compute their means, those means will be approximately normally distributed — even if the original data is not normal at all. This is arguably the most important theorem in statistics.

Why is sample size 30 considered the magic number?

For most practical distributions, a sample size of 30 is sufficient for the sampling distribution of the mean to be approximately normal. However, this is a rule of thumb — highly skewed distributions may require larger samples, while symmetric distributions converge faster.

How does the standard error relate to sample size?

The standard error equals sigma/sqrt(n), where sigma is the population standard deviation and n is the sample size. This means quadrupling the sample size halves the standard error, explaining why larger studies yield more precise estimates.

Why does the CLT matter in practice?

The CLT justifies the use of normal-based statistical tests (t-tests, confidence intervals, regression) even when the underlying data is not normally distributed. It is the reason polls, clinical trials, and quality control processes work reliably.

Sources

Embed

<iframe src="https://homo-deus.com/lab/mathematics/central-limit-theorem/embed" width="100%" height="400" frameborder="0"></iframe>