DNA Fingerprinting: Match Probability Calculator

simulator intermediate ~10 min
Loading simulation...
1 in 2.5 × 10¹³ — virtually unique identification

With 13 STR loci and average allele frequency of 0.1, the random match probability is approximately 1 in 25 trillion — far exceeding the world population, making a coincidental match essentially impossible for unrelated individuals.

Formula

RMP = ∏(2 × p_i × q_i) for heterozygous loci, ∏(p_i²) for homozygous loci
Simplified: RMP ≈ (2 × f²)^num_loci where f = average allele frequency
Expected_matches = population_size × RMP

The Genetic Barcode

In 1984, Alec Jeffreys at the University of Leicester discovered that certain regions of human DNA vary enormously between individuals. These hypervariable regions — now analyzed as Short Tandem Repeats (STRs) — act like a biological barcode. By measuring the number of repeats at multiple independent loci, forensic scientists create a profile so specific that the probability of two unrelated people sharing it is typically less than 1 in a trillion.

The Product Rule

The statistical power of DNA fingerprinting comes from independence. If each STR locus is inherited independently (which they are, being on different chromosomes), the probability of a full-profile match equals the product of individual locus match probabilities. Even if each locus has a modest 1-in-100 match probability, 13 independent loci yield 1 in 100¹³ — a number so vast it dwarfs the number of humans who have ever lived.

Population Genetics Matters

Match probability calculations depend on allele frequencies within the relevant population. Allele frequencies differ between ethnic groups due to population history. Forensic labs maintain frequency databases for major population groups and typically report the most conservative (highest) match probability. The NRC II report established guidelines for handling population substructure in court.

Limitations and Controversies

DNA statistics assume the evidence sample is clean, single-source, and properly handled. In practice, crime scene DNA is often degraded, mixed with multiple contributors, or present in trace quantities. Mixture interpretation — separating two or more contributors — remains an active area of research and courtroom debate, leading to the development of probabilistic genotyping software.

FAQ

How does DNA fingerprinting work?

DNA fingerprinting analyzes Short Tandem Repeat (STR) regions — sections of DNA where a short sequence (2–6 base pairs) repeats a variable number of times. Each person has two alleles per locus (one from each parent). By examining multiple STR loci, forensic scientists create a genetic profile that is statistically unique.

What is a random match probability?

The random match probability (RMP) is the chance that a randomly selected unrelated person from the population would have the same DNA profile as the evidence sample. With 13+ loci, RMP typically falls below 1 in a trillion, making false matches essentially impossible among unrelated individuals.

How many STR loci are used in forensic DNA analysis?

The FBI's CODIS system originally used 13 core STR loci. Since 2017, this expanded to 20 loci. The European Standard Set uses 12 loci. More loci exponentially decrease the random match probability and improve discrimination between individuals.

Can DNA evidence be wrong?

The statistics are extremely reliable, but errors occur in sample handling: contamination, mislabeling, degraded samples, or mixtures from multiple contributors. The match probability calculation assumes clean, single-source samples — real casework must account for these complications.

Sources

Embed

<iframe src="https://homo-deus.com/lab/forensic-science/dna-fingerprinting/embed" width="100%" height="400" frameborder="0"></iframe>
View source on GitHub