DNA Profiling Simulator: STR Match Probability and Forensic Genetics

simulator advanced ~12 min
Loading simulation...
RMP ≈ 1 in 10¹⁵ — 13 loci, average profile

A 13-locus STR profile with average heterozygosity of 0.78 yields a random match probability of approximately 1 in 10¹⁵ — one in a quadrillion — providing extremely strong identification evidence.

Formula

RMP = Π(2·p_i·p_j) for heterozygous loci, Π(p_i²) for homozygous
DP = 1 - Σ(genotype_freq²) per locus
P(inclusion) = RMP × database_size

The Gold Standard

DNA profiling revolutionized forensic science when Alec Jeffreys developed the technique in 1984 and first applied it to a criminal case in 1986 — simultaneously exonerating an innocent suspect and identifying the true perpetrator. Today, DNA evidence is the most powerful identification tool available to law enforcement, with random match probabilities that can exceed one in a sextillion when full profiles are obtained.

How STR Profiling Works

Modern forensic DNA typing analyzes Short Tandem Repeat (STR) loci — specific chromosomal locations where a short sequence (e.g., AGAT) repeats a variable number of times. Each person inherits two copies (alleles) per locus, one from each parent. By typing 20 loci (the current CODIS standard), analysts generate a profile that is statistically unique. PCR amplification allows profiling from as little as 100 picograms of DNA — about 15 cells.

The Product Rule

The strength of DNA evidence comes from the product rule: the probability of a multi-locus match is the product of the individual locus probabilities. If each locus has a genotype frequency of 1-5%, multiplying across 13 or 20 loci yields combined probabilities of 10⁻¹⁵ to 10⁻³⁰. This multiplication is valid when loci are on different chromosomes (genetically independent), which is true for all CODIS loci.

Challenges and Mixtures

Not all DNA evidence is straightforward. Crime scenes often yield mixed profiles from multiple contributors, degraded samples with allele dropout, or touch DNA with very low template amounts. Probabilistic genotyping software uses likelihood ratios to evaluate mixed and partial profiles statistically. Sample degradation, modeled in this simulation, reduces the number of usable loci and weakens the statistical power of the match.

FAQ

What are STR loci in DNA profiling?

Short Tandem Repeats (STRs) are regions of DNA where a 2–6 base pair sequence repeats multiple times. Different people have different numbers of repeats at each locus. The FBI's CODIS system uses 20 STR loci; the combined profile is virtually unique to each individual (except identical twins).

What is random match probability?

Random match probability (RMP) is the chance that a randomly selected unrelated person from the population would have the same DNA profile as the evidence sample. With 13+ loci, RMP typically ranges from 10⁻¹⁵ to 10⁻³⁰, making false matches astronomically unlikely.

Can DNA evidence be wrong?

The DNA science itself is extremely reliable, but errors can occur through contamination (mixing samples), mislabeling, misinterpretation of mixed profiles, or laboratory procedural failures. The 2004 Phantom of Heilbronn case, where contaminated swabs created a false serial offender, illustrates the importance of quality control.

What is CODIS?

The Combined DNA Index System (CODIS) is the FBI's database of DNA profiles, containing over 20 million offender profiles and 1 million forensic profiles. It enables 'cold hits' — matching crime scene DNA to previously convicted offenders, solving thousands of cases annually.

Sources

Embed

<iframe src="https://homo-deus.com/lab/forensic-analysis/dna-profiling/embed" width="100%" height="400" frameborder="0"></iframe>
View source on GitHub