Question 1

Why are regression diagnostics important?

Accepted Answer

Regression results (coefficients, p-values, confidence intervals) are only valid when model assumptions hold: linearity, independence, homoscedasticity (constant variance), and normality of residuals. Diagnostics check these assumptions and reveal influential observations that may distort conclusions.

Question 2

What is Cook's distance?

Accepted Answer

Cook's distance measures how much all fitted values change when a single observation is removed. It combines leverage (how unusual the predictor values are) with residual size. Observations with Cook's D > 0.5 deserve scrutiny; D > 1 is highly influential.

Question 3

What do residual plots reveal?

Accepted Answer

Residuals vs. fitted values should show a random scatter with constant spread. Patterns indicate violations: a funnel shape suggests heteroscedasticity, a curve suggests nonlinearity, and clusters suggest missing variables or subgroups. QQ-plots check normality of residuals.

Question 4

What is leverage in regression?

Accepted Answer

Leverage measures how far an observation's predictor values are from the mean. High-leverage points have disproportionate influence on the fitted line. They are not necessarily outliers in the response, but if they are, their combination of high leverage and large residual makes them highly influential.

Regression Diagnostics: Validating Model Assumptions

Formula

Trust, But Verify

The Four Diagnostic Plots

Outliers, Leverage, and Influence

What to Do When Assumptions Fail

FAQ

Sources

Embed

Regression Diagnostics: Validating Model Assumptions

Formula

Trust, But Verify

The Four Diagnostic Plots

Outliers, Leverage, and Influence

What to Do When Assumptions Fail

FAQ

Sources

Other simulations: Biostatistics & Clinical Trials

Adaptive Clinical Trial Design

Meta-Analysis Forest Plot

Sample Size & Power Calculator

Kaplan-Meier Survival Analysis

Embed