Question 1

What is PCA (Principal Component Analysis)?

Accepted Answer

PCA finds the directions of maximum variance in high-dimensional data and projects the data onto these directions. The first principal component captures the most variance, the second captures the most remaining variance orthogonal to the first, and so on.

Question 2

When should you use PCA?

Accepted Answer

Use PCA when you have many correlated features and want to reduce dimensionality for visualization, noise reduction, or computational efficiency. PCA is most effective when the data lies near a low-dimensional subspace — that is, when a few directions capture most of the variance.

Question 3

How do you choose the number of components?

Accepted Answer

Common methods: keep enough components to explain 90-95% of variance, use the Kaiser criterion (eigenvalue > 1), or look for an 'elbow' in the scree plot. Cross-validation on downstream task performance is the most rigorous approach.

Question 4

What are the limitations of PCA?

Accepted Answer

PCA only captures linear relationships. It's sensitive to feature scaling (always standardize first). It assumes the principal components are orthogonal. For nonlinear dimensionality reduction, consider t-SNE, UMAP, or kernel PCA.

PCA: Reducing Dimensions While Preserving Information

Formula

Finding the Essential Dimensions

How PCA Works

Compression Without Losing Meaning

Beyond Linear PCA

FAQ

Sources

Embed

PCA: Reducing Dimensions While Preserving Information

Formula

Finding the Essential Dimensions

How PCA Works

Compression Without Losing Meaning

Beyond Linear PCA

FAQ

Sources

Other simulations: Data Science & Machine Learning

Decision Tree Classifier

Gradient Descent Optimization

K-Means Clustering Algorithm

Linear Regression & Least Squares Fit

Embed