ASSESSING RIDGE REGRESSION AND PLS REGRESSION: A THOROUGH EXAMINATION

Authors

  • Felipe Andrade Institute for Data Science, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil

Keywords:

Ridge Regression, Partial Least Squares, statistical modeling, chemometrics, variable projection.

Abstract

In the realm of statistical modeling, a longstanding debate has persisted regarding Ridge Regression (RR) and Partial Least Squares (PLS) as preferred methodologies. Statisticians argue in favor of RR, rooted in a well-established mathematical foundation, while PLS finds favor among chemometricians due to its reliance on orthogonal variable projection. This divide stems from the unknown statistical properties of PLS. PLS, akin to Canonical Correlation (CC), employs orthogonal vector projection. It focuses on maximizing the covariance between X- and Y-score vectors, in contrast to CC's emphasis on correlation. PLS and CC theories exhibit a close connection. Notably, PLS accommodates situations with more variables than samples, a critical advantage. Validation techniques like cross-validation and test sets are leveraged to affirm results, complemented by graphical tools for data exploration. The seminal work by Frank et al. (1993) compared RR and PLS, concluding a slight preference for RR, albeit with minimal discernible difference. Critics noted the use of simulated data with full rank but small singular values, urging a shift towards real-world chemometric datasets. Subsequent research, searchable on scholar.google.com, has presented mixed findings. Basak et al. (2002) advocate for RR, while Irfan et al. (2013) and Wold et al. (1983) lend support to PLS.

Downloads

Published

2024-06-28

Issue

Section

Articles