COMBSS: Best Subset Selection via Continuous Optimization

Samuel Muller (Macquarie University)

Friday 1st July, 2022 15:00-16:00 Zoom

Abstract

Recent rapid developments in information technology have enabled the collection of high-dimensional complex data, including in engineering, economics, finance, biology, and health sciences. High-dimensional means that the number of features is large and often far higher than
the number of collected data samples. In many of these applications, it is desirable to find a small
best subset of predictors so that the resulting model has desirable prediction accuracy. 

In this talk, we will first briefly review existing optimization and search methods in the literature that tackle the problem of identifying or selecting the set of important predictors. We then present COMBSS, a novel continuous optimization-based solution that directly solves the best subset selection problem in linear regression. COMBSS turns out to be very fast, potentially making best subset selection possible, even when the number of features exceeds thousands. 

Simulation results are presented that highlight the strong performance of COMBSS in comparison to existing popular non exhaustive methods such as Forward Stepwise and the Lasso, as well as for exhaustive methods such as Mixed-Integer Optimization. Because of the outstanding overall performance of COMBSS, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models. We conclude the presentation with a brief discussion on future avenues of research.

Add to your calendar

Download event information as iCalendar file (only this event)