Evaluation of Likelihood Ratio Tests in Penalized Logistic Regression with Factors
Lea Kaufmann, RWTH Aachen University
Co-authors: Maria Kateri, RWTH Aachen University
Abstract: High-dimensional regression-type problems where the number of parameters to be estimated is large compared to the sample size arise in numerous applications. Penalized regression techniques offer a way to estimate regression parameters while simultaneously performing model selection. However, conducting hypothesis tests, such as likelihood ratio tests (LRT), in such high-dimensional settings is challenging due to the complexity introduced by the large number of parameters. To address this in the context of a penalty function designed for simultaneous factor selection and levels fusion, we propose a two-stage method for statistical inference using LRT. The first stage involves splitting the dataset into two parts: one for model selection and the other for hypothesis testing.
Additionally, we consider two multiple split extensions targeting at stabilizing the results. After discussing these techniques from a theoretical perspective, we present results from simulations, comparing the performance of the single split and multiple split extensions for various simulation designs, and highlighting their characteristics and advantages.