Statistical Modelling Techniques for Default Rate Estimation in Credit Risk Analysis: An Ensemble Learning Approach for Large Financial Datasets
Fernando Luiz Pereira de Oliveira, Federal University of Ouro Preto
Co-authors: Carolina Soares Vieira, Federal University of Ouro Preto; Tiago Martins Pereira, Federal University of Ouro Preto; Gustavo de Souza, Federal University of Ouro Preto
Abstract: This article presents a case study of default rate estimation with ensemble learning based on methods such as decision trees, random forests and boosting, applied to a large dataset from a Brazilian fintech company. Our study emphasizes current challenges in financial data modeling, including class imbalance problems, high dimensionality, and data heterogeneity. This work highlights the practical benefits of using more interpretable Machine Learning frameworks compared to complex black-box alternatives to transform large volumes of complex data into actionable insights for credit managers, thus contributing to safer and more transparent risk assessment practices.