Online Distributional Regression in Python

Jonathan Berrisch, University of Duisburg-Essen

Co-authors: Simon Hirsch, University of Duisburg-Essen and Statkraft; Florian Ziel, University of Duisburg-Essen

Abstract: Despite the availability of significant computational power today, the need for optimized algorithm implementations remains crucial. This holds particularly true for online learning algorithms, considering the growth of large-scale streaming data. ROLCH contributes to this field. It is a python package for probabilistic online learning in a GAMLSS (generalized additive models for location, shape, and scale) framework. Rolch deeply integrates into the current ecosystem of statistical computation in python. We utilize the data structures provided by NumPy and accelerate computations using just-in-time compilation provided by Numba. Furthermore, we utilize a mixin class for distributions to inherit from SciPy where possible. Stable package versions are being provided through PyPi. This paper gives an overview of the package. It explains the package’s motivation and important aspects of the structure of core components. We also discuss how ROLCH can be extended, e.g., with additional distributions.