Additive Density Regression

Sonja Greven, Humboldt University of Berlin

Abstract: We present structured additive regression models to model probability density functions given scalar covariates. To preserve nonnegativity and integration to one, we formulate our models for densities in a Bayes Hilbert space with respect to an arbitrary finite measure. This enables us to not only consider continuous densities, but also, e.g., discrete densities (compositional data) or mixed densities. Mixed densities occur in our application motivated by research on gender identity norms and the distribution of the woman’s share in a couple’s total labor income, as the woman’s income share is a continuous variable having discrete point masses at zero and one for single-earner couples. We show how to handle the challenging case of mixed densities using an orthogonal decomposition. We discuss interpretation of effect functions in our model via odds-ratios. We consider two cases: First, where densities are observed and are directly used as responses. Second, when only individual scalar realizations of the conditional distributions are observed, but not the whole conditional densities, we use our additive regression approach to model the conditional density given covariates. We show approximate equivalence of the resulting Bayes space penalized likelihood to a certain penalized Poisson likelihood, facilitating estimation. We apply our framework to a motivating gender economic data set from the German Socio-Economic Panel Study (SOEP) to analyze the distribution of the woman’s share in a couple’s total labor income, given year, place of residence and age of the youngest child. Results show a more symmetric distribution in East German than in West German couples after German reunification and a smaller child penalty comparing couples with and without minor children. These West-East differences become smaller, but are persistent over time.