Use este identificador para citar o ir al link de este elemento: http://hdl.handle.net/1843/61118
Tipo: Tese
Título: Robust linear mixed models, alternative methods to quantile regression for panel data, and adaptive LASSO quantile regression with fixed effects
Título(s) alternativo(s): Modelos lineares mistos robustos, métodos alternativos de regressão quantílica para dados de painel, e regressão quantílica com LASSO adaptativo para efeitos fixos
Autor(es): Ian Meneghel Danilevicz
primer Tutor: Valdério Anselmo Reisen
Segundo Tutor: Pascal Bondon
primer miembro del tribunal : Arthur Tenenhaus
Segundo miembro del tribunal: Stéphane Chrétien
Tercer miembro del tribunal: Paulo Canas Rodrigues
Cuarto miembro del tribunal: Maria Helena Mourino Silva Nunes
Quinto miembro del tribunal: Glaura da Conceição Franco
Resumen: This thesis consists of three chapters on longitudinal data analysis. Linear mixed models are discussed, both random effects (where individual intercepts are interpreted as random variables) and fixed effects (where individual intercepts are considered unknown constants, i.e., they must be estimated). Furthermore, robust models (resistant to outliers) and efficient models (with low estimator variability) are proposed in the scope of repeated measures. The second part of the thesis is dedicated to quantile regression, which explores the full conditional distribution of an outcome given its predictors. It introduces a more general method for dealing with heteroscedastic variables and longitudinal data. The first chapter is motivated by evaluating the statistical association between air pollution exposure and children and adolescents' lung ability among six months. A robust linear mixed model combined with an equally robust principal component analysis is proposed to deal with multicollinearity between covariates and the impact of extreme observations on the estimates. Huber and Tukey loss functions (M-estimation examples) are considered to obtain more robust estimators than the least squared function usually used to estimate the parameters of linear mixed models. A finite sample size study is carried out in the case where the covariates follow linear time series models with or without additive outliers. The impact of time correlation and outliers on fixed effect parameter estimates in linear mixed models is investigated. In addition, weights are introduced to reduce the estimates' bias even more. The study of the real data revealed that the robust principal component analysis exhibits three principal components explaining more than 90% of the total variability. The second principal component, which corresponds to particles smaller than 10 microns, significantly affects respiratory capacity. In addition, biological indicators such as passive smoking have a negative and significant effect on children's lung ability. The second chapter analyses fixed effect panel data with three different loss functions. To avoid the number of parameters increases with the sample size, we propose to penalize each regression method with the least absolute shrinkage and selection operator (LASSO). The asymptotic properties of two of these new techniques are established. A Monte Carlo study is performed for homoscedastic and heteroscedastic models. Although the model is more challenging to estimate in the heteroscedastic case for most statistical methods, the proposed methods perform well in both scenarios. This confirms that the proposed quantile regression methods are robust to heteroscedasticity. Their performance is tested on economic panel data from the Organisation for Economic Cooperation and Development (OECD). The objective of the third chapter is to simultaneously restrict the number of individual regression constants and explanatory covariates. In addition to the LASSO, an adaptive LASSO is proposed, which enjoys oracle proprieties, i.e., it owns the asymptotic selection of the true model if it exists, and it has the classical asymptotic normality property. Monte Carlo simulations are performed in the case of low dimensionality (much more observations than parameters) and in the case of moderate dimensionality (equivalent number of observations and parameters). In both cases, the adaptive method performs much better than the non-adaptive methods. Finally, we apply our methodology to a cohort dataset of moderate dimensionality. For each chapter, open-source software is written, which is available to the scientific community.
Abstract: A tese é composta por três capítulos. O primeiro enfoca a relação entre exposição à poluição do ar e doenças respiratórias em crianças e adolescentes. A coorte inclui 82 indivíduos observados mensalmente durante seis meses. Propomos um modelo linear misto robusto combinado com uma análise de componentes principais para gerenciar a multicolinearidade entre as covariáveis e o impacto de observações extremas nas estimativas. O segundo capítulo analisa dados em painel usando modelos de efeitos fixos e diferentes funções de perda. Para evitar que a dimensão paramétrica aumente com o tamanho da amostra, penalizamos cada método de regressão por LASSO. As propriedades assintóticas dessas novas técnicas são estabelecidas. Testamos o desempenho dos métodos com dados de painel econômico da OCDE. O objetivo almejado no terceiro capítulo é restringir as constantes de regressão individuais e as covariáveis explicativas simultaneamente. O LASSO adaptativo reduz a dimensionalidade garantindo assimptoticamente a seleção correta do modelo. Testamos a precisão dos métodos propostos em dados de coorte de tamanho moderado.
Asunto: . Estatística – Teses
Dados longitudinais – Teses
.Regressão quantílica, – Teses
M-estimation – Teses
Observações atípicas – Teses
Idioma: eng
País: Brasil
Editor: Universidade Federal de Minas Gerais
Sigla da Institución: UFMG
Departamento: ICX - DEPARTAMENTO DE ESTATÍSTICA
Curso: Programa de Pós-Graduação em Estatística
Tipo de acceso: Acesso Restrito
metadata.dc.rights.uri: http://creativecommons.org/licenses/by-nc-nd/3.0/pt/
URI: http://hdl.handle.net/1843/61118
Fecha del documento: 15-dic-2022
Término del Embargo: 15-dic-2024
Aparece en las colecciones:Teses de Doutorado

archivos asociados a este elemento:
archivo Descripción TamañoFormato 
exemplo_v2 (1) (1).pdf
???org.dspace.app.webui.jsptag.ItemTag.restrictionUntil??? 2024-12-15
manuscrito da tese de doutorado ian m danilevicz3.76 MBAdobe PDFVisualizar/Abrir    Solicitar una copia


Este elemento está licenciado bajo una Licencia Creative Commons Creative Commons