A common problem in several disciplines is the missing information. When applying statistical methods to database with multivariate information, it is requires complete information, for instance, GGE models, additive main effects with multiplicative interaction models -AMMIand principal components, so, the value imputation is relevant for any area of knowledge. The literature and the associated statistical software provide several alternatives of imputation, such as the parametric regression, the propensity score method, or the Markov Chain Monte Carlo (MCMC) method (Zhang 2003, Yuan 2011). However, these methodologies require that certain assumptions are met. The assumption in all three methods is that the missing data depend on observed variables, which means that there is a missing at random mechanism (MAR), as defined by Little and Rubin (2002). Also, parametric regression andMCMCdepend on the assumption of multivariate normality. There are other missing value imputation methods that have no structural or distributional assumptions like those using the SVD, one of these methods is the Krzanowski algorithm described below. However, it is not known that a generalization of this method has been developed using regularised singular value decomposition. This project aims to propose a generalisation of the algorithm by means of different regularisation alternatives found in the literature. To achieve this objective, simulation and computational techniques will be explored for the calculation of regularisations, search of real data for a practical application and statistics to evaluate the imputation uncertainty. As an impact of this research will present new statistical methodologies that solve the problem of missing value without distributional or structural assumptions.
A common problem in several disciplines is the missing information. When applying statistical methods to database with multivariate information, it is requires complete information, for instance, GGE models, additive main effects with multiplicative interaction models -AMMIand principal components, so, the value imputation is relevant for any area of knowledge. The literature and the associated statistical software provide several alternatives of imputation, such as the parametric regression, the propensity score method, or the Markov Chain Monte Carlo (MCMC) method (Zhang 2003, Yuan 2011). However, these methodologies require that certain assumptions are met. The assumption in all three methods is that the missing data depend on observed variables, which means that there is a missing at random mechanism (MAR), as defined by Little and Rubin (2002). Also, parametric regression andMCMCdepend on the assumption of multivariate normality. There are other missing value imputation methods that have no structural or distributional assumptions like those using the SVD, one of these methods is the Krzanowski algorithm described below. However, it is not known that a generalization of this method has been developed using regularised singular value decomposition. This project aims to propose a generalisation of the algorithm by means of different regularisation alternatives found in the literature. To achieve this objective, simulation and computational techniques will be explored for the calculation of regularisations, search of real data for a practical application and statistics to evaluate the imputation uncertainty. As an impact of this research will present new statistical methodologies that solve the problem of missing value without distributional or structural assumptions.
missing values, singular value decomposition, cross-validation, genotype-by-environment trials
| Estado | Finalizado |
|---|
| Fecha de inicio/Fecha fin | 12/02/21 → 12/08/22 |
|---|