Resumen
The Bayesian approach to feature extraction, known as factor analysis (FA), has been widely studied in machine learning to obtain a latent representation of the data. An adequate selection of the probabilities and priors of these bayesian models allows the model to better adapt to the data nature (i.e. heterogeneity, sparsity), obtaining a more representative latent space. The objective of this article is to propose a general FA framework capable of modelling any problem. To do so, we start from the Bayesian Inter-Battery Factor Analysis (BIBFA) model, enhancing it with new functionalities to be able to work with heterogeneous data, to include feature selection, and to handle missing values as well as semi-supervised problems. The performance of the proposed model, Sparse Semi-supervised Heterogeneous Interbattery Bayesian Analysis (SSHIBA), has been tested on different scenarios to evaluate each one of its novelties, showing not only a great versatility and an interpretability gain, but also outperforming most of the state-of-the-art algorithms.
Enlaces
- https://www.sciencedirect.com/science/article/pii/S0031320321003289
- doi:https://doi.org/10.1016/j.patcog.2021.108141
BibTeX (Download)
@article{SEVILLASALCEDO2021108141, title = {Sparse semi-supervised heterogeneous interbattery bayesian analysis}, author = {Carlos Sevilla-Salcedo and Vanessa G\'{o}mez-Verdejo and Pablo M Olmos}, url = {https://www.sciencedirect.com/science/article/pii/S0031320321003289}, doi = {https://doi.org/10.1016/j.patcog.2021.108141}, issn = {0031-3203}, year = {2021}, date = {2021-01-01}, urldate = {2021-01-01}, journal = {Pattern Recognition}, volume = {120}, pages = {108141}, abstract = {The Bayesian approach to feature extraction, known as factor analysis (FA), has been widely studied in machine learning to obtain a latent representation of the data. An adequate selection of the probabilities and priors of these bayesian models allows the model to better adapt to the data nature (i.e. heterogeneity, sparsity), obtaining a more representative latent space. The objective of this article is to propose a general FA framework capable of modelling any problem. To do so, we start from the Bayesian Inter-Battery Factor Analysis (BIBFA) model, enhancing it with new functionalities to be able to work with heterogeneous data, to include feature selection, and to handle missing values as well as semi-supervised problems. The performance of the proposed model, Sparse Semi-supervised Heterogeneous Interbattery Bayesian Analysis (SSHIBA), has been tested on different scenarios to evaluate each one of its novelties, showing not only a great versatility and an interpretability gain, but also outperforming most of the state-of-the-art algorithms.}, keywords = {Bayesian model, Canonical correlation analysis, Factor analysis, Feature selection, Multi-task, Principal component analysis, Semi-supervised}, pubstate = {published}, tppubtype = {article} }