1. | Valera, Isabel; Ruiz, Francisco J R; Perez-Cruz, Fernando: Infinite Factorial Unbounded-State Hidden Markov Model. In: IEEE transactions on pattern analysis and machine intelligence, 38 (9), pp. 1816 – 1828, 2016, ISSN: 1939-3539. (Type: Journal Article | Abstract | Links | BibTeX) @article{Valera2016b, title = {Infinite Factorial Unbounded-State Hidden Markov Model}, author = {Isabel Valera and Francisco J R Ruiz and Fernando Perez-Cruz}, url = {http://www.ncbi.nlm.nih.gov/pubmed/26571511 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&amp;arnumber=7322279}, doi = {10.1109/TPAMI.2015.2498931}, issn = {1939-3539}, year = {2016}, date = {2016-09-01}, journal = {IEEE transactions on pattern analysis and machine intelligence}, volume = {38}, number = {9}, pages = {1816 -- 1828}, abstract = {There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem.}, keywords = {}, pubstate = {published}, tppubtype = {article} } There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem. |

2. | Valera, Isabel; Ruiz, Francisco J R; Perez-Cruz, Fernando: Infinite Factorial Unbounded-State Hidden Markov Model. In: IEEE transactions on pattern analysis and machine intelligence, To appear (99), pp. 1, 2016, ISSN: 1939-3539. (Type: Journal Article | Abstract | Links | BibTeX) @article{Valera2016c, title = {Infinite Factorial Unbounded-State Hidden Markov Model}, author = {Isabel Valera and Francisco J R Ruiz and Fernando Perez-Cruz}, url = {http://www.ncbi.nlm.nih.gov/pubmed/26571511 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&amp;arnumber=7322279}, doi = {10.1109/TPAMI.2015.2498931}, issn = {1939-3539}, year = {2016}, date = {2016-01-01}, journal = {IEEE transactions on pattern analysis and machine intelligence}, volume = {To appear}, number = {99}, pages = {1}, abstract = {There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem.}, keywords = {}, pubstate = {published}, tppubtype = {article} } There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem. |

3. | Moreno, Pablo G; Teh, Yee Whye; Perez-Cruz, Fernando; Artés-Rodríguez, Antonio: Bayesian Nonparametric Crowdsourcing. In: Journal of Machine Learning Research, 16 (August), pp. 1607–1627, 2015. (Type: Journal Article | Abstract | Links | BibTeX) @article{Moreno2015b, title = {Bayesian Nonparametric Crowdsourcing}, author = {Pablo G Moreno and Yee Whye Teh and Fernando Perez-Cruz and Antonio Artés-Rodríguez}, url = {http://www.jmlr.org/papers/volume16/moreno15a/moreno15a.pdf}, year = {2015}, date = {2015-08-01}, journal = {Journal of Machine Learning Research}, volume = {16}, number = {August}, pages = {1607--1627}, abstract = {Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms.}, keywords = {}, pubstate = {published}, tppubtype = {article} } Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms. |

## 2016 |

## Journal Articles |

Valera, Isabel; Ruiz, Francisco J R; Perez-Cruz, Fernando Infinite Factorial Unbounded-State Hidden Markov Model Journal Article IEEE transactions on pattern analysis and machine intelligence, 38 (9), pp. 1816 – 1828, 2016, ISSN: 1939-3539. Abstract | Links | BibTeX | Tags: Bayes methods, Bayesian nonparametrics, CASI CAM CM, Computational modeling, GAMMA-L+ UC3M, Gibbs sampling, Hidden Markov models, Inference algorithms, Journal, Markov processes, Probability distribution, reversible jump Markov chain Monte Carlo, slice sampling, Time series, variational inference, Yttrium @article{Valera2016b, title = {Infinite Factorial Unbounded-State Hidden Markov Model}, author = {Isabel Valera and Francisco J R Ruiz and Fernando Perez-Cruz}, url = {http://www.ncbi.nlm.nih.gov/pubmed/26571511 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&amp;arnumber=7322279}, doi = {10.1109/TPAMI.2015.2498931}, issn = {1939-3539}, year = {2016}, date = {2016-09-01}, journal = {IEEE transactions on pattern analysis and machine intelligence}, volume = {38}, number = {9}, pages = {1816 -- 1828}, abstract = {There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem.}, keywords = {Bayes methods, Bayesian nonparametrics, CASI CAM CM, Computational modeling, GAMMA-L+ UC3M, Gibbs sampling, Hidden Markov models, Inference algorithms, Journal, Markov processes, Probability distribution, reversible jump Markov chain Monte Carlo, slice sampling, Time series, variational inference, Yttrium}, pubstate = {published}, tppubtype = {article} } There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem. |

Valera, Isabel; Ruiz, Francisco J R; Perez-Cruz, Fernando Infinite Factorial Unbounded-State Hidden Markov Model Journal Article IEEE transactions on pattern analysis and machine intelligence, To appear (99), pp. 1, 2016, ISSN: 1939-3539. Abstract | Links | BibTeX | Tags: Bayes methods, Bayesian nonparametrics, CASI CAM CM, Computational modeling, GAMMA-L+ UC3M, Gibbs sampling, Hidden Markov models, Inference algorithms, Markov processes, Probability distribution, reversible jump Markov chain Monte Carlo, slice sampling, Time series, variational inference, Yttrium @article{Valera2016c, title = {Infinite Factorial Unbounded-State Hidden Markov Model}, author = {Isabel Valera and Francisco J R Ruiz and Fernando Perez-Cruz}, url = {http://www.ncbi.nlm.nih.gov/pubmed/26571511 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&amp;arnumber=7322279}, doi = {10.1109/TPAMI.2015.2498931}, issn = {1939-3539}, year = {2016}, date = {2016-01-01}, journal = {IEEE transactions on pattern analysis and machine intelligence}, volume = {To appear}, number = {99}, pages = {1}, abstract = {There are many scenarios in artificial intelligence, signal processing or medicine, in which a temporal sequence consists of several unknown overlapping independent causes, and we are interested in accurately recovering those canonical causes. Factorial hidden Markov models (FHMMs) present the versatility to provide a good fit to these scenarios. However, in some scenarios, the number of causes or the number of states of the FHMM cannot be known or limited a priori. In this paper, we propose an infinite factorial unbounded-state hidden Markov model (IFUHMM), in which the number of parallel hidden Markov models (HMMs) and states in each HMM are potentially unbounded. We rely on a Bayesian nonparametric (BNP) prior over integer-valued matrices, in which the columns represent the Markov chains, the rows the time indexes, and the integers the state for each chain and time instant. First, we extend the existent infinite factorial binary-state HMM to allow for any number of states. Then, we modify this model to allow for an unbounded number of states and derive an MCMC-based inference algorithm that properly deals with the trade-off between the unbounded number of states and chains. We illustrate the performance of our proposed models in the power disaggregation problem.}, keywords = {Bayes methods, Bayesian nonparametrics, CASI CAM CM, Computational modeling, GAMMA-L+ UC3M, Gibbs sampling, Hidden Markov models, Inference algorithms, Markov processes, Probability distribution, reversible jump Markov chain Monte Carlo, slice sampling, Time series, variational inference, Yttrium}, pubstate = {published}, tppubtype = {article} } |

## 2015 |

## Journal Articles |

Moreno, Pablo G; Teh, Yee Whye; Perez-Cruz, Fernando; Artés-Rodríguez, Antonio Bayesian Nonparametric Crowdsourcing Journal Article Journal of Machine Learning Research, 16 (August), pp. 1607–1627, 2015. Abstract | Links | BibTeX | Tags: Bayesian nonparametrics, Dirichlet process, Gibbs sampling, Hierarchical clustering, Journal, Multiple annotators @article{Moreno2015b, title = {Bayesian Nonparametric Crowdsourcing}, author = {Pablo G Moreno and Yee Whye Teh and Fernando Perez-Cruz and Antonio Artés-Rodríguez}, url = {http://www.jmlr.org/papers/volume16/moreno15a/moreno15a.pdf}, year = {2015}, date = {2015-08-01}, journal = {Journal of Machine Learning Research}, volume = {16}, number = {August}, pages = {1607--1627}, abstract = {Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms.}, keywords = {Bayesian nonparametrics, Dirichlet process, Gibbs sampling, Hierarchical clustering, Journal, Multiple annotators}, pubstate = {published}, tppubtype = {article} } Crowdsourcing has been proven to be an effective and efficient tool to annotate large datasets. User annotations are often noisy, so methods to combine the annotations to produce reliable estimates of the ground truth are necessary. We claim that considering the existence of clusters of users in this combination step can improve the performance. This is especially important in early stages of crowdsourcing implementations, where the number of annotations is low. At this stage there is not enough information to accurately estimate the bias introduced by each annotator separately, so we have to resort to models that consider the statistical links among them. In addition, finding these clusters is interesting in itself as knowing the behavior of the pool of annotators allows implementing efficient active learning strategies. Based on this, we propose in this paper two new fully unsupervised models based on a Chinese Restaurant Process (CRP) prior and a hierarchical structure that allows inferring these groups jointly with the ground truth and the properties of the users. Efficient inference algorithms based on Gibbs sampling with auxiliary variables are proposed. Finally, we perform experiments, both on synthetic and real databases, to show the advantages of our models over state-of-the-art algorithms. |