## 2014 |

## Journal Articles |

Alvarado, Alex; Brannstrom, Fredrik; Agrell, Erik; Koch, Tobias High-SNR Asymptotics of Mutual Information for Discrete Constellations With Applications to BICM (Journal Article) IEEE Transactions on Information Theory, 60 (2), pp. 1061–1076, 2014, ISSN: 0018-9448. (Abstract | Links | BibTeX | Tags: additive white Gaussian noise channel, Anti-Gray code, bit-interleaved coded modulation, discrete constellations, Entropy, Gray code, high-SNR asymptotics, IP networks, Labeling, minimum-mean square error, Modulation, Mutual information, Signal to noise ratio, Vectors) @article{Alvarado2014, title = {High-SNR Asymptotics of Mutual Information for Discrete Constellations With Applications to BICM}, author = {Alvarado, Alex and Brannstrom, Fredrik and Agrell, Erik and Koch, Tobias}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6671479 http://www.tsc.uc3m.es/~koch/files/IEEE_TIT_60%282%29.pdf}, issn = {0018-9448}, year = {2014}, date = {2014-01-01}, journal = {IEEE Transactions on Information Theory}, volume = {60}, number = {2}, pages = {1061--1076}, abstract = {Asymptotic expressions of the mutual information between any discrete input and the corresponding output of the scalar additive white Gaussian noise channel are presented in the limit as the signal-to-noise ratio (SNR) tends to infinity. Asymptotic expressions of the symbol-error probability (SEP) and the minimum mean-square error (MMSE) achieved by estimating the channel input given the channel output are also developed. It is shown that for any input distribution, the conditional entropy of the channel input given the output, MMSE, and SEP have an asymptotic behavior proportional to the Gaussian Q-function. The argument of the Q-function depends only on the minimum Euclidean distance (MED) of the constellation and the SNR, and the proportionality constants are functions of the MED and the probabilities of the pairs of constellation points at MED. The developed expressions are then generalized to study the high-SNR behavior of the generalized mutual information (GMI) for bit-interleaved coded modulation (BICM). By means of these asymptotic expressions, the long-standing conjecture that Gray codes are the binary labelings that maximize the BICM-GMI at high SNR is proven. It is further shown that for any equally spaced constellation whose size is a power of two, there always exists an anti-Gray code giving the lowest BICM-GMI at high SNR.}, keywords = {additive white Gaussian noise channel, Anti-Gray code, bit-interleaved coded modulation, discrete constellations, Entropy, Gray code, high-SNR asymptotics, IP networks, Labeling, minimum-mean square error, Modulation, Mutual information, Signal to noise ratio, Vectors}, pubstate = {published}, tppubtype = {article} } Asymptotic expressions of the mutual information between any discrete input and the corresponding output of the scalar additive white Gaussian noise channel are presented in the limit as the signal-to-noise ratio (SNR) tends to infinity. Asymptotic expressions of the symbol-error probability (SEP) and the minimum mean-square error (MMSE) achieved by estimating the channel input given the channel output are also developed. It is shown that for any input distribution, the conditional entropy of the channel input given the output, MMSE, and SEP have an asymptotic behavior proportional to the Gaussian Q-function. The argument of the Q-function depends only on the minimum Euclidean distance (MED) of the constellation and the SNR, and the proportionality constants are functions of the MED and the probabilities of the pairs of constellation points at MED. The developed expressions are then generalized to study the high-SNR behavior of the generalized mutual information (GMI) for bit-interleaved coded modulation (BICM). By means of these asymptotic expressions, the long-standing conjecture that Gray codes are the binary labelings that maximize the BICM-GMI at high SNR is proven. It is further shown that for any equally spaced constellation whose size is a power of two, there always exists an anti-Gray code giving the lowest BICM-GMI at high SNR. |

Pastore A,; Koch, Tobias; Fonollosa, Javier Rodriguez A Rate-Splitting Approach to Fading Channels With Imperfect Channel-State Information (Journal Article) IEEE Transactions on Information Theory, 60 (7), pp. 4266–4285, 2014, ISSN: 0018-9448. (Abstract | Links | BibTeX | Tags: channel capacity, COMONSENS, DEIPRO, Entropy, Fading, fading channels, flat fading, imperfect channel-state information, MobileNET, Mutual information, OTOSiS, Random variables, Receivers, Signal to noise ratio, Upper bound) @article{Pastore2014a, title = {A Rate-Splitting Approach to Fading Channels With Imperfect Channel-State Information}, author = {Pastore, A, and Koch, Tobias and Fonollosa, Javier Rodriguez}, url = {http://ieeexplore.ieee.org/articleDetails.jsp?arnumber=6832779 http://www.tsc.uc3m.es/~koch/files/IEEE_TIT_60(7).pdf http://arxiv.org/pdf/1301.6120.pdf}, issn = {0018-9448}, year = {2014}, date = {2014-01-01}, journal = {IEEE Transactions on Information Theory}, volume = {60}, number = {7}, pages = {4266--4285}, publisher = {IEEE}, abstract = {As shown by Médard, the capacity of fading channels with imperfect channel-state information can be lower-bounded by assuming a Gaussian channel input (X) with power (P) and by upper-bounding the conditional entropy (h(X|Y,hat Ħ)) by the entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating (X) from ((Y,hat Ħ)) . We demonstrate that, using a rate-splitting approach, this lower bound can be sharpened: by expressing the Gaussian input (X) as the sum of two independent Gaussian variables (X_1) and (X_2) and by applying Médard\'s lower bound first to bound the mutual information between (X_1) and (Y) while treating (X_2) as noise, and by applying it a second time to the mutual information between (X_2) and (Y) while assuming (X_1) to be known, we obtain a capacity lower bound that is strictly larger than Médard\'s lower bound. We then generalize this approach to an arbi- rary number (L) of layers, where (X) is expressed as the sum of (L) independent Gaussian random variables of respective variances (P_ell ) , (ell = 1,dotsc ,L) summing up to (P) . Among all such rate-splitting bounds, we determine the supremum over power allocations (P_ell ) and total number of layers (L) . This supremum is achieved for (L rightarrow infty ) and gives rise to an analytically expressible capacity lower bound. For Gaussian fading, this novel bound is shown to converge to the Gaussian-input mutual information as the signal-to-noise ratio (SNR) grows, provided that the variance of the channel estimation error (H-hat Ħ) tends to zero as the SNR tends to infinity.}, keywords = {channel capacity, COMONSENS, DEIPRO, Entropy, Fading, fading channels, flat fading, imperfect channel-state information, MobileNET, Mutual information, OTOSiS, Random variables, Receivers, Signal to noise ratio, Upper bound}, pubstate = {published}, tppubtype = {article} } As shown by Médard, the capacity of fading channels with imperfect channel-state information can be lower-bounded by assuming a Gaussian channel input (X) with power (P) and by upper-bounding the conditional entropy (h(X|Y,hat Ħ)) by the entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating (X) from ((Y,hat Ħ)) . We demonstrate that, using a rate-splitting approach, this lower bound can be sharpened: by expressing the Gaussian input (X) as the sum of two independent Gaussian variables (X_1) and (X_2) and by applying Médard's lower bound first to bound the mutual information between (X_1) and (Y) while treating (X_2) as noise, and by applying it a second time to the mutual information between (X_2) and (Y) while assuming (X_1) to be known, we obtain a capacity lower bound that is strictly larger than Médard's lower bound. We then generalize this approach to an arbi- rary number (L) of layers, where (X) is expressed as the sum of (L) independent Gaussian random variables of respective variances (P_ell ) , (ell = 1,dotsc ,L) summing up to (P) . Among all such rate-splitting bounds, we determine the supremum over power allocations (P_ell ) and total number of layers (L) . This supremum is achieved for (L rightarrow infty ) and gives rise to an analytically expressible capacity lower bound. For Gaussian fading, this novel bound is shown to converge to the Gaussian-input mutual information as the signal-to-noise ratio (SNR) grows, provided that the variance of the channel estimation error (H-hat Ħ) tends to zero as the SNR tends to infinity. |

## 2013 |

## Inproceedings |

Alvarado, Alex; Brannstrom, Fredrik; Agrell, Erik; Koch, Tobias High-SNR Asymptotics of Mutual Information for Discrete Constellations (Inproceeding) 2013 IEEE International Symposium on Information Theory, pp. 2274–2278, IEEE, Istanbul, 2013, ISSN: 2157-8095. (Abstract | Links | BibTeX | Tags: AWGN channels, discrete constellations, Entropy, Fading, Gaussian Q-function, high-SNR asymptotics, IP networks, least mean squares methods, minimum mean-square error, MMSE, Mutual information, scalar additive white Gaussian noise channel, Signal to noise ratio, signal-to-noise ratio, Upper bound) @inproceedings{Alvarado2013b, title = {High-SNR Asymptotics of Mutual Information for Discrete Constellations}, author = {Alvarado, Alex and Brannstrom, Fredrik and Agrell, Erik and Koch, Tobias}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6620631}, issn = {2157-8095}, year = {2013}, date = {2013-01-01}, booktitle = {2013 IEEE International Symposium on Information Theory}, pages = {2274--2278}, publisher = {IEEE}, address = {Istanbul}, abstract = {The asymptotic behavior of the mutual information (MI) at high signal-to-noise ratio (SNR) for discrete constellations over the scalar additive white Gaussian noise channel is studied. Exact asymptotic expressions for the MI for arbitrary one-dimensional constellations and input distributions are presented in the limit as the SNR tends to infinity. Asymptotics of the minimum mean-square error (MMSE) are also developed. It is shown that for any input distribution, the MI and the MMSE have an asymptotic behavior proportional to a Gaussian Q-function, whose argument depends on the minimum Euclidean distance of the constellation and the SNR. Closed-form expressions for the coefficients of these Q-functions are calculated.}, keywords = {AWGN channels, discrete constellations, Entropy, Fading, Gaussian Q-function, high-SNR asymptotics, IP networks, least mean squares methods, minimum mean-square error, MMSE, Mutual information, scalar additive white Gaussian noise channel, Signal to noise ratio, signal-to-noise ratio, Upper bound}, pubstate = {published}, tppubtype = {inproceedings} } The asymptotic behavior of the mutual information (MI) at high signal-to-noise ratio (SNR) for discrete constellations over the scalar additive white Gaussian noise channel is studied. Exact asymptotic expressions for the MI for arbitrary one-dimensional constellations and input distributions are presented in the limit as the SNR tends to infinity. Asymptotics of the minimum mean-square error (MMSE) are also developed. It is shown that for any input distribution, the MI and the MMSE have an asymptotic behavior proportional to a Gaussian Q-function, whose argument depends on the minimum Euclidean distance of the constellation and the SNR. Closed-form expressions for the coefficients of these Q-functions are calculated. |

## 2012 |

## Inproceedings |

Taborda, Camilo; Perez-Cruz, Fernando Derivative of the Relative Entropy over the Poisson and Binomial Channel (Inproceeding) 2012 IEEE Information Theory Workshop, pp. 386–390, IEEE, Lausanne, 2012, ISBN: 978-1-4673-0223-4. (Abstract | Links | BibTeX | Tags: binomial channel, binomial distribution, Channel estimation, conditional distribution, Entropy, Estimation, function expectation, Mutual information, mutual information concept, Poisson channel, Poisson distribution, Random variables, relative entropy derivative, similar expression) @inproceedings{Taborda2012, title = {Derivative of the Relative Entropy over the Poisson and Binomial Channel}, author = {Taborda, Camilo G. and Perez-Cruz, Fernando}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6404699}, isbn = {978-1-4673-0223-4}, year = {2012}, date = {2012-01-01}, booktitle = {2012 IEEE Information Theory Workshop}, pages = {386--390}, publisher = {IEEE}, address = {Lausanne}, abstract = {In this paper it is found that, regardless of the statistics of the input, the derivative of the relative entropy over the Binomial channel can be seen as the expectation of a function that has as argument the mean of the conditional distribution that models the channel. Based on this relationship we formulate a similar expression for the mutual information concept. In addition to this, using the connection between the Binomial and Poisson distribution we develop similar results for the Poisson channel. Novelty of the results presented here lies on the fact that, expressions obtained can be applied to a wide range of scenarios.}, keywords = {binomial channel, binomial distribution, Channel estimation, conditional distribution, Entropy, Estimation, function expectation, Mutual information, mutual information concept, Poisson channel, Poisson distribution, Random variables, relative entropy derivative, similar expression}, pubstate = {published}, tppubtype = {inproceedings} } In this paper it is found that, regardless of the statistics of the input, the derivative of the relative entropy over the Binomial channel can be seen as the expectation of a function that has as argument the mean of the conditional distribution that models the channel. Based on this relationship we formulate a similar expression for the mutual information concept. In addition to this, using the connection between the Binomial and Poisson distribution we develop similar results for the Poisson channel. Novelty of the results presented here lies on the fact that, expressions obtained can be applied to a wide range of scenarios. |

Pastore, Adriano; Koch, Tobias; Fonollosa, Javier Rodriguez Improved Capacity Lower Bounds for Fading Channels with Imperfect CSI Using Rate Splitting (Inproceeding) 2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel, pp. 1–5, IEEE, Eilat, 2012, ISBN: 978-1-4673-4681-8. (Abstract | Links | BibTeX | Tags: channel capacity, channel capacity lower bounds, conditional entropy, Decoding, Entropy, Fading, fading channels, Gaussian channel, Gaussian channels, Gaussian random variable, imperfect channel-state information, imperfect CSI, independent Gaussian variables, linear minimum mean-square error, mean square error methods, Medard lower bound, Mutual information, Random variables, rate splitting approach, Resource management, Upper bound, wireless communications) @inproceedings{Pastore2012, title = {Improved Capacity Lower Bounds for Fading Channels with Imperfect CSI Using Rate Splitting}, author = {Pastore, Adriano and Koch, Tobias and Fonollosa, Javier Rodriguez}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6377031}, isbn = {978-1-4673-4681-8}, year = {2012}, date = {2012-01-01}, booktitle = {2012 IEEE 27th Convention of Electrical and Electronics Engineers in Israel}, pages = {1--5}, publisher = {IEEE}, address = {Eilat}, abstract = {As shown by Medard (“The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel,” IEEE Trans. Inform. Theory, May 2000), the capacity of fading channels with imperfect channel-state information (CSI) can be lower-bounded by assuming a Gaussian channel input X, and by upper-bounding the conditional entropy h(XY, Ĥ), conditioned on the channel output Y and the CSI Ĥ, by the entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating X from (Y, Ĥ). We demonstrate that, by using a rate-splitting approach, this lower bound can be sharpened: we show that by expressing the Gaussian input X as as the sum of two independent Gaussian variables X(1) and X(2), and by applying Medard's lower bound first to analyze the mutual information between X(1) and Y conditioned on Ĥ while treating X(2) as noise, and by applying the lower bound then to analyze the mutual information between X(2) and Y conditioned on (X(1), Ĥ), we obtain a lower bound on the capacity that is larger than Medard's lower bound.}, keywords = {channel capacity, channel capacity lower bounds, conditional entropy, Decoding, Entropy, Fading, fading channels, Gaussian channel, Gaussian channels, Gaussian random variable, imperfect channel-state information, imperfect CSI, independent Gaussian variables, linear minimum mean-square error, mean square error methods, Medard lower bound, Mutual information, Random variables, rate splitting approach, Resource management, Upper bound, wireless communications}, pubstate = {published}, tppubtype = {inproceedings} } As shown by Medard (“The effect upon channel capacity in wireless communications of perfect and imperfect knowledge of the channel,” IEEE Trans. Inform. Theory, May 2000), the capacity of fading channels with imperfect channel-state information (CSI) can be lower-bounded by assuming a Gaussian channel input X, and by upper-bounding the conditional entropy h(XY, Ĥ), conditioned on the channel output Y and the CSI Ĥ, by the entropy of a Gaussian random variable with variance equal to the linear minimum mean-square error in estimating X from (Y, Ĥ). We demonstrate that, by using a rate-splitting approach, this lower bound can be sharpened: we show that by expressing the Gaussian input X as as the sum of two independent Gaussian variables X(1) and X(2), and by applying Medard's lower bound first to analyze the mutual information between X(1) and Y conditioned on Ĥ while treating X(2) as noise, and by applying the lower bound then to analyze the mutual information between X(2) and Y conditioned on (X(1), Ĥ), we obtain a lower bound on the capacity that is larger than Medard's lower bound. |

Taborda, Camilo; Perez-Cruz, Fernando Mutual Information and Relative Entropy over the Binomial and Negative Binomial Channels (Inproceeding) 2012 IEEE International Symposium on Information Theory Proceedings, pp. 696–700, IEEE, Cambridge, MA, 2012, ISSN: 2157-8095. (Abstract | Links | BibTeX | Tags: Channel estimation, conditional mean estimation, Entropy, Estimation, estimation theoretical quantity, estimation theory, Gaussian channel, Gaussian channels, information theory concept, loss function, mean square error methods, Mutual information, negative binomial channel, Poisson channel, Random variables, relative entropy) @inproceedings{Taborda2012a, title = {Mutual Information and Relative Entropy over the Binomial and Negative Binomial Channels}, author = {Taborda, Camilo G. and Perez-Cruz, Fernando}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=6284304}, issn = {2157-8095}, year = {2012}, date = {2012-01-01}, booktitle = {2012 IEEE International Symposium on Information Theory Proceedings}, pages = {696--700}, publisher = {IEEE}, address = {Cambridge, MA}, abstract = {We study the relation of the mutual information and relative entropy over the Binomial and Negative Binomial channels with estimation theoretical quantities, in which we extend already known results for Gaussian and Poisson channels. We establish general expressions for these information theory concepts with a direct connection with estimation theory through the conditional mean estimation and a particular loss function.}, keywords = {Channel estimation, conditional mean estimation, Entropy, Estimation, estimation theoretical quantity, estimation theory, Gaussian channel, Gaussian channels, information theory concept, loss function, mean square error methods, Mutual information, negative binomial channel, Poisson channel, Random variables, relative entropy}, pubstate = {published}, tppubtype = {inproceedings} } We study the relation of the mutual information and relative entropy over the Binomial and Negative Binomial channels with estimation theoretical quantities, in which we extend already known results for Gaussian and Poisson channels. We establish general expressions for these information theory concepts with a direct connection with estimation theory through the conditional mean estimation and a particular loss function. |

## 2011 |

## Inproceedings |

Goparaju,; Calderbank,; Carson,; Rodrigues, Miguel; Perez-Cruz, Fernando When to Add Another Dimension when Communicating over MIMO Channels (Inproceeding) 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 3100–3103, IEEE, Prague, 2011, ISSN: 1520-6149. (Abstract | Links | BibTeX | Tags: divide and conquer approach, divide and conquer methods, error probability, error rate, error statistics, Gaussian channels, Lattices, Manganese, MIMO, MIMO channel, MIMO communication, multiple input multiple output Gaussian channel, Mutual information, optimal power allocation, power allocation, power constraint, receive filter, Resource management, Signal to noise ratio, signal-to-noise ratio, transmit filter, Upper bound) @inproceedings{Goparaju2011, title = {When to Add Another Dimension when Communicating over MIMO Channels}, author = {Goparaju, S. and Calderbank, A. R. and Carson, W. R. and Rodrigues, Miguel R. D. and Perez-Cruz, Fernando}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5946351}, issn = {1520-6149}, year = {2011}, date = {2011-01-01}, booktitle = {2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages = {3100--3103}, publisher = {IEEE}, address = {Prague}, abstract = {This paper introduces a divide and conquer approach to the design of transmit and receive filters for communication over a Multiple Input Multiple Output (MIMO) Gaussian channel subject to an average power constraint. It involves conversion to a set of parallel scalar channels, possibly with very different gains, followed by coding per sub-channel (i.e. over time) rather than coding across sub-channels (i.e. over time and space). The loss in performance is negligible at high signal-to-noise ratio (SNR) and not significant at medium SNR. The advantages are reduction in signal processing complexity and greater insight into the SNR thresholds at which a channel is first allocated power. This insight is a consequence of formulating the optimal power allocation in terms of an upper bound on error rate that is determined by parameters of the input lattice such as the minimum distance and kissing number. The resulting thresholds are given explicitly in terms of these lattice parameters. By contrast, when the optimization problem is phrased in terms of maximizing mutual information, the solution is mercury waterfilling, and the thresholds are implicit.}, keywords = {divide and conquer approach, divide and conquer methods, error probability, error rate, error statistics, Gaussian channels, Lattices, Manganese, MIMO, MIMO channel, MIMO communication, multiple input multiple output Gaussian channel, Mutual information, optimal power allocation, power allocation, power constraint, receive filter, Resource management, Signal to noise ratio, signal-to-noise ratio, transmit filter, Upper bound}, pubstate = {published}, tppubtype = {inproceedings} } This paper introduces a divide and conquer approach to the design of transmit and receive filters for communication over a Multiple Input Multiple Output (MIMO) Gaussian channel subject to an average power constraint. It involves conversion to a set of parallel scalar channels, possibly with very different gains, followed by coding per sub-channel (i.e. over time) rather than coding across sub-channels (i.e. over time and space). The loss in performance is negligible at high signal-to-noise ratio (SNR) and not significant at medium SNR. The advantages are reduction in signal processing complexity and greater insight into the SNR thresholds at which a channel is first allocated power. This insight is a consequence of formulating the optimal power allocation in terms of an upper bound on error rate that is determined by parameters of the input lattice such as the minimum distance and kissing number. The resulting thresholds are given explicitly in terms of these lattice parameters. By contrast, when the optimization problem is phrased in terms of maximizing mutual information, the solution is mercury waterfilling, and the thresholds are implicit. |

## 2010 |

## Journal Articles |

Perez-Cruz, Fernando; Rodrigues, Miguel; Verdu, Sergio MIMO Gaussian Channels With Arbitrary Inputs: Optimal Precoding and Power Allocation (Journal Article) IEEE Transactions on Information Theory, 56 (3), pp. 1070–1084, 2010, ISSN: 0018-9448. (Abstract | Links | BibTeX | Tags: Collaborative work, Equations, fixed-point equation, Gaussian channels, Gaussian noise channels, Gaussian processes, Government, Interference, linear precoding, matrix algebra, mean square error methods, mercury-waterfilling algorithm, MIMO, MIMO communication, MIMO Gaussian channel, minimum mean-square error, minimum mean-square error (MMSE), multiple-input-multiple-output channel, multiple-input–multiple-output (MIMO) systems, Mutual information, nondiagonal precoding matrix, optimal linear precoder, optimal power allocation policy, optimal precoding, optimum power allocation, Phase shift keying, precoding, Quadrature amplitude modulation, Telecommunications, waterfilling) @article{Perez-Cruz2010a, title = {MIMO Gaussian Channels With Arbitrary Inputs: Optimal Precoding and Power Allocation}, author = {Perez-Cruz, Fernando and Rodrigues, Miguel R. D. and Verdu, Sergio}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=5429131}, issn = {0018-9448}, year = {2010}, date = {2010-01-01}, journal = {IEEE Transactions on Information Theory}, volume = {56}, number = {3}, pages = {1070--1084}, abstract = {In this paper, we investigate the linear precoding and power allocation policies that maximize the mutual information for general multiple-input-multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean-square error (MMSE). The optimal linear precoder satisfies a fixed-point equation as a function of the channel and the input constellation. For non-Gaussian inputs, a nondiagonal precoding matrix in general increases the information transmission rate, even for parallel noninteracting channels. Whenever precoding is precluded, the optimal power allocation policy also satisfies a fixed-point equation; we put forth a generalization of the mercury/waterfilling algorithm, previously proposed for parallel noninterfering channels, in which the mercury level accounts not only for the non-Gaussian input distributions, but also for the interference among inputs.}, keywords = {Collaborative work, Equations, fixed-point equation, Gaussian channels, Gaussian noise channels, Gaussian processes, Government, Interference, linear precoding, matrix algebra, mean square error methods, mercury-waterfilling algorithm, MIMO, MIMO communication, MIMO Gaussian channel, minimum mean-square error, minimum mean-square error (MMSE), multiple-input-multiple-output channel, multiple-input–multiple-output (MIMO) systems, Mutual information, nondiagonal precoding matrix, optimal linear precoder, optimal power allocation policy, optimal precoding, optimum power allocation, Phase shift keying, precoding, Quadrature amplitude modulation, Telecommunications, waterfilling}, pubstate = {published}, tppubtype = {article} } In this paper, we investigate the linear precoding and power allocation policies that maximize the mutual information for general multiple-input-multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean-square error (MMSE). The optimal linear precoder satisfies a fixed-point equation as a function of the channel and the input constellation. For non-Gaussian inputs, a nondiagonal precoding matrix in general increases the information transmission rate, even for parallel noninteracting channels. Whenever precoding is precluded, the optimal power allocation policy also satisfies a fixed-point equation; we put forth a generalization of the mercury/waterfilling algorithm, previously proposed for parallel noninterfering channels, in which the mercury level accounts not only for the non-Gaussian input distributions, but also for the interference among inputs. |

## 2008 |

## Inproceedings |

Perez-Cruz, Fernando Kullback-Leibler Divergence Estimation of Continuous Distributions (Inproceeding) 2008 IEEE International Symposium on Information Theory, pp. 1666–1670, IEEE, Toronto, 2008, ISBN: 978-1-4244-2256-2. (Abstract | Links | BibTeX | Tags: Convergence, density estimation, Density measurement, Entropy, Frequency estimation, H infinity control, information theory, k-nearest-neighbour density estimation, Kullback-Leibler divergence estimation, Machine learning, Mutual information, neuroscience, Random variables, statistical distributions, waiting-times distributions) @inproceedings{Perez-Cruz2008, title = {Kullback-Leibler Divergence Estimation of Continuous Distributions}, author = {Perez-Cruz, Fernando}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4595271}, isbn = {978-1-4244-2256-2}, year = {2008}, date = {2008-01-01}, booktitle = {2008 IEEE International Symposium on Information Theory}, pages = {1666--1670}, publisher = {IEEE}, address = {Toronto}, abstract = {We present a method for estimating the KL divergence between continuous densities and we prove it converges almost surely. Divergence estimation is typically solved estimating the densities first. Our main result shows this intermediate step is unnecessary and that the divergence can be either estimated using the empirical cdf or k-nearest-neighbour density estimation, which does not converge to the true measure for finite k. The convergence proof is based on describing the statistics of our estimator using waiting-times distributions, as the exponential or Erlang. We illustrate the proposed estimators and show how they compare to existing methods based on density estimation, and we also outline how our divergence estimators can be used for solving the two-sample problem.}, keywords = {Convergence, density estimation, Density measurement, Entropy, Frequency estimation, H infinity control, information theory, k-nearest-neighbour density estimation, Kullback-Leibler divergence estimation, Machine learning, Mutual information, neuroscience, Random variables, statistical distributions, waiting-times distributions}, pubstate = {published}, tppubtype = {inproceedings} } We present a method for estimating the KL divergence between continuous densities and we prove it converges almost surely. Divergence estimation is typically solved estimating the densities first. Our main result shows this intermediate step is unnecessary and that the divergence can be either estimated using the empirical cdf or k-nearest-neighbour density estimation, which does not converge to the true measure for finite k. The convergence proof is based on describing the statistics of our estimator using waiting-times distributions, as the exponential or Erlang. We illustrate the proposed estimators and show how they compare to existing methods based on density estimation, and we also outline how our divergence estimators can be used for solving the two-sample problem. |

Perez-Cruz, Fernando; Rodrigues, Miguel; Verdu, Sergio Optimal Precoding for Digital Subscriber Lines (Inproceeding) 2008 IEEE International Conference on Communications, pp. 1200–1204, IEEE, Beijing, 2008, ISBN: 978-1-4244-2075-9. (Abstract | Links | BibTeX | Tags: Bit error rate, channel matrix diagonalization, Communications Society, Computer science, digital subscriber lines, DSL, Equations, fixed-point equation, Gaussian channels, least mean squares methods, linear codes, matrix algebra, MIMO, MIMO communication, MIMO Gaussian channel, minimum mean squared error method, MMSE, multiple-input multiple-output communication, Mutual information, optimal linear precoder, precoding, Telecommunications, Telephony) @inproceedings{Perez-Cruz2008a, title = {Optimal Precoding for Digital Subscriber Lines}, author = {Perez-Cruz, Fernando and Rodrigues, Miguel R. D. and Verdu, Sergio}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4533270}, isbn = {978-1-4244-2075-9}, year = {2008}, date = {2008-01-01}, booktitle = {2008 IEEE International Conference on Communications}, pages = {1200--1204}, publisher = {IEEE}, address = {Beijing}, abstract = {We determine the linear precoding policy that maximizes the mutual information for general multiple-input multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean squared error (MMSE). The optimal linear precoder can be computed by means of a fixed- point equation as a function of the channel and the input constellation. We show that diagonalizing the channel matrix does not maximize the information transmission rate for nonGaussian inputs. A full precoding matrix may significantly increase the information transmission rate, even for parallel non-interacting channels. We illustrate the application of our results to typical Gigabit DSL systems.}, keywords = {Bit error rate, channel matrix diagonalization, Communications Society, Computer science, digital subscriber lines, DSL, Equations, fixed-point equation, Gaussian channels, least mean squares methods, linear codes, matrix algebra, MIMO, MIMO communication, MIMO Gaussian channel, minimum mean squared error method, MMSE, multiple-input multiple-output communication, Mutual information, optimal linear precoder, precoding, Telecommunications, Telephony}, pubstate = {published}, tppubtype = {inproceedings} } We determine the linear precoding policy that maximizes the mutual information for general multiple-input multiple-output (MIMO) Gaussian channels with arbitrary input distributions, by capitalizing on the relationship between mutual information and minimum mean squared error (MMSE). The optimal linear precoder can be computed by means of a fixed- point equation as a function of the channel and the input constellation. We show that diagonalizing the channel matrix does not maximize the information transmission rate for nonGaussian inputs. A full precoding matrix may significantly increase the information transmission rate, even for parallel non-interacting channels. We illustrate the application of our results to typical Gigabit DSL systems. |

Rodrigues, Miguel; Perez-Cruz, Fernando; Verdu, Sergio Multiple-Input Multiple-Output Gaussian Channels: Optimal Covariance for Non-Gaussian Inputs (Inproceeding) 2008 IEEE Information Theory Workshop, pp. 445–449, IEEE, Porto, 2008, ISBN: 978-1-4244-2269-2. (Abstract | Links | BibTeX | Tags: Binary phase shift keying, covariance matrices, Covariance matrix, deterministic MIMO Gaussian channel, fixed-point equation, Gaussian channels, Gaussian noise, Information rates, intersymbol interference, least mean squares methods, Magnetic recording, mercury-waterfilling power allocation policy, MIMO, MIMO communication, minimum mean-squared error, MMSE, MMSE matrix, multiple-input multiple-output system, Multiple-Input Multiple-Output Systems, Mutual information, Optimal Input Covariance, Optimization, Telecommunications) @inproceedings{Rodrigues2008, title = {Multiple-Input Multiple-Output Gaussian Channels: Optimal Covariance for Non-Gaussian Inputs}, author = {Rodrigues, Miguel R. D. and Perez-Cruz, Fernando and Verdu, Sergio}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4578704}, isbn = {978-1-4244-2269-2}, year = {2008}, date = {2008-01-01}, booktitle = {2008 IEEE Information Theory Workshop}, pages = {445--449}, publisher = {IEEE}, address = {Porto}, abstract = {We investigate the input covariance that maximizes the mutual information of deterministic multiple-input multipleo-utput (MIMO) Gaussian channels with arbitrary (not necessarily Gaussian) input distributions, by capitalizing on the relationship between the gradient of the mutual information and the minimum mean-squared error (MMSE) matrix. We show that the optimal input covariance satisfies a simple fixed-point equation involving key system quantities, including the MMSE matrix. We also specialize the form of the optimal input covariance to the asymptotic regimes of low and high snr. We demonstrate that in the low-snr regime the optimal covariance fully correlates the inputs to better combat noise. In contrast, in the high-snr regime the optimal covariance is diagonal with diagonal elements obeying the generalized mercury/waterfilling power allocation policy. Numerical results illustrate that covariance optimization may lead to significant gains with respect to conventional strategies based on channel diagonalization followed by mercury/waterfilling or waterfilling power allocation, particularly in the regimes of medium and high snr.}, keywords = {Binary phase shift keying, covariance matrices, Covariance matrix, deterministic MIMO Gaussian channel, fixed-point equation, Gaussian channels, Gaussian noise, Information rates, intersymbol interference, least mean squares methods, Magnetic recording, mercury-waterfilling power allocation policy, MIMO, MIMO communication, minimum mean-squared error, MMSE, MMSE matrix, multiple-input multiple-output system, Multiple-Input Multiple-Output Systems, Mutual information, Optimal Input Covariance, Optimization, Telecommunications}, pubstate = {published}, tppubtype = {inproceedings} } We investigate the input covariance that maximizes the mutual information of deterministic multiple-input multipleo-utput (MIMO) Gaussian channels with arbitrary (not necessarily Gaussian) input distributions, by capitalizing on the relationship between the gradient of the mutual information and the minimum mean-squared error (MMSE) matrix. We show that the optimal input covariance satisfies a simple fixed-point equation involving key system quantities, including the MMSE matrix. We also specialize the form of the optimal input covariance to the asymptotic regimes of low and high snr. We demonstrate that in the low-snr regime the optimal covariance fully correlates the inputs to better combat noise. In contrast, in the high-snr regime the optimal covariance is diagonal with diagonal elements obeying the generalized mercury/waterfilling power allocation policy. Numerical results illustrate that covariance optimization may lead to significant gains with respect to conventional strategies based on channel diagonalization followed by mercury/waterfilling or waterfilling power allocation, particularly in the regimes of medium and high snr. |

Vila-Forcen,; Artés-Rodríguez, Antonio; Garcia-Frias, Compressive Sensing Detection of Stochastic Signals (Inproceeding) 2008 42nd Annual Conference on Information Sciences and Systems, pp. 956–960, IEEE, Princeton, 2008, ISBN: 978-1-4244-2246-3. (Abstract | Links | BibTeX | Tags: Additive white noise, AWGN, compressive sensing detection, dimensionality reduction techniques, Distortion measurement, Gaussian noise, matrix algebra, Mutual information, optimized projections, projection matrix, signal detection, Signal processing, signal reconstruction, Stochastic processes, stochastic signals, Support vector machine classification, Support vector machines, SVM) @inproceedings{Vila-Forcen2008, title = {Compressive Sensing Detection of Stochastic Signals}, author = {Vila-Forcen, J.E. and Artés-Rodríguez, Antonio and Garcia-Frias, J.}, url = {http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4558656}, isbn = {978-1-4244-2246-3}, year = {2008}, date = {2008-01-01}, booktitle = {2008 42nd Annual Conference on Information Sciences and Systems}, pages = {956--960}, publisher = {IEEE}, address = {Princeton}, abstract = {Inspired by recent work in compressive sensing, we propose a framework for the detection of stochastic signals from optimized projections. In order to generate a good projection matrix, we use dimensionality reduction techniques based on the maximization of the mutual information between the projected signals and their corresponding class labels. In addition, classification techniques based on support vector machines (SVMs) are applied for the final decision process. Simulation results show that the realizations of the stochastic process are detected with higher accuracy and lower complexity than a scheme performing signal reconstruction first, followed by detection based on the reconstructed signal.}, keywords = {Additive white noise, AWGN, compressive sensing detection, dimensionality reduction techniques, Distortion measurement, Gaussian noise, matrix algebra, Mutual information, optimized projections, projection matrix, signal detection, Signal processing, signal reconstruction, Stochastic processes, stochastic signals, Support vector machine classification, Support vector machines, SVM}, pubstate = {published}, tppubtype = {inproceedings} } Inspired by recent work in compressive sensing, we propose a framework for the detection of stochastic signals from optimized projections. In order to generate a good projection matrix, we use dimensionality reduction techniques based on the maximization of the mutual information between the projected signals and their corresponding class labels. In addition, classification techniques based on support vector machines (SVMs) are applied for the final decision process. Simulation results show that the realizations of the stochastic process are detected with higher accuracy and lower complexity than a scheme performing signal reconstruction first, followed by detection based on the reconstructed signal. |

## 2007 |

## Journal Articles |

Leiva-Murillo, Jose; Artés-Rodríguez, Antonio Maximization of Mutual Information for Supervised Linear Feature Extraction (Journal Article) IEEE Transactions on Neural Networks, 18 (5), pp. 1433–1441, 2007, ISSN: 1045-9227. (Abstract | Links | BibTeX | Tags: Algorithms, Artificial Intelligence, Automated, component-by-component gradient-ascent method, Computer Simulation, Data Mining, Entropy, Feature extraction, gradient methods, gradient-based entropy, Independent component analysis, Information Storage and Retrieval, information theory, Iron, learning (artificial intelligence), Linear discriminant analysis, Linear Models, Mutual information, Optimization methods, Pattern recognition, Reproducibility of Results, Sensitivity and Specificity, supervised linear feature extraction, Vectors) @article{Leiva-Murillo2007, title = {Maximization of Mutual Information for Supervised Linear Feature Extraction}, author = {Leiva-Murillo, Jose M. and Artés-Rodríguez, Antonio}, url = {http://ieeexplore.ieee.org/articleDetails.jsp?arnumber=4298118}, issn = {1045-9227}, year = {2007}, date = {2007-01-01}, journal = {IEEE Transactions on Neural Networks}, volume = {18}, number = {5}, pages = {1433--1441}, publisher = {IEEE}, abstract = {In this paper, we present a novel scheme for linear feature extraction in classification. The method is based on the maximization of the mutual information (MI) between the features extracted and the classes. The sum of the MI corresponding to each of the features is taken as an heuristic that approximates the MI of the whole output vector. Then, a component-by-component gradient-ascent method is proposed for the maximization of the MI, similar to the gradient-based entropy optimization used in independent component analysis (ICA). The simulation results show that not only is the method competitive when compared to existing supervised feature extraction methods in all cases studied, but it also remarkably outperform them when the data are characterized by strongly nonlinear boundaries between classes.}, keywords = {Algorithms, Artificial Intelligence, Automated, component-by-component gradient-ascent method, Computer Simulation, Data Mining, Entropy, Feature extraction, gradient methods, gradient-based entropy, Independent component analysis, Information Storage and Retrieval, information theory, Iron, learning (artificial intelligence), Linear discriminant analysis, Linear Models, Mutual information, Optimization methods, Pattern recognition, Reproducibility of Results, Sensitivity and Specificity, supervised linear feature extraction, Vectors}, pubstate = {published}, tppubtype = {article} } In this paper, we present a novel scheme for linear feature extraction in classification. The method is based on the maximization of the mutual information (MI) between the features extracted and the classes. The sum of the MI corresponding to each of the features is taken as an heuristic that approximates the MI of the whole output vector. Then, a component-by-component gradient-ascent method is proposed for the maximization of the MI, similar to the gradient-based entropy optimization used in independent component analysis (ICA). The simulation results show that not only is the method competitive when compared to existing supervised feature extraction methods in all cases studied, but it also remarkably outperform them when the data are characterized by strongly nonlinear boundaries between classes. |