## Compressed Sensing

Compressed sensing: Compressed sensing (CS) is a new framework for integrated sensing and compression. The fundamental revelation is that, if an N-sample signal x is sparse and has a good K-term approximation in some basis, then it can be reconstructed using M =O(K log(N/K)) << N linear projections of x onto another basis. Furthermore, x can be reconstructed using linear programming, which has polynomial complexity. Some of the CS projects I have worked on are described here, and links to numerous other papers appear on the compressed sensing resource page. My contribution to CS was to introduce an information theoretic approach, where a probabilistic framework enables to leverage ideas from source and channel coding. The CS system resembles a communication system, where source and channel encoding are replaced by a matrix-vector multiplication, and channel and source decoding are replaced by CS reconstruction, which typically relies on optimization routines. I sometimes refer to these steps as CS encoding and decoding, respectively.
Information theoretic results on CS performance: By thinking about a linear measurement system in terms of joint source-channel coding, it is straightforward to employ the source-channel separation theorems of information theory to derive bounds on the number of CS measurements. The following paper introduced this interpretation, leading to the following simple bound:
S. Sarvotham, D. Baron, and R. G. Baraniuk, "Measurements vs. Bits: Compressed Sensing meets Information Theory," Proceedings of the 44th Allerton Conference on Communication, Control, and Computing, Monticello, IL, September 2006 (talk, pdf).
At a later time, Dongning Guo, Shlomo Shamai, and I established a precise characterizationof CS performance using replica methods, which have been used to analyze CDMA systems. Interestingly, we proved that sparse measurement matrices offer the same performance as dense ones, and that belief propagation (see below) is an asymptotically optimal algorithm. It is important to stress that our results are specific to Gaussian measurement noise, and a specifc large-system limit. That said, the results can be extended to arbitrary (non-Gaussian) noise via techniques introduced by Dongning and his collaborator C. Wang.
• D. Guo, D. Baron, and S. Shamai, "A Single-letter Characterization of Optimal Noisy Compressed Sensing," Proceedings of the 47th Allerton Conference on Communication, Control, and Computing, Monticello, IL, September 2009 (ppt, pdf).
This work was also featured on Igor Carron's famous blog, and is discussed in detail in a talk I gave at Google Research, Mountain View, CA, October 2009. These results are easily extended beyond compressed sensing to other linear measurement systems. In fact, many areas of science and engineering feature systems that acquire signals in a linear or approximately linear fashion. Applications include multiuser detection (e.g., CDMA), medical imaging, financial prediction, electromagnetic scattering, and seismics.

Distributed compressed sensing: Ensembles of signals often contain both inter- and intra- signal correlation structures. (For example, sensor network data often contain spatial and temporal correlations.) Such structures can be exploited by distributed source coding algorithms, where each signal is encoded separately and all signals are recovered jointly. Unfortunately, practical schemes for distributed compression of sources with both types of correlation have remained a challenging problem for quite some time.

CS offers a new way to approach these problems. Each sensor takes random projections of its data, and the measurements of all the sensors are used by the decoder jointly. We call this approach distributed compressed sensing (DCS). The DCS theory rests on the joint sparsity of a signal ensemble. We proposed several simple models for jointly sparse signals, developed algorithms for joint reconstruction, and characterized the number of measurements required. Similar to distributed compression, DCS enables to reduce the number of measurements, and is applicable to sensor networks.

We also provided a unifying theory that considers generic joint sparsity models using bipartite graphical models. The interesting point is that in the world of noiseless measurement of strictly sparse signals, dimensionality plays a volumetric role analogous to entropy in the data compression world. We have shown a bound that applies in the following settings:

• Single signal CS: one signal being measured noiselessly.
• Joint CS: one encoder looks at an ensemble of signals and measures them jointly (together); the CS measurements are obtained via multiplication of a matrix by the vector of the entire ensemble. Not surprisingly, the bounds here are similar to single signal CS.
• Distributed CS: here the multiple signals are measured in a distributed manner, and each is encoded differently by a different measurement matrix. We provide a region describing the number of noiseless measurements required for each of the the signals. The number of measurements required for each sensor must account for the minimal features unique to that sensor, while features that appear among multiple sensors are amortized.
The contribution here is that in the idealized world of noiseless measurement of exactly sparse signals, the dimensionality of the signal ensemble under evaluation provides a precise characterization of the number of measurements required. This result emphasizes the role that dimensionality plays in these systems. Despite the idealized setting, it provides insights into noisy measurement systems.

Fast CS decoding algorithms: Linear programming CS decoding received much initial attention, yet requires at least quadratic (or even cubic) computation, which is excessive for decoding large signals such as million-pixel images. Decoding is computationally challenging, because we need to (i) take products of the measurement matrix with the signal vector, and (ii) many algorithms iterate over the data numerous times. To reduce computation, we have proposed to use sparse CS encoding matrices that offer fast matrix-vector multiplication. We first used these matrices along with a graph-reduction technique intended for the somewhat narrow application of decoding noiseless measurements of exactly sparse data. For this problem, our technique requires O(K polylog(N)) computation.
S. Sarvotham, D. Baron, and R. G. Baraniuk, "Sudocodes - Fast Measurement and Reconstruction of Sparse Signals," 2006 IEEE International Symposium on Information Theory (ISIT2006), Seattle, WA, July 2006 (pdf).
To approach CS decoding problems where the data is not exactly sparse and measurements may be noisy, we proposed a Bayesian setting. We focused on a two-state mixture Gaussian signal model, but more sophisticated models can also be incorporated, including support for measurement noise. Our decoder relies on belief propagation (BP), a technique that is commonly used for signal estimation in various DSP applications and for channel decoding of turbo and LDPC codes. The CS-BP decoder represents the CS encoding matrix by a bipartite graph. In each iteration, nodes that correspond to signal coefficients and measurements estimate their corresponding conditional probabilities and then convey this information to neighboring nodes. Because the encoding matrix is sparse, the bipartite graph is also sparse, and so each CS-BP iteration runs in O(N log(N)) time. The number of iterations is logarithmic in N. Feel free to download the Matlab code.

A potentially important feature of CS-BP is that it combines computational efficiency and a Bayesian approach. In terms of computation, the CS-BP encoder and decoder are about as computationally efficient as the best available algorithms. At the same time, the Bayesian approach enables to exploit knowledge of the statistics in order to estimate the input better, thus enabling a further reduction in the number of measurements. A discussion of CS-BP, which compares it to other cutting-edge CS algorithms, appears on Igor Carron's blog.

Secrecy properties of CS: Several papers mention the possibility that CS measurements are encrypted. Yaron Rachlin and I have investigated this claim. We consider a scenario where Alice has a secret message (in our model the message is a real-valued K-sparse signal) that she would like to share with Bob. She encodes this signal using an M by N Gaussian encoding matrix. Bob receives the measurements, and can decode the signal, because he knows the encoding matrix (in practice, Alice and Bob could share the seed of a random number generator used to produce the matrix). Can an eavesdropper (Eve), who intercepts the measurements, recover the signal without knowing the encoding matrix? We evaluate this question using two well-established approaches to encryption: information-theoretic and computational.

First, we consider the stronger information-theoretic notion of perfect secrecy. This notion requires the mutual information between signals and measurements to be zero. However, the signals and measurements are statistically dependent, which rules out perfect secrecy. Second, we consider the weaker notion of computational secrecy, which means that Eve can only recover the signal with a prohibitively large computational cost. We prove that CS achieves a computational notion of secrecy in the face of an adversary attempting to use either ell_0 or ell_1 minimization (combinatorial search and linear programming, respectively). Our result hinges on a theorem that shows that with probability one Eve will recover an M-sparse explanation instead of a K-sparse explanation when using the wrong encoding matrix.

Y. Rachlin and D. Baron, "The Secrecy of Compressive Sensing Measurements," Proceedings of the 46th Allerton Conference on Communication, Control, and Computing, Monticello, IL, September 2008 (pdf, ppt). 