3 resultados para Sparsity
em CaltechTHESIS
Resumo:
A central objective in signal processing is to infer meaningful information from a set of measurements or data. While most signal models have an overdetermined structure (the number of unknowns less than the number of equations), traditionally very few statistical estimation problems have considered a data model which is underdetermined (number of unknowns more than the number of equations). However, in recent times, an explosion of theoretical and computational methods have been developed primarily to study underdetermined systems by imposing sparsity on the unknown variables. This is motivated by the observation that inspite of the huge volume of data that arises in sensor networks, genomics, imaging, particle physics, web search etc., their information content is often much smaller compared to the number of raw measurements. This has given rise to the possibility of reducing the number of measurements by down sampling the data, which automatically gives rise to underdetermined systems.
In this thesis, we provide new directions for estimation in an underdetermined system, both for a class of parameter estimation problems and also for the problem of sparse recovery in compressive sensing. There are two main contributions of the thesis: design of new sampling and statistical estimation algorithms for array processing, and development of improved guarantees for sparse reconstruction by introducing a statistical framework to the recovery problem.
We consider underdetermined observation models in array processing where the number of unknown sources simultaneously received by the array can be considerably larger than the number of physical sensors. We study new sparse spatial sampling schemes (array geometries) as well as propose new recovery algorithms that can exploit priors on the unknown signals and unambiguously identify all the sources. The proposed sampling structure is generic enough to be extended to multiple dimensions as well as to exploit different kinds of priors in the model such as correlation, higher order moments, etc.
Recognizing the role of correlation priors and suitable sampling schemes for underdetermined estimation in array processing, we introduce a correlation aware framework for recovering sparse support in compressive sensing. We show that it is possible to strictly increase the size of the recoverable sparse support using this framework provided the measurement matrix is suitably designed. The proposed nested and coprime arrays are shown to be appropriate candidates in this regard. We also provide new guarantees for convex and greedy formulations of the support recovery problem and demonstrate that it is possible to strictly improve upon existing guarantees.
This new paradigm of underdetermined estimation that explicitly establishes the fundamental interplay between sampling, statistical priors and the underlying sparsity, leads to exciting future research directions in a variety of application areas, and also gives rise to new questions that can lead to stand-alone theoretical results in their own right.
Resumo:
This thesis studies three classes of randomized numerical linear algebra algorithms, namely: (i) randomized matrix sparsification algorithms, (ii) low-rank approximation algorithms that use randomized unitary transformations, and (iii) low-rank approximation algorithms for positive-semidefinite (PSD) matrices.
Randomized matrix sparsification algorithms set randomly chosen entries of the input matrix to zero. When the approximant is substituted for the original matrix in computations, its sparsity allows one to employ faster sparsity-exploiting algorithms. This thesis contributes bounds on the approximation error of nonuniform randomized sparsification schemes, measured in the spectral norm and two NP-hard norms that are of interest in computational graph theory and subset selection applications.
Low-rank approximations based on randomized unitary transformations have several desirable properties: they have low communication costs, are amenable to parallel implementation, and exploit the existence of fast transform algorithms. This thesis investigates the tradeoff between the accuracy and cost of generating such approximations. State-of-the-art spectral and Frobenius-norm error bounds are provided.
The last class of algorithms considered are SPSD "sketching" algorithms. Such sketches can be computed faster than approximations based on projecting onto mixtures of the columns of the matrix. The performance of several such sketching schemes is empirically evaluated using a suite of canonical matrices drawn from machine learning and data analysis applications, and a framework is developed for establishing theoretical error bounds.
In addition to studying these algorithms, this thesis extends the Matrix Laplace Transform framework to derive Chernoff and Bernstein inequalities that apply to all the eigenvalues of certain classes of random matrices. These inequalities are used to investigate the behavior of the singular values of a matrix under random sampling, and to derive convergence rates for each individual eigenvalue of a sample covariance matrix.
Resumo:
The search for reliable proxies of past deep ocean temperature and salinity has proved difficult, thereby limiting our ability to understand the coupling of ocean circulation and climate over glacial-interglacial timescales. Previous inferences of deep ocean temperature and salinity from sediment pore fluid oxygen isotopes and chlorinity indicate that the deep ocean density structure at the Last Glacial Maximum (LGM, approximately 20,000 years BP) was set by salinity, and that the density contrast between northern and southern sourced deep waters was markedly greater than in the modern ocean. High density stratification could help explain the marked contrast in carbon isotope distribution recorded in the LGM ocean relative to that we observe today, but what made the ocean's density structure so different at the LGM? How did it evolve from one state to another? Further, given the sparsity of the LGM temperature and salinity data set, what else can we learn by increasing the spatial density of proxy records?
We investigate the cause and feasibility of a highly and salinity stratified deep ocean at the LGM and we work to increase the amount of information we can glean about the past ocean from pore fluid profiles of oxygen isotopes and chloride. Using a coupled ocean--sea ice--ice shelf cavity model we test whether the deep ocean density structure at the LGM can be explained by ice--ocean interactions over the Antarctic continental shelves, and show that a large contribution of the LGM salinity stratification can be explained through lower ocean temperature. In order to extract the maximum information from pore fluid profiles of oxygen isotopes and chloride we evaluate several inverse methods for ill-posed problems and their ability to recover bottom water histories from sediment pore fluid profiles. We demonstrate that Bayesian Markov Chain Monte Carlo parameter estimation techniques enable us to robustly recover the full solution space of bottom water histories, not only at the LGM, but through the most recent deglaciation and the Holocene up to the present. Finally, we evaluate a non-destructive pore fluid sampling technique, Rhizon samplers, in comparison to traditional squeezing methods and show that despite their promise, Rhizons are unlikely to be a good sampling tool for pore fluid measurements of oxygen isotopes and chloride.