10 resultados para Idempotent Rank
em CaltechTHESIS
Resumo:
Demixing is the task of identifying multiple signals given only their sum and prior information about their structures. Examples of demixing problems include (i) separating a signal that is sparse with respect to one basis from a signal that is sparse with respect to a second basis; (ii) decomposing an observed matrix into low-rank and sparse components; and (iii) identifying a binary codeword with impulsive corruptions. This thesis describes and analyzes a convex optimization framework for solving an array of demixing problems.
Our framework includes a random orientation model for the constituent signals that ensures the structures are incoherent. This work introduces a summary parameter, the statistical dimension, that reflects the intrinsic complexity of a signal. The main result indicates that the difficulty of demixing under this random model depends only on the total complexity of the constituent signals involved: demixing succeeds with high probability when the sum of the complexities is less than the ambient dimension; otherwise, it fails with high probability.
The fact that a phase transition between success and failure occurs in demixing is a consequence of a new inequality in conic integral geometry. Roughly speaking, this inequality asserts that a convex cone behaves like a subspace whose dimension is equal to the statistical dimension of the cone. When combined with a geometric optimality condition for demixing, this inequality provides precise quantitative information about the phase transition, including the location and width of the transition region.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.
Resumo:
This thesis studies three classes of randomized numerical linear algebra algorithms, namely: (i) randomized matrix sparsification algorithms, (ii) low-rank approximation algorithms that use randomized unitary transformations, and (iii) low-rank approximation algorithms for positive-semidefinite (PSD) matrices.
Randomized matrix sparsification algorithms set randomly chosen entries of the input matrix to zero. When the approximant is substituted for the original matrix in computations, its sparsity allows one to employ faster sparsity-exploiting algorithms. This thesis contributes bounds on the approximation error of nonuniform randomized sparsification schemes, measured in the spectral norm and two NP-hard norms that are of interest in computational graph theory and subset selection applications.
Low-rank approximations based on randomized unitary transformations have several desirable properties: they have low communication costs, are amenable to parallel implementation, and exploit the existence of fast transform algorithms. This thesis investigates the tradeoff between the accuracy and cost of generating such approximations. State-of-the-art spectral and Frobenius-norm error bounds are provided.
The last class of algorithms considered are SPSD "sketching" algorithms. Such sketches can be computed faster than approximations based on projecting onto mixtures of the columns of the matrix. The performance of several such sketching schemes is empirically evaluated using a suite of canonical matrices drawn from machine learning and data analysis applications, and a framework is developed for establishing theoretical error bounds.
In addition to studying these algorithms, this thesis extends the Matrix Laplace Transform framework to derive Chernoff and Bernstein inequalities that apply to all the eigenvalues of certain classes of random matrices. These inequalities are used to investigate the behavior of the singular values of a matrix under random sampling, and to derive convergence rates for each individual eigenvalue of a sample covariance matrix.
Resumo:
The concept of a "projection function" in a finite-dimensional real or complex normed linear space H (the function PM which carries every element into the closest element of a given subspace M) is set forth and examined.
If dim M = dim H - 1, then PM is linear. If PN is linear for all k-dimensional subspaces N, where 1 ≤ k < dim M, then PM is linear.
The projective bound Q, defined to be the supremum of the operator norm of PM for all subspaces, is in the range 1 ≤ Q < 2, and these limits are the best possible. For norms with Q = 1, PM is always linear, and a characterization of those norms is given.
If H also has an inner product (defined independently of the norm), so that a dual norm can be defined, then when PM is linear its adjoint PMH is the projection on (kernel PM)⊥ by the dual norm. The projective bounds of a norm and its dual are equal.
The notion of a pseudo-inverse F+ of a linear transformation F is extended to non-Euclidean norms. The distance from F to the set of linear transformations G of lower rank (in the sense of the operator norm ∥F - G∥) is c/∥F+∥, where c = 1 if the range of F fills its space, and 1 ≤ c < Q otherwise. The norms on both domain and range spaces have Q = 1 if and only if (F+)+ = F for every F. This condition is also sufficient to prove that we have (F+)H = (FH)+, where the latter pseudo-inverse is taken using dual norms.
In all results, the real and complex cases are handled in a completely parallel fashion.
Resumo:
The Hamilton Jacobi Bellman (HJB) equation is central to stochastic optimal control (SOC) theory, yielding the optimal solution to general problems specified by known dynamics and a specified cost functional. Given the assumption of quadratic cost on the control input, it is well known that the HJB reduces to a particular partial differential equation (PDE). While powerful, this reduction is not commonly used as the PDE is of second order, is nonlinear, and examples exist where the problem may not have a solution in a classical sense. Furthermore, each state of the system appears as another dimension of the PDE, giving rise to the curse of dimensionality. Since the number of degrees of freedom required to solve the optimal control problem grows exponentially with dimension, the problem becomes intractable for systems with all but modest dimension.
In the last decade researchers have found that under certain, fairly non-restrictive structural assumptions, the HJB may be transformed into a linear PDE, with an interesting analogue in the discretized domain of Markov Decision Processes (MDP). The work presented in this thesis uses the linearity of this particular form of the HJB PDE to push the computational boundaries of stochastic optimal control.
This is done by crafting together previously disjoint lines of research in computation. The first of these is the use of Sum of Squares (SOS) techniques for synthesis of control policies. A candidate polynomial with variable coefficients is proposed as the solution to the stochastic optimal control problem. An SOS relaxation is then taken to the partial differential constraints, leading to a hierarchy of semidefinite relaxations with improving sub-optimality gap. The resulting approximate solutions are shown to be guaranteed over- and under-approximations for the optimal value function. It is shown that these results extend to arbitrary parabolic and elliptic PDEs, yielding a novel method for Uncertainty Quantification (UQ) of systems governed by partial differential constraints. Domain decomposition techniques are also made available, allowing for such problems to be solved via parallelization and low-order polynomials.
The optimization-based SOS technique is then contrasted with the Separated Representation (SR) approach from the applied mathematics community. The technique allows for systems of equations to be solved through a low-rank decomposition that results in algorithms that scale linearly with dimensionality. Its application in stochastic optimal control allows for previously uncomputable problems to be solved quickly, scaling to such complex systems as the Quadcopter and VTOL aircraft. This technique may be combined with the SOS approach, yielding not only a numerical technique, but also an analytical one that allows for entirely new classes of systems to be studied and for stability properties to be guaranteed.
The analysis of the linear HJB is completed by the study of its implications in application. It is shown that the HJB and a popular technique in robotics, the use of navigation functions, sit on opposite ends of a spectrum of optimization problems, upon which tradeoffs may be made in problem complexity. Analytical solutions to the HJB in these settings are available in simplified domains, yielding guidance towards optimality for approximation schemes. Finally, the use of HJB equations in temporal multi-task planning problems is investigated. It is demonstrated that such problems are reducible to a sequence of SOC problems linked via boundary conditions. The linearity of the PDE allows us to pre-compute control policy primitives and then compose them, at essentially zero cost, to satisfy a complex temporal logic specification.
Resumo:
There is a growing interest in taking advantage of possible patterns and structures in data so as to extract the desired information and overcome the curse of dimensionality. In a wide range of applications, including computer vision, machine learning, medical imaging, and social networks, the signal that gives rise to the observations can be modeled to be approximately sparse and exploiting this fact can be very beneficial. This has led to an immense interest in the problem of efficiently reconstructing a sparse signal from limited linear observations. More recently, low-rank approximation techniques have become prominent tools to approach problems arising in machine learning, system identification and quantum tomography.
In sparse and low-rank estimation problems, the challenge is the inherent intractability of the objective function, and one needs efficient methods to capture the low-dimensionality of these models. Convex optimization is often a promising tool to attack such problems. An intractable problem with a combinatorial objective can often be "relaxed" to obtain a tractable but almost as powerful convex optimization problem. This dissertation studies convex optimization techniques that can take advantage of low-dimensional representations of the underlying high-dimensional data. We provide provable guarantees that ensure that the proposed algorithms will succeed under reasonable conditions, and answer questions of the following flavor:
- For a given number of measurements, can we reliably estimate the true signal?
- If so, how good is the reconstruction as a function of the model parameters?
More specifically, i) Focusing on linear inverse problems, we generalize the classical error bounds known for the least-squares technique to the lasso formulation, which incorporates the signal model. ii) We show that intuitive convex approaches do not perform as well as expected when it comes to signals that have multiple low-dimensional structures simultaneously. iii) Finally, we propose convex relaxations for the graph clustering problem and give sharp performance guarantees for a family of graphs arising from the so-called stochastic block model. We pay particular attention to the following aspects. For i) and ii), we aim to provide a general geometric framework, in which the results on sparse and low-rank estimation can be obtained as special cases. For i) and iii), we investigate the precise performance characterization, which yields the right constants in our bounds and the true dependence between the problem parameters.
Resumo:
Flash memory is a leading storage media with excellent features such as random access and high storage density. However, it also faces significant reliability and endurance challenges. In flash memory, the charge level in the cells can be easily increased, but removing charge requires an expensive erasure operation. In this thesis we study rewriting schemes that enable the data stored in a set of cells to be rewritten by only increasing the charge level in the cells. We consider two types of modulation scheme; a convectional modulation based on the absolute levels of the cells, and a recently-proposed scheme based on the relative cell levels, called rank modulation. The contributions of this thesis to the study of rewriting schemes for rank modulation include the following: we
•propose a new method of rewriting in rank modulation, beyond the previously proposed method of “push-to-the-top”;
•study the limits of rewriting with the newly proposed method, and derive a tight upper bound of 1 bit per cell;
•extend the rank-modulation scheme to support rankings with repetitions, in order to improve the storage density;
•derive a tight upper bound of 2 bits per cell for rewriting in rank modulation with repetitions;
•construct an efficient rewriting scheme that asymptotically approaches the upper bound of 2 bit per cell.
The next part of this thesis studies rewriting schemes for a conventional absolute-levels modulation. The considered model is called “write-once memory” (WOM). We focus on WOM schemes that achieve the capacity of the model. In recent years several capacity-achieving WOM schemes were proposed, based on polar codes and randomness extractors. The contributions of this thesis to the study of WOM scheme include the following: we
•propose a new capacity-achievingWOM scheme based on sparse-graph codes, and show its attractive properties for practical implementation;
•improve the design of polarWOMschemes to remove the reliance on shared randomness and include an error-correction capability.
The last part of the thesis studies the local rank-modulation (LRM) scheme, in which a sliding window going over a sequence of real-valued variables induces a sequence of permutations. The LRM scheme is used to simulate a single conventional multi-level flash cell. The simulated cell is realized by a Gray code traversing all the relative-value states where, physically, the transition between two adjacent states in the Gray code is achieved by using a single “push-to-the-top” operation. The main results of the last part of the thesis are two constructions of Gray codes with asymptotically-optimal rate.
Resumo:
Let F = Ǫ(ζ + ζ –1) be the maximal real subfield of the cyclotomic field Ǫ(ζ) where ζ is a primitive qth root of unity and q is an odd rational prime. The numbers u1=-1, uk=(ζk-ζ-k)/(ζ-ζ-1), k=2,…,p, p=(q-1)/2, are units in F and are called the cyclotomic units. In this thesis the sign distribution of the conjugates in F of the cyclotomic units is studied.
Let G(F/Ǫ) denote the Galoi's group of F over Ǫ, and let V denote the units in F. For each σϵ G(F/Ǫ) and μϵV define a mapping sgnσ: V→GF(2) by sgnσ(μ) = 1 iff σ(μ) ˂ 0 and sgnσ(μ) = 0 iff σ(μ) ˃ 0. Let {σ1, ... , σp} be a fixed ordering of G(F/Ǫ). The matrix Mq=(sgnσj(vi) ) , i, j = 1, ... , p is called the matrix of cyclotomic signatures. The rank of this matrix determines the sign distribution of the conjugates of the cyclotomic units. The matrix of cyclotomic signatures is associated with an ideal in the ring GF(2) [x] / (xp+ 1) in such a way that the rank of the matrix equals the GF(2)-dimension of the ideal. It is shown that if p = (q-1)/ 2 is a prime and if 2 is a primitive root mod p, then Mq is non-singular. Also let p be arbitrary, let ℓ be a primitive root mod q and let L = {i | 0 ≤ i ≤ p-1, the least positive residue of defined by ℓi mod q is greater than p}. Let Hq(x) ϵ GF(2)[x] be defined by Hq(x) = g. c. d. ((Σ xi/I ϵ L) (x+1) + 1, xp + 1). It is shown that the rank of Mq equals the difference p - degree Hq(x).
Further results are obtained by using the reciprocity theorem of class field theory. The reciprocity maps for a certain abelian extension of F and for the infinite primes in F are associated with the signs of conjugates. The product formula for the reciprocity maps is used to associate the signs of conjugates with the reciprocity maps at the primes which lie above (2). The case when (2) is a prime in F is studied in detail. Let T denote the group of totally positive units in F. Let U be the group generated by the cyclotomic units. Assume that (2) is a prime in F and that p is odd. Let F(2) denote the completion of F at (2) and let V(2) denote the units in F(2). The following statements are shown to be equivalent. 1) The matrix of cyclotomic signatures is non-singular. 2) U∩T = U2. 3) U∩F2(2) = U2. 4) V(2)/ V(2)2 = ˂v1 V(2)2˃ ʘ…ʘ˂vp V(2)2˃ ʘ ˂3V(2)2˃.
The rank of Mq was computed for 5≤q≤929 and the results appear in tables. On the basis of these results and additional calculations the following conjecture is made: If q and p = (q -1)/ 2 are both primes, then Mq is non-singular.
Resumo:
Let L be a finite geometric lattice of dimension n, and let w(k) denote the number of elements in L of rank k. Two theorems about the numbers w(k) are proved: first, w(k) ≥ w(1) for k = 2, 3, ..., n-1. Second, w(k) = w(1) if and only if k = n-1 and L is modular. Several corollaries concerning the "matching" of points and dual points are derived from these theorems.
Both theorems can be regarded as a generalization of a theorem of de Bruijn and Erdös concerning ʎ= 1 designs. The second can also be considered as the converse to a special case of Dilworth's theorem on finite modular lattices.
These results are related to two conjectures due to G. -C. Rota. The "unimodality" conjecture states that the w(k)'s form a unimodal sequence. The "Sperner" conjecture states that a set of non-comparable elements in L has cardinality at most max/k {w(k)}. In this thesis, a counterexample to the Sperner conjecture is exhibited.
Resumo:
The centralized paradigm of a single controller and a single plant upon which modern control theory is built is no longer applicable to modern cyber-physical systems of interest, such as the power-grid, software defined networks or automated highways systems, as these are all large-scale and spatially distributed. Both the scale and the distributed nature of these systems has motivated the decentralization of control schemes into local sub-controllers that measure, exchange and act on locally available subsets of the globally available system information. This decentralization of control logic leads to different decision makers acting on asymmetric information sets, introduces the need for coordination between them, and perhaps not surprisingly makes the resulting optimal control problem much harder to solve. In fact, shortly after such questions were posed, it was realized that seemingly simple decentralized optimal control problems are computationally intractable to solve, with the Wistenhausen counterexample being a famous instance of this phenomenon. Spurred on by this perhaps discouraging result, a concerted 40 year effort to identify tractable classes of distributed optimal control problems culminated in the notion of quadratic invariance, which loosely states that if sub-controllers can exchange information with each other at least as quickly as the effect of their control actions propagates through the plant, then the resulting distributed optimal control problem admits a convex formulation.
The identification of quadratic invariance as an appropriate means of "convexifying" distributed optimal control problems led to a renewed enthusiasm in the controller synthesis community, resulting in a rich set of results over the past decade. The contributions of this thesis can be seen as being a part of this broader family of results, with a particular focus on closing the gap between theory and practice by relaxing or removing assumptions made in the traditional distributed optimal control framework. Our contributions are to the foundational theory of distributed optimal control, and fall under three broad categories, namely controller synthesis, architecture design and system identification.
We begin by providing two novel controller synthesis algorithms. The first is a solution to the distributed H-infinity optimal control problem subject to delay constraints, and provides the only known exact characterization of delay-constrained distributed controllers satisfying an H-infinity norm bound. The second is an explicit dynamic programming solution to a two player LQR state-feedback problem with varying delays. Accommodating varying delays represents an important first step in combining distributed optimal control theory with the area of Networked Control Systems that considers lossy channels in the feedback loop. Our next set of results are concerned with controller architecture design. When designing controllers for large-scale systems, the architectural aspects of the controller such as the placement of actuators, sensors, and the communication links between them can no longer be taken as given -- indeed the task of designing this architecture is now as important as the design of the control laws themselves. To address this task, we formulate the Regularization for Design (RFD) framework, which is a unifying computationally tractable approach, based on the model matching framework and atomic norm regularization, for the simultaneous co-design of a structured optimal controller and the architecture needed to implement it. Our final result is a contribution to distributed system identification. Traditional system identification techniques such as subspace identification are not computationally scalable, and destroy rather than leverage any a priori information about the system's interconnection structure. We argue that in the context of system identification, an essential building block of any scalable algorithm is the ability to estimate local dynamics within a large interconnected system. To that end we propose a promising heuristic for identifying the dynamics of a subsystem that is still connected to a large system. We exploit the fact that the transfer function of the local dynamics is low-order, but full-rank, while the transfer function of the global dynamics is high-order, but low-rank, to formulate this separation task as a nuclear norm minimization problem. Finally, we conclude with a brief discussion of future research directions, with a particular emphasis on how to incorporate the results of this thesis, and those of optimal control theory in general, into a broader theory of dynamics, control and optimization in layered architectures.