4 resultados para Algebraic path formulation
em CaltechTHESIS
Resumo:
This thesis focuses mainly on linear algebraic aspects of combinatorics. Let N_t(H) be an incidence matrix with edges versus all subhypergraphs of a complete hypergraph that are isomorphic to H. Richard M. Wilson and the author find the general formula for the Smith normal form or diagonal form of N_t(H) for all simple graphs H and for a very general class of t-uniform hypergraphs H.
As a continuation, the author determines the formula for diagonal forms of integer matrices obtained from other combinatorial structures, including incidence matrices for subgraphs of a complete bipartite graph and inclusion matrices for multisets.
One major application of diagonal forms is in zero-sum Ramsey theory. For instance, Caro's results in zero-sum Ramsey numbers for graphs and Caro and Yuster's results in zero-sum bipartite Ramsey numbers can be reproduced. These results are further generalized to t-uniform hypergraphs. Other applications include signed bipartite graph designs.
Research results on some other problems are also included in this thesis, such as a Ramsey-type problem on equipartitions, Hartman's conjecture on large sets of designs and a matroid theory problem proposed by Welsh.
Resumo:
The work presented in this thesis revolves around erasure correction coding, as applied to distributed data storage and real-time streaming communications.
First, we examine the problem of allocating a given storage budget over a set of nodes for maximum reliability. The objective is to find an allocation of the budget that maximizes the probability of successful recovery by a data collector accessing a random subset of the nodes. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models, and determine the optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) for a variety of cases. Although the optimal allocation can have nonintuitive structure and can be difficult to find in general, our results suggest that, as a simple heuristic, reliable storage can be achieved by spreading the budget maximally over all nodes when the budget is large, and spreading it minimally over a few nodes when it is small. Coding would therefore be beneficial in the former case, while uncoded replication would suffice in the latter case.
Second, we study how distributed storage allocations affect the recovery delay in a mobile setting. Specifically, two recovery delay optimization problems are considered for a network of mobile storage nodes: the maximization of the probability of successful recovery by a given deadline, and the minimization of the expected recovery delay. We show that the first problem is closely related to the earlier allocation problem, and solve the second problem completely for the case of symmetric allocations. It turns out that the optimal allocations for the two problems can be quite different. In a simulation study, we evaluated the performance of a simple data dissemination and storage protocol for mobile delay-tolerant networks, and observed that the choice of allocation can have a significant impact on the recovery delay under a variety of scenarios.
Third, we consider a real-time streaming system where messages created at regular time intervals at a source are encoded for transmission to a receiver over a packet erasure link; the receiver must subsequently decode each message within a given delay from its creation time. For erasure models containing a limited number of erasures per coding window, per sliding window, and containing erasure bursts whose maximum length is sufficiently short or long, we show that a time-invariant intrasession code asymptotically achieves the maximum message size among all codes that allow decoding under all admissible erasure patterns. For the bursty erasure model, we also show that diagonally interleaved codes derived from specific systematic block codes are asymptotically optimal over all codes in certain cases. We also study an i.i.d. erasure model in which each transmitted packet is erased independently with the same probability; the objective is to maximize the decoding probability for a given message size. We derive an upper bound on the decoding probability for any time-invariant code, and show that the gap between this bound and the performance of a family of time-invariant intrasession codes is small when the message size and packet erasure probability are small. In a simulation study, these codes performed well against a family of random time-invariant convolutional codes under a number of scenarios.
Finally, we consider the joint problems of routing and caching for named data networking. We propose a backpressure-based policy that employs virtual interest packets to make routing and caching decisions. In a packet-level simulation, the proposed policy outperformed a basic protocol that combines shortest-path routing with least-recently-used (LRU) cache replacement.
Resumo:
The study of codes, classically motivated by the need to communicate information reliably in the presence of error, has found new life in fields as diverse as network communication, distributed storage of data, and even has connections to the design of linear measurements used in compressive sensing. But in all contexts, a code typically involves exploiting the algebraic or geometric structure underlying an application. In this thesis, we examine several problems in coding theory, and try to gain some insight into the algebraic structure behind them.
The first is the study of the entropy region - the space of all possible vectors of joint entropies which can arise from a set of discrete random variables. Understanding this region is essentially the key to optimizing network codes for a given network. To this end, we employ a group-theoretic method of constructing random variables producing so-called "group-characterizable" entropy vectors, which are capable of approximating any point in the entropy region. We show how small groups can be used to produce entropy vectors which violate the Ingleton inequality, a fundamental bound on entropy vectors arising from the random variables involved in linear network codes. We discuss the suitability of these groups to design codes for networks which could potentially outperform linear coding.
The second topic we discuss is the design of frames with low coherence, closely related to finding spherical codes in which the codewords are unit vectors spaced out around the unit sphere so as to minimize the magnitudes of their mutual inner products. We show how to build frames by selecting a cleverly chosen set of representations of a finite group to produce a "group code" as described by Slepian decades ago. We go on to reinterpret our method as selecting a subset of rows of a group Fourier matrix, allowing us to study and bound our frames' coherences using character theory. We discuss the usefulness of our frames in sparse signal recovery using linear measurements.
The final problem we investigate is that of coding with constraints, most recently motivated by the demand for ways to encode large amounts of data using error-correcting codes so that any small loss can be recovered from a small set of surviving data. Most often, this involves using a systematic linear error-correcting code in which each parity symbol is constrained to be a function of some subset of the message symbols. We derive bounds on the minimum distance of such a code based on its constraints, and characterize when these bounds can be achieved using subcodes of Reed-Solomon codes.
Resumo:
The quasicontinuum (QC) method was introduced to coarse-grain crystalline atomic ensembles in order to bridge the scales from individual atoms to the micro- and mesoscales. Though many QC formulations have been proposed with varying characteristics and capabilities, a crucial cornerstone of all QC techniques is the concept of summation rules, which attempt to efficiently approximate the total Hamiltonian of a crystalline atomic ensemble by a weighted sum over a small subset of atoms. In this work we propose a novel, fully-nonlocal, energy-based formulation of the QC method with support for legacy and new summation rules through a general energy-sampling scheme. Our formulation does not conceptually differentiate between atomistic and coarse-grained regions and thus allows for seamless bridging without domain-coupling interfaces. Within this structure, we introduce a new class of summation rules which leverage the affine kinematics of this QC formulation to most accurately integrate thermodynamic quantities of interest. By comparing this new class of summation rules to commonly-employed rules through analysis of energy and spurious force errors, we find that the new rules produce no residual or spurious force artifacts in the large-element limit under arbitrary affine deformation, while allowing us to seamlessly bridge to full atomistics. We verify that the new summation rules exhibit significantly smaller force artifacts and energy approximation errors than all comparable previous summation rules through a comprehensive suite of examples with spatially non-uniform QC discretizations in two and three dimensions. Due to the unique structure of these summation rules, we also use the new formulation to study scenarios with large regions of free surface, a class of problems previously out of reach of the QC method. Lastly, we present the key components of a high-performance, distributed-memory realization of the new method, including a novel algorithm for supporting unparalleled levels of deformation. Overall, this new formulation and implementation allows us to efficiently perform simulations containing an unprecedented number of degrees of freedom with low approximation error.