4 resultados para message authentication code
em CaltechTHESIS
Resumo:
The work presented in this thesis revolves around erasure correction coding, as applied to distributed data storage and real-time streaming communications.
First, we examine the problem of allocating a given storage budget over a set of nodes for maximum reliability. The objective is to find an allocation of the budget that maximizes the probability of successful recovery by a data collector accessing a random subset of the nodes. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models, and determine the optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) for a variety of cases. Although the optimal allocation can have nonintuitive structure and can be difficult to find in general, our results suggest that, as a simple heuristic, reliable storage can be achieved by spreading the budget maximally over all nodes when the budget is large, and spreading it minimally over a few nodes when it is small. Coding would therefore be beneficial in the former case, while uncoded replication would suffice in the latter case.
Second, we study how distributed storage allocations affect the recovery delay in a mobile setting. Specifically, two recovery delay optimization problems are considered for a network of mobile storage nodes: the maximization of the probability of successful recovery by a given deadline, and the minimization of the expected recovery delay. We show that the first problem is closely related to the earlier allocation problem, and solve the second problem completely for the case of symmetric allocations. It turns out that the optimal allocations for the two problems can be quite different. In a simulation study, we evaluated the performance of a simple data dissemination and storage protocol for mobile delay-tolerant networks, and observed that the choice of allocation can have a significant impact on the recovery delay under a variety of scenarios.
Third, we consider a real-time streaming system where messages created at regular time intervals at a source are encoded for transmission to a receiver over a packet erasure link; the receiver must subsequently decode each message within a given delay from its creation time. For erasure models containing a limited number of erasures per coding window, per sliding window, and containing erasure bursts whose maximum length is sufficiently short or long, we show that a time-invariant intrasession code asymptotically achieves the maximum message size among all codes that allow decoding under all admissible erasure patterns. For the bursty erasure model, we also show that diagonally interleaved codes derived from specific systematic block codes are asymptotically optimal over all codes in certain cases. We also study an i.i.d. erasure model in which each transmitted packet is erased independently with the same probability; the objective is to maximize the decoding probability for a given message size. We derive an upper bound on the decoding probability for any time-invariant code, and show that the gap between this bound and the performance of a family of time-invariant intrasession codes is small when the message size and packet erasure probability are small. In a simulation study, these codes performed well against a family of random time-invariant convolutional codes under a number of scenarios.
Finally, we consider the joint problems of routing and caching for named data networking. We propose a backpressure-based policy that employs virtual interest packets to make routing and caching decisions. In a packet-level simulation, the proposed policy outperformed a basic protocol that combines shortest-path routing with least-recently-used (LRU) cache replacement.
Resumo:
The study of codes, classically motivated by the need to communicate information reliably in the presence of error, has found new life in fields as diverse as network communication, distributed storage of data, and even has connections to the design of linear measurements used in compressive sensing. But in all contexts, a code typically involves exploiting the algebraic or geometric structure underlying an application. In this thesis, we examine several problems in coding theory, and try to gain some insight into the algebraic structure behind them.
The first is the study of the entropy region - the space of all possible vectors of joint entropies which can arise from a set of discrete random variables. Understanding this region is essentially the key to optimizing network codes for a given network. To this end, we employ a group-theoretic method of constructing random variables producing so-called "group-characterizable" entropy vectors, which are capable of approximating any point in the entropy region. We show how small groups can be used to produce entropy vectors which violate the Ingleton inequality, a fundamental bound on entropy vectors arising from the random variables involved in linear network codes. We discuss the suitability of these groups to design codes for networks which could potentially outperform linear coding.
The second topic we discuss is the design of frames with low coherence, closely related to finding spherical codes in which the codewords are unit vectors spaced out around the unit sphere so as to minimize the magnitudes of their mutual inner products. We show how to build frames by selecting a cleverly chosen set of representations of a finite group to produce a "group code" as described by Slepian decades ago. We go on to reinterpret our method as selecting a subset of rows of a group Fourier matrix, allowing us to study and bound our frames' coherences using character theory. We discuss the usefulness of our frames in sparse signal recovery using linear measurements.
The final problem we investigate is that of coding with constraints, most recently motivated by the demand for ways to encode large amounts of data using error-correcting codes so that any small loss can be recovered from a small set of surviving data. Most often, this involves using a systematic linear error-correcting code in which each parity symbol is constrained to be a function of some subset of the message symbols. We derive bounds on the minimum distance of such a code based on its constraints, and characterize when these bounds can be achieved using subcodes of Reed-Solomon codes.
Resumo:
Proper encoding of transmitted information can improve the performance of a communication system. To recover the information at the receiver it is necessary to decode the received signal. For many codes the complexity and slowness of the decoder is so severe that the code is not feasible for practical use. This thesis considers the decoding problem for one such class of codes, the comma-free codes related to the first-order Reed-Muller codes.
A factorization of the code matrix is found which leads to a simple, fast, minimum memory, decoder. The decoder is modular and only n modules are needed to decode a code of length 2n. The relevant factorization is extended to any code defined by a sequence of Kronecker products.
The problem of monitoring the correct synchronization position is also considered. A general answer seems to depend upon more detailed knowledge of the structure of comma-free codes. However, a technique is presented which gives useful results in many specific cases.
Resumo:
Part I. Complexes of Biological Bases and Oligonucleotides with RNA
The physical nature of complexes of several biological bases and oligonucleotides with single-stranded ribonucleic acids have been studied by high resolution proton magnetic resonance spectroscopy. The importance of various forces in the stabilization of these complexes is also discussed.
Previous work has shown that purine forms an intercalated complex with single-stranded nucleic acids. This complex formation led to severe and stereospecific broadening of the purine resonances. From the field dependence of the linewidths, T1 measurements of the purine protons and nuclear Overhauser enhancement experiments, the mechanism for the line broadening was ascertained to be dipole-dipole interactions between the purine protons and the ribose protons of the nucleic acid.
The interactions of ethidium bromide (EB) with several RNA residues have been studied. EB forms vertically stacked aggregates with itself as well as with uridine, 3'-uridine monophosphate and 5'-uridine monophosphate and forms an intercalated complex with uridylyl (3' → 5') uridine and polyuridylic acid (poly U). The geometry of EB in the intercalated complex has also been determined.
The effect of chain length of oligo-A-nucleotides on their mode of interaction with poly U in D20 at neutral pD have also been studied. Below room temperatures, ApA and ApApA form a rigid triple-stranded complex involving a stoichiometry of one adenine to two uracil bases, presumably via specific adenine-uracil base pairing and cooperative base stacking of the adenine bases. While no evidence was obtained for the interaction of ApA with poly U above room temperature, ApApA exhibited complex formation of a 1:1 nature with poly U by forming Watson-Crick base pairs. The thermodynamics of these systems are discussed.
Part II. Template Recognition and the Degeneracy of the Genetic Code
The interaction of ApApG and poly U was studied as a model system for the codon-anticodon interaction of tRNA and mRNA in vivo. ApApG was shown to interact with poly U below ~20°C. The interaction was of a 1:1 nature which exhibited the Hoogsteen bonding scheme. The three bases of ApApG are in an anti conformation and the guanosine base appears to be in the lactim tautomeric form in the complex.
Due to the inadequacies of previous models for the degeneracy of the genetic code in explaining the observed interactions of ApApG with poly U, the "tautomeric doublet" model is proposed as a possible explanation of the degenerate interactions of tRNA with mRNA during protein synthesis in vivo.