6 resultados para Decoding
em CaltechTHESIS
Resumo:
Proper encoding of transmitted information can improve the performance of a communication system. To recover the information at the receiver it is necessary to decode the received signal. For many codes the complexity and slowness of the decoder is so severe that the code is not feasible for practical use. This thesis considers the decoding problem for one such class of codes, the comma-free codes related to the first-order Reed-Muller codes.
A factorization of the code matrix is found which leads to a simple, fast, minimum memory, decoder. The decoder is modular and only n modules are needed to decode a code of length 2n. The relevant factorization is extended to any code defined by a sequence of Kronecker products.
The problem of monitoring the correct synchronization position is also considered. A general answer seems to depend upon more detailed knowledge of the structure of comma-free codes. However, a technique is presented which gives useful results in many specific cases.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.
Resumo:
The work presented in this thesis revolves around erasure correction coding, as applied to distributed data storage and real-time streaming communications.
First, we examine the problem of allocating a given storage budget over a set of nodes for maximum reliability. The objective is to find an allocation of the budget that maximizes the probability of successful recovery by a data collector accessing a random subset of the nodes. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models, and determine the optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) for a variety of cases. Although the optimal allocation can have nonintuitive structure and can be difficult to find in general, our results suggest that, as a simple heuristic, reliable storage can be achieved by spreading the budget maximally over all nodes when the budget is large, and spreading it minimally over a few nodes when it is small. Coding would therefore be beneficial in the former case, while uncoded replication would suffice in the latter case.
Second, we study how distributed storage allocations affect the recovery delay in a mobile setting. Specifically, two recovery delay optimization problems are considered for a network of mobile storage nodes: the maximization of the probability of successful recovery by a given deadline, and the minimization of the expected recovery delay. We show that the first problem is closely related to the earlier allocation problem, and solve the second problem completely for the case of symmetric allocations. It turns out that the optimal allocations for the two problems can be quite different. In a simulation study, we evaluated the performance of a simple data dissemination and storage protocol for mobile delay-tolerant networks, and observed that the choice of allocation can have a significant impact on the recovery delay under a variety of scenarios.
Third, we consider a real-time streaming system where messages created at regular time intervals at a source are encoded for transmission to a receiver over a packet erasure link; the receiver must subsequently decode each message within a given delay from its creation time. For erasure models containing a limited number of erasures per coding window, per sliding window, and containing erasure bursts whose maximum length is sufficiently short or long, we show that a time-invariant intrasession code asymptotically achieves the maximum message size among all codes that allow decoding under all admissible erasure patterns. For the bursty erasure model, we also show that diagonally interleaved codes derived from specific systematic block codes are asymptotically optimal over all codes in certain cases. We also study an i.i.d. erasure model in which each transmitted packet is erased independently with the same probability; the objective is to maximize the decoding probability for a given message size. We derive an upper bound on the decoding probability for any time-invariant code, and show that the gap between this bound and the performance of a family of time-invariant intrasession codes is small when the message size and packet erasure probability are small. In a simulation study, these codes performed well against a family of random time-invariant convolutional codes under a number of scenarios.
Finally, we consider the joint problems of routing and caching for named data networking. We propose a backpressure-based policy that employs virtual interest packets to make routing and caching decisions. In a packet-level simulation, the proposed policy outperformed a basic protocol that combines shortest-path routing with least-recently-used (LRU) cache replacement.
Resumo:
Waking up from a dreamless sleep, I open my eyes, recognize my wife’s face and am filled with joy. In this thesis, I used functional Magnetic Resonance Imaging (fMRI) to gain insights into the mechanisms involved in this seemingly simple daily occurrence, which poses at least three great challenges to neuroscience: how does conscious experience arise from the activity of the brain? How does the brain process visual input to the point of recognizing individual faces? How does the brain store semantic knowledge about people that we know? To start tackling the first question, I studied the neural correlates of unconscious processing of invisible faces. I was unable to image significant activations related to the processing of completely invisible faces, despite existing reports in the literature. I thus moved on to the next question and studied how recognition of a familiar person was achieved in the brain; I focused on finding invariant representations of person identity – representations that would be activated any time we think of a familiar person, read their name, see their picture, hear them talk, etc. There again, I could not find significant evidence for such representations with fMRI, even in regions where they had previously been found with single unit recordings in human patients (the Jennifer Aniston neurons). Faced with these null outcomes, the scope of my investigations eventually turned back towards the technique that I had been using, fMRI, and the recently praised analytical tools that I had been trusting, Multivariate Pattern Analysis. After a mostly disappointing attempt at replicating a strong single unit finding of a categorical response to animals in the right human amygdala with fMRI, I put fMRI decoding to an ultimate test with a unique dataset acquired in the macaque monkey. There I showed a dissociation between the ability of fMRI to pick up face viewpoint information and its inability to pick up face identity information, which I mostly traced back to the poor clustering of identity selective units. Though fMRI decoding is a powerful new analytical tool, it does not rid fMRI of its inherent limitations as a hemodynamics-based measure.
Resumo:
Curve samplers are sampling algorithms that proceed by viewing the domain as a vector space over a finite field, and randomly picking a low-degree curve in it as the sample. Curve samplers exhibit a nice property besides the sampling property: the restriction of low-degree polynomials over the domain to the sampled curve is still low-degree. This property is often used in combination with the sampling property and has found many applications, including PCP constructions, local decoding of codes, and algebraic PRG constructions.
The randomness complexity of curve samplers is a crucial parameter for its applications. It is known that (non-explicit) curve samplers using O(log N + log(1/δ)) random bits exist, where N is the domain size and δ is the confidence error. The question of explicitly constructing randomness-efficient curve samplers was first raised in [TU06] where they obtained curve samplers with near-optimal randomness complexity.
In this thesis, we present an explicit construction of low-degree curve samplers with optimal randomness complexity (up to a constant factor) that sample curves of degree (m logq(1/δ))O(1) in Fqm. Our construction is a delicate combination of several components, including extractor machinery, limited independence, iterated sampling, and list-recoverable codes.
Resumo:
The problem of global optimization of M phase-incoherent signals in N complex dimensions is formulated. Then, by using the geometric approach of Landau and Slepian, conditions for optimality are established for N = 2 and the optimal signal sets are determined for M = 2, 3, 4, 6, and 12.
The method is the following: The signals are assumed to be equally probable and to have equal energy, and thus are represented by points ṡi, i = 1, 2, …, M, on the unit sphere S1 in CN. If Wik is the halfspace determined by ṡi and ṡk and containing ṡi, i.e. Wik = {ṙϵCN:| ≥ | ˂ṙ, ṡk˃|}, then the Ʀi = ∩/k≠i Wik, i = 1, 2, …, M, the maximum likelihood decision regions, partition S1. For additive complex Gaussian noise ṅ and a received signal ṙ = ṡiejϴ + ṅ, where ϴ is uniformly distributed over [0, 2π], the probability of correct decoding is PC = 1/πN ∞/ʃ/0 r2N-1e-(r2+1)U(r)dr, where U(r) = 1/M M/Ʃ/i=1 Ʀi ʃ/∩ S1 I0(2r | ˂ṡ, ṡi˃|)dσ(ṡ), and r = ǁṙǁ.
For N = 2, it is proved that U(r) ≤ ʃ/Cα I0(2r|˂ṡ, ṡi˃|)dσ(ṡ) – 2K/M. h(1/2K [Mσ(Cα)-σ(S1)]), where Cα = {ṡϵS1:|˂ṡ, ṡi˃| ≥ α}, K is the total number of boundaries of the net on S1 determined by the decision regions, and h is the strictly increasing strictly convex function of σ(Cα∩W), (where W is a halfspace not containing ṡi), given by h = ʃ/Cα∩W I0 (2r|˂ṡ, ṡi˃|)dσ(ṡ). Conditions for equality are established and these give rise to the globally optimal signal sets for M = 2, 3, 4, 6, and 12.