997 resultados para Distributed coding
Resumo:
We consider the problem of distributed joint source-channel coding of correlated Gaussian sources over a Gaussian Multiple Access Channel (MAC). There may be side information at the encoders and/or at the decoder. First we specialize a general result in [16] to obtain sufficient conditions for reliable transmission over a Gaussian MAC. This system does not satisfy the source channel separation. Thus, next we study and compare three joint source channel coding schemes available in literature.
Resumo:
This paper considers sequential hypothesis testing in a decentralized framework. We start with two simple decentralized sequential hypothesis testing algorithms. One of which is later proved to be asymptotically Bayes optimal. We also consider composite versions of decentralized sequential hypothesis testing. A novel nonparametric version for decentralized sequential hypothesis testing using universal source coding theory is developed. Finally we design a simple decentralized multihypothesis sequential detection algorithm.
Resumo:
Erasure codes are an efficient means of storing data across a network in comparison to data replication, as they tend to reduce the amount of data stored in the network and offer increased resilience in the presence of node failures. The codes perform poorly though, when repair of a failed node is called for, as they typically require the entire file to be downloaded to repair a failed node. A new class of erasure codes, termed as regenerating codes were recently introduced, that do much better in this respect. However, given the variety of efficient erasure codes available in the literature, there is considerable interest in the construction of coding schemes that would enable traditional erasure codes to be used, while retaining the feature that only a fraction of the data need be downloaded for node repair. In this paper, we present a simple, yet powerful, framework that does precisely this. Under this framework, the nodes are partitioned into two types and encoded using two codes in a manner that reduces the problem of node-repair to that of erasure-decoding of the constituent codes. Depending upon the choice of the two codes, the framework can be used to avail one or more of the following advantages: simultaneous minimization of storage space and repair-bandwidth, low complexity of operation, fewer disk reads at helper nodes during repair, and error detection and correction.
Resumo:
Erasure codes are an efficient means of storing data across a network in comparison to data replication, as they tend to reduce the amount of data stored in the network and offer increased resilience in the presence of node failures. The codes perform poorly though, when repair of a failed node is called for, as they typically require the entire file to be downloaded to repair a failed node. A new class of erasure codes, termed as regenerating codes were recently introduced, that do much better in this respect. However, given the variety of efficient erasure codes available in the literature, there is considerable interest in the construction of coding schemes that would enable traditional erasure codes to be used, while retaining the feature that only a fraction of the data need be downloaded for node repair. In this paper, we present a simple, yet powerful, framework that does precisely this. Under this framework, the nodes are partitioned into two types and encoded using two codes in a manner that reduces the problem of node-repair to that of erasure-decoding of the constituent codes. Depending upon the choice of the two codes, the framework can be used to avail one or more of the following advantages: simultaneous minimization of storage space and repair-bandwidth, low complexity of operation, fewer disk reads at helper nodes during repair, and error detection and correction.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.
Resumo:
The study of codes, classically motivated by the need to communicate information reliably in the presence of error, has found new life in fields as diverse as network communication, distributed storage of data, and even has connections to the design of linear measurements used in compressive sensing. But in all contexts, a code typically involves exploiting the algebraic or geometric structure underlying an application. In this thesis, we examine several problems in coding theory, and try to gain some insight into the algebraic structure behind them.
The first is the study of the entropy region - the space of all possible vectors of joint entropies which can arise from a set of discrete random variables. Understanding this region is essentially the key to optimizing network codes for a given network. To this end, we employ a group-theoretic method of constructing random variables producing so-called "group-characterizable" entropy vectors, which are capable of approximating any point in the entropy region. We show how small groups can be used to produce entropy vectors which violate the Ingleton inequality, a fundamental bound on entropy vectors arising from the random variables involved in linear network codes. We discuss the suitability of these groups to design codes for networks which could potentially outperform linear coding.
The second topic we discuss is the design of frames with low coherence, closely related to finding spherical codes in which the codewords are unit vectors spaced out around the unit sphere so as to minimize the magnitudes of their mutual inner products. We show how to build frames by selecting a cleverly chosen set of representations of a finite group to produce a "group code" as described by Slepian decades ago. We go on to reinterpret our method as selecting a subset of rows of a group Fourier matrix, allowing us to study and bound our frames' coherences using character theory. We discuss the usefulness of our frames in sparse signal recovery using linear measurements.
The final problem we investigate is that of coding with constraints, most recently motivated by the demand for ways to encode large amounts of data using error-correcting codes so that any small loss can be recovered from a small set of surviving data. Most often, this involves using a systematic linear error-correcting code in which each parity symbol is constrained to be a function of some subset of the message symbols. We derive bounds on the minimum distance of such a code based on its constraints, and characterize when these bounds can be achieved using subcodes of Reed-Solomon codes.
Resumo:
Our study of a novel technique for adaptive image sequence coding is reported. The number of reference frames and the intervals between them are adjusted to improve the temporal compensability of the input video. The bits are distributed more efficiently on different frame types according to temporal and spatial complexity of the image scene. Experimental results show that this dynamic group-of-picture (GOP) structure coding scheme is not only feasible but also better than the conventional fixed GOP method in terms of perceptual quality and SNR. (C) 1996 Society of Photo-Optical Instrumentation Engineers.
Resumo:
The family Cyprinidae is one of the largest fish families in the world, which is widely distributed in East Asian, with obvious difference in characteristic size among species. The phylogenetic analysis of cyprinid taxa based on the functionally important genes can help to understand the speciation and functional divergench of the Cyprinidae. The c-myc gene is an important gene regulating individual growth. In the present study, the sequence variations of the cyprinid c-myc gene and their phylogenetic significance were analyzed. The 41 complete sequences of the c-myc gene were obtained from cyprinids and outgroups through PCR amplification and clone. The coding DNA sequences of the c-myc gene were used to infer molecular phylogenetic relationships within the Cyprinidae. Myxocyprinus asiaticus (Catostomidae), Misgurnus anguillicaudatus (Cobitidae) and Hemimyzon sinensis (Homalopteridae) were assigned to the outgroup taxa. Phylogenetic analyses using maximum parsimony (MP), maximum likelihood (ML), and Bayesian retrieved similar topology. Within the Cyprinidae, Leuciscini and Barbini formed the monophyletic lineage respectively with high nodal supports. Leuciscini comprises Xeno-cyprinae, Cultrinae, East Asian species of Leuciscinae and Danioninae, Gobioninae and Acheilognathinae, and Barbini contains Schizothoracinae, Barbinae, Cyprininae and Labeoninae. Danio rerio, D. myersi and Rasbora trilineata were supposed to separate from Leuciscinae and Barbini and to form another lineage, The positions of some Danioninae species were still unresolved. Analyses of both amino acid variation with parsimony information and two high variation regions indicated that there is no correlation between variations of single amino acid or high variation regions and characteristic size of cyprinids. In,addition, the species with smaller size were usually found to be basal within clades in the tree, which might be the results of the adaptation to the primitive ecology and survival pressure.
Resumo:
Recognition of objects in complex visual scenes is greatly simplified by the ability to segment features belonging to different objects while grouping features belonging to the same object. This feature-binding process can be driven by the local relations between visual contours. The standard method for implementing this process with neural networks uses a temporal code to bind features together. I propose a spatial coding alternative for the dynamic binding of visual contours, and demonstrate the spatial coding method for segmenting an image consisting of three overlapping objects.
Resumo:
A new neural network architecture for spatial patttern recognition using multi-scale pyramida1 coding is here described. The network has an ARTMAP structure with a new class of ART-module, called Hybrid ART-module, as its front-end processor. Hybrid ART-module, which has processing modules corresponding to each scale channel of multi-scale pyramid, employs channels of finer scales only if it is necesssary to discriminate a pattern from others. This process is effected by serial match tracking. Also the parallel match tracking is used to select the spatial location having most salient feature and limit its attention to that part.
Resumo:
O tema principal desta tese é o problema de cancelamento de interferência para sistemas multi-utilizador, com antenas distribuídas. Como tal, ao iniciar, uma visão geral das principais propriedades de um sistema de antenas distribuídas é apresentada. Esta descrição inclui o estudo analítico do impacto da ligação, dos utilizadores do sistema, a mais antenas distribuídas. Durante essa análise é demonstrado que a propriedade mais importante do sistema para obtenção do ganho máximo, através da ligação de mais antenas de transmissão, é a simetria espacial e que os utilizadores nas fronteiras das células são os mais bene ciados. Tais resultados são comprovados através de simulação. O problema de cancelamento de interferência multi-utilizador é considerado tanto para o caso unidimensional (i.e. sem codi cação) como para o multidimensional (i.e. com codi cação). Para o caso unidimensional um algoritmo de pré-codi cação não-linear é proposto e avaliado, tendo como objectivo a minimização da taxa de erro de bit. Tanto o caso de portadora única como o de multipla-portadora são abordados, bem como o cenário de antenas colocadas e distribuidas. É demonstrado que o esquema proposto pode ser visto como uma extensão do bem conhecido esquema de zeros forçados, cuja desempenho é provado ser um limite inferior para o esquema generalizado. O algoritmo é avaliado, para diferentes cenários, através de simulação, a qual indica desempenho perto do óptimo, com baixa complexidade. Para o caso multi-dimensional um esquema para efectuar "dirty paper coding" binário, tendo como base códigos de dupla camada é proposto. No desenvolvimento deste esquema, a compressão com perdas de informação, é considerada como um subproblema. Resultados de simulação indicam transmissão dedigna proxima do limite de Shannon.
Resumo:
Recently, several distributed video coding (DVC) solutions based on the distributed source coding (DSC) paradigm have appeared in the literature. Wyner-Ziv (WZ) video coding, a particular case of DVC where side information is made available at the decoder, enable to achieve a flexible distribution of the computational complexity between the encoder and decoder, promising to fulfill novel requirements from applications such as video surveillance, sensor networks and mobile camera phones. The quality of the side information at the decoder has a critical role in determining the WZ video coding rate-distortion (RD) performance, notably to raise it to a level as close as possible to the RD performance of standard predictive video coding schemes. Towards this target, efficient motion search algorithms for powerful frame interpolation are much needed at the decoder. In this paper, the RD performance of a Wyner-Ziv video codec is improved by using novel, advanced motion compensated frame interpolation techniques to generate the side information. The development of these type of side information estimators is a difficult problem in WZ video coding, especially because the decoder only has available some reference, decoded frames. Based on the regularization of the motion field, novel side information creation techniques are proposed in this paper along with a new frame interpolation framework able to generate higher quality side information at the decoder. To illustrate the RD performance improvements, this novel side information creation framework has been integrated in a transform domain turbo coding based Wyner-Ziv video codec. Experimental results show that the novel side information creation solution leads to better RD performance than available state-of-the-art side information estimators, with improvements up to 2 dB: moreover, it allows outperforming H.264/AVC Intra by up to 3 dB with a lower encoding complexity.
Resumo:
HLA-E is a non-classical Human Leucocyte Antigen class I gene with immunomodulatory properties. Whereas HLA-E expression usually occurs at low levels, it is widely distributed amongst human tissues, has the ability to bind self and non-self antigens and to interact with NK cells and T lymphocytes, being important for immunosurveillance and also for fighting against infections. HLA-E is usually the most conserved locus among all class I genes. However, most of the previous studies evaluating HLA-E variability sequenced only a few exons or genotyped known polymorphisms. Here we report a strategy to evaluate HLA-E variability by next-generation sequencing (NGS) that might be used to other HLA loci and present the HLA-E haplotype diversity considering the segment encoding the entire HLA-E mRNA (including 5'UTR, introns and the 3'UTR) in two African population samples, Susu from Guinea-Conakry and Lobi from Burkina Faso. Our results indicate that (a) the HLA-E gene is indeed conserved, encoding mainly two different protein molecules; (b) Africans do present several unknown HLA-E alleles presenting synonymous mutations; (c) the HLA-E 3'UTR is quite polymorphic and (d) haplotypes in the HLA-E 3'UTR are in close association with HLA-E coding alleles. NGS has proved to be an important tool on data generation for future studies evaluating variability in non-classical MHC genes.
Resumo:
Abstract Background The mitochondrial DNA of kinetoplastid flagellates is distinctive in the eukaryotic world due to its massive size, complex form and large sequence content. Comprised of catenated maxicircles that contain rRNA and protein-coding genes and thousands of heterogeneous minicircles encoding small guide RNAs, the kinetoplast network has evolved along with an extreme form of mRNA processing in the form of uridine insertion and deletion RNA editing. Many maxicircle-encoded mRNAs cannot be translated without this post-transcriptional sequence modification. Results We present the complete sequence and annotation of the Trypanosoma cruzi maxicircles for the CL Brener and Esmeraldo strains. Gene order is syntenic with Trypanosoma brucei and Leishmania tarentolae maxicircles. The non-coding components have strain-specific repetitive regions and a variable region that is unique for each strain with the exception of a conserved sequence element that may serve as an origin of replication, but shows no sequence identity with L. tarentolae or T. brucei. Alternative assemblies of the variable region demonstrate intra-strain heterogeneity of the maxicircle population. The extent of mRNA editing required for particular genes approximates that seen in T. brucei. Extensively edited genes were more divergent among the genera than non-edited and rRNA genes. Esmeraldo contains a unique 236-bp deletion that removes the 5'-ends of ND4 and CR4 and the intergenic region. Esmeraldo shows additional insertions and deletions outside of areas edited in other species in ND5, MURF1, and MURF2, while CL Brener has a distinct insertion in MURF2. Conclusion The CL Brener and Esmeraldo maxicircles represent two of three previously defined maxicircle clades and promise utility as taxonomic markers. Restoration of the disrupted reading frames might be accomplished by strain-specific RNA editing. Elements in the non-coding region may be important for replication, transcription, and anchoring of the maxicircle within the kinetoplast network.
Resumo:
Abstract Background Transcription of large numbers of non-coding RNAs originating from intronic regions of human genes has been recently reported, but mechanisms governing their biosynthesis and biological functions are largely unknown. In this work, we evaluated the existence of a common mechanism of transcription regulation shared by protein-coding mRNAs and intronic RNAs by measuring the effect of androgen on the transcriptional profile of a prostate cancer cell line. Results Using a custom-built cDNA microarray enriched in intronic transcribed sequences, we found 39 intronic non-coding RNAs for which levels were significantly regulated by androgen exposure. Orientation-specific reverse transcription-PCR indicated that 10 of the 13 were transcribed in the antisense direction. These transcripts are long (0.5–5 kb), unspliced and apparently do not code for proteins. Interestingly, we found that the relative levels of androgen-regulated intronic transcripts could be correlated with the levels of the corresponding protein-coding gene (asGAS6 and asDNAJC3) or with the alternative usage of exons (asKDELR2 and asITGA6) in the corresponding protein-coding transcripts. Binding of the androgen receptor to a putative regulatory region upstream from asMYO5A, an androgen-regulated antisense intronic transcript, was confirmed by chromatin immunoprecipitation. Conclusion Altogether, these results indicate that at least a fraction of naturally transcribed intronic non-coding RNAs may be regulated by common physiological signals such as hormones, and further corroborate the notion that the intronic complement of the transcriptome play functional roles in the human gene-expression program.