3 resultados para 280103 Information Storage, Retrieval and Management
em CaltechTHESIS
Resumo:
The study of exoplanets is rapidly evolving into an important and exciting field of its own. My investigations over the past half-decade have focused on understanding just a small sliver of what they are trying to tell us. That small sliver is their atmospheres. Atmospheres are the buffer between the bulk planet and the vacuum of space. The atmosphere is an important component of a planet as it is the most readily observable and contains the most information about the physical processes that can occur in a planet. I have focused on two aspects of exoplanetary atmospheres. First, I aimed to understand the chemical mechanisms that control the atmospheric abundances. Second, I focused on interpreting exoplanet atmospheric spectra and what they tell us about the temperatures and compositions through inverse modeling. Finally, I interpreted the retrieved temperature and abundances from inverse modeling in the context of chemical disequilibrium in the planetary atmospheres.
Resumo:
Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.
We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.
We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.
Resumo:
Network information theory and channels with memory are two important but difficult frontiers of information theory. In this two-parted dissertation, we study these two areas, each comprising one part. For the first area we study the so-called entropy vectors via finite group theory, and the network codes constructed from finite groups. In particular, we identify the smallest finite group that violates the Ingleton inequality, an inequality respected by all linear network codes, but not satisfied by all entropy vectors. Based on the analysis of this group we generalize it to several families of Ingleton-violating groups, which may be used to design good network codes. Regarding that aspect, we study the network codes constructed with finite groups, and especially show that linear network codes are embedded in the group network codes constructed with these Ingleton-violating families. Furthermore, such codes are strictly more powerful than linear network codes, as they are able to violate the Ingleton inequality while linear network codes cannot. For the second area, we study the impact of memory to the channel capacity through a novel communication system: the energy harvesting channel. Different from traditional communication systems, the transmitter of an energy harvesting channel is powered by an exogenous energy harvesting device and a finite-sized battery. As a consequence, each time the system can only transmit a symbol whose energy consumption is no more than the energy currently available. This new type of power supply introduces an unprecedented input constraint for the channel, which is random, instantaneous, and has memory. Furthermore, naturally, the energy harvesting process is observed causally at the transmitter, but no such information is provided to the receiver. Both of these features pose great challenges for the analysis of the channel capacity. In this work we use techniques from channels with side information, and finite state channels, to obtain lower and upper bounds of the energy harvesting channel. In particular, we study the stationarity and ergodicity conditions of a surrogate channel to compute and optimize the achievable rates for the original channel. In addition, for practical code design of the system we study the pairwise error probabilities of the input sequences.