8 resultados para information storage and retrieval systems

em CaltechTHESIS


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Biological information storage and retrieval is a dynamic process that requires the genome to undergo dramatic structural rearrangements. Recent advances in single-molecule techniques have allowed precise quantification of the nano-mechanical properties of DNA [1, 2], and direct in vivo observation of molecules in action [3]. In this work, we will examine elasticity in protein-mediated DNA looping, whose structural rearrangement is essential for transcriptional regulation in both prokaryotes and eukaryotes. We will look at hydrodynamics in the process of viral DNA ejection, which mediates information transfer and exchange and has prominent implications in evolution. As in the case of Kepler's laws of planetary motion leading to Newton's gravitational theory, and the allometric scaling laws in biology revealing the organizing principles of complex networks [4], experimental data collapse in these biological phenomena has guided much of our studies and urged us to find the underlying physical principles.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.

We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.

We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The first thesis topic is a perturbation method for resonantly coupled nonlinear oscillators. By successive near-identity transformations of the original equations, one obtains new equations with simple structure that describe the long time evolution of the motion. This technique is related to two-timing in that secular terms are suppressed in the transformation equations. The method has some important advantages. Appropriate time scalings are generated naturally by the method, and don't need to be guessed as in two-timing. Furthermore, by continuing the procedure to higher order, one extends (formally) the time scale of valid approximation. Examples illustrate these claims. Using this method, we investigate resonance in conservative, non-conservative and time dependent problems. Each example is chosen to highlight a certain aspect of the method.

The second thesis topic concerns the coupling of nonlinear chemical oscillators. The first problem is the propagation of chemical waves of an oscillating reaction in a diffusive medium. Using two-timing, we derive a nonlinear equation that determines how spatial variations in the phase of the oscillations evolves in time. This result is the key to understanding the propagation of chemical waves. In particular, we use it to account for certain experimental observations on the Belusov-Zhabotinskii reaction.

Next, we analyse the interaction between a pair of coupled chemical oscillators. This time, we derive an equation for the phase shift, which measures how much the oscillators are out of phase. This result is the key to understanding M. Marek's and I. Stuchl's results on coupled reactor systems. In particular, our model accounts for synchronization and its bifurcation into rhythm splitting.

Finally, we analyse large systems of coupled chemical oscillators. Using a continuum approximation, we demonstrate mechanisms that cause auto-synchronization in such systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The work presented in this thesis revolves around erasure correction coding, as applied to distributed data storage and real-time streaming communications.

First, we examine the problem of allocating a given storage budget over a set of nodes for maximum reliability. The objective is to find an allocation of the budget that maximizes the probability of successful recovery by a data collector accessing a random subset of the nodes. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models, and determine the optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) for a variety of cases. Although the optimal allocation can have nonintuitive structure and can be difficult to find in general, our results suggest that, as a simple heuristic, reliable storage can be achieved by spreading the budget maximally over all nodes when the budget is large, and spreading it minimally over a few nodes when it is small. Coding would therefore be beneficial in the former case, while uncoded replication would suffice in the latter case.

Second, we study how distributed storage allocations affect the recovery delay in a mobile setting. Specifically, two recovery delay optimization problems are considered for a network of mobile storage nodes: the maximization of the probability of successful recovery by a given deadline, and the minimization of the expected recovery delay. We show that the first problem is closely related to the earlier allocation problem, and solve the second problem completely for the case of symmetric allocations. It turns out that the optimal allocations for the two problems can be quite different. In a simulation study, we evaluated the performance of a simple data dissemination and storage protocol for mobile delay-tolerant networks, and observed that the choice of allocation can have a significant impact on the recovery delay under a variety of scenarios.

Third, we consider a real-time streaming system where messages created at regular time intervals at a source are encoded for transmission to a receiver over a packet erasure link; the receiver must subsequently decode each message within a given delay from its creation time. For erasure models containing a limited number of erasures per coding window, per sliding window, and containing erasure bursts whose maximum length is sufficiently short or long, we show that a time-invariant intrasession code asymptotically achieves the maximum message size among all codes that allow decoding under all admissible erasure patterns. For the bursty erasure model, we also show that diagonally interleaved codes derived from specific systematic block codes are asymptotically optimal over all codes in certain cases. We also study an i.i.d. erasure model in which each transmitted packet is erased independently with the same probability; the objective is to maximize the decoding probability for a given message size. We derive an upper bound on the decoding probability for any time-invariant code, and show that the gap between this bound and the performance of a family of time-invariant intrasession codes is small when the message size and packet erasure probability are small. In a simulation study, these codes performed well against a family of random time-invariant convolutional codes under a number of scenarios.

Finally, we consider the joint problems of routing and caching for named data networking. We propose a backpressure-based policy that employs virtual interest packets to make routing and caching decisions. In a packet-level simulation, the proposed policy outperformed a basic protocol that combines shortest-path routing with least-recently-used (LRU) cache replacement.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The proliferation of smartphones and other internet-enabled, sensor-equipped consumer devices enables us to sense and act upon the physical environment in unprecedented ways. This thesis considers Community Sense-and-Response (CSR) systems, a new class of web application for acting on sensory data gathered from participants' personal smart devices. The thesis describes how rare events can be reliably detected using a decentralized anomaly detection architecture that performs client-side anomaly detection and server-side event detection. After analyzing this decentralized anomaly detection approach, the thesis describes how weak but spatially structured events can be detected, despite significant noise, when the events have a sparse representation in an alternative basis. Finally, the thesis describes how the statistical models needed for client-side anomaly detection may be learned efficiently, using limited space, via coresets.

The Caltech Community Seismic Network (CSN) is a prototypical example of a CSR system that harnesses accelerometers in volunteers' smartphones and consumer electronics. Using CSN, this thesis presents the systems and algorithmic techniques to design, build and evaluate a scalable network for real-time awareness of spatial phenomena such as dangerous earthquakes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several patients of P. J. Vogel who had undergone cerebral commissurotomy for the control of intractable epilepsy were tested on a variety of tasks to measure aspects of cerebral organization concerned with lateralization in hemispheric function. From tests involving identification of shapes it was inferred that in the absence of the neocortical commissures, the left hemisphere still has access to certain types of information from the ipsilateral field. The major hemisphere can still make crude differentiations between various left-field stimuli, but is unable to specify exact stimulus properties. Most of the time the major hemisphere, having access to some ipsilateral stimuli, dominated the minor hemisphere in control of the body.

Competition for control of the body between the hemispheres is seen most clearly in tests of minor hemisphere language competency, in which it was determined that though the minor hemisphere does possess some minimal ability to express language, the major hemisphere prevented its expression much of the time. The right hemisphere was superior to the left in tests of perceptual visualization, and the two hemispheres appeared to use different strategies in attempting to solve the problems, namely, analysis for the left hemisphere and synthesis for the right hemisphere.

Analysis of the patients' verbal and performance I.Q.'s, as well as observations made throughout testing, suggest that the corpus callosum plays a critical role in activities that involve functions in which the minor hemisphere normally excels, that the motor expression of these functions may normally come through the major hemisphere by way of the corpus callosum.

Lateral specialization is thought to be an evolutionary adaptation which overcame problems of a functional antagonism between the abilities normally associated with the two hemispheres. The tests of perception suggested that this function lateralized into the mute hemisphere because of an active counteraction by language. This latter idea was confirmed by the finding that left-handers, in whom there is likely to be bilateral language centers, are greatly deficient on tests of perception.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We are at the cusp of a historic transformation of both communication system and electricity system. This creates challenges as well as opportunities for the study of networked systems. Problems of these systems typically involve a huge number of end points that require intelligent coordination in a distributed manner. In this thesis, we develop models, theories, and scalable distributed optimization and control algorithms to overcome these challenges.

This thesis focuses on two specific areas: multi-path TCP (Transmission Control Protocol) and electricity distribution system operation and control. Multi-path TCP (MP-TCP) is a TCP extension that allows a single data stream to be split across multiple paths. MP-TCP has the potential to greatly improve reliability as well as efficiency of communication devices. We propose a fluid model for a large class of MP-TCP algorithms and identify design criteria that guarantee the existence, uniqueness, and stability of system equilibrium. We clarify how algorithm parameters impact TCP-friendliness, responsiveness, and window oscillation and demonstrate an inevitable tradeoff among these properties. We discuss the implications of these properties on the behavior of existing algorithms and motivate a new algorithm Balia (balanced linked adaptation) which generalizes existing algorithms and strikes a good balance among TCP-friendliness, responsiveness, and window oscillation. We have implemented Balia in the Linux kernel. We use our prototype to compare the new proposed algorithm Balia with existing MP-TCP algorithms.

Our second focus is on designing computationally efficient algorithms for electricity distribution system operation and control. First, we develop efficient algorithms for feeder reconfiguration in distribution networks. The feeder reconfiguration problem chooses the on/off status of the switches in a distribution network in order to minimize a certain cost such as power loss. It is a mixed integer nonlinear program and hence hard to solve. We propose a heuristic algorithm that is based on the recently developed convex relaxation of the optimal power flow problem. The algorithm is efficient and can successfully computes an optimal configuration on all networks that we have tested. Moreover we prove that the algorithm solves the feeder reconfiguration problem optimally under certain conditions. We also propose a more efficient algorithm and it incurs a loss in optimality of less than 3% on the test networks.

Second, we develop efficient distributed algorithms that solve the optimal power flow (OPF) problem on distribution networks. The OPF problem determines a network operating point that minimizes a certain objective such as generation cost or power loss. Traditionally OPF is solved in a centralized manner. With increasing penetration of volatile renewable energy resources in distribution systems, we need faster and distributed solutions for real-time feedback control. This is difficult because power flow equations are nonlinear and kirchhoff's law is global. We propose solutions for both balanced and unbalanced radial distribution networks. They exploit recent results that suggest solving for a globally optimal solution of OPF over a radial network through a second-order cone program (SOCP) or semi-definite program (SDP) relaxation. Our distributed algorithms are based on the alternating direction method of multiplier (ADMM), but unlike standard ADMM-based distributed OPF algorithms that require solving optimization subproblems using iterative methods, the proposed solutions exploit the problem structure that greatly reduce the computation time. Specifically, for balanced networks, our decomposition allows us to derive closed form solutions for these subproblems and it speeds up the convergence by 1000x times in simulations. For unbalanced networks, the subproblems reduce to either closed form solutions or eigenvalue problems whose size remains constant as the network scales up and computation time is reduced by 100x compared with iterative methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The synthesis of iodonium salts of the general formula [C6H5IR]+X-, where R is an alkyl group and x- is a stabilizing anion, was attempted. For the choice of R three groups were selected, whose derivatives are known to be sluggish in SN1 and SN2 substitutions: cyclopropyl, 7, 7 -dimethyl-1-norbornyl, and 9 -triptycyl. The synthetic routes followed along classical lines which have been exploited in recent years by Beringer and students. Ultimately, the object of the present study was to study the reactions of the above salts with nucleophiles. In none of the three cases, however, was it possible to isolate a stable salt. A thermodynamic argument suggests that this must be due to kinetic instability rather than thermodynamic instability. Only iodocyclopropane and 1-iodoapocamphane formed isolable iododichlorides.

Several methylated 2, 2-difluoronorbornanes were prepared with the intent of correlating fluorine -19 chemical shifts with geometric features in a rigid system. The effect of a methyl group on the shielding of a β -fluorine is dependent upon the dihedral angle; the maximum effect (an upfield shift of the resonance) occurs at 0° and 180°, whereas almost no effect is felt at a dihedral angle of 120°. The effect of a methyl group on a γ -fluorine is to strongly shift the resonance downfield when fluorine and methyl group are in a 1, 3 - diaxial-like relationship. Molecular orbital calculations of fluorine shielding in a variety of molecules were carried out using the formalism developed by Pople; the results are, at best, in modest agreement with experiment.