3 resultados para Storage condition

em CaltechTHESIS


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Storage systems are widely used and have played a crucial rule in both consumer and industrial products, for example, personal computers, data centers, and embedded systems. However, such system suffers from issues of cost, restricted-lifetime, and reliability with the emergence of new systems and devices, such as distributed storage and flash memory, respectively. Information theory, on the other hand, provides fundamental bounds and solutions to fully utilize resources such as data density, information I/O and network bandwidth. This thesis bridges these two topics, and proposes to solve challenges in data storage using a variety of coding techniques, so that storage becomes faster, more affordable, and more reliable.

We consider the system level and study the integration of RAID schemes and distributed storage. Erasure-correcting codes are the basis of the ubiquitous RAID schemes for storage systems, where disks correspond to symbols in the code and are located in a (distributed) network. Specifically, RAID schemes are based on MDS (maximum distance separable) array codes that enable optimal storage and efficient encoding and decoding algorithms. With r redundancy symbols an MDS code can sustain r erasures. For example, consider an MDS code that can correct two erasures. It is clear that when two symbols are erased, one needs to access and transmit all the remaining information to rebuild the erasures. However, an interesting and practical question is: What is the smallest fraction of information that one needs to access and transmit in order to correct a single erasure? In Part I we will show that the lower bound of 1/2 is achievable and that the result can be generalized to codes with arbitrary number of parities and optimal rebuilding.

We consider the device level and study coding and modulation techniques for emerging non-volatile memories such as flash memory. In particular, rank modulation is a novel data representation scheme proposed by Jiang et al. for multi-level flash memory cells, in which a set of n cells stores information in the permutation induced by the different charge levels of the individual cells. It eliminates the need for discrete cell levels, as well as overshoot errors, when programming cells. In order to decrease the decoding complexity, we propose two variations of this scheme in Part II: bounded rank modulation where only small sliding windows of cells are sorted to generated permutations, and partial rank modulation where only part of the n cells are used to represent data. We study limits on the capacity of bounded rank modulation and propose encoding and decoding algorithms. We show that overlaps between windows will increase capacity. We present Gray codes spanning all possible partial-rank states and using only ``push-to-the-top'' operations. These Gray codes turn out to solve an open combinatorial problem called universal cycle, which is a sequence of integers generating all possible partial permutations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The work presented in this thesis revolves around erasure correction coding, as applied to distributed data storage and real-time streaming communications.

First, we examine the problem of allocating a given storage budget over a set of nodes for maximum reliability. The objective is to find an allocation of the budget that maximizes the probability of successful recovery by a data collector accessing a random subset of the nodes. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models, and determine the optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) for a variety of cases. Although the optimal allocation can have nonintuitive structure and can be difficult to find in general, our results suggest that, as a simple heuristic, reliable storage can be achieved by spreading the budget maximally over all nodes when the budget is large, and spreading it minimally over a few nodes when it is small. Coding would therefore be beneficial in the former case, while uncoded replication would suffice in the latter case.

Second, we study how distributed storage allocations affect the recovery delay in a mobile setting. Specifically, two recovery delay optimization problems are considered for a network of mobile storage nodes: the maximization of the probability of successful recovery by a given deadline, and the minimization of the expected recovery delay. We show that the first problem is closely related to the earlier allocation problem, and solve the second problem completely for the case of symmetric allocations. It turns out that the optimal allocations for the two problems can be quite different. In a simulation study, we evaluated the performance of a simple data dissemination and storage protocol for mobile delay-tolerant networks, and observed that the choice of allocation can have a significant impact on the recovery delay under a variety of scenarios.

Third, we consider a real-time streaming system where messages created at regular time intervals at a source are encoded for transmission to a receiver over a packet erasure link; the receiver must subsequently decode each message within a given delay from its creation time. For erasure models containing a limited number of erasures per coding window, per sliding window, and containing erasure bursts whose maximum length is sufficiently short or long, we show that a time-invariant intrasession code asymptotically achieves the maximum message size among all codes that allow decoding under all admissible erasure patterns. For the bursty erasure model, we also show that diagonally interleaved codes derived from specific systematic block codes are asymptotically optimal over all codes in certain cases. We also study an i.i.d. erasure model in which each transmitted packet is erased independently with the same probability; the objective is to maximize the decoding probability for a given message size. We derive an upper bound on the decoding probability for any time-invariant code, and show that the gap between this bound and the performance of a family of time-invariant intrasession codes is small when the message size and packet erasure probability are small. In a simulation study, these codes performed well against a family of random time-invariant convolutional codes under a number of scenarios.

Finally, we consider the joint problems of routing and caching for named data networking. We propose a backpressure-based policy that employs virtual interest packets to make routing and caching decisions. In a packet-level simulation, the proposed policy outperformed a basic protocol that combines shortest-path routing with least-recently-used (LRU) cache replacement.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Politically the Colorado river is an interstate as well as an international stream. Physically the basin divides itself distinctly into three sections. The upper section from head waters to the mouth of San Juan comprises about 40 percent of the total of the basin and affords about 87 percent of the total runoff, or an average of about 15 000 000 acre feet per annum. High mountains and cold weather are found in this section. The middle section from the mouth of San Juan to the mouth of the Williams comprises about 35 percent of the total area of the basin and supplies about 7 percent of the annual runoff. Narrow canyons and mild weather prevail in this section. The lower third of the basin is composed of mainly hot arid plains of low altitude. It comprises some 25 percent of the total area of the basin and furnishes about 6 percent of the average annual runoff.

The proposed Diamond Creek reservoir is located in the middle section and is wholly within the boundary of Arizona. The site is at the mouth of Diamond Creek and is only 16 m. from Beach Spring, a station on the Santa Fe railroad. It is solely a power project with a limited storage capacity. The dam which creats the reservoir is of the gravity type to be constructed across the river. The walls and foundation are of granite. For a dam of 290 feet in height, the back water will be about 25 m. up the river.

The power house will be placed right below the dam perpendicular to the axis of the river. It is entirely a concrete structure. The power installation would consist of eighteen 37 500 H.P. vertical, variable head turbines, directly connected to 28 000 kwa. 110 000 v. 3 phase, 60 cycle generators with necessary switching and auxiliary apparatus. Each unit is to be fed by a separate penstock wholly embedded into the masonry.

Concerning the power market, the main electric transmission lines would extend to Prescott, Phoenix, Mesa, Florence etc. The mining regions of the mountains of Arizona would be the most adequate market. The demand of power in the above named places might not be large at present. It will, from the observation of the writer, rapidly increase with the wonderful advancement of all kinds of industrial development.

All these things being comparatively feasible, there is one difficult problem: that is the silt. At the Diamond Creek dam site the average annual silt discharge is about 82 650 acre feet. The geographical conditions, however, will not permit silt deposites right in the reservoir. So this design will be made under the assumption given in Section 4.

The silt condition and the change of lower course of the Colorado are much like those of the Yellow River in China. But one thing is different. On the Colorado most of the canyon walls are of granite, while those on the Yellow are of alluvial loess: so it is very hard, if not impossible, to get a favorable dam site on the lower part. As a visitor to this country, I should like to see the full development of the Colorado: but how about THE YELLOW!