459 resultados para Parallelism
Resumo:
As the performance gap between microprocessors and memory continues to increase, main memory accesses result in long latencies which become a factor limiting system performance. Previous studies show that main memory access streams contain significant localities and SDRAM devices provide parallelism through multiple banks and channels. These locality and parallelism have not been exploited thoroughly by conventional memory controllers. In this thesis, SDRAM address mapping techniques and memory access reordering mechanisms are studied and applied to memory controller design with the goal of reducing observed main memory access latency. The proposed bit-reversal address mapping attempts to distribute main memory accesses evenly in the SDRAM address space to enable bank parallelism. As memory accesses to unique banks are interleaved, the access latencies are partially hidden and therefore reduced. With the consideration of cache conflict misses, bit-reversal address mapping is able to direct potential row conflicts to different banks, further improving the performance. The proposed burst scheduling is a novel access reordering mechanism, which creates bursts by clustering accesses directed to the same rows of the same banks. Subjected to a threshold, reads are allowed to preempt writes and qualified writes are piggybacked at the end of the bursts. A sophisticated access scheduler selects accesses based on priorities and interleaves accesses to maximize the SDRAM data bus utilization. Consequentially burst scheduling reduces row conflict rate, increasing and exploiting the available row locality. Using a revised SimpleScalar and M5 simulator, both techniques are evaluated and compared with existing academic and industrial solutions. With SPEC CPU2000 benchmarks, bit-reversal reduces the execution time by 14% on average over traditional page interleaving address mapping. Burst scheduling also achieves a 15% reduction in execution time over conventional bank in order scheduling. Working constructively together, bit-reversal and burst scheduling successfully achieve a 19% speedup across simulated benchmarks.
Resumo:
Self-stabilization is a property of a distributed system such that, regardless of the legitimacy of its current state, the system behavior shall eventually reach a legitimate state and shall remain legitimate thereafter. The elegance of self-stabilization stems from the fact that it distinguishes distributed systems by a strong fault tolerance property against arbitrary state perturbations. The difficulty of designing and reasoning about self-stabilization has been witnessed by many researchers; most of the existing techniques for the verification and design of self-stabilization are either brute-force, or adopt manual approaches non-amenable to automation. In this dissertation, we first investigate the possibility of automatically designing self-stabilization through global state space exploration. In particular, we develop a set of heuristics for automating the addition of recovery actions to distributed protocols on various network topologies. Our heuristics equally exploit the computational power of a single workstation and the available parallelism on computer clusters. We obtain existing and new stabilizing solutions for classical protocols like maximal matching, ring coloring, mutual exclusion, leader election and agreement. Second, we consider a foundation for local reasoning about self-stabilization; i.e., study the global behavior of the distributed system by exploring the state space of just one of its components. It turns out that local reasoning about deadlocks and livelocks is possible for an interesting class of protocols whose proof of stabilization is otherwise complex. In particular, we provide necessary and sufficient conditions – verifiable in the local state space of every process – for global deadlock- and livelock-freedom of protocols on ring topologies. Local reasoning potentially circumvents two fundamental problems that complicate the automated design and verification of distributed protocols: (1) state explosion and (2) partial state information. Moreover, local proofs of convergence are independent of the number of processes in the network, thereby enabling our assertions about deadlocks and livelocks to apply on rings of arbitrary sizes without worrying about state explosion.
Resumo:
This thesis develops high performance real-time signal processing modules for direction of arrival (DOA) estimation for localization systems. It proposes highly parallel algorithms for performing subspace decomposition and polynomial rooting, which are otherwise traditionally implemented using sequential algorithms. The proposed algorithms address the emerging need for real-time localization for a wide range of applications. As the antenna array size increases, the complexity of signal processing algorithms increases, making it increasingly difficult to satisfy the real-time constraints. This thesis addresses real-time implementation by proposing parallel algorithms, that maintain considerable improvement over traditional algorithms, especially for systems with larger number of antenna array elements. Singular value decomposition (SVD) and polynomial rooting are two computationally complex steps and act as the bottleneck to achieving real-time performance. The proposed algorithms are suitable for implementation on field programmable gated arrays (FPGAs), single instruction multiple data (SIMD) hardware or application specific integrated chips (ASICs), which offer large number of processing elements that can be exploited for parallel processing. The designs proposed in this thesis are modular, easily expandable and easy to implement. Firstly, this thesis proposes a fast converging SVD algorithm. The proposed method reduces the number of iterations it takes to converge to correct singular values, thus achieving closer to real-time performance. A general algorithm and a modular system design are provided making it easy for designers to replicate and extend the design to larger matrix sizes. Moreover, the method is highly parallel, which can be exploited in various hardware platforms mentioned earlier. A fixed point implementation of proposed SVD algorithm is presented. The FPGA design is pipelined to the maximum extent to increase the maximum achievable frequency of operation. The system was developed with the objective of achieving high throughput. Various modern cores available in FPGAs were used to maximize the performance and details of these modules are presented in detail. Finally, a parallel polynomial rooting technique based on Newton’s method applicable exclusively to root-MUSIC polynomials is proposed. Unique characteristics of root-MUSIC polynomial’s complex dynamics were exploited to derive this polynomial rooting method. The technique exhibits parallelism and converges to the desired root within fixed number of iterations, making this suitable for polynomial rooting of large degree polynomials. We believe this is the first time that complex dynamics of root-MUSIC polynomial were analyzed to propose an algorithm. In all, the thesis addresses two major bottlenecks in a direction of arrival estimation system, by providing simple, high throughput, parallel algorithms.
Resumo:
We present a high performance-yet low cost-system for multi-view rendering in virtual reality (VR) applications. In contrast to complex CAVE installations, which are typically driven by one render client per view, we arrange eight displays in an octagon around the viewer to provide a full 360° projection, and we drive these eight displays by a single PC equipped with multiple graphics units (GPUs). In this paper we describe the hardware and software setup, as well as the necessary low-level and high-level optimizations to optimally exploit the parallelism of this multi-GPU multi-view VR system.
Resumo:
The macaque cortical visual system is hierarchically organized into two streams, the ventral stream for recognizing objects and the dorsal stream for analyzing spatial relationships. The ventral stream extends from striate cortex or area V1 to inferior temporal cortex (IT) through extra-striate areas V2 and V4. Between V1 and V2, the ventral stream consists of two roughly parallel sub-streams, one extending from the cytochrome oxidase (CO) rich blobs in V1 to the CO rich thin stripes in V2, the other extending from the interblobs in V1 to interstripes, in V2. The blob-dominated sub-stream is thought to analyze the surface features such as color, whereas the interblob-dominated one is thought to analyze the contour features such as shape. ^ In the current study, the organization of cortical pathways linking V2 thin stripe and interstripe compartments with area V4 was investigated using a combination of physiological and anatomical techniques. Different compartments of V2 were first characterized, in vivo, using optical recording of intrinsic cortical signals. These functionally derived maps of V2 stripe compartments were then used to guide iontophoretic injections of multiple, distinguishable, anterograde tracers into specific V2 compartments. The distribution of labeled axons was analyzed either in horizontal sections through the prelunate gyrus, or in tangentially sectioned portions of physically unfolded cortex containing the lunate sulcus, prelunate gyrus and superior temporal sulcus. When a V2 thin stripe and adjacent interstripe were injected with distinguishable tracers, a large primary and several secondary foci were observed in V4. The primary focus from the thin stripe injection was spatially segregated from the primary focus from the V2 interstripe injection, suggesting a retention of the pattern of compartmentation. ^ We examined the distribution of retrogradely labeled cells in V1 following the injections of tracers into V2 different compartments, in order to quantitate just how parallel the two sub-streams are from V1 to V2. Our results suggest that both blobs and interblobs project to thin stripes in V2, whereas only interblobs project to interstripes. This asymmetrical segregation argues against the original proposal of strict parallelism. (Abstract shortened by UMI.) ^
Resumo:
When genetic constraints restrict phenotypic evolution, diversification can be predicted to evolve along so-called lines of least resistance. To address the importance of such constraints and their resolution, studies of parallel phenotypic divergence that differ in their age are valuable. Here, we investigate the parapatric evolution of six lake and stream threespine stickleback systems from Iceland and Switzerland, ranging in age from a few decades to several millennia. Using phenotypic data, we test for parallelism in ecotypic divergence between parapatric lake and stream populations and compare the observed patterns to an ancestral-like marine population. We find strong and consistent phenotypic divergence, both among lake and stream populations and between our freshwater populations and the marine population. Interestingly, ecotypic divergence in low-dimensional phenotype space (i.e. single traits) is rapid and seems to be often completed within 100 years. Yet, the dimensionality of ecotypic divergence was highest in our oldest systems and only there parallel evolution of unrelated ecotypes was strong enough to overwrite phylogenetic contingency. Moreover, the dimensionality of divergence in different systems varies between trait complexes, suggesting different constraints and evolutionary pathways to their resolution among freshwater systems.
Resumo:
OBJECTIVE To analytically validate a gas concentration of chromatography-mass spectrometry (GC-MS) method for measurement of 6 amino acids in canine serum samples and to assess the stability of each amino acid after sample storage. SAMPLES Surplus serum from 80 canine samples submitted to the Gastrointestinal Laboratory at Texas A&M University and serum samples from 12 healthy dogs. PROCEDURES GC-MS was validated to determine precision, reproducibility, limit of detection, and percentage recovery of known added concentrations of 6 amino acids in surplus serum samples. Amino acid concentrations in serum samples from healthy dogs were measured before (baseline) and after storage in various conditions. RESULTS Intra- and interassay coefficients of variation (10 replicates involving 12 pooled serum samples) were 13.4% and 16.6% for glycine, 9.3% and 12.4% for glutamic acid, 5.1% and 6.3% for methionine, 14.0% and 15.1% for tryptophan, 6.2% and 11.0% for tyrosine, and 7.4% and 12.4% for lysine, respectively. Observed-to-expected concentration ratios in dilutional parallelism tests (6 replicates involving 6 pooled serum samples) were 79.5% to 111.5% for glycine, 80.9% to 123.0% for glutamic acid, 77.8% to 111.0% for methionine, 85.2% to 98.0% for tryptophan, 79.4% to 115.0% for tyrosine, and 79.4% to 110.0% for lysine. No amino acid concentration changed significantly from baseline after serum sample storage at -80°C for ≤ 7 days. CONCLUSIONS AND CLINICAL RELEVANCE GC-MS measurement of concentration of 6 amino acids in canine serum samples yielded precise, accurate, and reproducible results. Sample storage at -80°C for 1 week had no effect on GC-MS results.
Resumo:
La adivinanza, forma lírica de comunicación, del arte de saber y entretener, destaca como un juego mental y verbal que ha perdurado durante años uniendo a la poesía con el ingenio. Su forma tradicional se basa en versos de arte menor, cuartetas octosilábicas, de rima asonante o consonante cruzada, y el uso del símil, la metáfora, la metonimia, la alegoría, la dilogía, la analogía, y el desglose lingüístico. Como forma, no se apega a los cánones sino que se trata de un género libre, que crea sus propias reglas. Como el refrán, pertenecen al conjunto de rimas, o textos que no se cantaban, sino que se decían. Sus características son: la brevedad, la autonomía, la rima, las aliteraciones y los paralelismos. Su estructuración tiene dos vías: la sintáctica y la retórica: ambas están envueltas en el ropaje de la semántica y convierten a las adivinanzas en un juego del lenguaje, una lección o castigo, un enigma y un auténtico deleite de la tradición lírica
Resumo:
Past sea surface temperature (SST) evolution in the Alboran Sea (western Mediterranean) during the last 50,000 years has been inferred from the study of C37 alkenones in International Marine Global Change Studies MD952043 core. This record has a time resolution of ~200 years allowing the study of millennial-scale and even shorter climatic changes. The observed SST curve displays characteristic sequences of extremely rapid warming and cooling events along the glacial period. Comparison of this Alboran record with delta18O from Greenland ice (Greenland Ice Sheet Project 2 core) shows a strong parallelism between these SST oscillations and the Dansgaard-Oeschger events. Five prominent cooling episodes standing out in the SST profile are accompanied by an anomalous high abundance of Neogloboquadrina pachyderma sinistral which is confined to the duration of these cold intervals. These features and the isotopic record reflect drastic changes in the surface hydrography of the Alboran Sea in association with Heinrich events Hl-5.
Resumo:
La adivinanza, forma lírica de comunicación, del arte de saber y entretener, destaca como un juego mental y verbal que ha perdurado durante años uniendo a la poesía con el ingenio. Su forma tradicional se basa en versos de arte menor, cuartetas octosilábicas, de rima asonante o consonante cruzada, y el uso del símil, la metáfora, la metonimia, la alegoría, la dilogía, la analogía, y el desglose lingüístico. Como forma, no se apega a los cánones sino que se trata de un género libre, que crea sus propias reglas. Como el refrán, pertenecen al conjunto de rimas, o textos que no se cantaban, sino que se decían. Sus características son: la brevedad, la autonomía, la rima, las aliteraciones y los paralelismos. Su estructuración tiene dos vías: la sintáctica y la retórica: ambas están envueltas en el ropaje de la semántica y convierten a las adivinanzas en un juego del lenguaje, una lección o castigo, un enigma y un auténtico deleite de la tradición lírica
Resumo:
Lipids are used for the evaluation of the different organic matter contributions in the north eastern Norwegian sea (M23258 site; 75ºN, 14ºE) over the last 15,000 years. Development of a mass balance model based on the down core quantification of the C37 alkenones, the odd carbon numbered n-alkanes (Aodd) and the unresolved complex mixture of hydrocarbons (UCM) has allowed three main organic matter inputs involving marine, continental and ancient reworked organic matter to be recognized. The model shows a good agreement between measured and reconstructed TOC values. Similarly, a strong parallelism is observed between predicted components such as marine TOC and carbonate content (CaCO3), which was determined independently. Representation of the model results within a time-scale based on 15 AMS-14C measurements shows that the main changes in organic matter constituents are coincident with the major climatic events of the last 15,000 a. Thus, the predominance of reworked organic matter is characteristic of Termination Ia (up to 70%), continental organic matter was dominant during the Bølling-Allerød (B-A) and Younger Dryas (YD) periods (about 85%) and a strong increase of marine organic matter occurred in the Holocene (between 50 and 75%). This agreement reflects the main hydrographic changes that determined the deposition of sedimentary materials during the period studied: ice-rafted detritus from the Barents continental platform, ice-melting waters from the Arctic fluvial system discharging into the Barents sea and dominance of north Atlantic currents, respectively. In this respect, the high-resolution down core record resulting from the mass balance and lipid measurements allows the identification of millennial-scale events such as the increase of reworked organic matter at the final retreat of the Barents ice sheet at the end of the deglaciation period (Termination Ib).
Resumo:
La adivinanza, forma lírica de comunicación, del arte de saber y entretener, destaca como un juego mental y verbal que ha perdurado durante años uniendo a la poesía con el ingenio. Su forma tradicional se basa en versos de arte menor, cuartetas octosilábicas, de rima asonante o consonante cruzada, y el uso del símil, la metáfora, la metonimia, la alegoría, la dilogía, la analogía, y el desglose lingüístico. Como forma, no se apega a los cánones sino que se trata de un género libre, que crea sus propias reglas. Como el refrán, pertenecen al conjunto de rimas, o textos que no se cantaban, sino que se decían. Sus características son: la brevedad, la autonomía, la rima, las aliteraciones y los paralelismos. Su estructuración tiene dos vías: la sintáctica y la retórica: ambas están envueltas en el ropaje de la semántica y convierten a las adivinanzas en un juego del lenguaje, una lección o castigo, un enigma y un auténtico deleite de la tradición lírica
Resumo:
The late Miocene sediments of the Tyrrhenian ODP Site 654 encompass a deepening sequence which begins with glauconite shallow water sands followed by a rapid transition to deep water sediments and culminates with dolomitic mudstones associated with Messinian evaporites. The sequence compares well with the so-called 'Sahelian cycle' and with post-orogenic cycles recognized in peninsular Italy and Sicily. The studied interval, consisting of 55 m thick nannofossil oozes, belongs to the Globorotalia suterae subzone and lower part of the Globorotalia conomiozea Zone, indicating late Tortonian and early Messinian age, respectively. Biomagnetostratigraphic correlation assigns the Tortonian/ Messinian boundary an age of 6.44-6.45 Ma. In addition, six main events have been recognized, based on the range of keeled globorotaliids and coiling direction changes of keeled and unkeeled globorotaliids, which have been correlated to the geomagnetic time-scale. Comparison with North Atlantic sites and land sections of the Guadalquivir basin and northern Morocco provides good correlations with the events documented in these areas. In particular, Event IV, which predates the FO of Globorotalia conomiozea, may be used to recognize the Tortonian/Messinian boundary in extra-Mediterranean areas where G. conomiozea is missing. Variations in the distribution of different species of Globigerinoides are related to changes in the surficial marine environment. Although no clear trends can be recognized on the oxygen and carbon isotope records of Globigerinoides obliquus, the parallelism between the occurrence of low salinity species (G. sacculifer) and peaks of low 5180 values, as well as that of normal salinity species (G. obliquus) and peaks of high d18O values, suggests strong local changes of environmental conditions. The high amplitude of the fluctuations of d18O values suggests important variations in the salinity of the Tyrrhenian Sea, related to a rapidly changing water budget. The major feature of the carbon isotope record is a large decrease between 7.0 and 6.95 Ma, which therefore predates the 6.2 Ma global 'carbon shift'.
Resumo:
Plane strain simple shearing of norcamphor (C7H10O) in a see-through deformation rig to a shear strain of γ = 10.5 at a homologous temperature of Th = 0.81 yields a microfabric similar to that of quartz in amphibolite facies mylonite. Synkinematic analysis of the norcamphor microfabric reveals that the development of a steady-state texture is linked to changes in the relative activities of several grain-scale mechanisms. Three stages of textural and microstructural evolution are distinguished: (1) rotation and shearing of the intracrystalline glide planes are accommodated by localized deformation along three sets of anastomozing microshears. A symmetrical c-axis girdle reflects localized pure shear extension along the main microshear set (Sa) oblique to the bulk shear zone boundary (abbreviated as SZB); (2) progressive rotation of the microshears into parallelism with the SZB increases the component of simple shear on the Sa microshears. Grain-boundary migration recrystallization favours the survival of grains with slip systems oriented for easy glide. This is associated with a textural transition towards two stable c-axis point maxima whose skeletal outline is oblique with respect to the Sa microshears and the SZB; and (3) at high shear strains (γ > 8), the microstructure, texture and mechanism assemblage are strain invariant, but strain continues to partition into rotating sets of microshears. Steady state is therefore a dynamic, heterogeneous condition involving the cyclic nucleation, growth and consumption of grains.
Resumo:
Since the early days of logic programming, researchers in the field realized the potential for exploitation of parallelism present in the execution of logic programs. Their high-level nature, the presence of nondeterminism, and their referential transparency, among other characteristics, make logic programs interesting candidates for obtaining speedups through parallel execution. At the same time, the fact that the typical applications of logic programming frequently involve irregular computations, make heavy use of dynamic data structures with logical variables, and involve search and speculation, makes the techniques used in the corresponding parallelizing compilers and run-time systems potentially interesting even outside the field. The objective of this article is to provide a comprehensive survey of the issues arising in parallel execution of logic programming languages along with the most relevant approaches explored to date in the field. Focus is mostly given to the challenges emerging from the parallel execution of Prolog programs. The article describes the major techniques used for shared memory implementation of Or-parallelism, And-parallelism, and combinations of the two. We also explore some related issues, such as memory management, compile-time analysis, and execution visualization.