956 resultados para Dual gratings parallel matched


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conventional parallel computer architectures do not provide support for non-uniformly distributed objects. In this thesis, I introduce sparsely faceted arrays (SFAs), a new low-level mechanism for naming regions of memory, or facets, on different processors in a distributed, shared memory parallel processing system. Sparsely faceted arrays address the disconnect between the global distributed arrays provided by conventional architectures (e.g. the Cray T3 series), and the requirements of high-level parallel programming methods that wish to use objects that are distributed over only a subset of processing elements. A sparsely faceted array names a virtual globally-distributed array, but actual facets are lazily allocated. By providing simple semantics and making efficient use of memory, SFAs enable efficient implementation of a variety of non-uniformly distributed data structures and related algorithms. I present example applications which use SFAs, and describe and evaluate simple hardware mechanisms for implementing SFAs. Keeping track of which nodes have allocated facets for a particular SFA is an important task that suggests the need for automatic memory management, including garbage collection. To address this need, I first argue that conventional tracing techniques such as mark/sweep and copying GC are inherently unscalable in parallel systems. I then present a parallel memory-management strategy, based on reference-counting, that is capable of garbage collecting sparsely faceted arrays. I also discuss opportunities for hardware support of this garbage collection strategy. I have implemented a high-level hardware/OS simulator featuring hardware support for sparsely faceted arrays and automatic garbage collection. I describe the simulator and outline a few of the numerous details associated with a "real" implementation of SFAs and SFA-aware garbage collection. Simulation results are used throughout this thesis in the evaluation of hardware support mechanisms.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Euterpe is a real-time computer system for the modeling of musical structures. It provides a formalism wherein familiar concepts of musical analysis may be readily expressed. This is verified by its application to the analysis of a wide variety of conventional forms of music: Gregorian chant, Mediaeval polyphony, Back counterpoint, and sonata form. It may be of further assistance in the real-time experiments in various techniques of thematic development. Finally, the system is endowed with sound-synthesis apparatus with which the user may prepare tapes for musical performances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Huelse, M, Barr, D R W, Dudek, P: Cellular Automata and non-static image processing for embodied robot systems on a massively parallel processor array. In: Adamatzky, A et al. (eds) AUTOMATA 2008, Theory and Applications of Cellular Automata. Luniver Press, 2008, pp. 504-510. Sponsorship: EPSRC

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wallace, Joanne, et al., 'Body composition and bone mineral density changes during a premier league season as measured by dual-energy X-ray absorptiometry', International Journal of Body Composition Research (2006) 4(2) pp.61-66 RAE2008

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Li, Xing; Lu, Q. M.; Li, B., 'Ion Pickup by Finite Amplitude Parallel Propagating Alfven Waves', The Astrophysical Journal Letters (2007) 661(1) pp.L105-L108 RAE2008

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The proliferation of inexpensive workstations and networks has prompted several researchers to use such distributed systems for parallel computing. Attempts have been made to offer a shared-memory programming model on such distributed memory computers. Most systems provide a shared-memory that is coherent in that all processes that use it agree on the order of all memory events. This dissertation explores the possibility of a significant improvement in the performance of some applications when they use non-coherent memory. First, a new formal model to describe existing non-coherent memories is developed. I use this model to prove that certain problems can be solved using asynchronous iterative algorithms on shared-memory in which the coherence constraints are substantially relaxed. In the course of the development of the model I discovered a new type of non-coherent behavior called Local Consistency. Second, a programming model, Mermera, is proposed. It provides programmers with a choice of hierarchically related non-coherent behaviors along with one coherent behavior. Thus, one can trade-off the ease of programming with coherent memory for improved performance with non-coherent memory. As an example, I present a program to solve a linear system of equations using an asynchronous iterative algorithm. This program uses all the behaviors offered by Mermera. Third, I describe the implementation of Mermera on a BBN Butterfly TC2000 and on a network of workstations. The performance of a version of the equation solving program that uses all the behaviors of Mermera is compared with that of a version that uses coherent behavior only. For a system of 1000 equations the former exhibits at least a 5-fold improvement in convergence time over the latter. The version using coherent behavior only does not benefit from employing more than one workstation to solve the problem while the program using non-coherent behavior continues to achieve improved performance as the number of workstations is increased from 1 to 6. This measurement corroborates our belief that non-coherent shared memory can be a performance boon for some applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For communication-intensive parallel applications, the maximum degree of concurrency achievable is limited by the communication throughput made available by the network. In previous work [HPS94], we showed experimentally that the performance of certain parallel applications running on a workstation network can be improved significantly if a congestion control protocol is used to enhance network performance. In this paper, we characterize and analyze the communication requirements of a large class of supercomputing applications that fall under the category of fixed-point problems, amenable to solution by parallel iterative methods. This results in a set of interface and architectural features sufficient for the efficient implementation of the applications over a large-scale distributed system. In particular, we propose a direct link between the application and network layer, supporting congestion control actions at both ends. This in turn enhances the system's responsiveness to network congestion, improving performance. Measurements are given showing the efficacy of our scheme to support large-scale parallel computations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Predictability -- the ability to foretell that an implementation will not violate a set of specified reliability and timeliness requirements -- is a crucial, highly desirable property of responsive embedded systems. This paper overviews a development methodology for responsive systems, which enhances predictability by eliminating potential hazards resulting from physically-unsound specifications. The backbone of our methodology is the Time-constrained Reactive Automaton (TRA) formalism, which adopts a fundamental notion of space and time that restricts expressiveness in a way that allows the specification of only reactive, spontaneous, and causal computation. Using the TRA model, unrealistic systems – possessing properties such as clairvoyance, caprice, infinite capacity, or perfect timing -- cannot even be specified. We argue that this "ounce of prevention" at the specification level is likely to spare a lot of time and energy in the development cycle of responsive systems -- not to mention the elimination of potential hazards that would have gone, otherwise, unnoticed. The TRA model is presented to system developers through the Cleopatra programming language. Cleopatra features a C-like imperative syntax for the description of computation, which makes it easier to incorporate in applications already using C. It is event-driven, and thus appropriate for embedded process control applications. It is object-oriented and compositional, thus advocating modularity and reusability. Cleopatra is semantically sound; its objects can be transformed, mechanically and unambiguously, into formal TRA automata for verification purposes, which can be pursued using model-checking or theorem proving techniques. Since 1989, an ancestor of Cleopatra has been in use as a specification and simulation language for embedded time-critical robotic processes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Programmers of parallel processes that communicate through shared globally distributed data structures (DDS) face a difficult choice. Either they must explicitly program DDS management, by partitioning or replicating it over multiple distributed memory modules, or be content with a high latency coherent (sequentially consistent) memory abstraction that hides the DDS' distribution. We present Mermera, a new formalism and system that enable a smooth spectrum of noncoherent shared memory behaviors to coexist between the above two extremes. Our approach allows us to define known noncoherent memories in a new simple way, to identify new memory behaviors, and to characterize generic mixed-behavior computations. The latter are useful for programming using multiple behaviors that complement each others' advantages. On the practical side, we show that the large class of programs that use asynchronous iterative methods (AIM) can run correctly on slow memory, one of the weakest, and hence most efficient and fault-tolerant, noncoherence conditions. An example AIM program to solve linear equations, is developed to illustrate: (1) the need for concurrently mixing memory behaviors, and, (2) the performance gains attainable via noncoherence. Other program classes tolerate weak memory consistency by synchronizing in such a way as to yield executions indistinguishable from coherent ones. AIM computations on noncoherent memory yield noncoherent, yet correct, computations. We report performance data that exemplifies the potential benefits of noncoherence, in terms of raw memory performance, as well as application speed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Communication and synchronization stand as the dual bottlenecks in the performance of parallel systems, and especially those that attempt to alleviate the programming burden by incurring overhead in these two domains. We formulate the notions of communicable memory and lazy barriers to help achieve efficient communication and synchronization. These concepts are developed in the context of BSPk, a toolkit library for programming networks of workstations|and other distributed memory architectures in general|based on the Bulk Synchronous Parallel (BSP) model. BSPk emphasizes efficiency in communication by minimizing local memory-to-memory copying, and in barrier synchronization by not forcing a process to wait unless it needs remote data. Both the message passing (MP) and distributed shared memory (DSM) programming styles are supported in BSPk. MP helps processes efficiently exchange short-lived unnamed data values, when the identity of either the sender or receiver is known to the other party. By contrast, DSM supports communication between processes that may be mutually anonymous, so long as they can agree on variable names in which to store shared temporary or long-lived data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Calligraphic writing presents a rich set of challenges to the human movement control system. These challenges include: initial learning, and recall from memory, of prescribed stroke sequences; critical timing of stroke onsets and durations; fine control of grip and contact forces; and letter-form invariance under voluntary size scaling, which entails fine control of stroke direction and amplitude during recruitment and derecruitment of musculoskeletal degrees of freedom. Experimental and computational studies in behavioral neuroscience have made rapid progress toward explaining the learning, planning and contTOl exercised in tasks that share features with calligraphic writing and drawing. This article summarizes computational neuroscience models and related neurobiological data that reveal critical operations spanning from parallel sequence representations to fine force control. Part one addresses stroke sequencing. It treats competitive queuing (CQ) models of sequence representation, performance, learning, and recall. Part two addresses letter size scaling and motor equivalence. It treats cursive handwriting models together with models in which sensory-motor tmnsformations are performed by circuits that learn inverse differential kinematic mappings. Part three addresses fine-grained control of timing and transient forces, by treating circuit models that learn to solve inverse dynamics problems.