894 resultados para HPC parallel computer architecture queues fault tolerance programmability ADAM
Resumo:
In this paper it is presented the theoretical background, the architecture (using the ""4+1"" model), and the use of the library for execution of adaptive devices, AdapLib. This library was created seeking to be accurate to the adaptive devices theory, and to allow its easy extension considering the specific details of solutions that employ this kind of device. As an example, it is presented a case study in which the library was used to create a proof of concept to monitor and diagnose problems in an online news portal.
Resumo:
The XSophe-Sophe-XeprView((R)) computer simulation software suite enables scientists to easily determine spin Hamiltonian parameters from isotropic, randomly oriented and single crystal continuous wave electron paramagnetic resonance (CW EPR) spectra from radicals and isolated paramagnetic metal ion centers or clusters found in metalloproteins, chemical systems and materials science. XSophe provides an X-windows graphical user interface to the Sophe programme and allows: creation of multiple input files, local and remote execution of Sophe, the display of sophelog (output from Sophe) and input parameters/files. Sophe is a sophisticated computer simulation software programme employing a number of innovative technologies including; the Sydney OPera HousE (SOPHE) partition and interpolation schemes, a field segmentation algorithm, the mosaic misorientation linewidth model, parallelization and spectral optimisation. In conjunction with the SOPHE partition scheme and the field segmentation algorithm, the SOPHE interpolation scheme and the mosaic misorientation linewidth model greatly increase the speed of simulations for most spin systems. Employing brute force matrix diagonalization in the simulation of an EPR spectrum from a high spin Cr(III) complex with the spin Hamiltonian parameters g(e) = 2.00, D = 0.10 cm(-1), E/D = 0.25, A(x) = 120.0, A(y) = 120.0, A(z) = 240.0 x 10(-4) cm(-1) requires a SOPHE grid size of N = 400 (to produce a good signal to noise ratio) and takes 229.47 s. In contrast the use of either the SOPHE interpolation scheme or the mosaic misorientation linewidth model requires a SOPHE grid size of only N = 18 and takes 44.08 and 0.79 s, respectively. Results from Sophe are transferred via the Common Object Request Broker Architecture (CORBA) to XSophe and subsequently to XeprView((R)) where the simulated CW EPR spectra (1D and 2D) can be compared to the experimental spectra. Energy level diagrams, transition roadmaps and transition surfaces aid the interpretation of complicated randomly oriented CW EPR spectra and can be viewed with a web browser and an OpenInventor scene graph viewer.
Resumo:
One of the challenges in scientific visualization is to generate software libraries suitable for the large-scale data emerging from tera-scale simulations and instruments. We describe the efforts currently under way at SDSC and NPACI to address these challenges. The scope of the SDSC project spans data handling, graphics, visualization, and scientific application domains. Components of the research focus on the following areas: intelligent data storage, layout and handling, using an associated “Floor-Plan” (meta data); performance optimization on parallel architectures; extension of SDSC’s scalable, parallel, direct volume renderer to allow perspective viewing; and interactive rendering of fractional images (“imagelets”), which facilitates the examination of large datasets. These concepts are coordinated within a data-visualization pipeline, which operates on component data blocks sized to fit within the available computing resources. A key feature of the scheme is that the meta data, which tag the data blocks, can be propagated and applied consistently. This is possible at the disk level, in distributing the computations across parallel processors; in “imagelet” composition; and in feature tagging. The work reflects the emerging challenges and opportunities presented by the ongoing progress in high-performance computing (HPC) and the deployment of the data, computational, and visualization Grids.
Resumo:
The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. We show in this paper that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms.
Resumo:
Shear deformation of fault gouge or other particulate materials often results in observed strain localization, or more precisely, the localization of measured deformation gradients. In conventional elastic materials the strain localization cannot take place therefore this phenomenon is attributed to special types of non-elastic constitutive behaviour. For particulate materials however the Cosserat continuum which takes care of microrotations independent of displacements is a more appropriate model. In elastic Cosserat continuum the localization in displacement gradients is possible under some combinations of the generalized Cosserat elastic moduli. The same combinations of parameters also correspond to a considerable dispersion in shear wave propagation which can be used for independent experimental verification of the proposed mechanism of apparent strain localization in fault gouge.
Resumo:
Objectives: Lung hyperinflation may be assessed by computed tomography (CT). As shown for patients with emphysema, however, CT image reconstruction affects quantification of hyperinflation. We studied the impact of reconstruction parameters on hyperinflation measurements in mechanically ventilated (MV) patients. Design: Observational analysis. Setting: A University hospital-affiliated research Unit. Patients: The patients were MV patients with injured (n = 5) or normal lungs (n = 6), and spontaneously breathing patients (n = 5). Interventions: None. Measurements and results: Eight image series involving 3, 5, 7, and 10 mm slices and standard and sharp filters were reconstructed from identical CT raw data. Hyperinflated (V-hyper), normally (V-normal), poorly (V-poor), and nonaerated (V-non) volumes were calculated by densitometry as percentage of total lung volume (V-total). V-hyper obtained with the sharp filter systematically exceeded that with the standard filter showing a median (interquartile range) increment of 138 (62-272) ml corresponding to approximately 4% of V-total. In contrast, sharp filtering minimally affected the other subvolumes (V-normal, V-poor, V-non, and V-total). Decreasing slice thickness also increased V-hyper significantly. When changing from 10 to 3 mm thickness, V-hyper increased by a median value of 107 (49-252) ml in parallel with a small and inconsistent increment in V-non of 12 (7-16) ml. Conclusions: Reconstruction parameters significantly affect quantitative CT assessment of V-hyper in MV patients. Our observations suggest that sharp filters are inappropriate for this purpose. Thin slices combined with standard filters and more appropriate thresholds (e.g., -950 HU in normal lungs) might improve the detection of V-hyper. Different studies on V-hyper can only be compared if identical reconstruction parameters were used.
Resumo:
A field matching method is described to analyze a recessed circular cavity radiating into a radial waveguide. Using the wall impedance approach, the analysis is divided into two separate problems of the cavity and its external environment. Based on this analysis, a computer algorithm is developed for determining wall admittances as seen at the edge of the patch in the cavity, the radial admittance matrix for the two-probe feed arrangement, and the input impedance as observed from the coaxial line feeding the cavity. This algorithm is tested against the general-purpose Hewlett-Packard finite-element High Frequency Structure Simulator as well as against measured results. Good agreement in all considered cases is noted.
Resumo:
Developments in computer and three dimensional (3D) digitiser technologies have made it possible to keep track of the broad range of data required to simulate an insect moving around or over the highly heterogeneous habitat of a plant's surface. Properties of plant parts vary within a complex canopy architecture, and insect damage can induce further changes that affect an animal's movements, development and likelihood of survival. Models of plant architectural development based on Lindenmayer systems (L-systems) serve as dynamic platforms for simulation of insect movement, providing ail explicit model of the developing 3D structure of a plant as well as allowing physiological processes associated with plant growth and responses to damage to be described and Simulated. Simple examples of the use of the L-system formalism to model insect movement, operating Lit different spatial scales-from insects foraging on an individual plant to insects flying around plants in a field-are presented. Such models can be used to explore questions about the consequences of changes in environmental architecture and configuration on host finding, exploitation and its population consequences. In effect this model is a 'virtual ecosystem' laboratory to address local as well as landscape-level questions pertinent to plant-insect interactions, taking plant architecture into account. (C) 2002 Elsevier Science B.V. All rights reserved.
Resumo:
Joining efforts of academic and corporate teams, we developed an integration architecture - MULTIS - that enables corporate e-learning managers to use a Learning Management System (LMS) for management of educational activities in virtual worlds. This architecture was then implemented for the Formare LMS. In this paper we present this architecture and concretizations of its implementation for the Second Life Grid/OpenSimulator virtual world platforms. Current systems are focused on activities managed by individual trainers, rather than groups of trainers and large numbers of trainees: they focus on providing the LMS with information about educational activities taking place in a virtual world and/or being able to access within the virtual world some of the information stored in the LMS, and disregard the streamlining of activity setup and data collection in multi-trainer contexts, among other administrative issues. This architecture aims to overcome the limitations of existing systems for organizational management of corporate e-learning activities.
Resumo:
A novel high throughput and scalable unified architecture for the computation of the transform operations in video codecs for advanced standards is presented in this paper. This structure can be used as a hardware accelerator in modern embedded systems to efficiently compute all the two-dimensional 4 x 4 and 2 x 2 transforms of the H.264/AVC standard. Moreover, its highly flexible design and hardware efficiency allows it to be easily scaled in terms of performance and hardware cost to meet the specific requirements of any given video coding application. Experimental results obtained using a Xilinx Virtex-5 FPGA demonstrated the superior performance and hardware efficiency levels provided by the proposed structure, which presents a throughput per unit of area relatively higher than other similar recently published designs targeting the H.264/AVC standard. Such results also showed that, when integrated in a multi-core embedded system, this architecture provides speedup factors of about 120x concerning pure software implementations of the transform algorithms, therefore allowing the computation, in real-time, of all the above mentioned transforms for Ultra High Definition Video (UHDV) sequences (4,320 x 7,680 @ 30 fps).
Resumo:
In this paper we survey the most relevant results for the prioritybased schedulability analysis of real-time tasks, both for the fixed and dynamic priority assignment schemes. We give emphasis to the worst-case response time analysis in non-preemptive contexts, which is fundamental for the communication schedulability analysis. We define an architecture to support priority-based scheduling of messages at the application process level of a specific fieldbus communication network, the PROFIBUS. The proposed architecture improves the worst-case messages’ response time, overcoming the limitation of the first-come-first-served (FCFS) PROFIBUS queue implementations.
Resumo:
In the past years, Software Architecture has attracted increased attention by academia and industry as the unifying concept to structure the design of complex systems. One particular research area deals with the possibility of reconfiguring architectures to adapt the systems they describe to new requirements. Reconfiguration amounts to adding and removing components and connections, and may have to occur without stopping the execution of the system being reconfigured. This work contributes to the formal description of such a process. Taking as a premise that a single formalism hardly ever satisfies all requirements in every situation, we present three approaches, each one with its own assumptions about the systems it can be applied to and with different advantages and disadvantages. Each approach is based on work of other researchers and has the aesthetic concern of changing as little as possible the original formalism, keeping its spirit. The first approach shows how a given reconfiguration can be specified in the same manner as the system it is applied to and in a way to be efficiently executed. The second approach explores the Chemical Abstract Machine, a formalism for rewriting multisets of terms, to describe architectures, computations, and reconfigurations in a uniform way. The last approach uses a UNITY-like parallel programming design language to describe computations, represents architectures by diagrams in the sense of Category Theory, and specifies reconfigurations by graph transformation rules.
Resumo:
Dynamic parallel scheduling using work-stealing has gained popularity in academia and industry for its good performance, ease of implementation and theoretical bounds on space and time. Cores treat their own double-ended queues (deques) as a stack, pushing and popping threads from the bottom, but treat the deque of another randomly selected busy core as a queue, stealing threads only from the top, whenever they are idle. However, this standard approach cannot be directly applied to real-time systems, where the importance of parallelising tasks is increasing due to the limitations of multiprocessor scheduling theory regarding parallelism. Using one deque per core is obviously a source of priority inversion since high priority tasks may eventually be enqueued after lower priority tasks, possibly leading to deadline misses as in this case the lower priority tasks are the candidates when a stealing operation occurs. Our proposal is to replace the single non-priority deque of work-stealing with ordered per-processor priority deques of ready threads. The scheduling algorithm starts with a single deque per-core, but unlike traditional work-stealing, the total number of deques in the system may now exceed the number of processors. Instead of stealing randomly, cores steal from the highest priority deque.
Resumo:
Over the last three decades, computer architects have been able to achieve an increase in performance for single processors by, e.g., increasing clock speed, introducing cache memories and using instruction level parallelism. However, because of power consumption and heat dissipation constraints, this trend is going to cease. In recent times, hardware engineers have instead moved to new chip architectures with multiple processor cores on a single chip. With multi-core processors, applications can complete more total work than with one core alone. To take advantage of multi-core processors, parallel programming models are proposed as promising solutions for more effectively using multi-core processors. This paper discusses some of the existent models and frameworks for parallel programming, leading to outline a draft parallel programming model for Ada.