894 resultados para HPC parallel computer architecture queues fault tolerance programmability ADAM
Resumo:
We describe a procedure for the generation of chemically accurate computer-simulation models to study chemical reactions in the condensed phase. The process involves (i) the use of a coupled semiempirical quantum and classical molecular mechanics method to represent solutes and solvent, respectively; (ii) the optimization of semiempirical quantum mechanics (QM) parameters to produce a computationally efficient and chemically accurate QM model; (iii) the calibration of a quantum/classical microsolvation model using ab initio quantum theory; and (iv) the use of statistical mechanical principles and methods to simulate, on massively parallel computers, the thermodynamic properties of chemical reactions in aqueous solution. The utility of this process is demonstrated by the calculation of the enthalpy of reaction in vacuum and free energy change in aqueous solution for a proton transfer involving methanol, methoxide, imidazole, and imidazolium, which are functional groups involved with proton transfers in many biochemical systems. An optimized semiempirical QM model is produced, which results in the calculation of heats of formation of the above chemical species to within 1.0 kcal/mol (1 kcal = 4.18 kJ) of experimental values. The use of the calibrated QM and microsolvation QM/MM (molecular mechanics) models for the simulation of a proton transfer in aqueous solution gives a calculated free energy that is within 1.0 kcal/mol (12.2 calculated vs. 12.8 experimental) of a value estimated from experimental pKa values of the reacting species.
Resumo:
We describe Janus, a massively parallel FPGA-based computer optimized for the simulation of spin glasses, theoretical models for the behavior of glassy materials. FPGAs (as compared to GPUs or many-core processors) provide a complementary approach to massively parallel computing. In particular, our model problem is formulated in terms of binary variables, and floating-point operations can be (almost) completely avoided. The FPGA architecture allows us to run many independent threads with almost no latencies in memory access, thus updating up to 1024 spins per cycle. We describe Janus in detail and we summarize the physics results obtained in four years of operation of this machine; we discuss two types of physics applications: long simulations on very large systems (which try to mimic and provide understanding about the experimental non equilibrium dynamics), and low-temperature equilibrium simulations using an artificial parallel tempering dynamics. The time scale of our non-equilibrium simulations spans eleven orders of magnitude (from picoseconds to a tenth of a second). On the other hand, our equilibrium simulations are unprecedented both because of the low temperatures reached and for the large systems that we have brought to equilibrium. A finite-time scaling ansatz emerges from the detailed comparison of the two sets of simulations. Janus has made it possible to perform spin glass simulations that would take several decades on more conventional architectures. The paper ends with an assessment of the potential of possible future versions of the Janus architecture, based on state-of-the-art technology.
Resumo:
This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not limited to, the requirements of a class of hard scientific applications characterized by regular code structure, unconventional data manipulation instructions and not too large data-base size. We discuss the architecture of this configurable machine, and focus on its use on Monte Carlo simulations of statistical mechanics. On this class of application JANUS achieves impressive performances: in some cases one JANUS processing element outperfoms high-end PCs by a factor ≈1000. We also discuss the role of JANUS on other classes of scientific applications.
Resumo:
This research presents the development and implementation of fault location algorithms in power distribution networks with distributed generation units installed along their feeders. The proposed algorithms are capable of locating the fault based on voltage and current signals recorded by intelligent electronic devices installed at the end of the feeder sections, information to compute the loads connected to these feeders and their electric characteristics, and the operating status of the network. In addition, this work presents the study of analytical models of distributed generation and load technologies that could contribute to the performance of the proposed fault location algorithms. The validation of the algorithms was based on computer simulations using network models implemented in ATP, whereas the algorithms were implemented in MATLAB.
Resumo:
We describe the hardwired implementation of algorithms for Monte Carlo simulations of a large class of spin models. We have implemented these algorithms as VHDL codes and we have mapped them onto a dedicated processor based on a large FPGA device. The measured performance on one such processor is comparable to O(100) carefully programmed high-end PCs: it turns out to be even better for some selected spin models. We describe here codes that we are currently executing on the IANUS massively parallel FPGA-based system.
Resumo:
A parallel algorithm for image noise removal is proposed. The algorithm is based on peer group concept and uses a fuzzy metric. An optimization study on the use of the CUDA platform to remove impulsive noise using this algorithm is presented. Moreover, an implementation of the algorithm on multi-core platforms using OpenMP is presented. Performance is evaluated in terms of execution time and a comparison of the implementation parallelised in multi-core, GPUs and the combination of both is conducted. A performance analysis with large images is conducted in order to identify the amount of pixels to allocate in the CPU and GPU. The observed time shows that both devices must have work to do, leaving the most to the GPU. Results show that parallel implementations of denoising filters on GPUs and multi-cores are very advisable, and they open the door to use such algorithms for real-time processing.
Resumo:
In this work, we present a multi-camera surveillance system based on the use of self-organizing neural networks to represent events on video. The system processes several tasks in parallel using GPUs (graphic processor units). It addresses multiple vision tasks at various levels, such as segmentation, representation or characterization, analysis and monitoring of the movement. These features allow the construction of a robust representation of the environment and interpret the behavior of mobile agents in the scene. It is also necessary to integrate the vision module into a global system that operates in a complex environment by receiving images from multiple acquisition devices at video frequency. Offering relevant information to higher level systems, monitoring and making decisions in real time, it must accomplish a set of requirements, such as: time constraints, high availability, robustness, high processing speed and re-configurability. We have built a system able to represent and analyze the motion in video acquired by a multi-camera network and to process multi-source data in parallel on a multi-GPU architecture.
Resumo:
The explosive growth of the traffic in computer systems has made it clear that traditional control techniques are not adequate to provide the system users fast access to network resources and prevent unfair uses. In this paper, we present a reconfigurable digital hardware implementation of a specific neural model for intrusion detection. It uses a specific vector of characterization of the network packages (intrusion vector) which is starting from information obtained during the access intent. This vector will be treated by the system. Our approach is adaptative and to detecting these intrusions by using a complex artificial intelligence method known as multilayer perceptron. The implementation have been developed and tested into a reconfigurable hardware (FPGA) for embedded systems. Finally, the Intrusion detection system was tested in a real-world simulation to gauge its effectiveness and real-time response.
Resumo:
With the development of the embedded application and driving assistance systems, it becomes relevant to develop parallel mechanisms in order to check and to diagnose these new systems. In this thesis we focus our research on one of this type of parallel mechanisms and analytical redundancy for fault diagnosis of an automotive suspension system. We have considered a quarter model car passive suspension model and used a parameter estimation, ARX model, method to detect the fault happening in the damper and spring of system. Moreover, afterward we have deployed a neural network classifier to isolate the faults and identifies where the fault is happening. Then in this regard, the safety measurements and redundancies can take into the effect to prevent failure in the system. It is shown that The ARX estimator could quickly detect the fault online using the vertical acceleration and displacement sensor data which are common sensors in nowadays vehicles. Hence, the clear divergence is the ARX response make it easy to deploy a threshold to give alarm to the intelligent system of vehicle and the neural classifier can quickly show the place of fault occurrence.
Resumo:
Software erosion can be controlled by periodically checking for consistency between the de facto architecture and its theoretical counterpart. Studies show that this process is often not automated and that developers still rely heavily on manual reviews, despite the availability of a large number of tools. This is partially due to the high cost involved in setting up and maintaining tool-specific and incompatible test specifications that replicate otherwise documented invariants. To reduce this cost, our approach consists in unifying the functionality provided by existing tools under the umbrella of a common business-readable DSL. By using a declarative language, we are able to write tool-agnostic rules that are simple enough to be understood by non-technical stakeholders and, at the same time, can be interpreted as a rigorous specification for checking architecture conformance
Resumo:
Analogue model experiments using both brittle and viscous materials were performed to investigate the development and interaction of strike-slip faults in zones of distributed shear deformation. At low strain, bulk dextral shear deformation of an initial rectangular model is dominantly accommodated by left-stepping, en echelon strike-slip faults (Riedel shears, R) that form in response to the regional (bulk) stress field. Push-up zones form in the area of interaction between adjacent left-stepping Riedel shears. In cross sections, faults bounding push-up zones have an arcuate shape or merge at depth. Adjacent left-stepping R shears merge by sideways propagation or link by short synthetic shears that strike subparallel to the bulk shear direction. Coalescence of en echelon R shears results in major, through-going faults zones (master faults). Several parallel master faults develop due to the distributed nature of deformation. Spacing between master faults is related to the thickness of the brittle layers overlying the basal viscous layer. Master faults control to a large extent the subsequent fault pattern. With increasing strain, relatively short antithetic and synthetic faults develop mostly between old, but still active master faults. The orientation and evolution of the new faults indicate local modifications of the stress field. In experiments lacking lateral borders, closely spaced parallel antithetic faults (cross faults) define blocks that undergo clockwise rotation about a vertical axis with continuing deformation. Fault development and fault interaction at different stages of shear strain in our models show similarities with natural examples that have undergone distributed shear.
Resumo:
National Highway Traffic Safety Administration, Washington, D.C.
Resumo:
National Highway Traffic Safety Administration, Washington, D.C.
Resumo:
"UILU-ENG 78 1745."
Resumo:
Includes bibliographical references.