778 resultados para Scientific computing
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
This study looks at how increased memory utilisation affects throughput and energy consumption in scientific computing, especially in high-energy physics. Our aim is to minimise energy consumed by a set of jobs without increasing the processing time. The earlier tests indicated that, especially in data analysis, throughput can increase over 100% and energy consumption decrease 50% by processing multiple jobs in parallel per CPU core. Since jobs are heterogeneous, it is not possible to find an optimum value for the number of parallel jobs. A better solution is based on memory utilisation, but finding an optimum memory threshold is not straightforward. Therefore, a fuzzy logic-based algorithm was developed that can dynamically adapt the memory threshold based on the overall load. In this way, it is possible to keep memory consumption stable with different workloads while achieving significantly higher throughput and energy-efficiency than using a traditional fixed number of jobs or fixed memory threshold approaches.
Resumo:
This paper describes JANUS, a modular massively parallel and reconfigurable FPGA-based computing system. Each JANUS module has a computational core and a host. The computational core is a 4x4 array of FPGA-based processing elements with nearest-neighbor data links. Processors are also directly connected to an I/O node attached to the JANUS host, a conventional PC. JANUS is tailored for, but not limited to, the requirements of a class of hard scientific applications characterized by regular code structure, unconventional data manipulation instructions and not too large data-base size. We discuss the architecture of this configurable machine, and focus on its use on Monte Carlo simulations of statistical mechanics. On this class of application JANUS achieves impressive performances: in some cases one JANUS processing element outperfoms high-end PCs by a factor ≈1000. We also discuss the role of JANUS on other classes of scientific applications.
Resumo:
We report on the observed differences in production rates of strange and multistrange baryons in Au+Au collisions at s(NN)=200 GeV compared to p+p interactions at the same energy. The strange baryon yields in Au+Au collisions, when scaled down by the number of participating nucleons, are enhanced relative to those measured in p+p reactions. The enhancement observed increases with the strangeness content of the baryon, and it increases for all strange baryons with collision centrality. The enhancement is qualitatively similar to that observed at the lower collision energy s(NN)=17.3 GeV. The previous observations are for the bulk production, while at intermediate p(T),1 < p(T)< 4 GeV/c, the strange baryons even exceed binary scaling from p+p yields.
Resumo:
A method to compute three-dimension (3D) left ventricle (LV) motion and its color coded visualization scheme for the qualitative analysis in SPECT images is proposed. It is used to investigate some aspects of Cardiac Resynchronization Therapy (CRT). The method was applied to 3D gated-SPECT images sets from normal subjects and patients with severe Idiopathic Heart Failure, before and after CRT. Color coded visualization maps representing the LV regional motion showed significant difference between patients and normal subjects. Moreover, they indicated a difference between the two groups. Numerical results of regional mean values representing the intensity and direction of movement in radial direction are presented. A difference of one order of magnitude in the intensity of the movement on patients in relation to the normal subjects was observed. Quantitative and qualitative parameters gave good indications of potential application of the technique to diagnosis and follow up of patients submitted to CRT.
Resumo:
Context. Cluster properties can be more distinctly studied in pairs of clusters, where we expect the effects of interactions to be strong. Aims. We here discuss the properties of the double cluster Abell 1758 at a redshift z similar to 0.279. These clusters show strong evidence for merging. Methods. We analyse the optical properties of the North and South cluster of Abell 1758 based on deep imaging obtained with the Canada-France-Hawaii Telescope (CFHT) archive Megaprime/Megacam camera in the g' and r' bands, covering a total region of about 1.05 x 1.16 deg(2), or 16.1 x 17.6 Mpc(2). Our X-ray analysis is based on archive XMM-Newton images. Numerical simulations were performed using an N-body algorithm to treat the dark-matter component, a semi-analytical galaxy-formation model for the evolution of the galaxies and a grid-based hydrodynamic code with a parts per million (PPM) scheme for the dynamics of the intra-cluster medium. We computed galaxy luminosity functions (GLFs) and 2D temperature and metallicity maps of the X-ray gas, which we then compared to the results of our numerical simulations. Results. The GLFs of Abell 1758 North are well fit by Schechter functions in the g' and r' bands, but with a small excess of bright galaxies, particularly in the r' band; their faint-end slopes are similar in both bands. In contrast, the GLFs of Abell 1758 South are not well fit by Schechter functions: excesses of bright galaxies are seen in both bands; the faint-end of the GLF is not very well defined in g'. The GLF computed from our numerical simulations assuming a halo mass-luminosity relation agrees with those derived from the observations. From the X-ray analysis, the most striking features are structures in the metal distribution. We found two elongated regions of high metallicity in Abell 1758 North with two peaks towards the centre. In contrast, Abell 1758 South shows a deficit of metals in its central regions. Comparing observational results to those derived from numerical simulations, we could mimic the most prominent features present in the metallicity map and propose an explanation for the dynamical history of the cluster. We found in particular that in the metal-rich elongated regions of the North cluster, winds had been more efficient than ram-pressure stripping in transporting metal-enriched gas to the outskirts. Conclusions. We confirm the merging structure of the North and South clusters, both at optical and X-ray wavelengths.
Resumo:
Poster presented at the Sixth Annual International Conference on Open Repositories (OR11) held on 6-11th June, Austin, Texas
Resumo:
El avance en la potencia de cómputo en nuestros días viene dado por la paralelización del procesamiento, dadas las características que disponen las nuevas arquitecturas de hardware. Utilizar convenientemente este hardware impacta en la aceleración de los algoritmos en ejecución (programas). Sin embargo, convertir de forma adecuada el algoritmo en su forma paralela es complejo, y a su vez, esta forma, es específica para cada tipo de hardware paralelo. En la actualidad los procesadores de uso general más comunes son los multicore, procesadores paralelos, también denominados Symmetric Multi-Processors (SMP). Hoy en día es difícil hallar un procesador para computadoras de escritorio que no tengan algún tipo de paralelismo del caracterizado por los SMP, siendo la tendencia de desarrollo, que cada día nos encontremos con procesadores con mayor numero de cores disponibles. Por otro lado, los dispositivos de procesamiento de video (Graphics Processor Units - GPU), a su vez, han ido desarrollando su potencia de cómputo por medio de disponer de múltiples unidades de procesamiento dentro de su composición electrónica, a tal punto que en la actualidad no es difícil encontrar placas de GPU con capacidad de 200 a 400 hilos de procesamiento paralelo. Estos procesadores son muy veloces y específicos para la tarea que fueron desarrollados, principalmente el procesamiento de video. Sin embargo, como este tipo de procesadores tiene muchos puntos en común con el procesamiento científico, estos dispositivos han ido reorientándose con el nombre de General Processing Graphics Processor Unit (GPGPU). A diferencia de los procesadores SMP señalados anteriormente, las GPGPU no son de propósito general y tienen sus complicaciones para uso general debido al límite en la cantidad de memoria que cada placa puede disponer y al tipo de procesamiento paralelo que debe realizar para poder ser productiva su utilización. Los dispositivos de lógica programable, FPGA, son dispositivos capaces de realizar grandes cantidades de operaciones en paralelo, por lo que pueden ser usados para la implementación de algoritmos específicos, aprovechando el paralelismo que estas ofrecen. Su inconveniente viene derivado de la complejidad para la programación y el testing del algoritmo instanciado en el dispositivo. Ante esta diversidad de procesadores paralelos, el objetivo de nuestro trabajo está enfocado en analizar las características especificas que cada uno de estos tienen, y su impacto en la estructura de los algoritmos para que su utilización pueda obtener rendimientos de procesamiento acordes al número de recursos utilizados y combinarlos de forma tal que su complementación sea benéfica. Específicamente, partiendo desde las características del hardware, determinar las propiedades que el algoritmo paralelo debe tener para poder ser acelerado. Las características de los algoritmos paralelos determinará a su vez cuál de estos nuevos tipos de hardware son los mas adecuados para su instanciación. En particular serán tenidos en cuenta el nivel de dependencia de datos, la necesidad de realizar sincronizaciones durante el procesamiento paralelo, el tamaño de datos a procesar y la complejidad de la programación paralela en cada tipo de hardware. Today´s advances in high-performance computing are driven by parallel processing capabilities of available hardware architectures. These architectures enable the acceleration of algorithms when thes ealgorithms are properly parallelized and exploit the specific processing power of the underneath architecture. Most current processors are targeted for general pruposes and integrate several processor cores on a single chip, resulting in what is known as a Symmetric Multiprocessing (SMP) unit. Nowadays even desktop computers make use of multicore processors. Meanwhile, the industry trend is to increase the number of integrated rocessor cores as technology matures. On the other hand, Graphics Processor Units (GPU), originally designed to handle only video processing, have emerged as interesting alternatives to implement algorithm acceleration. Current available GPUs are able to implement from 200 to 400 threads for parallel processing. Scientific computing can be implemented in these hardware thanks to the programability of new GPUs that have been denoted as General Processing Graphics Processor Units (GPGPU).However, GPGPU offer little memory with respect to that available for general-prupose processors; thus, the implementation of algorithms need to be addressed carefully. Finally, Field Programmable Gate Arrays (FPGA) are programmable devices which can implement hardware logic with low latency, high parallelism and deep pipelines. Thes devices can be used to implement specific algorithms that need to run at very high speeds. However, their programmability is harder that software approaches and debugging is typically time-consuming. In this context where several alternatives for speeding up algorithms are available, our work aims at determining the main features of thes architectures and developing the required know-how to accelerate algorithm execution on them. We look at identifying those algorithms that may fit better on a given architecture as well as compleme
Resumo:
Report for the scientific sojourn carried out at the Department of Chemistry University of North Texas (USA) from September until November 2006. It includes the performance of two computational chemistry studies: an experimental and computational study toward the intra- and intermolecular hydroarylation of isonitriles and the development of an improved catalyst for hydrocarbon functionalization.
Resumo:
El rápido crecimiento del los sistemas multicore y los diversos enfoques que estos han tomado, permiten que procesos complejos que antes solo eran posibles de ejecutar en supercomputadores, hoy puedan ser ejecutados en soluciones de bajo coste también denominadas "hardware de comodidad". Dichas soluciones pueden ser implementadas usando los procesadores de mayor demanda en el mercado de consumo masivo (Intel y AMD). Al escalar dichas soluciones a requerimientos de cálculo científico se hace indispensable contar con métodos para medir el rendimiento que los mismos ofrecen y la manera como los mismos se comportan ante diferentes cargas de trabajo. Debido a la gran cantidad de tipos de cargas existentes en el mercado, e incluso dentro de la computación científica, se hace necesario establecer medidas "típicas" que puedan servir como soporte en los procesos de evaluación y adquisición de soluciones, teniendo un alto grado de certeza de funcionamiento. En la presente investigación se propone un enfoque práctico para dicha evaluación y se presentan los resultados de las pruebas ejecutadas sobre equipos de arquitecturas multicore AMD e Intel.
Resumo:
The spatial limits of the active site in the benzylic hydroxylase enzyme of the fungus Mortierella isabellina were investigated. Several molecular probes were used in incubation experiments to determine the acceptability of each compound by this enzyme. The yields of benzylic alcohols provided information on the acceptability of the particular compound into the active site, and the enantiomeric excess values provided information on the "fit" of acceptable substrates. Measurements of the molecular models were made using Cambridge Scientific Computing Inc. CSC Chem 3D Plus modeling program. i The dimensional limits of the aromatic binding pocket of the benzylic hydroxylase were tested using suitably substituted ethyl benzenes. Both the depth (para substituted substrates) and width (ortho and meta substituted substrates) of this region were investigated, with results demonstrating absolute spatial limits in both directions in the plane of the aromatic ring of 7.3 Angstroms for the depth and 7.1 Angstroms for the width. A minimum requirement for the height of this region has also been established at 6.2 Angstroms. The region containing the active oxygen species was also investigated, using a series of alkylphenylmethanes and fused ring systems in indan, 1,2,3,4-tetrahydronaphthalene and benzocycloheptene substrates. A maximum distance of 6.9 Angstroms (including the 1.5 Angstroms from the phenyl substituent to the active center of the heme prosthetic group of the enzyme) has been established extending directly in ii front of the aromatic binding pocket. The other dimensions in this region of the benzylic hydroxylase active site will require further investigation to establish maximum allowable values. An explanation of the stereochemical distributions in the obtained products has also been put forth that correlates well with the experimental observations.
Resumo:
Le code source de la libraire développée accompagne ce dépôt dans l'état où il était à ce moment. Il est possible de trouver une version plus à jour sur github (http://github.com/abergeron).
Resumo:
The basic concepts of digital signal processing are taught to the students in engineering and science. The focus of the course is on linear, time invariant systems. The question as to what happens when the system is governed by a quadratic or cubic equation remains unanswered in the vast majority of literature on signal processing. Light has been shed on this problem when John V Mathews and Giovanni L Sicuranza published the book Polynomial Signal Processing. This book opened up an unseen vista of polynomial systems for signal and image processing. The book presented the theory and implementations of both adaptive and non-adaptive FIR and IIR quadratic systems which offer improved performance than conventional linear systems. The theory of quadratic systems presents a pristine and virgin area of research that offers computationally intensive work. Once the area of research is selected, the next issue is the choice of the software tool to carry out the work. Conventional languages like C and C++ are easily eliminated as they are not interpreted and lack good quality plotting libraries. MATLAB is proved to be very slow and so do SCILAB and Octave. The search for a language for scientific computing that was as fast as C, but with a good quality plotting library, ended up in Python, a distant relative of LISP. It proved to be ideal for scientific computing. An account of the use of Python, its scientific computing package scipy and the plotting library pylab is given in the appendix Initially, work is focused on designing predictors that exploit the polynomial nonlinearities inherent in speech generation mechanisms. Soon, the work got diverted into medical image processing which offered more potential to exploit by the use of quadratic methods. The major focus in this area is on quadratic edge detection methods for retinal images and fingerprints as well as de-noising raw MRI signals
Resumo:
The goal of this work is the numerical realization of the probe method suggested by Ikehata for the detection of an obstacle D in inverse scattering. The main idea of the method is to use probes in the form of point source (., z) with source point z to define an indicator function (I) over cap (z) which can be reconstructed from Cauchy data or far. eld data. The indicator function boolean AND (I) over cap (z) can be shown to blow off when the source point z tends to the boundary aD, and this behavior can be used to find D. To study the feasibility of the probe method we will use two equivalent formulations of the indicator function. We will carry out the numerical realization of the functional and show reconstructions of a sound-soft obstacle.