877 resultados para High-performance computing hyperspectral imaging
Resumo:
This paper focuses on the parallelization of an ocean model applying current multicore processor-based cluster architectures to an irregular computational mesh. The aim is to maximize the efficiency of the computational resources used. To make the best use of the resources offered by these architectures, this parallelization has been addressed at all the hardware levels of modern supercomputers: firstly, exploiting the internal parallelism of the CPU through vectorization; secondly, taking advantage of the multiple cores of each node using OpenMP; and finally, using the cluster nodes to distribute the computational mesh, using MPI for communication within the nodes. The speedup obtained with each parallelization technique as well as the combined overall speedup have been measured for the western Mediterranean Sea for different cluster configurations, achieving a speedup factor of 73.3 using 256 processors. The results also show the efficiency achieved in the different cluster nodes and the advantages obtained by combining OpenMP and MPI versus using only OpenMP or MPI. Finally, the scalability of the model has been analysed by examining computation and communication times as well as the communication and synchronization overhead due to parallelization.
Resumo:
We are indebted with Marnix Medema, Paul Straight and Sean Rovito, for useful discussions and critical reading of the manuscript, as well as with Alicia Chagolla and Yolanda Rodriguez of the MS Service of Unidad Irapuato, Cinvestav, and Araceli Fernandez for technical support in high-performance computing. This work was funded by Conacyt Mexico (grants No. 179290 and 177568) and FINNOVA Mexico (grant No. 214716) to FBG. PCM was funded by Conacyt scholarship (No. 28830) and a Cinvestav posdoctoral fellowship. JF and JFK acknowledge funding from the College of Physical Sciences, University of Aberdeen, UK.
Resumo:
A ciência tem feito uso frequente de recursos computacionais para execução de experimentos e processos científicos, que podem ser modelados como workflows que manipulam grandes volumes de dados e executam ações como seleção, análise e visualização desses dados segundo um procedimento determinado. Workflows científicos têm sido usados por cientistas de várias áreas, como astronomia e bioinformática, e tendem a ser computacionalmente intensivos e fortemente voltados à manipulação de grandes volumes de dados, o que requer o uso de plataformas de execução de alto desempenho como grades ou nuvens de computadores. Para execução dos workflows nesse tipo de plataforma é necessário o mapeamento dos recursos computacionais disponíveis para as atividades do workflow, processo conhecido como escalonamento. Plataformas de computação em nuvem têm se mostrado um alternativa viável para a execução de workflows científicos, mas o escalonamento nesse tipo de plataforma geralmente deve considerar restrições específicas como orçamento limitado ou o tipo de recurso computacional a ser utilizado na execução. Nesse contexto, informações como a duração estimada da execução ou limites de tempo e de custo (chamadas aqui de informações de suporte ao escalonamento) são importantes para garantir que o escalonamento seja eficiente e a execução ocorra de forma a atingir os resultados esperados. Este trabalho identifica as informações de suporte que podem ser adicionadas aos modelos de workflows científicos para amparar o escalonamento e a execução eficiente em plataformas de computação em nuvem. É proposta uma classificação dessas informações, e seu uso nos principais Sistemas Gerenciadores de Workflows Científicos (SGWC) é analisado. Para avaliar o impacto do uso das informações no escalonamento foram realizados experimentos utilizando modelos de workflows científicos com diferentes informações de suporte, escalonados com algoritmos que foram adaptados para considerar as informações inseridas. Nos experimentos realizados, observou-se uma redução no custo financeiro de execução do workflow em nuvem de até 59% e redução no makespan chegando a 8,6% se comparados à execução dos mesmos workflows sendo escalonados sem nenhuma informação de suporte disponível.
Resumo:
Devido às tendências de crescimento da quantidade de dados processados e a crescente necessidade por computação de alto desempenho, mudanças significativas estão acontecendo no projeto de arquiteturas de computadores. Com isso, tem-se migrado do paradigma sequencial para o paralelo, com centenas ou milhares de núcleos de processamento em um mesmo chip. Dentro desse contexto, o gerenciamento de energia torna-se cada vez mais importante, principalmente em sistemas embarcados, que geralmente são alimentados por baterias. De acordo com a Lei de Moore, o desempenho de um processador dobra a cada 18 meses, porém a capacidade das baterias dobra somente a cada 10 anos. Esta situação provoca uma enorme lacuna, que pode ser amenizada com a utilização de arquiteturas multi-cores heterogêneas. Um desafio fundamental que permanece em aberto para estas arquiteturas é realizar a integração entre desenvolvimento de código embarcado, escalonamento e hardware para gerenciamento de energia. O objetivo geral deste trabalho de doutorado é investigar técnicas para otimização da relação desempenho/consumo de energia em arquiteturas multi-cores heterogêneas single-ISA implementadas em FPGA. Nesse sentido, buscou-se por soluções que obtivessem o melhor desempenho possível a um consumo de energia ótimo. Isto foi feito por meio da combinação de mineração de dados para a análise de softwares baseados em threads aliadas às técnicas tradicionais para gerenciamento de energia, como way-shutdown dinâmico, e uma nova política de escalonamento heterogeneity-aware. Como principais contribuições pode-se citar a combinação de técnicas de gerenciamento de energia em diversos níveis como o nível do hardware, do escalonamento e da compilação; e uma política de escalonamento integrada com uma arquitetura multi-core heterogênea em relação ao tamanho da memória cache L1.
Resumo:
A computação paralela permite uma série de vantagens para a execução de aplicações de grande porte, sendo que o uso efetivo dos recursos computacionais paralelos é um aspecto relevante da computação de alto desempenho. Este trabalho apresenta uma metodologia que provê a execução, de forma automatizada, de aplicações paralelas baseadas no modelo BSP com tarefas heterogêneas. É considerado no modelo adotado, que o tempo de computação de cada tarefa secundária não possui uma alta variância entre uma iteração e outra. A metodologia é denominada de ASE e é composta por três etapas: Aquisição (Acquisition), Escalonamento (Scheduling) e Execução (Execution). Na etapa de Aquisição, os tempos de processamento das tarefas são obtidos; na etapa de Escalonamento a metodologia busca encontrar a distribuição de tarefas que maximize a velocidade de execução da aplicação paralela, mas minimizando o uso de recursos, por meio de um algoritmo desenvolvido neste trabalho; e por fim a etapa de Execução executa a aplicação paralela com a distribuição definida na etapa anterior. Ferramentas que são aplicadas na metodologia foram implementadas. Um conjunto de testes aplicando a metodologia foi realizado e os resultados apresentados mostram que os objetivos da proposta foram alcançados.
Resumo:
We have studied the role played by cyclic topology on charge-transfer properties of recently synthesized π -conjugated molecules, namely the set of [n]cycloparaphenylene compounds, with n the number of phenylene rings forming the curved nanoring. We estimate the charge-transfer rates for holes and electrons migration within the array of molecules in their crystalline state. The theoretical calculations suggest that increasing the size of the system would help to obtain higher hole and electron charge-transfer rates and that these materials might show an ambipolar behavior in real samples, independently of the different mode of packing followed by the [6]cycloparaphenylene and [12]cycloparaphenylene cases studied.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Despite the insight gained from 2-D particle models, and given that the dynamics of crustal faults occur in 3-D space, the question remains, how do the 3-D fault gouge dynamics differ from those in 2-D? Traditionally, 2-D modeling has been preferred over 3-D simulations because of the computational cost of solving 3-D problems. However, modern high performance computing architectures, combined with a parallel implementation of the Lattice Solid Model (LSM), provide the opportunity to explore 3-D fault micro-mechanics and to advance understanding of effective constitutive relations of fault gouge layers. In this paper, macroscopic friction values from 2-D and 3-D LSM simulations, performed on an SGI Altix 3700 super-cluster, are compared. Two rectangular elastic blocks of bonded particles, with a rough fault plane and separated by a region of randomly sized non-bonded gouge particles, are sheared in opposite directions by normally-loaded driving plates. The results demonstrate that the gouge particles in the 3-D models undergo significant out-of-plane motion during shear. The 3-D models also exhibit a higher mean macroscopic friction than the 2-D models for varying values of interparticle friction. 2-D LSM gouge models have previously been shown to exhibit accelerating energy release in simulated earthquake cycles, supporting the Critical Point hypothesis. The 3-D models are shown to also display accelerating energy release, and good fits of power law time-to-failure functions to the cumulative energy release are obtained.
Resumo:
Atomistic Molecular Dynamics provides powerful and flexible tools for the prediction and analysis of molecular and macromolecular systems. Specifically, it provides a means by which we can measure theoretically that which cannot be measured experimentally: the dynamic time-evolution of complex systems comprising atoms and molecules. It is particularly suitable for the simulation and analysis of the otherwise inaccessible details of MHC-peptide interaction and, on a larger scale, the simulation of the immune synapse. Progress has been relatively tentative yet the emergence of truly high-performance computing and the development of coarse-grained simulation now offers us the hope of accurately predicting thermodynamic parameters and of simulating not merely a handful of proteins but larger, longer simulations comprising thousands of protein molecules and the cellular scale structures they form. We exemplify this within the context of immunoinformatics.
Resumo:
In this paper, we study the localization problem in large-scale Underwater Wireless Sensor Networks (UWSNs). Unlike in the terrestrial positioning, the global positioning system (GPS) can not work efficiently underwater. The limited bandwidth, the severely impaired channel and the cost of underwater equipment all makes the localization problem very challenging. Most current localization schemes are not well suitable for deep underwater environment. We propose a hierarchical localization scheme to address the challenging problems. The new scheme mainly consists of four types of nodes, which are surface buoys, Detachable Elevator Transceivers (DETs), anchor nodes and ordinary nodes. Surface buoy is assumed to be equipped with GPS on the water surface. A DET is attached to a surface buoy and can rise and down to broadcast its position. The anchor nodes can compute their positions based on the position information from the DETs and the measurements of distance to the DETs. The hierarchical localization scheme is scalable, and can be used to make balances on the cost and localization accuracy. Initial simulation results show the advantages of our proposed scheme. © 2009 IEEE.
Resumo:
Large-scale massively parallel molecular dynamics (MD) simulations of the human class I major histo-compatibility complex (MHC) protein HLA-A*0201 bound to a decameric tumor-specific antigenic peptide GVY-DGREHTV were performed using a scalable MD code on high-performance computing platforms. Such computational capabilities put us in reach of simulations of various scales and complexities. The supercomputing resources available Large-scale massively parallel molecular dynamics (MD) simulations of the human class I major histocompatibility complex (MHC) protein HLA-A*0201 bound to a decameric tumor-specific antigenic peptide GVYDGREHTV were performed using a scalable MD code on high-performance computing platforms. Such computational capabilities put us in reach of simulations of various scales and complexities. The supercomputing resources available for this study allow us to compare directly differences in the behavior of very large molecular models; in this case, the entire extracellular portion of the peptide–MHC complex vs. the isolated peptide binding domain. Comparison of the results from the partial and the whole system simulations indicates that the peptide is less tightly bound in the partial system than in the whole system. From a detailed study of conformations, solvent-accessible surface area, the nature of the water network structure, and the binding energies, we conclude that, when considering the conformation of the α1–α2 domain, the α3 and β2m domains cannot be neglected. © 2004 Wiley Periodicals, Inc. J Comput Chem 25: 1803–1813, 2004
Resumo:
Fueled by increasing human appetite for high computing performance, semiconductor technology has now marched into the deep sub-micron era. As transistor size keeps shrinking, more and more transistors are integrated into a single chip. This has increased tremendously the power consumption and heat generation of IC chips. The rapidly growing heat dissipation greatly increases the packaging/cooling costs, and adversely affects the performance and reliability of a computing system. In addition, it also reduces the processor's life span and may even crash the entire computing system. Therefore, dynamic thermal management (DTM) is becoming a critical problem in modern computer system design. Extensive theoretical research has been conducted to study the DTM problem. However, most of them are based on theoretically idealized assumptions or simplified models. While these models and assumptions help to greatly simplify a complex problem and make it theoretically manageable, practical computer systems and applications must deal with many practical factors and details beyond these models or assumptions. The goal of our research was to develop a test platform that can be used to validate theoretical results on DTM under well-controlled conditions, to identify the limitations of existing theoretical results, and also to develop new and practical DTM techniques. This dissertation details the background and our research efforts in this endeavor. Specifically, in our research, we first developed a customized test platform based on an Intel desktop. We then tested a number of related theoretical works and examined their limitations under the practical hardware environment. With these limitations in mind, we developed a new reactive thermal management algorithm for single-core computing systems to optimize the throughput under a peak temperature constraint. We further extended our research to a multicore platform and developed an effective proactive DTM technique for throughput maximization on multicore processor based on task migration and dynamic voltage frequency scaling technique. The significance of our research lies in the fact that our research complements the current extensive theoretical research in dealing with increasingly critical thermal problems and enabling the continuous evolution of high performance computing systems.
Resumo:
Catering to society's demand for high performance computing, billions of transistors are now integrated on IC chips to deliver unprecedented performances. With increasing transistor density, the power consumption/density is growing exponentially. The increasing power consumption directly translates to the high chip temperature, which not only raises the packaging/cooling costs, but also degrades the performance/reliability and life span of the computing systems. Moreover, high chip temperature also greatly increases the leakage power consumption, which is becoming more and more significant with the continuous scaling of the transistor size. As the semiconductor industry continues to evolve, power and thermal challenges have become the most critical challenges in the design of new generations of computing systems. ^ In this dissertation, we addressed the power/thermal issues from the system-level perspective. Specifically, we sought to employ real-time scheduling methods to optimize the power/thermal efficiency of the real-time computing systems, with leakage/ temperature dependency taken into consideration. In our research, we first explored the fundamental principles on how to employ dynamic voltage scaling (DVS) techniques to reduce the peak operating temperature when running a real-time application on a single core platform. We further proposed a novel real-time scheduling method, “M-Oscillations” to reduce the peak temperature when scheduling a hard real-time periodic task set. We also developed three checking methods to guarantee the feasibility of a periodic real-time schedule under peak temperature constraint. We further extended our research from single core platform to multi-core platform. We investigated the energy estimation problem on the multi-core platforms and developed a light weight and accurate method to calculate the energy consumption for a given voltage schedule on a multi-core platform. Finally, we concluded the dissertation with elaborated discussions of future extensions of our research. ^
Resumo:
Acknowledgements We wish to express our gratitude to the National Geographic Society and the National Research Foundation of South Africa for funding the discovery, recovery, and analysis of the H. naledi material. The study reported here was also made possible by grants from the Social Sciences and Humanities Research Council of Canada, the Canada Foundation for Innovation, the British Columbia Knowledge Development Fund, the Canada Research Chairs Program, Simon Fraser University, the DST/NRF Centre of Excellence in Palaeosciences (COE-Pal), as well as by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada, a Young Scientist Development Grant from the Paleontological Scientific Trust (PAST), a Baldwin Fellowship from the L.S.B. Leakey Foundation, and a Seed Grant and a Cornerstone Faculty Fellowship from the Texas A&M University College of Liberal Arts. We would like to thank the South African Heritage Resource Agency for the permits necessary to work on the Rising Star site; the Jacobs family for granting access; Wilma Lawrence, Bonita De Klerk, Merrill Van der Walt, and Justin Mukanku for their assistance during all phases of the project; Lucas Delezene for valuable discussion on the dental characters of H. naledi. We would also like to thank Peter Schmid for the preparation of the Dinaledi fossil material; Yoel Rak for explaining in detail some of the characters used in previous studies; William Kimbel for drawing our attention to the possibility that there might be a problem with Dembo et al.’s (2015) codes for the two characters related to the articular eminence; Will Stein for helpful discussion about the Bayesian analyses; Mike Lee for his comments on this manuscript; John Hawks for his support in organizing the Rising Star workshop; and the associate editor and three anonymous reviewers for their valuable comments. We are grateful to S. Potze and the Ditsong Museum, B. Billings and the School of Anatomical Sciences at the University of the Witwatersrand, and B. Zipfel and the Evolutionary Studies Institute at the University of the Witwatersrand for providing access to the specimens in their care; the University of the Witwatersrand, the Evolutionary Studies Institute, and the South African National Centre of Excellence in PalaeoSciences for hosting a number of the authors while studying the material; and the Western Canada Research Grid for providing access to the high-performance computing facilities for the Bayesian analyses. Last but definitely not least, we thank the head of the Rising Star project, Lee Berger, for his leadership and support, and for encouraging us to pursue the study reported here.