54 resultados para High-Performance Computing
Resumo:
The feasibility to synthesize, in large quantity, pure and non-toxic tetrahedrite compounds using high-energy mechanical-alloying from only elemental precursors is reported in the present paper for the first time. Our processing technique allows a better control of the final product composition and leads to high thermoelectric performances (ZT of 0.75 at 700 K), comparable to that reported on sealed tube synthesis samples. Combined with spark plasma sintering, the production of highly pure and dense samples is achieved in a very short time, at least 8 times shorter than in conventional liquid-solid-vapor synthesis process. The process described in this paper is a promising way to produce high performance tetrahedrite materials for cost-effective and large-scale thermoelectric applications.
Resumo:
Biaxially oriented films produced from semi-crystalline, semi-aromatic polyesters are utilised extensively as components within various applications, including the specialist packaging, flexible electronic and photovoltaic markets. However, the thermal performance of such polyesters, specifically poly(ethylene terephthalate) (PET) and poly(ethylene-2,6-naphthalate) (PEN), is inadequate for several applications that require greater dimensional stability at higher operating temperatures. The work described in this project is therefore primarily focussed upon the copolymerisation of rigid comonomers with PET and PEN, in order to produce novel polyester-based materials that exhibit superior thermomechanical performance, with retention of crystallinity, to achieve biaxial orientation. Rigid biphenyldiimide comonomers were readily incorporated into PEN and poly(butylene-2,6-naphthalate) (PBN) via a melt-polycondensation route. For each copoly(ester-imide) series, retention of semi-crystalline behaviour is observed throughout entire copolymer composition ratios. This phenomenon may be rationalised by cocrystallisation between isomorphic biphenyldiimide and naphthalenedicarboxylate residues, which enables statistically random copolymers to melt-crystallise despite high proportions of imide sub-units being present. In terms of thermal performance, the glass transition temperature, Tg, linearly increases with imide comonomer content for both series. This facilitated the production of several high performance PEN-based biaxially oriented films, which displayed analogous drawing, barrier and optical properties to PEN. Selected PBN copoly(ester-imide)s also possess the ability to either melt-crystallise, or form a mesophase from the isotropic state depending on the applied cooling rate. An equivalent synthetic approach based upon isomorphic comonomer crystallisation was subsequently applied to PET by copolymerisation with rigid diimide and Kevlar®-type amide comonomers, to afford several novel high performance PET-based copoly(ester-imide)s and copoly(ester-amide)s that all exhibited increased Tgs. Retention of crystallinity was achieved in these copolymers by either melt-crystallisation or thermal annealing. The initial production of a semi-crystalline, PET-based biaxially oriented film with a Tg in excess of 100 °C was successful, and this material has obvious scope for further industrial scale-up and process development.
Resumo:
FAMOUS is an ocean-atmosphere general circulation model of low resolution, capable of simulating approximately 120 years of model climate per wallclock day using current high performance computing facilities. It uses most of the same code as HadCM3, a widely used climate model of higher resolution and computational cost, and has been tuned to reproduce the same climate reasonably well. FAMOUS is useful for climate simulations where the computational cost makes the application of HadCM3 unfeasible, either because of the length of simulation or the size of the ensemble desired. We document a number of scientific and technical improvements to the original version of FAMOUS. These improvements include changes to the parameterisations of ozone and sea-ice which alleviate a significant cold bias from high northern latitudes and the upper troposphere, and the elimination of volume-averaged drifts in ocean tracers. A simple model of the marine carbon cycle has also been included. A particular goal of FAMOUS is to conduct millennial-scale paleoclimate simulations of Quaternary ice ages; to this end, a number of useful changes to the model infrastructure have been made.
Resumo:
Europe's widely distributed climate modelling expertise, now organized in the European Network for Earth System Modelling (ENES), is both a strength and a challenge. Recognizing this, the European Union's Program for Integrated Earth System Modelling (PRISM) infrastructure project aims at designing a flexible and friendly user environment to assemble, run and post-process Earth System models. PRISM was started in December 2001 with a duration of three years. This paper presents the major stages of PRISM, including: (1) the definition and promotion of scientific and technical standards to increase component modularity; (2) the development of an end-to-end software environment (graphical user interface, coupling and I/O system, diagnostics, visualization) to launch, monitor and analyse complex Earth system models built around state-of-art community component models (atmosphere, ocean, atmospheric chemistry, ocean bio-chemistry, sea-ice, land-surface); and (3) testing and quality standards to ensure high-performance computing performance on a variety of platforms. PRISM is emerging as a core strategic software infrastructure for building the European research area in Earth system sciences. Copyright (c) 2005 John Wiley & Sons, Ltd.
Resumo:
Three naming strategies are discussed that allow the processes of a distributed application to continue being addressed by their original logical name, along all the migrations they may be forced to undertake because of performance-improvement goals. A simple centralised solution is firstly discussed which showed a software bottleneck with the increase of the number of processes; other two solutions are considered that entail different communication schemes and different communication overheads for the naming protocol. All these strategies are based on the facility that each process is allowed to survive after migration, even in its original site, only to provide a forwarding service to those communications that used its obsolete address.
Resumo:
A full assessment of para-virtualization is important, because without knowledge about the various overheads, users can not understand whether using virtualization is a good idea or not. In this paper we are very interested in assessing the overheads of running various benchmarks on bare-‐metal, as well as on para-‐virtualization. The idea is to see what the overheads of para-‐ virtualization are, as well as looking at the overheads of turning on monitoring and logging. The knowledge from assessing various benchmarks on these different systems will help a range of users understand the use of virtualization systems. In this paper we assess the overheads of using Xen, VMware, KVM and Citrix, see Table 1. These different virtualization systems are used extensively by cloud-‐users. We are using various Netlib1 benchmarks, which have been developed by the University of Tennessee at Knoxville (UTK), and Oak Ridge National Laboratory (ORNL). In order to assess these virtualization systems, we run the benchmarks on bare-‐metal, then on the para-‐virtualization, and finally we turn on monitoring and logging. The later is important as users are interested in Service Level Agreements (SLAs) used by the Cloud providers, and the use of logging is a means of assessing the services bought and used from commercial providers. In this paper we assess the virtualization systems on three different systems. We use the Thamesblue supercomputer, the Hactar cluster and IBM JS20 blade server (see Table 2), which are all servers available at the University of Reading. A functional virtualization system is multi-‐layered and is driven by the privileged components. Virtualization systems can host multiple guest operating systems, which run on its own domain, and the system schedules virtual CPUs and memory within each Virtual Machines (VM) to make the best use of the available resources. The guest-‐operating system schedules each application accordingly. You can deploy virtualization as full virtualization or para-‐virtualization. Full virtualization provides a total abstraction of the underlying physical system and creates a new virtual system, where the guest operating systems can run. No modifications are needed in the guest OS or application, e.g. the guest OS or application is not aware of the virtualized environment and runs normally. Para-‐virualization requires user modification of the guest operating systems, which runs on the virtual machines, e.g. these guest operating systems are aware that they are running on a virtual machine, and provide near-‐native performance. You can deploy both para-‐virtualization and full virtualization across various virtualized systems. Para-‐virtualization is an OS-‐assisted virtualization; where some modifications are made in the guest operating system to enable better performance. In this kind of virtualization, the guest operating system is aware of the fact that it is running on the virtualized hardware and not on the bare hardware. In para-‐virtualization, the device drivers in the guest operating system coordinate the device drivers of host operating system and reduce the performance overheads. The use of para-‐virtualization [0] is intended to avoid the bottleneck associated with slow hardware interrupts that exist when full virtualization is employed. It has revealed [0] that para-‐ virtualization does not impose significant performance overhead in high performance computing, and this in turn this has implications for the use of cloud computing for hosting HPC applications. The “apparent” improvement in virtualization has led us to formulate the hypothesis that certain classes of HPC applications should be able to execute in a cloud environment, with minimal performance degradation. In order to support this hypothesis, first it is necessary to define exactly what is meant by a “class” of application, and secondly it will be necessary to observe application performance, both within a virtual machine and when executing on bare hardware. A further potential complication is associated with the need for Cloud service providers to support Service Level Agreements (SLA), so that system utilisation can be audited.
Resumo:
This paper presents a paralleled Two-Pass Hexagonal (TPA) algorithm constituted by Linear Hashtable Motion Estimation Algorithm (LHMEA) and Hexagonal Search (HEXBS) for motion estimation. In the TPA, Motion Vectors (MV) are generated from the first-pass LHMEA and are used as predictors for second-pass HEXBS motion estimation, which only searches a small number of Macroblocks (MBs). We introduced hashtable into video processing and completed parallel implementation. We propose and evaluate parallel implementations of the LHMEA of TPA on clusters of workstations for real time video compression. It discusses how parallel video coding on load balanced multiprocessor systems can help, especially on motion estimation. The effect of load balancing for improved performance is discussed. The performance of the algorithm is evaluated by using standard video sequences and the results are compared to current algorithms.
Resumo:
Since its introduction in 1993, the Message Passing Interface (MPI) has become a de facto standard for writing High Performance Computing (HPC) applications on clusters and Massively Parallel Processors (MPPs). The recent emergence of multi-core processor systems presents a new challenge for established parallel programming paradigms, including those based on MPI. This paper presents a new Java messaging system called MPJ Express. Using this system, we exploit multiple levels of parallelism - messaging and threading - to improve application performance on multi-core processors. We refer to our approach as nested parallelism. This MPI-like Java library can support nested parallelism by using Java or Java OpenMP (JOMP) threads within an MPJ Express process. Practicality of this approach is assessed by porting to Java a massively parallel structure formation code from Cosmology called Gadget-2. We introduce nested parallelism in the Java version of the simulation code and report good speed-ups. To the best of our knowledge it is the first time this kind of hybrid parallelism is demonstrated in a high performance Java application. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Reconfigurable computing is becoming an important new alternative for implementing computations. Field programmable gate arrays (FPGAs) are the ideal integrated circuit technology to experiment with the potential benefits of using different strategies of circuit specialization by reconfiguration. The final form of the reconfiguration strategy is often non-trivial to determine. Consequently, in this paper, we examine strategies for reconfiguration and, based on our experience, propose general guidelines for the tradeoffs using an area-time metric called functional density. Three experiments are set up to explore different reconfiguration strategies for FPGAs applied to a systolic implementation of a scalar quantizer used as a case study. Quantitative results for each experiment are given. The regular nature of the example means that the results can be generalized to a wide class of industry-relevant problems based on arrays.
Resumo:
Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.
Resumo:
The impending threat of global climate change and its regional manifestations is among the most important and urgent problems facing humanity. Society needs accurate and reliable estimates of changes in the probability of regional weather variations to develop science-based adaptation and mitigation strategies. Recent advances in weather prediction and in our understanding and ability to model the climate system suggest that it is both necessary and possible to revolutionize climate prediction to meet these societal needs. However, the scientific workforce and the computational capability required to bring about such a revolution is not available in any single nation. Motivated by the success of internationally funded infrastructure in other areas of science, this paper argues that, because of the complexity of the climate system, and because the regional manifestations of climate change are mainly through changes in the statistics of regional weather variations, the scientific and computational requirements to predict its behavior reliably are so enormous that the nations of the world should create a small number of multinational high-performance computing facilities dedicated to the grand challenges of developing the capabilities to predict climate variability and change on both global and regional scales over the coming decades. Such facilities will play a key role in the development of next-generation climate models, build global capacity in climate research, nurture a highly trained workforce, and engage the global user community, policy-makers, and stakeholders. We recommend the creation of a small number of multinational facilities with computer capability at each facility of about 20 peta-flops in the near term, about 200 petaflops within five years, and 1 exaflop by the end of the next decade. Each facility should have sufficient scientific workforce to develop and maintain the software and data analysis infrastructure. Such facilities will enable questions of what resolution, both horizontal and vertical, in atmospheric and ocean models, is necessary for more confident predictions at the regional and local level. Current limitations in computing power have placed severe limitations on such an investigation, which is now badly needed. These facilities will also provide the world's scientists with the computational laboratories for fundamental research on weather–climate interactions using 1-km resolution models and on atmospheric, terrestrial, cryospheric, and oceanic processes at even finer scales. Each facility should have enabling infrastructure including hardware, software, and data analysis support, and scientific capacity to interact with the national centers and other visitors. This will accelerate our understanding of how the climate system works and how to model it. It will ultimately enable the climate community to provide society with climate predictions, which are based on our best knowledge of science and the most advanced technology.
Resumo:
The necessity and benefits for establishing the international Earth-system Prediction Initiative (EPI) are discussed by scientists associated with the World Meteorological Organization (WMO) World Weather Research Programme (WWRP), World Climate Research Programme (WCRP), International Geosphere–Biosphere Programme (IGBP), Global Climate Observing System (GCOS), and natural-hazards and socioeconomic communities. The proposed initiative will provide research and services to accelerate advances in weather, climate, and Earth system prediction and the use of this information by global societies. It will build upon the WMO, the Group on Earth Observations (GEO), the Global Earth Observation System of Systems (GEOSS) and the International Council for Science (ICSU) to coordinate the effort across the weather, climate, Earth system, natural-hazards, and socioeconomic disciplines. It will require (i) advanced high-performance computing facilities, supporting a worldwide network of research and operational modeling centers, and early warning systems; (ii) science, technology, and education projects to enhance knowledge, awareness, and utilization of weather, climate, environmental, and socioeconomic information; (iii) investments in maintaining existing and developing new observational capabilities; and (iv) infrastructure to transition achievements into operational products and services.