980 resultados para parallel applications


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The InteGrade project is a multi-university effort to build a novel grid computing middleware based on the opportunistic use of resources belonging to user workstations. The InteGrade middleware currently enables the execution of sequential, bag-of-tasks, and parallel applications that follow the BSP or the MPI programming models. This article presents the lessons learned over the last five years of the InteGrade development and describes the solutions achieved concerning the support for robust application execution. The contributions cover the related fields of application scheduling, execution management, and fault tolerance. We present our solutions, describing their implementation principles and evaluation through the analysis of several experimental results. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Present operating systems are not built to support parallel computing––they do not provide services to manage parallelism, i.e., to globally manage parallel processes and computational resources. The cluster operating environments that are used to assist the execution of parallel applications do not provide support for both programming paradigms, message passing (MP) or distributed shared memory (DSM)––they are mainly offered as separate components implemented at the user level as library and independent server processes. Due to poor operating systems users must deal with clusters as a set of independent computers rather than to see this cluster as a single powerful computer. A single system image (SSI) of the cluster is not offered to users. There is a need for an operating system for clusters. We claim and demonstrate in this paper that it is possible to develop a cluster operating system that is able to efficiently manage parallelism; use cluster resources efficiently; support MP in the form of standard MP and PVM, and DSM; offer SSI; and make it easy to use. We show that to achieve these aims this operating system should inherit many features of a distributed operating system and provide new services which address the needs of parallel processes, cluster's resources, and application developers. In order to substantiate the claim the first version of a cluster operating system managing parallelism and offering SSI, called GENESIS, has been developed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The issue of under-estimated length of jobs (parallel applications) on backfill-based scheduling is ignored in the current literature because users want to avoid their jobs to be killed when the requested time expires. Therefore, users prefer to over-estimate the length of their jobs. This paper shows the impact of underestimated length of jobs on their execution performance in an EASY-backfill scheduling-based system. We have developed a batch job scheduler for Linux clusters that implements an enhanced EASY- backfilling algorithm in such a way that a job with an under-estimated execution time would not be killed unless it would delay other jobs. We have carried out performance evaluation by scheduling static workloads of well known MPI parallel applications on a real cluster. Our results show that most of the jobs do not have to be aborted even though their job lengths are under-estimated whereas the slowdown of jobs and the throughput of the system are only slightly degraded.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Distributed Shared Memory (DSM) provides programmers with a shared memory environment in systems where memory is not physically shared. Clusters of Workstations (COWs), an often untapped source of computing power, are characterised by a very low cost/performance ratio. The combination of Clusters of Workstations (COWs) with DSM provides an environment in which the programmer can use the well known approaches and methods of programming for physically shared memory systems and parallel processing can be carried out to make full use of the computing power and cost advantages of the COW. The aim of this research is to synthesise and develop a distributed shared memory system as an integral part of an operating system in order to provide application programmers with a convenient environment in which the development and execution of parallel applications can be done easily and efficiently, and which does this in a transparent manner. Furthermore, in order to satisfy our challenging design requirements we want to demonstrate that the operating system into which the DSM system is integrated should be a distributed operating system. In this thesis a study into the synthesis of a DSM system within a microkernel and client-server based distributed operating system which uses both strict and weak consistency models, with a write-invalidate and write-update based approach for consistency maintenance is reported. Furthermore a unique automatic initialisation system which allows the programmer to start the parallel execution of a group of processes with a single library call is reported. The number and location of these processes are determined by the operating system based on system load information. The DSM system proposed has a novel approach in that it provides programmers with a complete programming environment in which they are easily able to develop and run their code or indeed run existing shared memory code. A set of demanding DSM system design requirements are presented and the incentives for the placement of the DSM system with a distributed operating system and in particular in the memory management server have been reported. The new DSM system concentrated on an event-driven set of cooperating and distributed entities, and a detailed description of the events and reactions to these events that make up the operation of the DSM system is then presented. This is followed by a pseudocode form of the detailed design of the main modules and activities of the primitives used in the proposed DSM system. Quantitative results of performance tests and qualitative results showing the ease of programming and use of the RHODOS DSM system are reported. A study of five different application is given and the results of tests carried out on these applications together with a discussion of the results are given. A discussion of how RHODOS’ DSM allows programmers to write shared memory code in an easy to use and familiar environment and a comparative evaluation of RHODOS DSM with other DSM systems is presented. In particular, the ease of use and transparency of the DSM system have been demonstrated through the description of the ease with which a moderately inexperienced undergraduate programmer was able to convert, write and run applications for the testing of the DSM system. Furthermore, the description of the tests performed using physically shared memory shows that the latter is indistinguishable from distributed shared memory; this is further evidence that the DSM system is fully transparent. This study clearly demonstrates that the aim of the research has been achieved; it is possible to develop a programmer friendly and efficient DSM system fully integrated within a distributed operating system. It is clear from this research that client-server and microkernel based distributed operating system integrated DSM makes shared memory operations transparent and almost completely removes the involvement of the programmer beyond classical activities needed to deal with shared memory. The conclusion can be drawn that DSM, when implemented within a client-server and microkernel based distributed operating system, is one of the most encouraging approaches to parallel processing since it guarantees performance improvements with minimal programmer involvement.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Shared clusters represent an excellent platform for the execution of parallel applications given their low price/performance ratio and the presence of cluster infrastructure in many organisations. The focus of recent research efforts are on parallelism management, transport and efficient access to resources, and making clusters easy to use. In this thesis, we examine reliable parallel computing on clusters. The aim of this research is to demonstrate the feasibility of developing an operating system facility providing transport fault tolerance using existing, enhanced and newly built operating system services for supporting parallel applications. In particular, we use existing process duplication and process migration services, and synthesise a group communications facility for use in a transparent checkpointing facility. This research is carried out using the methods of experimental computer science. To provide a foundation for the synthesis of the group communications and checkpointing facilities, we survey and review related work in both fields. For group communications, we examine the V Distributed System, the x-kernel and Psync, the ISIS Toolkit, and Horus. We identify a need for services that consider the placement of processes on computers in the cluster. For Checkpointing, we examine Manetho, KeyKOS, libckpt, and Diskless Checkpointing. We observe the use of remote computer memories for storing checkpoints, and the use of copy-on-write mechanisms to reduce the time to create a checkpoint of a process. We propose a group communications facility providing two sets of services: user-oriented services and system-oriented services. User-oriented services provide transparency and target application. System-oriented services supplement the user-oriented services for supporting other operating systems services and do not provide transparency. Additional flexibility is achieved by providing delivery and ordering semantics independently. An operating system facility providing transparent checkpointing is synthesised using coordinated checkpointing. To ensure a consistent set of checkpoints are generated by the facility, instead of blindly blocking the processes of a parallel application, only non-deterministic events are blocked. This allows the processes of the parallel application to continue execution during the checkpoint operation. Checkpoints are created by adapting process duplication mechanisms, and checkpoint data is transferred to remote computer memories and disk for storage using the mechanisms of process migration. The services of the group communications facility are used to coordinate the checkpoint operation, and to transport checkpoint data to remote computer memories and disk. Both the group communications facility and the checkpointing facility have been implemented in the GENESIS cluster operating system and provide proof-of-concept. GENESIS uses a microkernel and client-server based operating system architecture, and is demonstrated to provide an appropriate environment for the development of these facilities. We design a number of experiments to test the performance of both the group communications facility and checkpointing facility, and to provide proof-of-performance. We present our approach to testing, the challenges raised in testing the facilities, and how we overcome them. For group communications, we examine the performance of a number of delivery semantics. Good speed-ups are observed and system-oriented group communication services are shown to provide significant performance advantages over user-oriented semantics in the presence of packet loss. For checkpointing, we examine the scalability of the facility given different levels of resource usage and a variable number of computers. Low overheads are observed for checkpointing a parallel application. It is made clear by this research that the microkernel and client-server based cluster operating system provide an ideal environment for the development of a high performance group communications facility and a transparent checkpointing facility for generating a platform for reliable parallel computing on clusters.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Recently, fields with substantial computing requirementshave turned to cloud computing for economical, scalable, and on-demandprovisioning of required execution environments. However, current cloudofferings focus on providing individual servers while tasks such as applicationdistribution and data preparation are left to cloud users. This article presents anew form of cloud called HPC Hybrid Deakin (H2D) cloud; an experimentalhybrid cloud capable of utilising both local and remote computational servicesfor large embarrassingly parallel applications. As well as supporting execution,H2D also provides a new service, called DataVault, that provides transparentdata management services so all cloud-hosted clusters have required datasetsbefore commencing execution.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Currently, several psychological and non-psychological tests can be found in publishes without standardization on procedures set in different psychological areas, like intelligence, emotional states, attitudes, social skills, vocation, preferences and others. The computerized psychological testing is a extension of traditional testing psychological practices. However, it has own psychometrics qualities, either by its matching in a computerized environment or by the extension that can be developed in it. The current research, developed from a necessity to study process of validity and reliability on a computerized test, drew a methodological structure to provide parallel applications in numerous kinds of operational groups, evaluating the influences of the time and approach in the computerization process. This validity refers to normative values groups, reproducibility in computerized applications process and data processing. Not every psychological test can be computerized. Therefore, our need to find a good test, with quality and plausible properties to transform in computerized application, leaded us to use The Millon Personality Inventory, created by Theodore Millon. This Inventory assesses personality according to 12 bipolarities distributed in 24 factors, distributed in categories motivational styles, cognitive targets and interpersonal relations. This instrument doesn t diagnose pathological features, but test normal and non adaptive aspects in human personality, comparing with Theodore Millon theory of personality. In oder to support this research in a Brazilian context in psychological testing, we discuss the theme, evaluating the advantages and disadvantages of such practices. Also we discuss the current forms in computerization of psychological testing and the main specific criteria in this psychometric specialized area of knowledge. The test was on-line, hosted in the site http://www.planetapsi.com, during the years of 2007 and 2008, which was available a questionnaire to describe social characteristics before test. A report was generated from the data entry of each user. An application of this test was conducted in a linear way through a national coverage in all Brazil regions, getting 1508 applications. Were organized nine groups, reaching 180 applications in test and retest subject, where three periods of time and three forms of retests for studies of on-line tests were separated. Parallel to this, we organized multi-application session offline group, 20 subjects who received tests by email. The subjects of this study were generally distributed by the five Brazilian regions, and were noticed about the test via the Internet. The performance application in traditional and on-line tested groups subsidies us to conclude that on-line application provides significantly consistency in all criteria for validity studied and justifies its use. The on-line test results were related not only among themselves but were similar to those data of tests done on pencil and paper (0,82). The retests results demonstrated correlation, between 0,92 and, 1 while multisessions had a good correlation in these comparisons. Moreover, were assessed the adequacy of operational criteria used, such as security, the performance of users, the environmental characteristics, the organization of the database, operational costs and limitations in this on-line inventory. In all these five items, there were excellent performances, concluding, also, that it s possible a self-applied psychometric test. The results of this work are a guide to question and establish of methodologies studies for computerization psychological testing software in the country

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents the concept, design and implementation of a MP-SoC platform, named STORM (MP-SoC DirecTory-Based PlatfORM). Currently the platform is composed of the following modules: SPARC V8 processor, GPOP processor, Cache module, Memory module, Directory module and two different modles of Network-on-Chip, NoCX4 and Obese Tree. All modules were implemented using SystemC, simulated and validated, individually or in group. The modules description is presented in details. For programming the platform in C it was implemented a SPARC assembler, fully compatible with gcc s generated assembly code. For the parallel programming it was implemented a library for mutex managing, using the due assembler s support. A total of 10 simulations of increasing complexity are presented for the validation of the presented concepts. The simulations include real parallel applications, such as matrix multiplication, Mergesort, KMP, Motion Estimation and DCT 2D

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A capacidade de processamento das instituições de pesquisa vem crescendo significativamente à medida que processadores e estações de trabalho cada vez mais poderosos vão surgindo no mercado. Considerando a melhoria de desempenho na área de redes de computadores e visando suprir a demanda por processamento cada vez maior, surgiu a ideia de utilizar computadores independentes conectados em rede como plataforma para execução de aplicações paralelas, originando assim a área de computação em grade. Em uma rede que se encontra sob um mesmo domínio administrativo, é comum que exista o compartilhamento de recursos como discos, impressoras, etc. Mas quando a rede ultrapassa um domínio administrativo, este compartilhamento se torna muito limitado. A finalidade das grades de computação é permitir compartilhamento de recursos mesmo que estes estejam espalhados por diversos domínios administrativos. Esta dissertação propõe uma arquitetura para o estabelecimento dinâmico de conexões multidomínio que faz uso da comutação de rajadas ópticas (OBS – Optical Burst Switching) utilizando um plano de controle GMPLS (Generalized Multiprotocol Label Switching). A arquitetura baseia-se no armazenamento de informações sobre recursos de grade de sistemas autônomos (AS -Autonomous Systems) distintos em um componente chamado Servidor GOBS Raiz (Grid OBS) e na utilização do roteamento explícito para reservar os recursos ao longo de uma rota que satisfaça as restrições de desempenho de uma aplicação. A validação da proposta é feita através de simulações que mostram que a arquitetura é capaz de garantir níveis de desempenho diferenciados de acordo com a classe da aplicação e proporciona uma melhor utilização dos recursos de rede e de computação.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Breakthrough advances in microprocessor technology and efficient power management have altered the course of development of processors with the emergence of multi-core processor technology, in order to bring higher level of processing. The utilization of many-core technology has boosted computing power provided by cluster of workstations or SMPs, providing large computational power at an affordable cost using solely commodity components. Different implementations of message-passing libraries and system softwares (including Operating Systems) are installed in such cluster and multi-cluster computing systems. In order to guarantee correct execution of message-passing parallel applications in a computing environment other than that originally the parallel application was developed, review of the application code is needed. In this paper, a hybrid communication interfacing strategy is proposed, to execute a parallel application in a group of computing nodes belonging to different clusters or multi-clusters (computing systems may be running different operating systems and MPI implementations), interconnected with public or private IP addresses, and responding interchangeably to user execution requests. Experimental results demonstrate the feasibility of this proposed strategy and its effectiveness, through the execution of benchmarking parallel applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A computação paralela permite uma série de vantagens para a execução de aplicações de grande porte, sendo que o uso efetivo dos recursos computacionais paralelos é um aspecto relevante da computação de alto desempenho. Este trabalho apresenta uma metodologia que provê a execução, de forma automatizada, de aplicações paralelas baseadas no modelo BSP com tarefas heterogêneas. É considerado no modelo adotado, que o tempo de computação de cada tarefa secundária não possui uma alta variância entre uma iteração e outra. A metodologia é denominada de ASE e é composta por três etapas: Aquisição (Acquisition), Escalonamento (Scheduling) e Execução (Execution). Na etapa de Aquisição, os tempos de processamento das tarefas são obtidos; na etapa de Escalonamento a metodologia busca encontrar a distribuição de tarefas que maximize a velocidade de execução da aplicação paralela, mas minimizando o uso de recursos, por meio de um algoritmo desenvolvido neste trabalho; e por fim a etapa de Execução executa a aplicação paralela com a distribuição definida na etapa anterior. Ferramentas que são aplicadas na metodologia foram implementadas. Um conjunto de testes aplicando a metodologia foi realizado e os resultados apresentados mostram que os objetivos da proposta foram alcançados.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This talk explores how the runtime system and operating system can leverage metrics that express the significance and resilience of application components in order to reduce the energy footprint of parallel applications. We will explore in particular how software can tolerate and indeed exploit higher error rates in future processors and memory technologies that may operate outside their safe margins.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

To cover wide range of pulsed power applications, this paper proposes a modularity concept to improve the performance and flexibility of the pulsed power supply. The proposed scheme utilizes the advantage of parallel and series configurations of flyback modules in obtaining high-voltage levels with fast rise time (dv/dt). Prototypes were implemented using 600-V insulated-gate bipolar transistor (IGBT) switches to generate up to 4-kV output pulses with 1-kHz repetition rate for experimentation. To assess the proposed modular approach for higher number of the modules, prototypes were implemented using 1700-V IGBTs switches, based on ten-series modules, and tested up to 20 kV. Conducted experimental results verified the effectiveness of the proposed method

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Advances in solid-state switches and power electronics techniques have led to the development of compact, efficient and more reliable pulsed power systems. Although, the power rating and operation speed of the new solid-state switches are considerably increased, their low blocking voltage level puts a limits in the pulsed power operation. This paper proposes the advantage of parallel and series configurations of pulsed power modules in obtaining high voltage levels with fast rise time (dv/dt) using only conventional switches. The proposed configuration is based on two flyback modules. The effectiveness of the proposed approach is verified by numerical simulations, and the advantages of each configuration are indicated in comparison with a single module.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Parallel interleaved converters are finding more applications everyday, for example they are frequently used for VRMs on PC main boards mainly to obtain better transient response. Parallel interleaved converters can have their inductances uncoupled, directly coupled or inversely coupled, all of which have different applications with associated advantages and disadvantages. Coupled systems offer more control over converter features, such as ripple currents, inductance volume and transient response. To be able to gain an intuitive understanding of which type of parallel interleaved converter, what amount of coupling, what number of levels and how much inductance should be used for different applications a simple equivalent model is needed. As all phases of an interleaved converter are supposed to be identical, the equivalent model is nothing more than a separate inductance which is common to all phases. Without utilising this simplification the design of a coupled system is quite daunting. Being able to design a coupled system involves solving and understanding the RMS currents of the input, individual phase (or cell) and output. A procedure using this equivalent model and a small amount of modulo arithmetic is detailed.