990 resultados para grid computing


Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Existing benchmarking methods are time consuming processes as they typically benchmark the entire Virtual Machine (VM) in order to generate accurate performance data, making them less suitable for real-time analytics. The research in this paper is aimed to surmount the above challenge by presenting DocLite - Docker Container-based Lightweight benchmarking tool. DocLite explores lightweight cloud benchmarking methods for rapidly executing benchmarks in near real-time. DocLite is built on the Docker container technology, which allows a user-defined memory size and number of CPU cores of the VM to be benchmarked. The tool incorporates two benchmarking methods - the first referred to as the native method employs containers to benchmark a small portion of the VM and generate performance ranks, and the second uses historic benchmark data along with the native method as a hybrid to generate VM ranks. The proposed methods are evaluated on three use-cases and are observed to be up to 91 times faster than benchmarking the entire VM. In both methods, small containers provide the same quality of rankings as a large container. The native method generates ranks with over 90% and 86% accuracy for sequential and parallel execution of an application compared against benchmarking the whole VM. The hybrid method did not improve the quality of the rankings significantly.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As tecnologias de informação e comunicação na área da saúde não são só um instrumento para a boa gestão de informação, mas antes um fator estratégico para uma prestação de cuidados mais eficiente e segura. As tecnologias de informação são um pilar para que os sistemas de saúde evoluam em direção a um modelo centrado no cidadão, no qual um conjunto abrangente de informação do doente deve estar automaticamente disponível para as equipas que lhe prestam cuidados, independentemente de onde foi gerada (local geográfico ou sistema). Este tipo de utilização segura e agregada da informação clínica é posta em causa pela fragmentação generalizada das implementações de sistemas de informação em saúde. Várias aproximações têm sido propostas para colmatar as limitações decorrentes das chamadas “ilhas de informação” na saúde, desde a centralização total (um sistema único), à utilização de redes descentralizadas de troca de mensagens clínicas. Neste trabalho, propomos a utilização de uma camada de unificação baseada em serviços, através da federação de fontes de informação heterogéneas. Este agregador de informação clínica fornece a base necessária para desenvolver aplicações com uma lógica regional, que demostrámos com a implementação de um sistema de registo de saúde eletrónico virtual. Ao contrário dos métodos baseados em mensagens clínicas ponto-a-ponto, populares na integração de sistemas em saúde, desenvolvemos um middleware segundo os padrões de arquitetura J2EE, no qual a informação federada é expressa como um modelo de objetos, acessível através de interfaces de programação. A arquitetura proposta foi instanciada na Rede Telemática de Saúde, uma plataforma instalada na região de Aveiro que liga oito instituições parceiras (dois hospitais e seis centros de saúde), cobrindo ~350.000 cidadãos, utilizada por ~350 profissionais registados e que permite acesso a mais de 19.000.000 de episódios. Para além da plataforma colaborativa regional para a saúde (RTSys), introduzimos uma segunda linha de investigação, procurando fazer a ponte entre as redes para a prestação de cuidados e as redes para a computação científica. Neste segundo cenário, propomos a utilização dos modelos de computação Grid para viabilizar a utilização e integração massiva de informação biomédica. A arquitetura proposta (não implementada) permite o acesso a infraestruturas de e-Ciência existentes para criar repositórios de informação clínica para aplicações em saúde.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

E-scientists want to run their scientific experiments on Distributed Computing Infrastructures (DCI) to be able to access large pools of resources and services. To run experiments on these infrastructures requires specific expertise that e-scientists may not have. Workflows can hide resources and services as a virtualization layer providing a user interface that e-scientists can use. There are many workflow systems used by research communities but they are not interoperable. To learn a workflow system and create workflows in this workflow system may require significant efforts from e-scientists. Considering these efforts it is not reasonable to expect that research communities will learn new workflow systems if they want to run workflows developed in other workflow systems. The solution is to create workflow interoperability solutions to allow workflow sharing. The FP7 Sharing Interoperable Workflow for Large-Scale Scientific Simulation on Available DCIs (SHIWA) project developed two interoperability solutions to support workflow sharing: Coarse-Grained Interoperability (CGI) and Fine-Grained Interoperability (FGI). The project created the SHIWA Simulation Platform (SSP) to implement the Coarse-Grained Interoperability approach as a production-level service for research communities. The paper describes the CGI approach and how it enables sharing and combining existing workflows into complex applications and run them on Distributed Computing Infrastructures. The paper also outlines the architecture, components and usage scenarios of the simulation platform.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Las herramientas ETL (Extract, Transform, Load – extraer, transformar, cargar) permiten modelizar flujos de datos, facilitando la ejecución automática de procesos repetitivos. El intercambio de información entre dos modelos de datos heterogéneos es un claro ejemplo del tipo de tareas que pueden abordarse con software ETL. El proyecto Kettle es una herramienta ETL con licencia LGPL (Library General Public License) que utiliza técnicas de computación grid (ejecución paralela y distribuida) para poder procesar grandes cantidades de datos en un tiempo reducido. Kettle combina una potente ejecución en modo servidor con una intuitiva herramienta de escritorio para modelar los procesos y configurar los parámetros de ejecución. GeoKettle es una extensión de Kettle, que añade la posibilidad de tratar datos con componente geográfica, si bien está limitado a datos vectoriales y a ciertas operaciones espaciales muy concreta. El Centro Temático Europeo de Usos del Suelo e Información Espacial (ETC-LUSI) está impulsando un proyecto complementario, llamado BeETLe, que pretende ampliar drásticamente las capacidades de análisis y transformación espacial de GeoKettle. Para ello se ha elegido el proyecto Sextante, una librería de análisis espacial que incluye más de doscientos algoritmos ráster y vectoriales. La intención del proyecto BeETLe es integrar el conjunto de algoritmos de Sextante en GeoKettle, de forma que estén disponibles como transformaciones de GeoKettle. Las principales características de la herramienta BeETLe incluyen: automatización de procesos de análisis espacial o de transformaciones repetitivas de datos espaciales, ejecución paralela y distribuida (grid computing), capacidad para procesar grandes cantidades de datos sin limitaciones de memoria, y soporte de datos ráster y vectorial. Los usuarios actuales de Sextante descubrirán que BeETLe les propone una forma de trabajo sencilla e intuitiva, que añade a Sextante toda la potencia que ofrecen las herramientas ETL para procesar y transformar información en bases de datos

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Compute grids are used widely in many areas of environmental science, but there has been limited uptake of grid computing by the climate modelling community, partly because the characteristics of many climate models make them difficult to use with popular grid middleware systems. In particular, climate models usually produce large volumes of output data, and running them also involves complicated workflows implemented as shell scripts. A new grid middleware system that is well suited to climate modelling applications is presented in this paper. Grid Remote Execution (G-Rex) allows climate models to be deployed as Web services on remote computer systems and then launched and controlled as if they were running on the user's own computer. Output from the model is transferred back to the user while the run is in progress to prevent it from accumulating on the remote system and to allow the user to monitor the model. G-Rex has a REST architectural style, featuring a Java client program that can easily be incorporated into existing scientific workflow scripts. Some technical details of G-Rex are presented, with examples of its use by climate modellers.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In real world applications sequential algorithms of data mining and data exploration are often unsuitable for datasets with enormous size, high-dimensionality and complex data structure. Grid computing promises unprecedented opportunities for unlimited computing and storage resources. In this context there is the necessity to develop high performance distributed data mining algorithms. However, the computational complexity of the problem and the large amount of data to be explored often make the design of large scale applications particularly challenging. In this paper we present the first distributed formulation of a frequent subgraph mining algorithm for discriminative fragments of molecular compounds. Two distributed approaches have been developed and compared on the well known National Cancer Institute’s HIV-screening dataset. We present experimental results on a small-scale computing environment.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper is addressed to the numerical solving of the rendering equation in realistic image creation. The rendering equation is integral equation describing the light propagation in a scene accordingly to a given illumination model. The used illumination model determines the kernel of the equation under consideration. Nowadays, widely used are the Monte Carlo methods for solving the rendering equation in order to create photorealistic images. In this work we consider the Monte Carlo solving of the rendering equation in the context of the parallel sampling scheme for hemisphere. Our aim is to apply this sampling scheme to stratified Monte Carlo integration method for parallel solving of the rendering equation. The domain for integration of the rendering equation is a hemisphere. We divide the hemispherical domain into a number of equal sub-domains of orthogonal spherical triangles. This domain partitioning allows to solve the rendering equation in parallel. It is known that the Neumann series represent the solution of the integral equation as a infinity sum of integrals. We approximate this sum with a desired truncation error (systematic error) receiving the fixed number of iteration. Then the rendering equation is solved iteratively using Monte Carlo approach. At each iteration we solve multi-dimensional integrals using uniform hemisphere partitioning scheme. An estimate of the rate of convergence is obtained using the stratified Monte Carlo method. This domain partitioning allows easy parallel realization and leads to convergence improvement of the Monte Carlo method. The high performance and Grid computing of the corresponding Monte Carlo scheme are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Smart healthcare is a complex domain for systems integration due to human and technical factors and heterogeneous data sources involved. As a part of smart city, it is such a complex area where clinical functions require smartness of multi-systems collaborations for effective communications among departments, and radiology is one of the areas highly relies on intelligent information integration and communication. Therefore, it faces many challenges regarding integration and its interoperability such as information collision, heterogeneous data sources, policy obstacles, and procedure mismanagement. The purpose of this study is to conduct an analysis of data, semantic, and pragmatic interoperability of systems integration in radiology department, and to develop a pragmatic interoperability framework for guiding the integration. We select an on-going project at a local hospital for undertaking our case study. The project is to achieve data sharing and interoperability among Radiology Information Systems (RIS), Electronic Patient Record (EPR), and Picture Archiving and Communication Systems (PACS). Qualitative data collection and analysis methods are used. The data sources consisted of documentation including publications and internal working papers, one year of non-participant observations and 37 interviews with radiologists, clinicians, directors of IT services, referring clinicians, radiographers, receptionists and secretary. We identified four primary phases of data analysis process for the case study: requirements and barriers identification, integration approach, interoperability measurements, and knowledge foundations. Each phase is discussed and supported by qualitative data. Through the analysis we also develop a pragmatic interoperability framework that summaries the empirical findings and proposes recommendations for guiding the integration in the radiology context.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The evolution of commodity computing lead to the possibility of efficient usage of interconnected machines to solve computationally-intensive tasks, which were previously solvable only by using expensive supercomputers. This, however, required new methods for process scheduling and distribution, considering the network latency, communication cost, heterogeneous environments and distributed computing constraints. An efficient distribution of processes over such environments requires an adequate scheduling strategy, as the cost of inefficient process allocation is unacceptably high. Therefore, a knowledge and prediction of application behavior is essential to perform effective scheduling. In this paper, we overview the evolution of scheduling approaches, focusing on distributed environments. We also evaluate the current approaches for process behavior extraction and prediction, aiming at selecting an adequate technique for online prediction of application execution. Based on this evaluation, we propose a novel model for application behavior prediction, considering chaotic properties of such behavior and the automatic detection of critical execution points. The proposed model is applied and evaluated for process scheduling in cluster and grid computing environments. The obtained results demonstrate that prediction of the process behavior is essential for efficient scheduling in large-scale and heterogeneous distributed environments, outperforming conventional scheduling policies by a factor of 10, and even more in some cases. Furthermore, the proposed approach proves to be efficient for online predictions due to its low computational cost and good precision. (C) 2009 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In the present work, the effects of spatial constraints on the efficiency of task execution in systems underlain by geographical complex networks are investigated, where the probability of connection decreases with the distance between the nodes. The investigation considers several configurations of the parameters defining the network connectivity, and the Barabasi-Albert network model is also considered for comparisons. The results show that the effect of connectivity is significant only for shorter tasks, the locality of connection simplied by the spatial constraints reduces efficiency, and the addition of edges can improve the efficiency of the execution, although with increasing locality of the connections the improvement is small.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The InteGrade middleware intends to exploit the idle time of computing resources in computer laboratories. In this work we investigate the performance of running parallel applications with communication among processors on the InteGrade grid. As costly communication on a grid can be prohibitive, we explore the so-called systolic or wavefront paradigm to design the parallel algorithms in which no global communication is used. To evaluate the InteGrade middleware we considered three parallel algorithms that solve the matrix chain product problem, the 0-1 Knapsack Problem, and the local sequence alignment problem, respectively. We show that these three applications running under the InteGrade middleware and MPI take slightly more time than the same applications running on a cluster with only LAM-MPI support. The results can be considered promising and the time difference between the two is not substantial. The overhead of the InteGrade middleware is acceptable, in view of the benefits obtained to facilitate the use of grid computing by the user. These benefits include job submission, checkpointing, security, job migration, etc. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we propose a scalable and fault-tolerant job scheduling framework for grid computing. The proposed framework loosely couples a dynamic job scheduling approach with the hybrid replications approach to schedule jobs efficiently while at the same time providing fault-tolerance. The novelty of the proposed framework is that it uses passive replication approach under high system load and active replication approach under low system loads. The switch between these two replication methods is also done dynamically and transparently.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In e-Science experiments, it is vital to record the experimental process for later use such as in interpreting results, verifying that the correct process took place or tracing where data came from. The process that led to some data is called the provenance of that data, and a provenance architecture is the software architecture for a system that will provide the necessary functionality to record, store and use process documentation. However, there has been little principled analysis of what is actually required of a provenance architecture, so it is impossible to determine the functionality they would ideally support. In this paper, we present use cases for a provenance architecture from current experiments in biology, chemistry, physics and computer science, and analyse the use cases to determine the technical requirements of a generic, technology and application-independent architecture. We propose an architecture that meets these requirements and evaluate a preliminary implementation by attempting to realise two of the use cases.