196 resultados para Workflows


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data analytic applications are characterized by large data sets that are subject to a series of processing phases. Some of these phases are executed sequentially but others can be executed concurrently or in parallel on clusters, grids or clouds. The MapReduce programming model has been applied to process large data sets in cluster and cloud environments. For developing an application using MapReduce there is a need to install/configure/access specific frameworks such as Apache Hadoop or Elastic MapReduce in Amazon Cloud. It would be desirable to provide more flexibility in adjusting such configurations according to the application characteristics. Furthermore the composition of the multiple phases of a data analytic application requires the specification of all the phases and their orchestration. The original MapReduce model and environment lacks flexible support for such configuration and composition. Recognizing that scientific workflows have been successfully applied to modeling complex applications, this paper describes our experiments on implementing MapReduce as subworkflows in the AWARD framework (Autonomic Workflow Activities Reconfigurable and Dynamic). A text mining data analytic application is modeled as a complex workflow with multiple phases, where individual workflow nodes support MapReduce computations. As in typical MapReduce environments, the end user only needs to define the application algorithms for input data processing and for the map and reduce functions. In the paper we present experimental results when using the AWARD framework to execute MapReduce workflows deployed over multiple Amazon EC2 (Elastic Compute Cloud) instances.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Workflows have been successfully applied to express the decomposition of complex scientific applications. However the existing tools still lack adequate support to important aspects namely, decoupling the enactment engine from tasks specification, decentralizing the control of workflow activities allowing their tasks to run in distributed infrastructures, and supporting dynamic workflow reconfigurations. We present the AWARD (Autonomic Workflow Activities Reconfigurable and Dynamic) model of computation, based on Process Networks, where the workflow activities (AWA) are autonomic processes with independent control that can run in parallel on distributed infrastructures. Each AWA executes a task developed as a Java class with a generic interface allowing end-users to code their applications without low-level details. The data-driven coordination of AWA interactions is based on a shared tuple space that also enables dynamic workflow reconfiguration. For evaluation we describe experimental results of AWARD workflow executions in several application scenarios, mapped to the Amazon (Elastic Computing EC2) Cloud.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Dissertação para obtenção do Grau de Mestre em Engenharia Informática

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Thesis submitted in fulfilment of the requirements for the Degree of Master of Science in Computer Science

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Doctoral Program in Computer Science

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Workflow technology is expanding rapidly. In doing so, new technologies are employed. The internet, which is one such technology, could allow every user within an organization to make use of workflow. The internet- based workflows are discussed in this thesis from technical and, also, from economical points of view. First, as an ampler introduction, there are presented the basic concepts related to this topic: the workflow concept, about processes and workflows and the workflow management system. Also in this introduction it is discussed about the XML language and the overview of the Web Services stack. Then is explained how the internet-based workflows work: is presented the architecture of an internet-based enterprise and, also, the flows between web-services. Finally, there are presented, briefly, some workflow languages. In addition, based on this knowledge, a sample workflow was implemented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation of Janet Aucock, at the FinELib Consortium Seminar (Aineistopäivä), April 16, 2015 in Helsinki.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wednesday 23rd April 2014 Speaker(s): Willi Hasselbring Organiser: Leslie Carr Time: 23/04/2014 11:00-11:50 Location: B32/3077 File size: 669 Mb Abstract For good scientific practice, it is important that research results may be properly checked by reviewers and possibly repeated and extended by other researchers. This is of particular interest for "digital science" i.e. for in-silico experiments. In this talk, I'll discuss some issues of how software systems and services may contribute to good scientific practice. Particularly, I'll present our PubFlow approach to automate publication workflows for scientific data. The PubFlow workflow management system is based on established technology. We integrate institutional repository systems (based on EPrints) and world data centers (in marine science). PubFlow collects provenance data automatically via our monitoring framework Kieker. Provenance information describes the origins and the history of scientific data in its life cycle, and the process by which it arrived. Thus, provenance information is highly relevant to repeatability and trustworthiness of scientific results. In our evaluation in marine science, we collaborate with the GEOMAR Helmholtz Centre for Ocean Research Kiel.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The service-oriented approach to performing distributed scientific research is potentially very powerful but is not yet widely used in many scientific fields. This is partly due to the technical difficulties involved in creating services and workflows and the inefficiency of many workflow systems with regard to handling large datasets. We present the Styx Grid Service, a simple system that wraps command-line programs and allows them to be run over the Internet exactly as if they were local programs. Styx Grid Services are very easy to create and use and can be composed into powerful workflows with simple shell scripts or more sophisticated graphical tools. An important feature of the system is that data can be streamed directly from service to service, significantly increasing the efficiency of workflows that use large data volumes. The status and progress of Styx Grid Services can be monitored asynchronously using a mechanism that places very few demands on firewalls. We show how Styx Grid Services can interoperate with with Web Services and WS-Resources using suitable adapters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Scientific workflows are becoming a valuable tool for scientists to capture and automate e-Science procedures. Their success brings the opportunity to publish, share, reuse and repurpose this explicitly captured knowledge. Within the myGrid project, we have identified key resources that can be shared including complete workflows, fragments of workflows and constituent services. We have examined the alternative ways these can be described by their authors (and subsequent users), and developed a unified descriptive model to support their later discovery. By basing this model on existing standards, we have been able to extend existing Web Service and Semantic Web Service infrastructure whilst still supporting the specific needs of the e-Scientist. myGrid components enable a workflow life-cycle that extends beyond execution, to include discovery of previous relevant designs, reuse of those designs, and subsequent publication. Experience with example groups of scientists indicates that this cycle is valuable. The growing number of workflows and services mean more work is needed to support the user in effective ranking of search results, and to support the repurposing process.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article explores libraries’ technical workflow design and strategic considerations as various e-books business models and mobile devices and their management become a growing part of the information landscape.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The CMS Collaboration conducted a month-long data taking exercise, the Cosmic Run At Four Tesla, during October-November 2008, with the goal of commissioning the experiment for extended operation. With all installed detector systems participating, CMS recorded 270 million cosmic ray events with the solenoid at a magnetic field strength of 3.8 T. This paper describes the data flow from the detector through the various online and offline computing systems, as well as the workflows used for recording the data, for aligning and calibrating the detector, and for analysis of the data. © 2010 IOP Publishing Ltd and SISSA.