598 resultados para Workflows semânticos


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compute grids are used widely in many areas of environmental science, but there has been limited uptake of grid computing by the climate modelling community, partly because the characteristics of many climate models make them difficult to use with popular grid middleware systems. In particular, climate models usually produce large volumes of output data, and running them also involves complicated workflows implemented as shell scripts. A new grid middleware system that is well suited to climate modelling applications is presented in this paper. Grid Remote Execution (G-Rex) allows climate models to be deployed as Web services on remote computer systems and then launched and controlled as if they were running on the user's own computer. Output from the model is transferred back to the user while the run is in progress to prevent it from accumulating on the remote system and to allow the user to monitor the model. G-Rex has a REST architectural style, featuring a Java client program that can easily be incorporated into existing scientific workflow scripts. Some technical details of G-Rex are presented, with examples of its use by climate modellers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Compute grids are used widely in many areas of environmental science, but there has been limited uptake of grid computing by the climate modelling community, partly because the characteristics of many climate models make them difficult to use with popular grid middleware systems. In particular, climate models usually produce large volumes of output data, and running them usually involves complicated workflows implemented as shell scripts. For example, NEMO (Smith et al. 2008) is a state-of-the-art ocean model that is used currently for operational ocean forecasting in France, and will soon be used in the UK for both ocean forecasting and climate modelling. On a typical modern cluster, a particular one year global ocean simulation at 1-degree resolution takes about three hours when running on 40 processors, and produces roughly 20 GB of output as 50000 separate files. 50-year simulations are common, during which the model is resubmitted as a new job after each year. Running NEMO relies on a set of complicated shell scripts and command utilities for data pre-processing and post-processing prior to job resubmission. Grid Remote Execution (G-Rex) is a pure Java grid middleware system that allows scientific applications to be deployed as Web services on remote computer systems, and then launched and controlled as if they are running on the user's own computer. Although G-Rex is general purpose middleware it has two key features that make it particularly suitable for remote execution of climate models: (1) Output from the model is transferred back to the user while the run is in progress to prevent it from accumulating on the remote system and to allow the user to monitor the model; (2) The client component is a command-line program that can easily be incorporated into existing model work-flow scripts. G-Rex has a REST (Fielding, 2000) architectural style, which allows client programs to be very simple and lightweight and allows users to interact with model runs using only a basic HTTP client (such as a Web browser or the curl utility) if they wish. This design also allows for new client interfaces to be developed in other programming languages with relatively little effort. The G-Rex server is a standard Web application that runs inside a servlet container such as Apache Tomcat and is therefore easy to install and maintain by system administrators. G-Rex is employed as the middleware for the NERC1 Cluster Grid, a small grid of HPC2 clusters belonging to collaborating NERC research institutes. Currently the NEMO (Smith et al. 2008) and POLCOMS (Holt et al, 2008) ocean models are installed, and there are plans to install the Hadley Centre’s HadCM3 model for use in the decadal climate prediction project GCEP (Haines et al., 2008). The science projects involving NEMO on the Grid have a particular focus on data assimilation (Smith et al. 2008), a technique that involves constraining model simulations with observations. The POLCOMS model will play an important part in the GCOMS project (Holt et al, 2008), which aims to simulate the world’s coastal oceans. A typical use of G-Rex by a scientist to run a climate model on the NERC Cluster Grid proceeds as follows :(1) The scientist prepares input files on his or her local machine. (2) Using information provided by the Grid’s Ganglia3 monitoring system, the scientist selects an appropriate compute resource. (3) The scientist runs the relevant workflow script on his or her local machine. This is unmodified except that calls to run the model (e.g. with “mpirun”) are simply replaced with calls to "GRexRun" (4) The G-Rex middleware automatically handles the uploading of input files to the remote resource, and the downloading of output files back to the user, including their deletion from the remote system, during the run. (5) The scientist monitors the output files, using familiar analysis and visualization tools on his or her own local machine. G-Rex is well suited to climate modelling because it addresses many of the middleware usability issues that have led to limited uptake of grid computing by climate scientists. It is a lightweight, low-impact and easy-to-install solution that is currently designed for use in relatively small grids such as the NERC Cluster Grid. A current topic of research is the use of G-Rex as an easy-to-use front-end to larger-scale Grid resources such as the UK National Grid service.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Grid workflow authoring tools are typically specific to particular workflow engines built into Grid middleware, or are application specific and are designed to interact with specific software implementations. g-Eclipse is a middleware independent Grid workbench that aims to provide a unified abstraction of the Grid and includes a Grid workflow builder to allow users to author and deploy workflows to the Grid. This paper describes the g-Eclipse Workflow Builder and its implementations for two Grid middlewares, gLite and GRIA, and a case study utilizing the Workflow Builder in a Grid user's scientific workflow deployment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the Biodiversity World (BDW) project we have created a flexible and extensible Web Services-based Grid environment for biodiversity researchers to solve problems in biodiversity and analyse biodiversity patterns. In this environment, heterogeneous and globally distributed biodiversity-related resources such as data sets and analytical tools are made available to be accessed and assembled by users into workflows to perform complex scientific experiments. One such experiment is bioclimatic modelling of the geographical distribution of individual species using climate variables in order to predict past and future climate-related changes in species distribution. Data sources and analytical tools required for such analysis of species distribution are widely dispersed, available on heterogeneous platforms, present data in different formats and lack interoperability. The BDW system brings all these disparate units together so that the user can combine tools with little thought as to their availability, data formats and interoperability. The current Web Servicesbased Grid environment enables execution of the BDW workflow tasks in remote nodes but with a limited scope. The next step in the evolution of the BDW architecture is to enable workflow tasks to utilise computational resources available within and outside the BDW domain. We describe the present BDW architecture and its transition to a new framework which provides a distributed computational environment for mapping and executing workflows in addition to bringing together heterogeneous resources and analytical tools.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the BiodiversityWorld project we are building a GRID to support scientific biodiversity-related research. The requirements associated with such a GRID are somewhat different from other GRIDs, and this has influenced the architecture that we have developed. In this paper we outline these requirements, most notably the need to interoperate over a diverse set of legacy databases and applications in an environment that supports effective resource discovery and use of these resources in complex workflows. Our architecture provides an invocation model that is usable over a wide range of resource types and underlying GRID middleware. However, there is a trade-off between the flexibility provided by our architecture and its performance. We discuss how this affects the inclusion of computationally intensive applications and applications that are highly interactive; we also consider the broader issue of interoperation with other GRIDs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human brain imaging techniques, such as Magnetic Resonance Imaging (MRI) or Diffusion Tensor Imaging (DTI), have been established as scientific and diagnostic tools and their adoption is growing in popularity. Statistical methods, machine learning and data mining algorithms have successfully been adopted to extract predictive and descriptive models from neuroimage data. However, the knowledge discovery process typically requires also the adoption of pre-processing, post-processing and visualisation techniques in complex data workflows. Currently, a main problem for the integrated preprocessing and mining of MRI data is the lack of comprehensive platforms able to avoid the manual invocation of preprocessing and mining tools, that yields to an error-prone and inefficient process. In this work we present K-Surfer, a novel plug-in of the Konstanz Information Miner (KNIME) workbench, that automatizes the preprocessing of brain images and leverages the mining capabilities of KNIME in an integrated way. K-Surfer supports the importing, filtering, merging and pre-processing of neuroimage data from FreeSurfer, a tool for human brain MRI feature extraction and interpretation. K-Surfer automatizes the steps for importing FreeSurfer data, reducing time costs, eliminating human errors and enabling the design of complex analytics workflow for neuroimage data by leveraging the rich functionalities available in the KNIME workbench.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This text extends some ideas presented in a keynote lecture of the 5th Encontro de Tipografia conference, in Barcelos, Portugal, in November 2014. The paper discusses problems of identifying the location and encoding of design decisions, the implications of digital workflows for capturing knowledge generating through design practice, and the consequences of the transformation of production tools into commodities. It concludes with a discussion of the perception of added value in typeface design.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The emergence and development of digital imaging technologies and their impact on mainstream filmmaking is perhaps the most familiar special effects narrative associated with the years 1981-1999. This is in part because some of the questions raised by the rise of the digital still concern us now, but also because key milestone films showcasing advancements in digital imaging technologies appear in this period, including Tron (1982) and its computer generated image elements, the digital morphing in The Abyss (1989) and Terminator 2: Judgment Day (1991), computer animation in Jurassic Park (1993) and Toy Story (1995), digital extras in Titanic (1997), and ‘bullet time’ in The Matrix (1999). As a result it is tempting to characterize 1981-1999 as a ‘transitional period’ in which digital imaging processes grow in prominence and technical sophistication, and what we might call ‘analogue’ special effects processes correspondingly become less common. But such a narrative risks eliding the other practices that also shape effects sequences in this period. Indeed, the 1980s and 1990s are striking for the diverse range of effects practices in evidence in both big budget films and lower budget productions, and for the extent to which analogue practices persist independently of or alongside digital effects work in a range of production and genre contexts. The chapter seeks to document and celebrate this diversity and plurality, this sustaining of earlier traditions of effects practice alongside newer processes, this experimentation with materials and technologies old and new in the service of aesthetic aspirations alongside budgetary and technical constraints. The common characterization of the period as a series of rapid transformations in production workflows, practices and technologies will be interrogated in relation to the persistence of certain key figures as Douglas Trumbull, John Dykstra, and James Cameron, but also through a consideration of the contexts for and influences on creative decision-making. Comparative analyses of the processes used to articulate bodies, space and scale in effects sequences drawn from different generic sites of special effects work, including science fiction, fantasy, and horror, will provide a further frame for the chapter’s mapping of the commonalities and specificities, continuities and variations in effects practices across the period. In the process, the chapter seeks to reclaim analogue processes’ contribution both to moments of explicit spectacle, and to diegetic verisimilitude, in the decades most often associated with the digital’s ‘arrival’.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The aim for this thesis was to develop a proposal of documentation, containing rules and procedures, which JernströmOffset needs to acquire the Certified Graphic Production certification. A fundamental part was to study the materialissued by Sveriges Grafiska Mediaförening, to clarify the requirements that must be met to obtain the certification. Therecommendations found in the CGP material must also be considered.Based on the requirements and recommendations of Certified Graphic Production, a mapping of the workflows atJernström Offset was performed. It was done by interviewing employees from different departments at the company,this to get a clear understanding of the operations carried out throughout the production flow and the final qualityfollow-up.During our reviewing process we found that a number of changes, in terms of working environment and practices,must be made at the company. As a result of this we propose some appropriate actions to be implemented.The documentation was finally written, through the requirements and recommendations of Certified GraphicProduction and then applied to Jernström Offset’s work procedures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis is about new digital moving image recording technologies and how they augment the distribution of creativity and the flexibility in moving image production systems, but also impose constraints on how images flow through the production system. The central concept developed in this thesis is ‘creative space’ which links quality and efficiency in moving image production to time for creative work, capacity of digital tools, user skills and the constitution of digital moving image material. The empirical evidence of this thesis is primarily based on semi-structured interviews conducted with Swedish film and TV production representatives.This thesis highlights the importance of pre-production technical planning and proposes a design management support tool (MI-FLOW) as a way to leverage functional workflows that is a prerequisite for efficient and cost effective moving image production.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MyGrid is an e-Science Grid project that aims to help biologists and bioinformaticians to perform workflow-based in silico experiments, and help them to automate the management of such workflows through personalisation, notification of change and publication of experiments. In this paper, we describe the architecture of myGrid and how it will be used by the scientist. We then show how myGrid can benefit from agents technologies. We have identified three key uses of agent technologies in myGrid: user agents, able to customize and personalise data, agent communication languages offering a generic and portable communication medium, and negotiation allowing multiple distributed entities to reach service level agreements.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Current scientific applications are often structured as workflows and rely on workflow systems to compile abstract experiment designs into enactable workflows that utilise the best available resources. The automation of this step and of the workflow enactment, hides the details of how results have been produced. Knowing how compilation and enactment occurred allows results to be reconnected with the experiment design. We investigate how provenance helps scientists to connect their results with the actual execution that took place, their original experiment and its inputs and parameters.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A description of a data item's provenance can be provided in dierent forms, and which form is best depends on the intended use of that description. Because of this, dierent communities have made quite distinct underlying assumptions in their models for electronically representing provenance. Approaches deriving from the library and archiving communities emphasise agreed vocabulary by which resources can be described and, in particular, assert their attribution (who created the resource, who modied it, where it was stored etc.) The primary purpose here is to provide intuitive metadata by which users can search for and index resources. In comparison, models for representing the results of scientific workflows have been developed with the assumption that each event or piece of intermediary data in a process' execution can and should be documented, to give a full account of the experiment undertaken. These occurrences are connected together by stating where one derived from, triggered, or otherwise caused another, and so form a causal graph. Mapping between the two approaches would be benecial in integrating systems and exploiting the strengths of each. In this paper, we specify such a mapping between Dublin Core and the Open Provenance Model. We further explain the technical issues to overcome and the rationale behind the approach, to allow the same method to apply in mapping similar schemes.