955 resultados para Workflows científicos
Resumo:
A ciência tem feito uso frequente de recursos computacionais para execução de experimentos e processos científicos, que podem ser modelados como workflows que manipulam grandes volumes de dados e executam ações como seleção, análise e visualização desses dados segundo um procedimento determinado. Workflows científicos têm sido usados por cientistas de várias áreas, como astronomia e bioinformática, e tendem a ser computacionalmente intensivos e fortemente voltados à manipulação de grandes volumes de dados, o que requer o uso de plataformas de execução de alto desempenho como grades ou nuvens de computadores. Para execução dos workflows nesse tipo de plataforma é necessário o mapeamento dos recursos computacionais disponíveis para as atividades do workflow, processo conhecido como escalonamento. Plataformas de computação em nuvem têm se mostrado um alternativa viável para a execução de workflows científicos, mas o escalonamento nesse tipo de plataforma geralmente deve considerar restrições específicas como orçamento limitado ou o tipo de recurso computacional a ser utilizado na execução. Nesse contexto, informações como a duração estimada da execução ou limites de tempo e de custo (chamadas aqui de informações de suporte ao escalonamento) são importantes para garantir que o escalonamento seja eficiente e a execução ocorra de forma a atingir os resultados esperados. Este trabalho identifica as informações de suporte que podem ser adicionadas aos modelos de workflows científicos para amparar o escalonamento e a execução eficiente em plataformas de computação em nuvem. É proposta uma classificação dessas informações, e seu uso nos principais Sistemas Gerenciadores de Workflows Científicos (SGWC) é analisado. Para avaliar o impacto do uso das informações no escalonamento foram realizados experimentos utilizando modelos de workflows científicos com diferentes informações de suporte, escalonados com algoritmos que foram adaptados para considerar as informações inseridas. Nos experimentos realizados, observou-se uma redução no custo financeiro de execução do workflow em nuvem de até 59% e redução no makespan chegando a 8,6% se comparados à execução dos mesmos workflows sendo escalonados sem nenhuma informação de suporte disponível.
Resumo:
Queensland University of Technology (QUT) completed an Australian National Data Service (ANDS) funded “Seeding the Commons Project” to contribute metadata to Research Data Australia. The project employed two Research Data Librarians from October 2009 through to July 2010. Technical support for the project was provided by QUT’s High Performance Computing and Research Support Specialists. ---------- The project identified and described QUT’s category 1 (ARC / NHMRC) research datasets. Metadata for the research datasets was stored in QUT’s Research Data Repository (Architecta Mediaflux). Metadata which was suitable for inclusion in Research Data Australia was made available to the Australian Research Data Commons (ARDC) in RIF-CS format. ---------- Several workflows and processes were developed during the project. 195 data interviews took place in connection with 424 separate research activities which resulted in the identification of 492 datasets. ---------- The project had a high level of technical support from QUT High Performance Computing and Research Support Specialists who developed the Research Data Librarian interface to the data repository that enabled manual entry of interview data and dataset metadata, creation of relationships between repository objects. The Research Data Librarians mapped the QUT metadata repository fields to RIF-CS and an application was created by the HPC and Research Support Specialists to generate RIF-CS files for harvest by the Australian Research Data Commons (ARDC). ---------- This poster will focus on the workflows and processes established for the project including: ---------- • Interview processes and instruments • Data Ingest from existing systems (including mapping to RIF-CS) • Data entry and the Data Librarian interface to Mediaflux • Verification processes • Mapping and creation of RIF-CS for the ARDC
Resumo:
As the need for concepts such as cancellation and OR-joins occurs naturally in business scenarios, comprehensive support in a workflow language is desirable. However, there is a clear trade-off between the expressive power of a language (i.e., introducing complex constructs such as cancellation and OR-joins) and ease of verification. When a workflow contains a large number of tasks and involves complex control flow dependencies, verification can take too much time or it may even be impossible. There are a number of different approaches to deal with this complexity. Reducing the size of the workflow, while preserving its essential properties with respect to a particular analysis problem, is one such approach. In this paper, we present a set of reduction rules for workflows with cancellation regions and OR-joins and demonstrate how they can be used to improve the efficiency of verification. Our results are presented in the context of the YAWL workflow language.
Resumo:
Flexible information exchange is critical to successful design-analysis integration, but current top-down, standards-based and model-oriented strategies impose restrictions that contradicts this flexibility. In this article we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks. We then discuss how a shared mapping process that is flexible and user friendly supports non-programmers in creating these custom connections. Adopting a services-oriented system architecture, we propose a web-based platform that enables data, semantics and models to be shared on the fly. We then discuss potential challenges and opportunities for its development as a flexible, visual, collaborative, scalable and open system.
Resumo:
Flexible information exchange is critical to successful design integration, but current top-down, standards-based and model-oriented strategies impose restrictions that are contradictory to this flexibility. In this paper we present a bottom-up, user-controlled and process-oriented approach to linking design and analysis applications that is more responsive to the varied needs of designers and design teams. Drawing on research into scientific workflows, we present a framework for integration that capitalises on advances in cloud computing to connect discrete tools via flexible and distributed process networks. Adopting a services-oriented system architecture, we propose a web-based platform that enables data, semantics and models to be shared on the fly. We discuss potential challenges and opportunities for the development thereof as a flexible, visual, collaborative, scalable and open system.
Resumo:
This paper describes the use of property graphs for mapping data between AEC software tools, which are not linked by common data formats and/or other interoperability measures. The intention of introducing this in practice, education and research is to facilitate the use of diverse, non-integrated design and analysis applications by a variety of users who need to create customised digital workflows, including those who are not expert programmers. Data model types are examined by way of supporting the choice of directed, attributed, multi-relational graphs for such data transformation tasks. A brief exemplar design scenario is also presented to illustrate the concepts and methods proposed, and conclusions are drawn regarding the feasibility of this approach and directions for further research.
Resumo:
The configuration of comprehensive Enterprise Systems to meet the specific requirements of an organisation up to today is consuming significant resources. The results of failing implementation projects are severe and may even threaten the organisation’s existence. This paper proposes a method which aims at increasing the efficiency of Enterprise Systems implementations. First, we argue that existing process modelling languages that feature different degrees of abstraction for different user groups exist and are used for different purposes which makes it necessary to integrate them. We describe how to do this using the meta models of the involved languages. Second, we motivate that an integrated process model based on the integrated meta model needs to be configurable and elaborate on the mechanisms by which this model configuration can be achieved. We introduce a business example using SAP modelling techniques to illustrate the proposed method.
Resumo:
Despite advances in the field of workflow flexibility, there is still insufficient support for dealing with unforeseen exceptions. In particular, it is challenging to find a solution which preserves the intent of the process as much as possible when such exceptions are encountered. This challenge can be alleviated by making the connection between a process and its objectives more explicit. This paper presents a demo illustrating the blended workflow approach where two specifications are fused together, a "classic" process model and a goal model. End users are guided by the process model but may deviate from this model whenever unexpected situations are encountered. The two models involved provide views on the process and the demo shows how one can switch between these views and how they are kept consistent by the blended workflow engine. A simple example involving the making of a doctor's appointment illustrates the potential advantages of the proposed approach to both researchers and developers.
Resumo:
Product Lifecycle Management (PLM) systems are widely used in the manufacturing industry. A core feature of such systems is to provide support for versioning of product data. As workflow functionality is increasingly used in PLM systems, the possibility emerges that the versioning transitions for product objects as encapsulated in process models do not comply with the valid version control policies mandated in the objects’ actual lifecycles. In this paper we propose a solution to tackle the (non-)compliance issues between processes and object version control policies. We formally define the notion of compliance between these two artifacts in product lifecycle management and then develop a compliance checking method which employs a well-established workflow analysis technique. This forms the basis of a tool which offers automated support to the proposed approach. By applying the approach to a collection of real-life specifications in a main PLM system, we demonstrate the practical applicability of our solution to the field.
Resumo:
Background: With the advances in DNA sequencer-based technologies, it has become possible to automate several steps of the genotyping process leading to increased throughput. To efficiently handle the large amounts of genotypic data generated and help with quality control, there is a strong need for a software system that can help with the tracking of samples and capture and management of data at different steps of the process. Such systems, while serving to manage the workflow precisely, also encourage good laboratory practice by standardizing protocols, recording and annotating data from every step of the workflow Results: A laboratory information management system (LIMS) has been designed and implemented at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) that meets the requirements of a moderately high throughput molecular genotyping facility. The application is designed as modules and is simple to learn and use. The application leads the user through each step of the process from starting an experiment to the storing of output data from the genotype detection step with auto-binning of alleles; thus ensuring that every DNA sample is handled in an identical manner and all the necessary data are captured. The application keeps track of DNA samples and generated data. Data entry into the system is through the use of forms for file uploads. The LIMS provides functions to trace back to the electrophoresis gel files or sample source for any genotypic data and for repeating experiments. The LIMS is being presently used for the capture of high throughput SSR (simple-sequence repeat) genotyping data from the legume (chickpea, groundnut and pigeonpea) and cereal (sorghum and millets) crops of importance in the semi-arid tropics. Conclusions: A laboratory information management system is available that has been found useful in the management of microsatellite genotype data in a moderately high throughput genotyping laboratory. The application with source code is freely available for academic users and can be downloaded from http://www.icrisat.org/bt-software-d-lims.htm