2 resultados para Humanist and Scientist
em Department of Computer Science E-Repository - King's College London, Strand, London
Resumo:
Very large scale computations are now becoming routinely used as a methodology to undertake scientific research. In this context, `provenance systems' are regarded as the equivalent of the scientist's logbook for in silico experimentation: provenance captures the documentation of the process that led to some result. Using a protein compressibility analysis application, we derive a set of generic use cases for a provenance system. In order to support these, we address the following fundamental questions: what is provenance? how to record it? what is the performance impact for grid execution? what is the performance of reasoning? In doing so, we define a technologyindependent notion of provenance that captures interactions between components, internal component information and grouping of interactions, so as to allow us to analyse and reason about the execution of scientific processes. In order to support persistent provenance in heterogeneous applications, we introduce a separate provenance store, in which provenance documentation can be stored, archived and queried independently of the technology used to run the application. Through a series of practical tests, we evaluate the performance impact of such a provenance system. In summary, we demonstrate that provenance recording overhead of our prototype system remains under 10% of execution time, and we show that the recorded information successfully supports our use cases in a performant manner.
Resumo:
Scientific workflows are becoming a valuable tool for scientists to capture and automate e-Science procedures. Their success brings the opportunity to publish, share, reuse and repurpose this explicitly captured knowledge. Within the myGrid project, we have identified key resources that can be shared including complete workflows, fragments of workflows and constituent services. We have examined the alternative ways these can be described by their authors (and subsequent users), and developed a unified descriptive model to support their later discovery. By basing this model on existing standards, we have been able to extend existing Web Service and Semantic Web Service infrastructure whilst still supporting the specific needs of the e-Scientist. myGrid components enable a workflow life-cycle that extends beyond execution, to include discovery of previous relevant designs, reuse of those designs, and subsequent publication. Experience with example groups of scientists indicates that this cycle is valuable. The growing number of workflows and services mean more work is needed to support the user in effective ranking of search results, and to support the repurposing process.