963 resultados para Scientific data
Resumo:
The constant scientific production in the universities and in the research centers makes these organizations produce and acquire a great amount of data in a short period of time. Due to the big quantity of data, the research organizations become potentially vulnerable to the impacts on information booms that may cause a chaos as far as information management is concerned. In this context, the development of data catalogues comes up as one possible solution to the problems such as (I) the organization and (II) the data management. In the scientific scope, the data catalogues are implemented with the standard for digital and geospatial metadata and are broadly utilized in the process of producing a catalogue of scientific information. The aim of this work is to present the characteristics of access and storage of metadata in databank systems in order to improve the description and dissemination of scientific data. Relevant aspects will be considered and they should be analyzed during the stage of planning, once they can determine the success of implementation. The use of data catalogues by research organizations may be a way to promote and facilitate the dissemination of scientific data, avoid the repetition of efforts while being executed, as well as incentivate the use of collected, processed an also stored.
Resumo:
Wednesday 23rd April 2014 Speaker(s): Willi Hasselbring Organiser: Leslie Carr Time: 23/04/2014 11:00-11:50 Location: B32/3077 File size: 669 Mb Abstract For good scientific practice, it is important that research results may be properly checked by reviewers and possibly repeated and extended by other researchers. This is of particular interest for "digital science" i.e. for in-silico experiments. In this talk, I'll discuss some issues of how software systems and services may contribute to good scientific practice. Particularly, I'll present our PubFlow approach to automate publication workflows for scientific data. The PubFlow workflow management system is based on established technology. We integrate institutional repository systems (based on EPrints) and world data centers (in marine science). PubFlow collects provenance data automatically via our monitoring framework Kieker. Provenance information describes the origins and the history of scientific data in its life cycle, and the process by which it arrived. Thus, provenance information is highly relevant to repeatability and trustworthiness of scientific results. In our evaluation in marine science, we collaborate with the GEOMAR Helmholtz Centre for Ocean Research Kiel.
Resumo:
The purpose of this study was to develop an understanding of the current state of scientific data sharing that stakeholders could use to develop and implement effective data sharing strategies and policies. The study developed a conceptual model to describe the process of data sharing, and the drivers, barriers, and enablers that determine stakeholder engagement. The conceptual model was used as a framework to structure discussions and interviews with key members of all stakeholder groups. Analysis of data obtained from interviewees identified a number of themes that highlight key requirements for the development of a mature data sharing culture.
Resumo:
As scientific workflows and the data they operate on, grow in size and complexity, the task of defining how those workflows should execute (which resources to use, where the resources must be in readiness for processing etc.) becomes proportionally more difficult. While "workflow compilers", such as Pegasus, reduce this burden, a further problem arises: since specifying details of execution is now automatic, a workflow's results are harder to interpret, as they are partly due to specifics of execution. By automating steps between the experiment design and its results, we lose the connection between them, hindering interpretation of results. To reconnect the scientific data with the original experiment, we argue that scientists should have access to the full provenance of their data, including not only parameters, inputs and intermediary data, but also the abstract experiment, refined into a concrete execution by the "workflow compiler". In this paper, we describe preliminary work on adapting Pegasus to capture the process of workflow refinement in the PASOA provenance system.
Resumo:
This work was supported in part by the EU „2nd Generation Open Access Infrastructure for Research in Europe" (OpenAIRE+). The autumn training school Development and Promotion of Open Access to Scientific Information and Research is organized in the frame of the Fourth International Conference on Digital Presentation and Preservation of Cultural and Scientific Heritage—DiPP2014 (September 18–21, 2014, Veliko Tarnovo, Bulgaria, http://dipp2014.math.bas.bg/), organized under the UNESCO patronage. The main organiser is the Institute of Mathematics and Informatics, Bulgarian Academy of Sciences with the support of EU project FOSTER (http://www.fosteropenscience.eu/) and the P. R. Slaveykov Regional Public Library in Veliko Tarnovo, Bulgaria.
Resumo:
Almost half of Ireland’s commercial stocks face overexploitation. As traditional species decrease in abundance and become less profitable, the industry is increasingly turning to alternate species. Atlantic saury (Scomberesox saurus saurus (Walbaum)) has been identified as a potential species for exploitation. Very little information is available on its biology or population dynamics, especially for Irish waters. This thesis aims to obtain sound scientific data, which will help to ensure that a future Atlantic saury fishery can be sustainably managed. The research has produced valuable data, some of which contradicts previous studies. Growth of Atlantic saury measured using otolith microstructure is found to be more than twice that previously calculated from annual structures on scales and otoliths. This results in a significant reduction of the expected life span from five to about two years. Investigation of maturity stage at age indicates that Atlantic saury will reproduce for the first time at age one and will survive for one or at most two reproduction seasons. It is concluded that a future Irish fishery will target mostly fish prior to their first reproduction. Finally the thesis gives some insights into the population structure of Atlantic saury, by analysis of otolith morphometric. Significant differences are detected between Northeastern Atlantic and western Mediterranean Sea specimens of the 0+ age class (less than one year old). The implications of these results for the management of an emerging fishery are discussed.
Resumo:
Scientific data from family medicine are relevant for the majority of the population. They are therefore essential from an ethical and public health perspective. We need to promote quality research in family medicine despite methodological, financial and logistic barriers. To highlight the strengths and weaknesses of research in family medicine in the French-speaking part of Switzerland we asked practitioners from this region to share their experience, critics and needs in relation to research. This article summarizes their contribution in light of the international literature.
Resumo:
Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014
Resumo:
For years, choosing the right career by monitoring the trends and scope for different career paths have been a requirement for all youngsters all over the world. In this paper we provide a scientific, data mining based method for job absorption rate prediction and predicting the waiting time needed for 100% placement, for different engineering courses in India. This will help the students in India in a great deal in deciding the right discipline for them for a bright future. Information about passed out students are obtained from the NTMIS ( National technical manpower information system ) NODAL center in Kochi, India residing in Cochin University of science and technology
Resumo:
We describe ncWMS, an implementation of the Open Geospatial Consortium’s Web Map Service (WMS) specification for multidimensional gridded environmental data. ncWMS can read data in a large number of common scientific data formats – notably the NetCDF format with the Climate and Forecast conventions – then efficiently generate map imagery in thousands of different coordinate reference systems. It is designed to require minimal configuration from the system administrator and, when used in conjunction with a suitable client tool, provides end users with an interactive means for visualizing data without the need to download large files or interpret complex metadata. It is also used as a “bridging” tool providing interoperability between the environmental science community and users of geographic information systems. ncWMS implements a number of extensions to the WMS standard in order to fulfil some common scientific requirements, including the ability to generate plots representing timeseries and vertical sections. We discuss these extensions and their impact upon present and future interoperability. We discuss the conceptual mapping between the WMS data model and the data models used by gridded data formats, highlighting areas in which the mapping is incomplete or ambiguous. We discuss the architecture of the system and particular technical innovations of note, including the algorithms used for fast data reading and image generation. ncWMS has been widely adopted within the environmental data community and we discuss some of the ways in which the software is integrated within data infrastructures and portals.
Resumo:
Background: In many experimental pipelines, clustering of multidimensional biological datasets is used to detect hidden structures in unlabelled input data. Taverna is a popular workflow management system that is used to design and execute scientific workflows and aid in silico experimentation. The availability of fast unsupervised methods for clustering and visualization in the Taverna platform is important to support a data-driven scientific discovery in complex and explorative bioinformatics applications. Results: This work presents a Taverna plugin, the Biological Data Interactive Clustering Explorer (BioDICE), that performs clustering of high-dimensional biological data and provides a nonlinear, topology preserving projection for the visualization of the input data and their similarities. The core algorithm in the BioDICE plugin is Fast Learning Self Organizing Map (FLSOM), which is an improved variant of the Self Organizing Map (SOM) algorithm. The plugin generates an interactive 2D map that allows the visual exploration of multidimensional data and the identification of groups of similar objects. The effectiveness of the plugin is demonstrated on a case study related to chemical compounds. Conclusions: The number and variety of available tools and its extensibility have made Taverna a popular choice for the development of scientific data workflows. This work presents a novel plugin, BioDICE, which adds a data-driven knowledge discovery component to Taverna. BioDICE provides an effective and powerful clustering tool, which can be adopted for the explorative analysis of biological datasets.
Resumo:
Geospatial information of many kinds, from topographic maps to scientific data, is increasingly being made available through web mapping services. These allow georeferenced map images to be served from data stores and displayed in websites and geographic information systems, where they can be integrated with other geographic information. The Open Geospatial Consortium’s Web Map Service (WMS) standard has been widely adopted in diverse communities for sharing data in this way. However, current services typically provide little or no information about the quality or accuracy of the data they serve. In this paper we will describe the design and implementation of a new “quality-enabled” profile of WMS, which we call “WMS-Q”. This describes how information about data quality can be transmitted to the user through WMS. Such information can exist at many levels, from entire datasets to individual measurements, and includes the many different ways in which data uncertainty can be expressed. We also describe proposed extensions to the Symbology Encoding specification, which include provision for visualizing uncertainty in raster data in a number of different ways, including contours, shading and bivariate colour maps. We shall also describe new open-source implementations of the new specifications, which include both clients and servers.
Resumo:
Pode-se afirmar que a evolução tecnológica (desenvolvimento de novos instrumentos de medição como, softwares, satélites e computadores, bem como, o barateamento das mídias de armazenamento) permite às Organizações produzirem e adquirirem grande quantidade de dados em curto espaço de tempo. Devido ao volume de dados, Organizações de pesquisa se tornam potencialmente vulneráveis aos impactos da explosão de informações. Uma solução adotada por algumas Organizações é a utilização de ferramentas de sistemas de informação para auxiliar na documentação, recuperação e análise dos dados. No âmbito científico, essas ferramentas são desenvolvidas para armazenar diferentes padrões de metadados (dados sobre dados). Durante o processo de desenvolvimento destas ferramentas, destaca-se a adoção de padrões como a Linguagem Unificada de Modelagem (UML, do Inglês Unified Modeling Language), cujos diagramas auxiliam na modelagem de diferentes aspectos do software. O objetivo deste estudo é apresentar uma ferramenta de sistemas de informação para auxiliar na documentação dos dados das Organizações por meio de metadados e destacar o processo de modelagem de software, por meio da UML. Será abordado o Padrão de Metadados Digitais Geoespaciais, amplamente utilizado na catalogação de dados por Organizações científicas de todo mundo, e os diagramas dinâmicos e estáticos da UML como casos de uso, sequências e classes. O desenvolvimento das ferramentas de sistemas de informação pode ser uma forma de promover a organização e a divulgação de dados científicos. No entanto, o processo de modelagem requer especial atenção para o desenvolvimento de interfaces que estimularão o uso das ferramentas de sistemas de informação.