903 resultados para Metadata Extraction
Resumo:
(Document pdf contains 193 pages) Executive Summary (pdf, < 0.1 Mb) 1. Introduction (pdf, 0.2 Mb) 1.1 Data sharing, international boundaries and large marine ecosystems 2. Objectives (pdf, 0.3 Mb) 3. Background (pdf, < 0.1 Mb) 3.1 North Pacific Ecosystem Metadatabase 3.2 First federation effort: NPEM and the Korea Oceanographic Data Center 3.2 Continuing effort: Adding Japan’s Marine Information Research Center 4. Metadata Standards (pdf, < 0.1 Mb) 4.1 Directory Interchange Format 4.2 Ecological Metadata Language 4.3 Dublin Core 4.3.1. Elements of DC 4.4 Federal Geographic Data Committee 4.5 The ISO 19115 Metadata Standard 4.6 Metadata stylesheets 4.7 Crosswalks 4.8 Tools for creating metadata 5. Communication Protocols (pdf, < 0.1 Mb) 5.1 Z39.50 5.1.1. What does Z39.50 do? 5.1.2. Isite 6. Clearinghouses (pdf, < 0.1 Mb) 7. Methodology (pdf, 0.2 Mb) 7.1 FGDC metadata 7.1.1. Main sections 7.1.2. Supporting sections 7.1.3. Metadata validation 7.2 Getting a copy of Isite 7.3 NSDI Clearinghouse 8. Server Configuration and Technical Issues (pdf, 0.4 Mb) 8.1 Hardware recommendations 8.2 Operating system – Red Hat Linux Fedora 8.3 Web services – Apache HTTP Server version 2.2.3 8.4 Create and validate FGDC-compliant Metadata in XML format 8.5 Obtaining, installing and configuring Isite for UNIX/Linux 8.5.1. Download the appropriate Isite software 8.5.2. Untar the file 8.5.3. Name your database 8.5.4. The zserver.ini file 8.5.5. The sapi.ini file 8.5.6. Indexing metadata 8.5.7. Start the Clearinghouse Server process 8.5.8. Testing the zserver installation 8.6 Registering with NSDI Clearinghouse 8.7 Security issues 9. Search Tutorial and Examples (pdf, 1 Mb) 9.1 Legacy NSDI Clearinghouse search interface 9.2 New GeoNetwork search interface 10. Challenges (pdf, < 0.1 Mb) 11. Emerging Standards (pdf, < 0.1 Mb) 12. Future Activity (pdf, < 0.1 Mb) 13. Acknowledgments (pdf, < 0.1 Mb) 14. References (pdf, < 0.1 Mb) 15. Acronyms (pdf, < 0.1 Mb) 16. Appendices 16.1. KODC-NPEM meeting agendas and minutes (pdf, < 0.1 Mb) 16.1.1. Seattle meeting agenda, August 22–23, 2005 16.1.2. Seattle meeting minutes, August 22–23, 2005 16.1.3. Busan meeting agenda, October 10–11, 2005 16.1.4. Busan meeting minutes, October 10–11, 2005 16.2. MIRC-NPEM meeting agendas and minutes (pdf, < 0.1 Mb) 16.2.1. Seattle Meeting agenda, August 14-15, 2006 16.2.2. Seattle meeting minutes, August 14–15, 2006 16.2.3. Tokyo meeting agenda, October 19–20, 2006 16.2.4. Tokyo, meeting minutes, October 19–20, 2006 16.3. XML stylesheet conversion crosswalks (pdf, < 0.1 Mb) 16.3.1. FGDCI to DIF stylesheet converter 16.3.2. DIF to FGDCI stylesheet converter 16.3.3. String-modified stylesheet 16.4. FGDC Metadata Standard (pdf, 0.1 Mb) 16.4.1. Overall structure 16.4.2. Section 1: Identification information 16.4.3. Section 2: Data quality information 16.4.4. Section 3: Spatial data organization information 16.4.5. Section 4: Spatial reference information 16.4.6. Section 5: Entity and attribute information 16.4.7. Section 6: Distribution information 16.4.8. Section 7: Metadata reference information 16.4.9. Sections 8, 9 and 10: Citation information, time period information, and contact information 16.5. Images of the Isite server directory structure and the files contained in each subdirectory after Isite installation (pdf, 0.2 Mb) 16.6 Listing of NPEM’s Isite configuration files (pdf, < 0.1 Mb) 16.6.1. zserver.ini 16.6.2. sapi.ini 16.7 Java program to extract records from the NPEM metadatabase and write one XML file for each record (pdf, < 0.1 Mb) 16.8 Java program to execute the metadata extraction program (pdf, < 0.1 Mb) A1 Addendum 1: Instructions for Isite for Windows (pdf, 0.6 Mb) A2 Addendum 2: Instructions for Isite for Windows ADHOST (pdf, 0.3 Mb)
Resumo:
The software architecture and development consideration for open metadata extraction and processing framework are outlined. Special attention is paid to the aspects of reliability and fault tolerance. Grid infrastructure is shown as useful backend for general-purpose task.
Resumo:
Access to Digital Cultural Heritage: Innovative Applications of Automated Metadata Generation Edited by: Krassimira Ivanova, Milena Dobreva, Peter Stanchev, George Totkov Authors (in order of appearance): Krassimira Ivanova, Peter Stanchev, George Totkov, Kalina Sotirova, Juliana Peneva, Stanislav Ivanov, Rositza Doneva, Emil Hadjikolev, George Vragov, Elena Somova, Evgenia Velikova, Iliya Mitov, Koen Vanhoof, Benoit Depaire, Dimitar Blagoev Reviewer: Prof., Dr. Avram Eskenazi Published by: Plovdiv University Publishing House "Paisii Hilendarski" ISBN: 978-954-423-722-6 2012, Plovdiv, Bulgaria First Edition
Resumo:
This article presents the principal results of the Ph.D. thesis A Novel Method for Content-Based Image Retrieval in Art Image Collections Utilizing Colour Semantics by Krassimira Ivanova (Institute of Mathematics and Informatics, BAS), successfully defended at Hasselt Uni-versity in Belgium, Faculty of Science, on 15 November 2011.
Resumo:
Resource discovery is one of the key services in digitised cultural heritage collections. It requires intelligent mining in heterogeneous digital content as well as capabilities in large scale performance; this explains the recent advances in classification methods. Associative classifiers are convenient data mining tools used in the field of cultural heritage, by applying their possibilities to taking into account the specific combinations of the attribute values. Usually, the associative classifiers prioritize the support over the confidence. The proposed classifier PGN questions this common approach and focuses on confidence first by retaining only 100% confidence rules. The classification tasks in the field of cultural heritage usually deal with data sets with many class labels. This variety is caused by the richness of accumulated culture during the centuries. Comparisons of classifier PGN with other classifiers, such as OneR, JRip and J48, show the competitiveness of PGN in recognizing multi-class datasets on collections of masterpieces from different West and East European Fine Art authors and movements.
Resumo:
Anticipating the increase in video information in future, archiving of news is an important activity in the visual media industry. When the volume of archives increases, it will be difficult for journalists to find the appropriate content using current search tools. This paper provides the details of the study we conducted about the news extraction systems used in different news channels in Kerala. Semantic web technologies can be used effectively since news archiving share many of the characteristics and problems of WWW. Since visual news archives of different media resources follow different metadata standards, interoperability between the resources is also an issue. World Wide Web Consortium has proposed a draft for an ontology framework for media resource which addresses the intercompatiblity issues. In this paper, the w3c proposed framework and its drawbacks is also discussed
Resumo:
Automatic indexing and retrieval of digital data poses major challenges. The main problem arises from the ever increasing mass of digital media and the lack of efficient methods for indexing and retrieval of such data based on the semantic content rather than keywords. To enable intelligent web interactions, or even web filtering, we need to be capable of interpreting the information base in an intelligent manner. For a number of years research has been ongoing in the field of ontological engineering with the aim of using ontologies to add such (meta) knowledge to information. In this paper, we describe the architecture of a system (Dynamic REtrieval Analysis and semantic metadata Management (DREAM)) designed to automatically and intelligently index huge repositories of special effects video clips, based on their semantic content, using a network of scalable ontologies to enable intelligent retrieval. The DREAM Demonstrator has been evaluated as deployed in the film post-production phase to support the process of storage, indexing and retrieval of large data sets of special effects video clips as an exemplar application domain. This paper provides its performance and usability results and highlights the scope for future enhancements of the DREAM architecture which has proven successful in its first and possibly most challenging proving ground, namely film production, where it is already in routine use within our test bed Partners' creative processes. (C) 2009 Published by Elsevier B.V.
Resumo:
DNA extraction was carried out as described on the MICROBIS project pages (http://icomm.mbl.edu/microbis ) using a commercially available extraction kit. We amplified the hypervariable regions V4-V6 of archaeal and bacterial 16S rRNA genes using PCR and several sets of forward and reverse primers (http://vamps.mbl.edu/resources/primers.php). Massively parallel tag sequencing of the PCR products was carried out on a 454 Life Sciences GS FLX sequencer at Marine Biological Laboratory, Woods Hole, MA, following the same experimental conditions for all samples. Sequence reads were submitted to a rigorous quality control procedure based on mothur v30 (doi:10.1128/AEM.01541-09) including denoising of the flow grams using an algorithm based on PyroNoise (doi:10.1038/nmeth.1361), removal of PCR errors and a chimera check using uchime (doi:10.1093/bioinformatics/btr381). The reads were taxonomically assigned according to the SILVA taxonomy (SSURef v119, 07-2014; doi:10.1093/nar/gks1219) implemented in mothur and clustered at 98% ribosomal RNA gene V4-V6 sequence identity. V4-V6 amplicon sequence abundance tables were standardized to account for unequal sampling effort using 1000 (Archaea) and 2300 (Bacteria) randomly chosen sequences without replacement using mothur and then used to calculate inverse Simpson diversity indices and Chao1 richness (doi:10.2307/4615964). Bray-Curtis dissimilarities (doi:10.2307/1942268) between all samples were calculated and used for 2-dimensional non metric multidimensional scaling (NMDS) ordinations with 20 random starts (doi:10.1007/BF02289694). Stress values below 0.2 indicated that the multidimensional dataset was well represented by the 2D ordination. NMDS ordinations were compared and tested using Procrustes correlation analysis (doi:10.1007/BF02291478). All analyses were carried out with the R statistical environment and the packages vegan (available at: http://cran.r-project.org/package=vegan), labdsv (available at: http://cran.r-project.org/package=labdsv), as well as with custom R scripts. Operational taxonomic units at 98% sequence identity (OTU0.03) that occurred only once in the whole dataset were termed absolute single sequence OTUs (SSOabs; doi:10.1038/ismej.2011.132). OTU0.03 sequences that occurred only once in at least one sample, but may occur more often in other samples were termed relative single sequence OTUs (SSOrel). SSOrel are particularly interesting for community ecology, since they comprise rare organisms that might become abundant when conditions change.16S rRNA amplicons and metagenomic reads have been stored in the sequence read archive under SRA project accession number SRP042162.
Resumo:
Sensor network deployments have become a primary source of big data about the real world that surrounds us, measuring a wide range of physical properties in real time. With such large amounts of heterogeneous data, a key challenge is to describe and annotate sensor data with high-level metadata, using and extending models, for instance with ontologies. However, to automate this task there is a need for enriching the sensor metadata using the actual observed measurements and extracting useful meta-information from them. This paper proposes a novel approach of characterization and extraction of semantic metadata through the analysis of sensor data raw observations. This approach consists in using approximations to represent the raw sensor measurements, based on distributions of the observation slopes, building a classi?cation scheme to automatically infer sensor metadata like the type of observed property, integrating the semantic analysis results with existing sensor networks metadata.