811 resultados para Database, Image Retrieval, Browsing, Semantic Concept
Resumo:
Web databases are now pervasive. Such a database can be accessed via its query interface (usually HTML query form) only. Extracting Web query interfaces is a critical step in data integration across multiple Web databases, which creates a formal representation of a query form by extracting a set of query conditions in it. This paper presents a novel approach to extracting Web query interfaces. In this approach, a generic set of query condition rules are created to define query conditions that are semantically equivalent to SQL search conditions. Query condition rules represent the semantic roles that labels and form elements play in query conditions, and how they are hierarchically grouped into constructs of query conditions. To group labels and form elements in a query form, we explore both their structural proximity in the hierarchy of structures in the query form, which is captured by a tree of nested tags in the HTML codes of the form, and their semantic similarity, which is captured by various short texts used in labels, form elements and their properties. We have implemented the proposed approach and our experimental results show that the approach is highly effective.
Resumo:
Background: Popular approaches in human tissue-based biomarker discovery include tissue microarrays (TMAs) and DNA Microarrays (DMAs) for protein and gene expression profiling respectively. The data generated by these analytic platforms, together with associated image, clinical and pathological data currently reside on widely different information platforms, making searching and cross-platform analysis difficult. Consequently, there is a strong need to develop a single coherent database capable of correlating all available data types.
Method: This study presents TMAX, a database system to facilitate biomarker discovery tasks. TMAX organises a variety of biomarker discovery-related data into the database. Both TMA and DMA experimental data are integrated in TMAX and connected through common DNA/protein biomarkers. Patient clinical data (including tissue pathological data), computer assisted tissue image and associated analytic data are also included in TMAX to enable the truly high throughput processing of ultra-large digital slides for both TMAs and whole slide tissue digital slides. A comprehensive web front-end was built with embedded XML parser software and predefined SQL queries to enable rapid data exchange in the form of standard XML files.
Results & Conclusion: TMAX represents one of the first attempts to integrate TMA data with public gene expression experiment data. Experiments suggest that TMAX is robust in managing large quantities of data from different sources (clinical, TMA, DMA and image analysis). Its web front-end is user friendly, easy to use, and most importantly allows the rapid and easy data exchange of biomarker discovery related data. In conclusion, TMAX is a robust biomarker discovery data repository and research tool, which opens up the opportunities for biomarker discovery and further integromics research.
Resumo:
Computational models of meaning trained on naturally occurring text successfully model human performance on tasks involving simple similarity measures, but they characterize meaning in terms of undifferentiated bags of words or topical dimensions. This has led some to question their psychological plausibility (Murphy, 2002; Schunn, 1999). We present here a fully automatic method for extracting a structured and comprehensive set of concept descriptions directly from an English part-of-speech-tagged corpus. Concepts are characterized by weighted properties, enriched with concept-property types that approximate classical relations such as hypernymy and function. Our model outperforms comparable algorithms in cognitive tasks pertaining not only to concept-internal structures (discovering properties of concepts, grouping properties by property type) but also to inter-concept relations (clustering into superordinates), suggesting the empirical validity of the property-based approach. Copyright © 2009 Cognitive Science Society, Inc. All rights reserved.
Resumo:
The Supreme Court of the United States in Feist v. Rural (Feist, 1991) specified that compilations or databases, and other works, must have a minimal degree of creativity to be copyrightable. The significance and global diffusion of the decision is only matched by the difficulties it has posed for interpretation. The judgment does not specify what is to be understood by creativity, although it does give a full account of the negative of creativity, as ‘so mechanical or routine as to require no creativity whatsoever’ (Feist, 1991, p.362). The negative of creativity as highly mechanical has particularly diffused globally.
A recent interpretation has correlated ‘so mechanical’ (Feist, 1991) with an automatic mechanical procedure or computational process, using a rigorous exegesis fully to correlate the two uses of mechanical. The negative of creativity is then understood as an automatic computation and as a highly routine process. Creativity is itself is conversely understood as non-computational activity, above a certain level of routinicity (Warner, 2013).
The distinction between the negative of creativity and creativity is strongly analogous to an independently developed distinction between forms of mental labour, between semantic and syntactic labour. Semantic labour is understood as human labour motivated by considerations of meaning and syntactic labour as concerned solely with patterns. Semantic labour is distinctively human while syntactic labour can be directly humanly conducted or delegated to machine, as an automatic computational process (Warner, 2005; 2010, pp.33-41).
The value of the analogy is to greatly increase the intersubjective scope of the distinction between semantic and syntactic mental labour. The global diffusion of the standard for extreme absence of copyrightability embodied in the judgment also indicates the possibility that the distinction fully captures the current transformation in the distribution of mental labour, where syntactic tasks which were previously humanly performed are now increasingly conducted by machine.
The paper has substantive and methodological relevance to the conference themes. Substantively, it is concerned with human creativity, with rationality as not reducible to computation, and has relevance to the language myth, through its indirect endorsement of a non-computable or not mechanical semantics. These themes are supported by the underlying idea of technology as a human construction. Methodologically, it is rooted in the humanities and conducts critical thinking through exegesis and empirically tested theoretical development
References
Feist. (1991). Feist Publications, Inc. v. Rural Tel. Service Co., Inc. 499 U.S. 340.
Warner, J. (2005). Labor in information systems. Annual Review of Information Science and Technology. 39, 2005, pp.551-573.
Warner, J. (2010). Human Information Retrieval (History and Foundations of Information Science Series). Cambridge, MA: MIT Press.
Warner, J. (2013). Creativity for Feist. Journal of the American Society for Information Science and Technology. 64, 6, 2013, pp.1173-1192.
Resumo:
Shoeprint evidence collected from crime scenes can play an important role in forensic investigations. Usually, the analysis of shoeprints is carried out manually and is based on human expertise and knowledge. As well as being error prone, such a manual process can also be time consuming; thus affecting the usability and suitability of shoeprint evidence in a court of law. Thus, an automatic system for classification and retrieval of shoeprints has the potential to be a valuable tool. This paper presents a solution for the automatic retrieval of shoeprints which is considerably more robust than existing solutions in the presence of geometric distortions such as scale, rotation and scale distortions. It addresses the issue of classifying partial shoeprints in the presence of rotation, scale and noise distortions and relies on the use of two local point-of-interest detectors whose matching scores are combined. In this work, multiscale Harris and Hessian detectors are used to select corners and blob-like structures in a scale-space representation for scale invariance, while Scale Invariant Feature Transform (SIFT) descriptor is employed to achieve rotation invariance. The proposed technique is based on combining the matching scores of the two detectors at the score level. Our evaluation has shown that it outperforms both detectors in most of our extended experiments when retrieving partial shoeprints with geometric distortions, and is clearly better than similar work published in the literature. We also demonstrate improved performance in the face of wear and tear. As matter of fact, whilst the proposed work outperforms similar algorithms in the literature, it is shown that achieving good retrieval performance is not constrained by acquiring a full print from a scene of crime as a partial print can still be used to attain comparable retrieval results to those of using the full print. This gives crime investigators more flexibility is choosing the parts of a print to search for in a database of footwear.
Resumo:
Das Ziel dieser Arbeit ist es, ein Konzept für eine Darstellung der Personennamendatei(PND) in den Sprachen Resource Description Framework (RDF), Resource DescriptionFramework Schema Language (RDFS) und Web Ontology Language (OWL) zu entwickeln. Der Prämisse des Semantic Web folgend, Daten sowohl in menschenverständlicher als auch in maschinell verarbeitbarer Form darzustellen und abzulegen, wird eine Struktur für Personendaten geschaffen. Dabei wird von der bestehenden Daten- und Struktursituation im Pica-Format ausgegangen. Die Erweiterbarkeit und Anpassbarkeit des Modells im Hinblick auf zukünftige, im Moment gegebenenfalls noch nicht absehbare Anwendungen und Strukurveränderungen, muss aber darüber hinaus gewährleistet sein. Die Modellierung orientiert sich an bestehenden Standards wie Dublin Core, Friend Of A Friend (FOAF), Functional Requirements for Bibliographic Records (FRBR), Functional Requirements for Authority Data (FRAD) und Resource Description and Access (RDA).
Resumo:
Tese de doutoramento, Informática (Engenharia Informática), Universidade de Lisboa, Faculdade de Ciências, 2015
Resumo:
Description of the Annotation files: Annotation files are supplied for each video, for benchmarking. Annotations correspond to ground truths of peoples' positions in the image plane, and also for their feet positions, when they were visible. Annotations were performed manually, with the aid of a code developed by (Silva et al., 2014; see the Thesis for details). Targets (people or feet) are marked at variable frame intervals and then linearly interpolated.
Resumo:
Description of the Annotation files: Annotation files are supplied for each video, for benchmarking. Annotations correspond to ground truths of peoples' positions in the image plane, and also for their feet positions, when they were visible. Annotations were performed manually, with the aid of a code developed by (Silva et al., 2014; see the Thesis for details). Targets (people or feet) are marked at variable frame intervals and then linearly interpolated.
Resumo:
Description of the Annotation files: Annotation files are supplied for each video, for benchmarking. Annotations correspond to ground truths of peoples' positions in the image plane, and also for their feet positions, when they were visible. Annotations were performed manually, with the aid of a code developed by (Silva et al., 2014; see the Thesis for details). Targets (people or feet) are marked at variable frame intervals and then linearly interpolated.
Resumo:
Description of the Annotation files: Annotation files are supplied for each video, for benchmarking. Annotations correspond to ground truths of peoples' positions in the image plane, and also for their feet positions, when they were visible. Annotations were performed manually, with the aid of a code developed by (Silva et al., 2014; see the Thesis for details). Targets (people or feet) are marked at variable frame intervals and then linearly interpolated.