988 resultados para Document Representation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Genetic Algorithms (GAs) make use of an internal representation of a given system in order to perform optimization functions. The actual structural layout of this representation, called a genome, has a crucial impact on the outcome of the optimization process. The purpose of this paper is to study the effects of different internal representations in a GA, which generates neural networks. A second GA was used to optimize the genome structure. This structure produces an optimized system within a shorter time interval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article in the peer-reviewed Oxford Bibliographies series, gives an introduction to the literatures on the varieties, origins, and effects of proportional electoral systems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels have to be built using only the terms in the documents of the collection. This paper presents the SeCLAR (Selecting Candidate Labels using Association Rules) method, which explores the use of association rules for the selection of good candidates for labels of hierarchical document clusters. The candidates are processed by a classical method to generate the labels. The idea of the proposed method is to process each parent-child relationship of the nodes as an antecedent-consequent relationship of association rules. The experimental results show that the proposed method can improve the precision and recall of labels obtained by classical methods. © 2010 Springer-Verlag.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels must be built using all the terms in the documents of the collection. This paper presents the SeCLAR method, which explores the use of association rules in the selection of good candidates for labels of hierarchical document clusters. The purpose of this method is to select a subset of terms by exploring the relationship among the terms of each document. Thus, these candidates can be processed by a classical method to generate the labels. An experimental study demonstrates the potential of the proposed approach to improve the precision and recall of labels obtained by classical methods only considering the terms which are potentially more discriminative. © 2012 - IOS Press and the authors. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since its emergence as a discipline, in the nineteenth century (1889), the theory and practice of Archival Science have focused on the arrangement and description of archival materials as complementary and inseparable nuclear processes that aim to classify, to order, to describe and to give access to records. These processes have their specific goals sharing one in common: the representation of archival knowledge. In the late 1980 a paradigm shift was announced in Archival Science, especially after the appearance of the new forms of document production and information technologies. The discipline was then invited to rethink its theoretical and methodological bases founded in the nineteenth century so it could handle the contemporary archival knowledge production, organization and representation. In this sense, the present paper aims to discuss, under a theoretical perspective, the archival representation, more specifically the archival description facing these changes and proposals, in order to illustrate the challenges faced by Contemporary Archival Science in a new context of production, organization and representation of archival knowledge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The goal of the present research is to define a Semantic Web framework for precedent modelling, by using knowledge extracted from text, metadata, and rules, while maintaining a strong text-to-knowledge morphism between legal text and legal concepts, in order to fill the gap between legal document and its semantics. The framework is composed of four different models that make use of standard languages from the Semantic Web stack of technologies: a document metadata structure, modelling the main parts of a judgement, and creating a bridge between a text and its semantic annotations of legal concepts; a legal core ontology, modelling abstract legal concepts and institutions contained in a rule of law; a legal domain ontology, modelling the main legal concepts in a specific domain concerned by case-law; an argumentation system, modelling the structure of argumentation. The input to the framework includes metadata associated with judicial concepts, and an ontology library representing the structure of case-law. The research relies on the previous efforts of the community in the field of legal knowledge representation and rule interchange for applications in the legal domain, in order to apply the theory to a set of real legal documents, stressing the OWL axioms definitions as much as possible in order to enable them to provide a semantically powerful representation of the legal document and a solid ground for an argumentation system using a defeasible subset of predicate logics. It appears that some new features of OWL2 unlock useful reasoning features for legal knowledge, especially if combined with defeasible rules and argumentation schemes. The main task is thus to formalize legal concepts and argumentation patterns contained in a judgement, with the following requirement: to check, validate and reuse the discourse of a judge - and the argumentation he produces - as expressed by the judicial text.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis aims at investigating a new approach to document analysis based on the idea of structural patterns in XML vocabularies. My work is founded on the belief that authors do naturally converge to a reasonable use of markup languages and that extreme, yet valid instances are rare and limited. Actual documents, therefore, may be used to derive classes of elements (patterns) persisting across documents and distilling the conceptualization of the documents and their components, and may give ground for automatic tools and services that rely on no background information (such as schemas) at all. The central part of my work consists in introducing from the ground up a formal theory of eight structural patterns (with three sub-patterns) that are able to express the logical organization of any XML document, and verifying their identifiability in a number of different vocabularies. This model is characterized by and validated against three main dimensions: terseness (i.e. the ability to represent the structure of a document with a small number of objects and composition rules), coverage (i.e. the ability to capture any possible situation in any document) and expressiveness (i.e. the ability to make explicit the semantics of structures, relations and dependencies). An algorithm for the automatic recognition of structural patterns is then presented, together with an evaluation of the results of a test performed on a set of more than 1100 documents from eight very different vocabularies. This language-independent analysis confirms the ability of patterns to capture and summarize the guidelines used by the authors in their everyday practice. Finally, I present some systems that work directly on the pattern-based representation of documents. The ability of these tools to cover very different situations and contexts confirms the effectiveness of the model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Two important characteristics of science are the ?reproducibility? and ?clarity?. By rigorous practices, scientists explore aspects of the world that they can reproduce under carefully controlled experimental conditions. The clarity, complementing reproducibility, provides unambiguous descriptions of results in a mechanical or mathematical form. Both pillars depend on well-structured and accurate descriptions of scientific practices, which are normally recorded in experimental protocols, scientific workflows, etc. Here we present SMART Protocols (SP), our ontology-based approach for representing experimental protocols and our contribution to clarity and reproducibility. SP delivers an unambiguous description of processes by means of which data is produced; by doing so, we argue, it facilitates reproducibility. Moreover, SP is thought to be part of e-science infrastructures. SP results from the analysis of 175 protocols; from this dataset, we extracted common elements. From our analysis, we identified document, workflow and domain-specific aspects in the representation of experimental protocols. The ontology is available at http://purl.org/net/SMARTprotocol

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reports on the further results of the ongoing research analyzing the impact of a range of commonly used statistical and semantic features in the context of extractive text summarization. The features experimented with include word frequency, inverse sentence and term frequencies, stopwords filtering, word senses, resolved anaphora and textual entailment. The obtained results demonstrate the relative importance of each feature and the limitations of the tools available. It has been shown that the inverse sentence frequency combined with the term frequency yields almost the same results as the latter combined with stopwords filtering that in its turn proved to be a highly competitive baseline. To improve the suboptimal results of anaphora resolution, the system was extended with the second anaphora resolution module. The present paper also describes the first attempts of the internal document data representation.