51 resultados para semantic textual similarity


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A procedure based on quantum molecular similarity measures (QMSM) has been used to compare electron densities obtained from conventional ab initio and density functional methodologies at their respective optimized geometries. This method has been applied to a series of small molecules which have experimentally known properties and molecular bonds of diverse degrees of ionicity and covalency. Results show that in most cases the electron densities obtained from density functional methodologies are of a similar quality than post-Hartree-Fock generalized densities. For molecules where Hartree-Fock methodology yields erroneous results, the density functional methodology is shown to yield usually more accurate densities than those provided by the second order Møller-Plesset perturbation theory

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present the theoretical and methodologicalfoundations for the development of a multi-agentSelective Dissemination of Information (SDI) servicemodel that applies Semantic Web technologies for specializeddigital libraries. These technologies make possibleachieving more efficient information management,improving agent–user communication processes, andfacilitating accurate access to relevant resources. Othertools used are fuzzy linguistic modelling techniques(which make possible easing the interaction betweenusers and system) and natural language processing(NLP) techniques for semiautomatic thesaurus generation.Also, RSS feeds are used as “current awareness bulletins”to generate personalized bibliographic alerts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The information provided by the alignment-independent GRid Independent Descriptors (GRIND) can be condensed by the application of principal component analysis, obtaining a small number of principal properties (GRIND-PP), which is more suitable for describing molecular similarity. The objective of the present study is to optimize diverse parameters involved in the obtention of the GRIND-PP and validate their suitability for applications, requiring a biologically relevant description of the molecular similarity. With this aim, GRIND-PP computed with a collection of diverse settings were used to carry out ligand-based virtual screening (LBVS) on standard conditions. The quality of the results obtained was remarkable and comparable with other LBVS methods, and their detailed statistical analysis allowed to identify the method settings more determinant for the quality of the results and their optimum. Remarkably, some of these optimum settings differ significantly from those used in previously published applications, revealing their unexplored potential. Their applicability in large compound database was also explored by comparing the equivalence of the results obtained using either computed or projected principal properties. In general, the results of the study confirm the suitability of the GRIND-PP for practical applications and provide useful hints about how they should be computed for obtaining optimum results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the Music Information Retrieval (MIR) community along in the past, as it provides a direct and objective way to evaluate music similarity algorithms.This article first presents a series of experiments carried outwith two state-of-the-art methods for cover song identification.We have studied several components of these (such as chroma resolution and similarity, transposition, beat tracking or Dynamic Time Warping constraints), in order to discover which characteristics would be desirable for a competitive cover song identifier. After analyzing many cross-validated results, the importance of these characteristics is discussed, and the best-performing ones are finally applied to the newly proposed method. Multipleevaluations of this one confirm a large increase in identificationaccuracy when comparing it with alternative state-of-the-artapproaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we present a description of the role of definitional verbal patterns for the extraction of semantic relations. Several studies show that semantic relations can be extracted from analytic definitions contained in machine-readable dictionaries (MRDs). In addition, definitions found in specialised texts are a good starting point to search for different types of definitions where other semantic relations occur. The extraction of definitional knowledge from specialised corpora represents another interesting approach for the extraction of semantic relations. Here, we present a descriptive analysis of definitional verbal patterns in Spanish and the first steps towards the development of a system for the automatic extraction of definitional knowledge.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Corpus constituido desde 1993 hasta la actualidad en el marco del proyecto Corpus del Institut Universitari de Lingüística Aplicada (IULA) de la Universidad Pompeu Fabra. Este proyecto recopila textos escritos en cinco lenguas diferentes (catalán, castellano, inglés, francés y alemán) de las áreas de especialidad de la economía, el derecho,el medio ambiente, la medicina y la informática.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper a method for extracting semantic informationfrom online music discussion forums is proposed. The semantic relations are inferred from the co-occurrence of musical concepts in forum posts, using network analysis. The method starts by defining a dictionary of common music terms in an art music tradition. Then, it creates a complex network representation of the online forum by matchingsuch dictionary against the forum posts. Once the complex network is built we can study different network measures, including node relevance, node co-occurrence andterm relations via semantically connecting words. Moreover, we can detect communities of concepts inside the forum posts. The rationale is that some music terms are more related to each other than to other terms. All in all, this methodology allows us to obtain meaningful and relevantinformation from forum discussions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acquiring lexical information is a complex problem, typically approached by relying on a number of contexts to contribute information for classification. One of the first issues to address in this domain is the determination of such contexts. The work presented here proposes the use of automatically obtained FORMAL role descriptors as features used to draw nouns from the same lexical semantic class together in an unsupervised clustering task. We have dealt with three lexical semantic classes (HUMAN, LOCATION and EVENT) in English. The results obtained show that it is possible to discriminate between elements from different lexical semantic classes using only FORMAL role information, hence validating our initial hypothesis. Also, iterating our method accurately accounts for fine-grained distinctions within lexical classes, namely distinctions involving ambiguous expressions. Moreover, a filtering and bootstrapping strategy employed in extracting FORMAL role descriptors proved to minimize effects of sparse data and noise in our task.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The work we present here addresses cue-based noun classification in English and Spanish. Its main objective is to automatically acquire lexical semantic information by classifying nouns into previously known noun lexical classes. This is achieved by using particular aspects of linguistic contexts as cues that identify a specific lexical class. Here we concentrate on the task of identifying such cues and the theoretical background that allows for an assessment of the complexity of the task. The results show that, despite of the a-priori complexity of the task, cue-based classification is a useful tool in the automatic acquisition of lexical semantic classes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work briefly analyses the difficulties to adopt the Semantic Web, and in particular proposes systems to know the present level of migration to the different technologies that make up the Semantic Web. It focuses on the presentation and description of two tools, DigiDocSpider and DigiDocMetaEdit, designed with the aim of verifYing, evaluating, and promoting its implementation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A class of composite estimators of small area quantities that exploit spatial (distancerelated)similarity is derived. It is based on a distribution-free model for the areas, but theestimators are aimed to have optimal design-based properties. Composition is applied alsoto estimate some of the global parameters on which the small area estimators depend.It is shown that the commonly adopted assumption of random effects is not necessaryfor exploiting the similarity of the districts (borrowing strength across the districts). Themethods are applied in the estimation of the mean household sizes and the proportions ofsingle-member households in the counties (comarcas) of Catalonia. The simplest version ofthe estimators is more efficient than the established alternatives, even though the extentof spatial similarity is quite modest.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

L’edició completa de les obres de Francesc Eiximenis és una empresa filològica que presenta dificultats objectives, no només perquè les obres tenen un pes material més que considerable, sinó també per la gran quantitat de testimonis manuscrits i impresos que cal col·lacionar. Amb l’objectiu d’ajudar a superar aquestes dificultats, ja fa alguns anys que l’Institut de Llengua i Cultura Catalanes de la Universitat de Girona ha emprès la publicació de les Obres de Francesc Eiximenis, i avui afortunadament podem afirmar que les obres d’Eiximenis, especialment les inèdites, ja tenen equips de treball que s’ocupen de la preparació de les edicions corresponents. Paral·lelament, i en la mateixa línia de treball, s’ha iniciat una nova descripció integral dels testimonis manuscrits i impresos de Francesc Eiximenis, els primers resultats de la qual es podran consultar en línia ben aviat. En el marc que acabo de descriure fa temps que un equip treballa en l’edició crítica del Llibre dels àngels (en endavant LA), escrit a València el 1392

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that the statistics of an edge type variable in natural images exhibits self-similarity properties which resemble those of local energy dissipation in turbulent flows. Our results show that self-similarity and extended self-similarity hold remarkably for the statistics of the local edge variance, and that the very same models can be used to predict all of the associated exponents. These results suggest using natural images as a laboratory for testing more elaborate scaling models of interest for the statistical description of turbulent flows. The properties we have exhibited are relevant for the modeling of the early visual system: They should be included in models designed for the prediction of receptive fields.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrate that the self-similarity of some scale-free networks with respect to a simple degree-thresholding renormalization scheme finds a natural interpretation in the assumption that network nodes exist in hidden metric spaces. Clustering, i.e., cycles of length three, plays a crucial role in this framework as a topological reflection of the triangle inequality in the hidden geometry. We prove that a class of hidden variable models with underlying metric spaces are able to accurately reproduce the self-similarity properties that we measured in the real networks. Our findings indicate that hidden geometries underlying these real networks are a plausible explanation for their observed topologies and, in particular, for their self-similarity with respect to the degree-based renormalization.