41 resultados para Semantic similarity
Resumo:
Background: A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. Results: The method has been tested against the PSI-BLAST program using a set of 3,890 protein sequences from which interaction data was available. For protein sequences that align with at least 40% sequence identity to a known enzyme, the specificity of our method in predicting the first three EC digits increased from 80% to 90% at 80% coverage when compared to PSI-BLAST. Conclusion: Our method can also be used in proteins for which homologous sequences with known interacting partners can be detected. Thus, our method could increase 10% the specificity of genome-wide enzyme predictions based on sequence matching by PSI-BLAST alone.
Resumo:
We present a new technique for audio signal comparison based on tonal subsequence alignment and its application to detect cover versions (i.e., different performances of the same underlying musical piece). Cover song identification is a task whose popularity has increased in the Music Information Retrieval (MIR) community along in the past, as it provides a direct and objective way to evaluate music similarity algorithms.This article first presents a series of experiments carried outwith two state-of-the-art methods for cover song identification.We have studied several components of these (such as chroma resolution and similarity, transposition, beat tracking or Dynamic Time Warping constraints), in order to discover which characteristics would be desirable for a competitive cover song identifier. After analyzing many cross-validated results, the importance of these characteristics is discussed, and the best-performing ones are finally applied to the newly proposed method. Multipleevaluations of this one confirm a large increase in identificationaccuracy when comparing it with alternative state-of-the-artapproaches.
Resumo:
In this paper we present a description of the role of definitional verbal patterns for the extraction of semantic relations. Several studies show that semantic relations can be extracted from analytic definitions contained in machine-readable dictionaries (MRDs). In addition, definitions found in specialised texts are a good starting point to search for different types of definitions where other semantic relations occur. The extraction of definitional knowledge from specialised corpora represents another interesting approach for the extraction of semantic relations. Here, we present a descriptive analysis of definitional verbal patterns in Spanish and the first steps towards the development of a system for the automatic extraction of definitional knowledge.
Resumo:
In this paper a method for extracting semantic informationfrom online music discussion forums is proposed. The semantic relations are inferred from the co-occurrence of musical concepts in forum posts, using network analysis. The method starts by defining a dictionary of common music terms in an art music tradition. Then, it creates a complex network representation of the online forum by matchingsuch dictionary against the forum posts. Once the complex network is built we can study different network measures, including node relevance, node co-occurrence andterm relations via semantically connecting words. Moreover, we can detect communities of concepts inside the forum posts. The rationale is that some music terms are more related to each other than to other terms. All in all, this methodology allows us to obtain meaningful and relevantinformation from forum discussions.
Resumo:
Acquiring lexical information is a complex problem, typically approached by relying on a number of contexts to contribute information for classification. One of the first issues to address in this domain is the determination of such contexts. The work presented here proposes the use of automatically obtained FORMAL role descriptors as features used to draw nouns from the same lexical semantic class together in an unsupervised clustering task. We have dealt with three lexical semantic classes (HUMAN, LOCATION and EVENT) in English. The results obtained show that it is possible to discriminate between elements from different lexical semantic classes using only FORMAL role information, hence validating our initial hypothesis. Also, iterating our method accurately accounts for fine-grained distinctions within lexical classes, namely distinctions involving ambiguous expressions. Moreover, a filtering and bootstrapping strategy employed in extracting FORMAL role descriptors proved to minimize effects of sparse data and noise in our task.
Resumo:
The work we present here addresses cue-based noun classification in English and Spanish. Its main objective is to automatically acquire lexical semantic information by classifying nouns into previously known noun lexical classes. This is achieved by using particular aspects of linguistic contexts as cues that identify a specific lexical class. Here we concentrate on the task of identifying such cues and the theoretical background that allows for an assessment of the complexity of the task. The results show that, despite of the a-priori complexity of the task, cue-based classification is a useful tool in the automatic acquisition of lexical semantic classes.
Resumo:
This work briefly analyses the difficulties to adopt the Semantic Web, and in particular proposes systems to know the present level of migration to the different technologies that make up the Semantic Web. It focuses on the presentation and description of two tools, DigiDocSpider and DigiDocMetaEdit, designed with the aim of verifYing, evaluating, and promoting its implementation.
Resumo:
A class of composite estimators of small area quantities that exploit spatial (distancerelated)similarity is derived. It is based on a distribution-free model for the areas, but theestimators are aimed to have optimal design-based properties. Composition is applied alsoto estimate some of the global parameters on which the small area estimators depend.It is shown that the commonly adopted assumption of random effects is not necessaryfor exploiting the similarity of the districts (borrowing strength across the districts). Themethods are applied in the estimation of the mean household sizes and the proportions ofsingle-member households in the counties (comarcas) of Catalonia. The simplest version ofthe estimators is more efficient than the established alternatives, even though the extentof spatial similarity is quite modest.
Resumo:
We show that the statistics of an edge type variable in natural images exhibits self-similarity properties which resemble those of local energy dissipation in turbulent flows. Our results show that self-similarity and extended self-similarity hold remarkably for the statistics of the local edge variance, and that the very same models can be used to predict all of the associated exponents. These results suggest using natural images as a laboratory for testing more elaborate scaling models of interest for the statistical description of turbulent flows. The properties we have exhibited are relevant for the modeling of the early visual system: They should be included in models designed for the prediction of receptive fields.
Resumo:
We demonstrate that the self-similarity of some scale-free networks with respect to a simple degree-thresholding renormalization scheme finds a natural interpretation in the assumption that network nodes exist in hidden metric spaces. Clustering, i.e., cycles of length three, plays a crucial role in this framework as a topological reflection of the triangle inequality in the hidden geometry. We prove that a class of hidden variable models with underlying metric spaces are able to accurately reproduce the self-similarity properties that we measured in the real networks. Our findings indicate that hidden geometries underlying these real networks are a plausible explanation for their observed topologies and, in particular, for their self-similarity with respect to the degree-based renormalization.
Resumo:
We analyse the use of the ordered weighted average (OWA) in decision-making giving special attention to business and economic decision-making problems. We present several aggregation techniques that are very useful for decision-making such as the Hamming distance, the adequacy coefficient and the index of maximum and minimum level. We suggest a new approach by using immediate weights, that is, by using the weighted average and the OWA operator in the same formulation. We further generalize them by using generalized and quasi-arithmetic means. We also analyse the applicability of the OWA operator in business and economics and we see that we can use it instead of the weighted average. We end the paper with an application in a business multi-person decision-making problem regarding production management
Resumo:
En el presente artículo se ha desarrollado un sistema capaz de categorizar de forma automática la base de datos de imágenes que sirven de punto de partida para la ideación y diseño en la producción artística del escultor M. Planas. La metodología utilizada está basada en características locales. Para la construcción de un vocabulario visual se sigue un procedimiento análogo al que se utiliza en el análisis automático de textos (modelo 'Bag-of-Words'-BOW) y en el ámbito de las imágenes nos referiremos a representaciones 'Bag-of-Visual Terms' (BOV). En este enfoque se analizan las imágenes como un conjunto de regiones, describiendo solamente su apariencia e ignorando su estructura espacial. Para superar los inconvenientes de polisemia y sinonimia que lleva asociados esta metodología, se utiliza el análisis probabilístico de aspectos latentes (PLSA) que detecta aspectos subyacentes en las imágenes, patrones formales. Los resultados obtenidos son prometedores y, además de la utilidad intrínseca de la categorización automática de imágenes, este método puede proporcionar al artista un punto de vista auxiliar muy interesante.
Resumo:
Nowadays, when a user is planning a touristic route is very difficult to find out which are the best places to visit. The user has to choose considering his/her preferences due to the great quantity of information it is possible to find in the web and taking into account it is necessary to do a selection, within small time because there is a limited time to do a trip. In Itiner@ project, we aim to implement Semantic Web technology combined with Geographic Information Systems in order to offer personalized touristic routes around a region based on user preferences and time situation. Using ontologies it is possible to link, structure, share data and obtain the result more suitable for user's preferences and actual situation with less time and more precisely than without ontologies. To achieve these objectives we propose a web page combining a GIS server and a touristic ontology. As a step further, we also study how to extend this technology on mobile devices due to the raising interest and technological progress of these devices and location-based services, which allows the user to have all the route information on the hand when he/she does a touristic trip. We design a little application in order to apply the combination of GIS and Semantic Web in a mobile device.
Resumo:
We analyse the use of the ordered weighted average (OWA) in decision-making giving special attention to business and economic decision-making problems. We present several aggregation techniques that are very useful for decision-making such as the Hamming distance, the adequacy coefficient and the index of maximum and minimum level. We suggest a new approach by using immediate weights, that is, by using the weighted average and the OWA operator in the same formulation. We further generalize them by using generalized and quasi-arithmetic means. We also analyse the applicability of the OWA operator in business and economics and we see that we can use it instead of the weighted average. We end the paper with an application in a business multi-person decision-making problem regarding production management