945 resultados para Semantic enrichment


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Natural history collections are an invaluable resource housing a wealth of knowledge with a long tradition of contributing to a wide range of fields such as taxonomy, quarantine, conservation and climate change. It is recognized however [Smith and Blagoderov 2012] that such physical collections are often heavily underutilized as a result of the practical issues of accessibility. The digitization of these collections is a step towards removing these access issues, but other hurdles must be addressed before we truly unlock the potential of this knowledge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Con questa dissertazione di tesi miro ad illustrare i risultati della mia ricerca nel campo del Semantic Publishing, consistenti nello sviluppo di un insieme di metodologie, strumenti e prototipi, uniti allo studio di un caso d‟uso concreto, finalizzati all‟applicazione ed alla focalizzazione di Lenti Semantiche (Semantic Lenses).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this work we present a semantic framework suitable of being used as support tool for recommender systems. Our purpose is to use the semantic information provided by a set of integrated resources to enrich texts by conducting different NLP tasks: WSD, domain classification, semantic similarities and sentiment analysis. After obtaining the textual semantic enrichment we would be able to recommend similar content or even to rate texts according to different dimensions. First of all, we describe the main characteristics of the semantic integrated resources with an exhaustive evaluation. Next, we demonstrate the usefulness of our resource in different NLP tasks and campaigns. Moreover, we present a combination of different NLP approaches that provide enough knowledge for being used as support tool for recommender systems. Finally, we illustrate a case of study with information related to movies and TV series to demonstrate that our framework works properly.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper describes the implementation of a semantic web search engine on conversation styled transcripts. Our choice of data is Hansard, a publicly available conversation style transcript of parliamentary debates. The current search engine implementation on Hansard is limited to running search queries based on keywords or phrases hence lacks the ability to make semantic inferences from user queries. By making use of knowledge such as the relationship between members of parliament, constituencies, terms of office, as well as topics of debates the search results can be improved in terms of both relevance and coverage. Our contribution is not algorithmic instead we describe how we exploit a collection of external data sources, ontologies, semantic web vocabularies and named entity extraction in the analysis of underlying semantics of user queries as well as the semantic enrichment of the search index thereby improving the quality of results.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In this paper we use concepts from graph theory and cellular biology represented as ontologies, to carry out semantic mining tasks on signaling pathway networks. Specifically, the paper describes the semantic enrichment of signaling pathway networks. A cell signaling network describes the basic cellular activities and their interactions. The main contribution of this paper is in the signaling pathway research area, it proposes a new technique to analyze and understand how changes in these networks may affect the transmission and flow of information, which produce diseases such as cancer and diabetes. Our approach is based on three concepts from graph theory (modularity, clustering and centrality) frequently used on social networks analysis. Our approach consists into two phases: the first uses the graph theory concepts to determine the cellular groups in the network, which we will call them communities; the second uses ontologies for the semantic enrichment of the cellular communities. The measures used from the graph theory allow us to determine the set of cells that are close (for example, in a disease), and the main cells in each community. We analyze our approach in two cases: TGF-β and the Alzheimer Disease.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Nearest neighbour collaborative filtering (NNCF) algorithms are commonly used in multimedia recommender systems to suggest media items based on the ratings of users with similar preferences. However, the prediction accuracy of NNCF algorithms is affected by the reduced number of items – the subset of items co-rated by both users – typically used to determine the similarity between pairs of users. In this paper, we propose a different approach, which substantially enhances the accuracy of the neighbour selection process – a user-based CF (UbCF) with semantic neighbour discovery (SND). Our neighbour discovery methodology, which assesses pairs of users by taking into account all the items rated at least by one of the users instead of just the set of co-rated items, semantically enriches this enlarged set of items using linked data and, finally, applies the Collinearity and Proximity Similarity metric (CPS), which combines the cosine similarity with Chebyschev distance dissimilarity metric. We tested the proposed SND against the Pearson Correlation neighbour discovery algorithm off-line, using the HetRec data set, and the results show a clear improvement in terms of accuracy and execution time for the predicted recommendations.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

With the widespread of social media websites in the internet, and the huge number of users participating and generating infinite number of contents in these websites, the need for personalisation increases dramatically to become a necessity. One of the major issues in personalisation is building users’ profiles, which depend on many elements; such as the used data, the application domain they aim to serve, the representation method and the construction methodology. Recently, this area of research has been a focus for many researchers, and hence, the proposed methods are increasing very quickly. This survey aims to discuss the available user modelling techniques for social media websites, and to highlight the weakness and strength of these methods and to provide a vision for future work in user modelling in social media websites.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

En este documento se propone un marco de trabajo basado en tecnologías de la Web Semántica para detectar potenciales redes de colaboración, mediante el enriquecimiento semántico de artículos científicos producidos por investigadores que publican con afiliaciones ecuatorianas. El marco de trabajo se describe a través de un ciclo de publicación de datos enlazados. Como alcance se consideraron publicaciones que tienen al menos un autor con afiliación ecuatoriana. Las redes de colaboración detectadas son un insumo importante para fortalecer los esfuerzos del gobierno ecuatoriano y las autoridades universitarias del país, priorizar los esfuerzos y recursos invertidos en investigación y determinar la pertinencia o coherencia de los programas de investigación.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

La visibilidad de una página Web involucra el proceso de mejora de la posición del sitio en los resultados devueltos por motores de búsqueda como Google. Hay muchas empresas que compiten agresivamente para conseguir la primera posición en los motores de búsqueda más populares. Como regla general, los sitios que aparecen más arriba en los resultados suelen obtener más tráfico a sus páginas, y de esta forma, potencialmente más negocios. En este artículo se describe los principales modelos para enriquecer los resultados de las búsquedas con información tales como fechas o localidades; información de tipo clave-valor que permite al usuario interactuar con el contenido de una página Web directamente desde el sitio de resultados de la búsqueda. El aporte fundamental del artículo es mostrar la utilidad de diferentes formatos de marcado para enriquecer fragmentos de una página Web con el fin de ayudar a las empresas que están planeando implementar métodos de enriquecimiento semánticos en la estructuración de sus sitios Web.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work is concerned with the increasing relationships between two distinct multidisciplinary research fields, Semantic Web technologies and scholarly publishing, that in this context converge into one precise research topic: Semantic Publishing. In the spirit of the original aim of Semantic Publishing, i.e. the improvement of scientific communication by means of semantic technologies, this thesis proposes theories, formalisms and applications for opening up semantic publishing to an effective interaction between scholarly documents (e.g., journal articles) and their related semantic and formal descriptions. In fact, the main aim of this work is to increase the users' comprehension of documents and to allow document enrichment, discovery and linkage to document-related resources and contexts, such as other articles and raw scientific data. In order to achieve these goals, this thesis investigates and proposes solutions for three of the main issues that semantic publishing promises to address, namely: the need of tools for linking document text to a formal representation of its meaning, the lack of complete metadata schemas for describing documents according to the publishing vocabulary, and absence of effective user interfaces for easily acting on semantic publishing models and theories.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The research aims at developing a framework for semantic-based digital survey of architectural heritage. Rooted in knowledge-based modeling which extracts mathematical constraints of geometry from architectural treatises, as-built information of architecture obtained from image-based modeling is integrated with the ideal model in BIM platform. The knowledge-based modeling transforms the geometry and parametric relation of architectural components from 2D printings to 3D digital models, and create large amount variations based on shape grammar in real time thanks to parametric modeling. It also provides prior knowledge for semantically segmenting unorganized survey data. The emergence of SfM (Structure from Motion) provides access to reconstruct large complex architectural scenes with high flexibility, low cost and full automation, but low reliability of metric accuracy. We solve this problem by combing photogrammetric approaches which consists of camera configuration, image enhancement, and bundle adjustment, etc. Experiments show the accuracy of image-based modeling following our workflow is comparable to that from range-based modeling. We also demonstrate positive results of our optimized approach in digital reconstruction of portico where low-texture-vault and dramatical transition of illumination bring huge difficulties in the workflow without optimization. Once the as-built model is obtained, it is integrated with the ideal model in BIM platform which allows multiple data enrichment. In spite of its promising prospect in AEC industry, BIM is developed with limited consideration of reverse-engineering from survey data. Besides representing the architectural heritage in parallel ways (ideal model and as-built model) and comparing their difference, we concern how to create as-built model in BIM software which is still an open area to be addressed. The research is supposed to be fundamental for research of architectural history, documentation and conservation of architectural heritage, and renovation of existing buildings.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a novel architecture to introduce automatic annotation and processing of semantic sensor data within context-aware applications. Based on the well-known state-charts technologies, and represented using W3C SCXML language combined with Semantic Web technologies, our architecture is able to provide enriched higher-level semantic representations of user’s context. This capability to detect and model relevant user situations allows a seamless modeling of the actual interaction situation, which can be integrated during the design of multimodal user interfaces (also based on SCXML) for them to be adequately adapted. Therefore, the final result of this contribution can be described as a flexible context-aware SCXML-based architecture, suitable for both designing a wide range of multimodal context-aware user interfaces, and implementing the automatic enrichment of sensor data, making it available to the entire Semantic Sensor Web

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present the enrichment of the Integration of Semantic Resources based in WordNet (ISR-WN Enriched). This new proposal improves the previous one where several semantic resources such as SUMO, WordNet Domains and WordNet Affects were related, adding other semantic resources such as Semantic Classes and SentiWordNet. Firstly, the paper describes the architecture of this proposal explaining the particularities of each integrated resource. After that, we analyze some problems related to the mappings of different versions and how we solve them. Moreover, we show the advantages that this kind of tool can provide to different applications of Natural Language Processing. Related to that question, we can demonstrate that the integration of semantic resources allows acquiring a multidimensional vision in the analysis of natural language.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Derivational morphology proposes meaningful connections between words and is largely unrepresented in lexical databases. This thesis presents a project to enrich a lexical database with morphological links and to evaluate their contribution to disambiguation. A lexical database with sense distinctions was required. WordNet was chosen because of its free availability and widespread use. Its suitability was assessed through critical evaluation with respect to specifications and criticisms, using a transparent, extensible model. The identification of serious shortcomings suggested a portable enrichment methodology, applicable to alternative resources. Although 40% of the most frequent words are prepositions, they have been largely ignored by computational linguists, so addition of prepositions was also required. The preferred approach to morphological enrichment was to infer relations from phenomena discovered algorithmically. Both existing databases and existing algorithms can capture regular morphological relations, but cannot capture exceptions correctly; neither of them provide any semantic information. Some morphological analysis algorithms are subject to the fallacy that morphological analysis can be performed simply by segmentation. Morphological rules, grounded in observation and etymology, govern associations between and attachment of suffixes and contribute to defining the meaning of morphological relationships. Specifying character substitutions circumvents the segmentation fallacy. Morphological rules are prone to undergeneration, minimised through a variable lexical validity requirement, and overgeneration, minimised by rule reformulation and restricting monosyllabic output. Rules take into account the morphology of ancestor languages through co-occurrences of morphological patterns. Multiple rules applicable to an input suffix need their precedence established. The resistance of prefixations to segmentation has been addressed by identifying linking vowel exceptions and irregular prefixes. The automatic affix discovery algorithm applies heuristics to identify meaningful affixes and is combined with morphological rules into a hybrid model, fed only with empirical data, collected without supervision. Further algorithms apply the rules optimally to automatically pre-identified suffixes and break words into their component morphemes. To handle exceptions, stoplists were created in response to initial errors and fed back into the model through iterative development, leading to 100% precision, contestable only on lexicographic criteria. Stoplist length is minimised by special treatment of monosyllables and reformulation of rules. 96% of words and phrases are analysed. 218,802 directed derivational links have been encoded in the lexicon rather than the wordnet component of the model because the lexicon provides the optimal clustering of word senses. Both links and analyser are portable to an alternative lexicon. The evaluation uses the extended gloss overlaps disambiguation algorithm. The enriched model outperformed WordNet in terms of recall without loss of precision. Failure of all experiments to outperform disambiguation by frequency reflects on WordNet sense distinctions.