940 resultados para semantic analysis


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Previous research into formulaic language has focussed on specialised groups of people (e.g. L1 acquisition by infants and adult L2 acquisition) with ordinary adult native speakers of English receiving less attention. Additionally, whilst some features of formulaic language have been used as evidence of authorship (e.g. the Unabomber’s use of you can’t eat your cake and have it too) there has been no systematic investigation into this as a potential marker of authorship. This thesis reports the first full-scale study into the use of formulaic sequences by individual authors. The theory of formulaic language hypothesises that formulaic sequences contained in the mental lexicon are shaped by experience combined with what each individual has found to be communicatively effective. Each author’s repertoire of formulaic sequences should therefore differ. To test this assertion, three automated approaches to the identification of formulaic sequences are tested on a specially constructed corpus containing 100 short narratives. The first approach explores a limited subset of formulaic sequences using recurrence across a series of texts as the criterion for identification. The second approach focuses on a word which frequently occurs as part of formulaic sequences and also investigates alternative non-formulaic realisations of the same semantic content. Finally, a reference list approach is used. Whilst claiming authority for any reference list can be difficult, the proposed method utilises internet examples derived from lists prepared by others, a procedure which, it is argued, is akin to asking large groups of judges to reach consensus about what is formulaic. The empirical evidence supports the notion that formulaic sequences have potential as a marker of authorship since in some cases a Questioned Document was correctly attributed. Although this marker of authorship is not universally applicable, it does promise to become a viable new tool in the forensic linguist’s tool-kit.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While much of a company's knowledge can be found in text repositories, current content management systems have limited capabilities for structuring and interpreting documents. In the emerging Semantic Web, search, interpretation and aggregation can be addressed by ontology-based semantic mark-up. In this paper, we examine semantic annotation, identify a number of requirements, and review the current generation of semantic annotation systems. This analysis shows that, while there is still some way to go before semantic annotation tools will be able to address fully all the knowledge management needs, research in the area is active and making good progress.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of our work is to present solutions and a methodical support for automated techniques and procedures in domain engineering, in particular for variability modeling. Our approach is based upon Semantic Modeling concepts, for which semantic description, representation patterns and inference mechanisms are defined. Thus, model-driven techniques enriched with semantics will allow flexibility and variability in representation means, reasoning power and the required analysis depth for the identification, interpretation and adaptation of artifact properties and qualities.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper presents an approach to extraction of facts from texts of documents. This approach is based on using knowledge about the subject domain, specialized dictionary and the schemes of facts that describe fact structures taking into consideration both semantic and syntactic compatibility of elements of facts. Actually extracted facts combine into one structure the dictionary lexical objects found in the text and match them against concepts of subject domain ontology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The approaches to the analysis of various information resources pertinent to user requirements at a semantic level are determined by the thesauruses of the appropriate subject domains. The algorithms of formation and normalization of the multilinguistic thesaurus, and also methods of their comparison are given.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

* This work was financially supported by RFBF-04-01-00858.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new distance function to compare arbitrary partitions is proposed. Clustering of image collections and image segmentation give objects to be matched. Offered metric intends for combination of visual features and metadata analysis to solve a semantic gap between low-level visual features and high-level human concept.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

One of the ultimate aims of Natural Language Processing is to automate the analysis of the meaning of text. A fundamental step in that direction consists in enabling effective ways to automatically link textual references to their referents, that is, real world objects. The work presented in this paper addresses the problem of attributing a sense to proper names in a given text, i.e., automatically associating words representing Named Entities with their referents. The method for Named Entity Disambiguation proposed here is based on the concept of semantic relatedness, which in this work is obtained via a graph-based model over Wikipedia. We show that, without building the traditional bag of words representation of the text, but instead only considering named entities within the text, the proposed method achieves results competitive with the state-of-the-art on two different datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Sentiment lexicons for sentiment analysis offer a simple, yet effective way to obtain the prior sentiment information of opinionated words in texts. However, words’ sentiment orientations and strengths often change throughout various contexts in which the words appear. In this paper, we propose a lexicon adaptation approach that uses the contextual semantics of words to capture their contexts in tweet messages and update their prior sentiment orientations and/or strengths accordingly. We evaluate our approach on one state-of-the-art sentiment lexicon using three different Twitter datasets. Results show that the sentiment lexicons adapted by our approach outperform the original lexicon in accuracy and F-measure in two datasets, but give similar accuracy and slightly lower F-measure in one dataset.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the main concepts of a project under development concerning the analysis process of a scene containing a large number of objects, represented as unstructured point clouds. To achieve what we called the "optimal scene interpretation" (the shortest scene description satisfying the MDL principle) we follow an approach for managing 3-D objects based on a semantic framework based on ontologies for adding and sharing conceptual knowledge about spatial objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a research of linguistic structure of Bulgarian bells knowledge. The idea of building semantic structure of Bulgarian bells appeared during the “Multimedia fund - BellKnow” project. In this project was collected a lots of data about bells, their structure, history, technical data, etc. This is the first attempt for computation linguistic explain of bell knowledge and deliver a semantic representation of that knowledge. Based on this research some linguistic components, aiming to realize different types of analysis of text objects are implemented in term dictionaries. Thus, we lay the foundation of the linguistic analysis services in these digital dictionaries aiding the research of kinds, number and frequency of the lexical units that constitute various bell objects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An implementation of Sem-ODB—a database management system based on the Semantic Binary Model is presented. A metaschema of Sem-ODB database as well as the top-level architecture of the database engine is defined. A new benchmarking technique is proposed which allows databases built on different database models to compete fairly. This technique is applied to show that Sem-ODB has excellent efficiency comparing to a relational database on a certain class of database applications. A new semantic benchmark is designed which allows evaluation of the performance of the features characteristic of semantic database applications. An application used in the benchmark represents a class of problems requiring databases with sparse data, complex inheritances and many-to-many relations. Such databases can be naturally accommodated by semantic model. A fixed predefined implementation is not enforced allowing the database designer to choose the most efficient structures available in the DBMS tested. The results of the benchmark are analyzed. ^ A new high-level querying model for semantic databases is defined. It is proven adequate to serve as an efficient native semantic database interface, and has several advantages over the existing interfaces. It is optimizable and parallelizable, supports the definition of semantic userviews and the interoperability of semantic databases with other data sources such as World Wide Web, relational, and object-oriented databases. The query is structured as a semantic database schema graph with interlinking conditionals. The query result is a mini-database, accessible in the same way as the original database. The paradigm supports and utilizes the rich semantics and inherent ergonomics of semantic databases. ^ The analysis and high-level design of a system that exploits the superiority of the Semantic Database Model to other data models in expressive power and ease of use to allow uniform access to heterogeneous data sources such as semantic databases, relational databases, web sites, ASCII files, and others via a common query interface is presented. The Sem-ODB engine is used to control all the data sources combined under a unified semantic schema. A particular application of the system to provide an ODBC interface to the WWW as a data source is discussed. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To carry out their specific roles in the cell, genes and gene products often work together in groups, forming many relationships among themselves and with other molecules. Such relationships include physical protein-protein interaction relationships, regulatory relationships, metabolic relationships, genetic relationships, and much more. With advances in science and technology, some high throughput technologies have been developed to simultaneously detect tens of thousands of pairwise protein-protein interactions and protein-DNA interactions. However, the data generated by high throughput methods are prone to noise. Furthermore, the technology itself has its limitations, and cannot detect all kinds of relationships between genes and their products. Thus there is a pressing need to investigate all kinds of relationships and their roles in a living system using bioinformatic approaches, and is a central challenge in Computational Biology and Systems Biology. This dissertation focuses on exploring relationships between genes and gene products using bioinformatic approaches. Specifically, we consider problems related to regulatory relationships, protein-protein interactions, and semantic relationships between genes. A regulatory element is an important pattern or "signal", often located in the promoter of a gene, which is used in the process of turning a gene "on" or "off". Predicting regulatory elements is a key step in exploring the regulatory relationships between genes and gene products. In this dissertation, we consider the problem of improving the prediction of regulatory elements by using comparative genomics data. With regard to protein-protein interactions, we have developed bioinformatics techniques to estimate support for the data on these interactions. While protein-protein interactions and regulatory relationships can be detected by high throughput biological techniques, there is another type of relationship called semantic relationship that cannot be detected by a single technique, but can be inferred using multiple sources of biological data. The contributions of this thesis involved the development and application of a set of bioinformatic approaches that address the challenges mentioned above. These included (i) an EM-based algorithm that improves the prediction of regulatory elements using comparative genomics data, (ii) an approach for estimating the support of protein-protein interaction data, with application to functional annotation of genes, (iii) a novel method for inferring functional network of genes, and (iv) techniques for clustering genes using multi-source data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis research describes the design and implementation of a Semantic Geographic Information System (GIS) and the creation of its spatial database. The database schema is designed and created, and all textual and spatial data are loaded into the database with the help of the Semantic DBMS's Binary Database Interface currently being developed at the FIU's High Performance Database Research Center (HPDRC). A friendly graphical user interface is created together with the other main system's areas: displaying process, data animation, and data retrieval. All these components are tightly integrated to form a novel and practical semantic GIS that has facilitated the interpretation, manipulation, analysis, and display of spatial data like: Ocean Temperature, Ozone(TOMS), and simulated SeaWiFS data. At the same time, this system has played a major role in the testing process of the HPDRC's high performance and efficient parallel Semantic DBMS.