877 resultados para Semantic Annotation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a combined structure for using real, complex, and binary valued vectors for semantic representation. The theory, implementation, and application of this structure are all significant. For the theory underlying quantum interaction, it is important to develop a core set of mathematical operators that describe systems of information, just as core mathematical operators in quantum mechanics are used to describe the behavior of physical systems. The system described in this paper enables us to compare more traditional quantum mechanical models (which use complex state vectors), alongside more generalized quantum models that use real and binary vectors. The implementation of such a system presents fundamental computational challenges. For large and sometimes sparse datasets, the demands on time and space are different for real, complex, and binary vectors. To accommodate these demands, the Semantic Vectors package has been carefully adapted and can now switch between different number types comparatively seamlessly. This paper describes the key abstract operations in our semantic vector models, and describes the implementations for real, complex, and binary vectors. We also discuss some of the key questions that arise in the field of quantum interaction and informatics, explaining how the wide availability of modelling options for different number fields will help to investigate some of these questions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper outlines a novel approach for modelling semantic relationships within medical documents. Medical terminologies contain a rich source of semantic information critical to a number of techniques in medical informatics, including medical information retrieval. Recent research suggests that corpus-driven approaches are effective at automatically capturing semantic similarities between medical concepts, thus making them an attractive option for accessing semantic information. Most previous corpus-driven methods only considered syntagmatic associations. In this paper, we adapt a recent approach that explicitly models both syntagmatic and paradigmatic associations. We show that the implicit similarity between certain medical concepts can only be modelled using paradigmatic associations. In addition, the inclusion of both types of associations overcomes the sensitivity to the training corpus experienced by previous approaches, making our method both more effective and more robust. This finding may have implications for researchers in the area of medical information retrieval.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this paper is to provide a comparison of various algorithms and parameters to build reduced semantic spaces. The effect of dimension reduction, the stability of the representation and the effect of word order are examined in the context of the five algorithms bearing on semantic vectors: Random projection (RP), singular value decom- position (SVD), non-negative matrix factorization (NMF), permutations and holographic reduced representations (HRR). The quality of semantic representation was tested by means of synonym finding task using the TOEFL test on the TASA corpus. Dimension reduction was found to improve the quality of semantic representation but it is hard to find the optimal parameter settings. Even though dimension reduction by RP was found to be more generally applicable than SVD, the semantic vectors produced by RP are somewhat unstable. The effect of encoding word order into the semantic vector representation via HRR did not lead to any increase in scores over vectors constructed from word co-occurrence in context information. In this regard, very small context windows resulted in better semantic vectors for the TOEFL test.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Entity-oriented search has become an essential component of modern search engines. It focuses on retrieving a list of entities or information about the specific entities instead of documents. In this paper, we study the problem of finding entity related information, referred to as attribute-value pairs, that play a significant role in searching target entities. We propose a novel decomposition framework combining reduced relations and the discriminative model, Conditional Random Field (CRF), for automatically finding entity-related attribute-value pairs from free text documents. This decomposition framework allows us to locate potential text fragments and identify the hidden semantics, in the form of attribute-value pairs for user queries. Empirical analysis shows that the decomposition framework outperforms pattern-based approaches due to its capability of effective integration of syntactic and semantic features.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Free association norms indicate that words are organized into semantic/associative neighborhoods within a larger network of words and links that bind the net together. We present evidence indicating that memory for a recent word event can depend on implicitly and simultaneously activating related words in its neighborhood. Processing a word during encoding primes its network representation as a function of the density of the links in its neighborhood. Such priming increases recall and recognition and can have long lasting effects when the word is processed in working memory. Evidence for this phenomenon is reviewed in extralist cuing, primed free association, intralist cuing, and single-item recognition tasks. The findings also show that when a related word is presented to cue the recall of a studied word, the cue activates it in an array of related words that distract and reduce the probability of its selection. The activation of the semantic network produces priming benefits during encoding and search costs during retrieval. In extralist cuing recall is a negative function of cue-to-distracter strength and a positive function of neighborhood density, cue-to-target strength, and target-to cue strength. We show how four measures derived from the network can be combined and used to predict memory performance. These measures play different roles in different tasks indicating that the contribution of the semantic network varies with the context provided by the task. We evaluate spreading activation and quantum-like entanglement explanations for the priming effect produced by neighborhood density.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Finding and labelling semantic features patterns of documents in a large, spatial corpus is a challenging problem. Text documents have characteristics that make semantic labelling difficult; the rapidly increasing volume of online documents makes a bottleneck in finding meaningful textual patterns. Aiming to deal with these issues, we propose an unsupervised documnent labelling approach based on semantic content and feature patterns. A world ontology with extensive topic coverage is exploited to supply controlled, structured subjects for labelling. An algorithm is also introduced to reduce dimensionality based on the study of ontological structure. The proposed approach was promisingly evaluated by compared with typical machine learning methods including SVMs, Rocchio, and kNN.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modelling how a word is activated in human memory is an important requirement for determining the probability of recall of a word in an extra-list cueing experiment. Previous research assumed a quantum-like model in which the semantic network was modelled as entangled qubits, however the level of activation was clearly being over-estimated. This paper explores three variations of this model, each of which are distinguished by a scaling factor designed to compensate the overestimation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a method to generate a large scale and accurate dense 3D semantic map of street scenes. A dense 3D semantic model of the environment can significantly improve a number of robotic applications such as autonomous driving, navigation or localisation. Instead of using offline trained classifiers for semantic segmentation, our approach employs a data-driven, nonparametric method to parse scenes which easily scale to a large environment and generalise to different scenes. We use stereo image pairs collected from cameras mounted on a moving car to produce dense depth maps which are combined into a global 3D reconstruction using camera poses from stereo visual odometry. Simultaneously, 2D automatic semantic segmentation using a nonparametric scene parsing method is fused into the 3D model. Furthermore, the resultant 3D semantic model is improved with the consideration of moving objects in the scene. We demonstrate our method on the publicly available KITTI dataset and evaluate the performance against manually generated ground truth.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Chinese modal particles feature prominently in Chinese people’s daily use of the language, but their pragmatic and semantic functions are elusive as commonly recognised by Chinese linguists and teachers of Chinese as a foreign language. This book originates from an extensive and intensive empirical study of the Chinese modal particle a (啊), one of the most frequently used modal particles in Mandarin Chinese. In order to capture all the uses and the underlying meanings of the particle, the author transcribed the first 20 episodes, about 20 hours in length, of the popular Chinese TV drama series Kewang ‘Expectations’, which yielded a corpus data of more than 142’000 Chinese characters with a total of 1829 instances of the particle all used in meaningful communicative situations. Within its context of use, every single occurrence of the particle was analysed in terms of its pragmatic and semantic contributions to the hosting utterance. Upon this basis the core meanings were identified which were seen as constituting the modal nature of the particle.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel framework for the unsupervised alignment of an ensemble of temporal sequences. This approach draws inspiration from the axiom that an ensemble of temporal signals stemming from the same source/class should have lower rank when "aligned" rather than "misaligned". Our approach shares similarities with recent state of the art methods for unsupervised images ensemble alignment (e.g. RASL) which breaks the problem into a set of image alignment problems (which have well known solutions i.e. the Lucas-Kanade algorithm). Similarly, we propose a strategy for decomposing the problem of temporal ensemble alignment into a similar set of independent sequence problems which we claim can be solved reliably through Dynamic Time Warping (DTW). We demonstrate the utility of our method using the Cohn-Kanade+ dataset, to align expression onset across multiple sequences, which allows us to automate the rapid discovery of event annotations.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Determining similarity between business process models has recently gained interest in the business process management community. So far similarity was addressed separately either at semantic or structural aspect of process models. Also, most of the contributions that measure similarity of process models assume an ideal case when process models are enriched with semantics - a description of meaning of process model elements. However, in real life this results in a heavy human effort consuming pre-processing phase which is often not feasible. In this paper we propose an automated approach for querying a business process model repository for structurally and semantically relevant models. Similar to the search on the Internet, a user formulates a BPMN-Q query and as a result receives a list of process models ordered by relevance to the query. We provide a business process model search engine implementation for evaluation of the proposed approach.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis developed new search engine models that elicit the meaning behind the words found in documents and queries, rather than simply matching keywords. These new models were applied to searching medical records: an area where search is particularly challenging yet can have significant benefits to our society.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semantic space models of word meaning derived from co-occurrence statistics within a corpus of documents, such as the Hyperspace Analogous to Language (HAL) model, have been proposed in the past. While word similarity can be computed using these models, it is not clear how semantic spaces derived from different sets of documents can be compared. In this paper, we focus on this problem, and we revisit the proposal of using semantic subspace distance measurements [1]. In particular, we outline the research questions that still need to be addressed to investigate and validate these distance measures. Then, we describe our plans for future research.