27 resultados para Natural language techniques, Semantic spaces, Random projection, Documents


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sparse representation based visual tracking approaches have attracted increasing interests in the community in recent years. The main idea is to linearly represent each target candidate using a set of target and trivial templates while imposing a sparsity constraint onto the representation coefficients. After we obtain the coefficients using L1-norm minimization methods, the candidate with the lowest error, when it is reconstructed using only the target templates and the associated coefficients, is considered as the tracking result. In spite of promising system performance widely reported, it is unclear if the performance of these trackers can be maximised. In addition, computational complexity caused by the dimensionality of the feature space limits these algorithms in real-time applications. In this paper, we propose a real-time visual tracking method based on structurally random projection and weighted least squares techniques. In particular, to enhance the discriminative capability of the tracker, we introduce background templates to the linear representation framework. To handle appearance variations over time, we relax the sparsity constraint using a weighed least squares (WLS) method to obtain the representation coefficients. To further reduce the computational complexity, structurally random projection is used to reduce the dimensionality of the feature space while preserving the pairwise distances between the data points in the feature space. Experimental results show that the proposed approach outperforms several state-of-the-art tracking methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a practical approach to Natural Language Generation (NLG) for spoken dialogue systems. The approach is based on small template fragments (mini-templates). The system’s object architecture facilitates generation of phrases across pre-defined business domains and registers, as well as into different languages. The architecture simplifies NLG in well-understood application contexts, while providing the flexibility for a developer and for the system, to vary linguistic output according to dialogue context, including any intended affective impact. Mini-templates are used with a suite of domain term objects, resulting in an NLG system (MINTGEN – MINi-Template GENerator) whose extensibility and ease of maintenance is enhanced by the sparsity of information devoted to individual domains. The system also avoids the need for specialist linguistic competence on the part of the system maintainer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper outlines the design and development of a Java-based, unified and flexible natural language dialogue system that enables users to interact using natural language, e.g. speech. A number of software development issues are considered with the aim of designing an architecture that enables different discourse components to be readily and flexibly combined in a manner that permits information to be easily shared. Use of XML schemas assists this component interaction. The paper describes how a range of Java language features were employed to support the development of the architecture, providing an illustration of how a modern programming language makes tractable the development of a complex dialogue system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present three natural language marking strategies based on fast and reliable shallow parsing techniques, and on widely available lexical resources: lexical substitution, adjective conjunction swaps, and relativiser switching. We test these techniques on a random sample of the British National Corpus. Individual candidate marks are checked for goodness of structural and semantic fit, using both lexical resources, and the web as a corpus. A representative sample of marks is given to 25 human judges to evaluate for acceptability and preservation of meaning. This establishes a correlation between corpus based felicity measures and perceived quality, and makes qualified predictions. Grammatical acceptability correlates with our automatic measure strongly (Pearson's r = 0.795, p = 0.001), allowing us to account for about two thirds of variability in human judgements. A moderate but statistically insignificant (Pearson's r = 0.422, p = 0.356) correlation is found with judgements of meaning preservation, indicating that the contextual window of five content words used for our automatic measure may need to be extended. © 2007 SPIE-IS&T.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In most previous research on distributional semantics, Vector Space Models (VSMs) of words are built either from topical information (e.g., documents in which a word is present), or from syntactic/semantic types of words (e.g., dependency parse links of a word in sentences), but not both. In this paper, we explore the utility of combining these two representations to build VSM for the task of semantic composition of adjective-noun phrases. Through extensive experiments on benchmark datasets, we find that even though a type-based VSM is effective for semantic composition, it is often outperformed by a VSM built using a combination of topic- and type-based statistics. We also introduce a new evaluation task wherein we predict the composed vector representation of a phrase from the brain activity of a human subject reading that phrase. We exploit a large syntactically parsed corpus of 16 billion tokens to build our VSMs, with vectors for both phrases and words, and make them publicly available.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The past decade had witnessed an unprecedented growth in the amount of available digital content, and its volume is expected to continue to grow the next few years. Unstructured text data generated from web and enterprise sources form a large fraction of such content. Many of these contain large volumes of reusable data such as solutions to frequently occurring problems, and general know-how that may be reused in appropriate contexts. In this work, we address issues around leveraging unstructured text data from sources as diverse as the web and the enterprise within the Case-based Reasoning framework. Case-based Reasoning (CBR) provides a framework and methodology for systematic reuse of historical knowledge that is available in the form of problemsolution
pairs, in solving new problems. Here, we consider possibilities of enhancing Textual CBR systems under three main themes: procurement, maintenance and retrieval. We adapt and build upon the stateof-the-art techniques from data mining and natural language processing in addressing various challenges therein. Under procurement, we investigate the problem of extracting cases (i.e., problem-solution pairs) from data sources such as incident/experience
reports. We develop case-base maintenance methods specifically tuned to text targeted towards retaining solutions such that the utility of the filtered case base in solving new problems is maximized. Further, we address the problem of query suggestions for textual case-bases and show that exploiting the problem-solution partition can enhance retrieval effectiveness by prioritizing more useful query suggestions. Additionally, we illustrate interpretable clustering as a tool to drill-down to domain specific text collections (since CBR systems are usually very domain specific) and develop techniques for improved similarity assessment in social media sources such as microblogs. Through extensive empirical evaluations, we illustrate the improvements that we are able to
achieve over the state-of-the-art methods for the respective tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In previous papers, we have presented a logic-based framework based on fusion rules for merging structured news reports. Structured news reports are XML documents, where the textentries are restricted to individual words or simple phrases, such as names and domain-specific terminology, and numbers and units. We assume structured news reports do not require natural language processing. Fusion rules are a form of scripting language that define how structured news reports should be merged. The antecedent of a fusion rule is a call to investigate the information in the structured news reports and the background knowledge, and the consequent of a fusion rule is a formula specifying an action to be undertaken to form a merged report. It is expected that a set of fusion rules is defined for any given application. In this paper we extend the approach to handling probability values, degrees of beliefs, or necessity measures associated with textentries in the news reports. We present the formal definition for each of these types of uncertainty and explain how they can be handled using fusion rules. We also discuss the methods of detecting inconsistencies among sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Side-channel attacks (SCA) threaten electronic cryptographic devices and can be carried out by monitoring the physical characteristics of security circuits. Differential Power Analysis (DPA) is one the most widely studied side-channel attacks. Numerous countermeasure techniques, such as Random Delay Insertion (RDI), have been proposed to reduce the risk of DPA attacks against cryptographic devices. The RDI technique was first proposed for microprocessors but it was shown to be unsuccessful when implemented on smartcards as it was vulnerable to a variant of the DPA attack known as the Sliding-Window DPA attack.Previous research by the authors investigated the use of the RDI countermeasure for Field Programmable Gate Array (FPGA) based cryptographic devices. A split-RDI technique wasproposed to improve the security of the RDI countermeasure. A set of critical parameters wasalso proposed that could be utilized in the design stage to optimize a security algorithm designwith RDI in terms of area, speed and power. The authors also showed that RDI is an efficientcountermeasure technique on FPGA in comparison to other countermeasures.In this article, a new RDI logic design is proposed that can be used to cost-efficiently implementRDI on FPGA devices. Sliding-Window DPA and realignment attacks, which were shown to beeffective against RDI implemented on smartcard devices, are performed on the improved RDIFPGA implementation. We demonstrate that these attacks are unsuccessful and we also proposea realignment technique that can be used to demonstrate the weakness of RDI implementations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a data model for content representation of temporal media in an IP based sensor network. The model is formed by introducing the idea of semantic-role from linguistics into the underlying concepts of formal event representation with the aim of developing a common event model. The architecture of a prototype system for a multi camera surveillance system, based on the proposed model is described. The important aspects of the proposed model are its expressiveness, its ability to model content of temporal media, and its suitability for use with a natural language interface. It also provides a platform for temporal information fusion, as well as organizing sensor annotations by help of ontologies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision making is an important element throughout the life-cycle of large-scale projects. Decisions are critical as they have a direct impact upon the success/outcome of a project and are affected by many factors including the certainty and precision of information. In this paper we present an evidential reasoning framework which applies Dempster-Shafer Theory and its variant Dezert-Smarandache Theory to aid decision makers in making decisions where the knowledge available may be imprecise, conflicting and uncertain. This conceptual framework is novel as natural language based information extraction techniques are utilized in the extraction and estimation of beliefs from diverse textual information sources, rather than assuming these estimations as already given. Furthermore we describe an algorithm to define a set of maximal consistent subsets before fusion occurs in the reasoning framework. This is important as inconsistencies between subsets may produce results which are incorrect/adverse in the decision making process. The proposed framework can be applied to problems involving material selection and a Use Case based in the Engineering domain is presented to illustrate the approach. © 2013 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the natural world, camouflage is habitually deployed by 'vulnerable' creatures to deceive predators. Such protective strategies have been culturally, socially and technologically translated into human societies, whereby camouflage has been used to mask intentions, actions, feelings and valuable objects or spaces. Through the material presence of such techniques, everyday spaces can become inscribed as places of sanctuary. Focusing on British civil camouflage work of the 1930s and 1940s, this paper explores the historical, cultural and political connotations of camouflage and how the attainment of invisibility, as a 'weapon of the weak', can both physically and affectively protect urban populations. © 2012 Copyright Taylor and Francis Group, LLC.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper contributes a new approach for developing UML software designs from Natural Language (NL), making use of a meta-domain oriented ontology, well established software design principles and Natural Language Processing (NLP) tools. In the approach described here, banks of grammatical rules are used to assign event flows from essential use cases. A domain specific ontology is also constructed, permitting semantic mapping between the NL input and the modeled domain. Rules based on the widely-used General Responsibility Assignment Software Principles (GRASP) are then applied to derive behavioral models.