965 resultados para Natural language processing (Computer science)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Software corpora facilitate reproducibility of analyses, however, static analysis for an entire corpus still requires considerable effort, often duplicated unnecessarily by multiple users. Moreover, most corpora are designed for single languages increasing the effort for cross-language analysis. To address these aspects we propose Pangea, an infrastructure allowing fast development of static analyses on multi-language corpora. Pangea uses language-independent meta-models stored as object model snapshots that can be directly loaded into memory and queried without any parsing overhead. To reduce the effort of performing static analyses, Pangea provides out-of-the box support for: creating and refining analyses in a dedicated environment, deploying an analysis on an entire corpus, using a runner that supports parallel execution, and exporting results in various formats. In this tool demonstration we introduce Pangea and provide several usage scenarios that illustrate how it reduces the cost of analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The domain of context-free languages has been extensively explored and there exist numerous techniques for parsing (all or a subset of) context-free languages. Unfortunately, some programming languages are not context-free. Using standard context-free parsing techniques to parse a context-sensitive programming language poses a considerable challenge. Im- plementors of programming language parsers have adopted various techniques, such as hand-written parsers, special lex- ers, or post-processing of an ambiguous parser output to deal with that challenge. In this paper we suggest a simple extension of a top-down parser with contextual information. Contrary to the tradi- tional approach that uses only the input stream as an input to a parsing function, we use a parsing context that provides ac- cess to a stream and possibly to other context-sensitive infor- mation. At a same time we keep the context-free formalism so a grammar definition stays simple without mind-blowing context-sensitive rules. We show that our approach can be used for various purposes such as indent-sensitive parsing, a high-precision island parsing or XML (with arbitrary el- ement names) parsing. We demonstrate our solution with PetitParser, a parsing-expression grammar based, top-down, parser combinator framework written in Smalltalk.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a shallow dialogue analysis model, aimed at human-human dialogues in the context of staff or business meetings. Four components of the model are defined, and several machine learning techniques are used to extract features from dialogue transcripts: maximum entropy classifiers for dialogue acts, latent semantic analysis for topic segmentation, or decision tree classifiers for discourse markers. A rule-based approach is proposed for solving cross-modal references to meeting documents. The methods are trained and evaluated thanks to a common data set and annotation format. The integration of the components into an automated shallow dialogue parser opens the way to multimodal meeting processing and retrieval applications.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a conceptual approach to enhance knowledge management by synchronizing mind maps and fuzzy cognitive maps. The use of mind maps allows taking advantage of human creativity, while the application of fuzzy cognitive maps enables to store information expressed in natural language. By applying cognitive computing, it makes possible to gather and extract relevant information out of a data pool. Therefore, this approach is supposed to give a framework that enhances knowledge management. To demonstrate the potential of this framework, a use case concerning the development of a smart city app is presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article presents a multi-agent expert system (SMAF) , that allows the input of incidents which occur in different elements of the telecommunications area. SMAF interacts with experts and general users, and each agent with all the agents? community, recording the incidents and their solutions in a knowledge base, without the analysis of their causes. The incidents are expressed using keywords taken from natural language (originally Spanish) and their main concepts are recorded with their severities as the users express them. Then, there is a search of the best solution for each incident, being helped by a human operator using a distancenotions between them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper the hardware implementation of an inner hair cell model is presented. Main features of the design are the use of Meddis’ transduction structure and the methodology for Design with Reusability. Which allows future migration to new hardware and design refinements for speech processing and custom-made hearing aids

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A new formalism, called Hiord, for defining type-free higherorder logic programming languages with predicate abstraction is introduced. A model theory, based on partial combinatory algebras, is presented, with respect to which the formalism is shown sound. A programming language built on a subset of Hiord, and its implementation are discussed. A new proposal for defining modules in this framework is considered, along with several examples.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This report addresses speculative parallelism (the assignment of spare processing resources to tasks which are not known to be strictly required for the successful completion of a computation) at the user and application level. At this level, the execution of a program is seen as a (dynamic) tree —a graph, in general. A solution for a problem is a traversal of this graph from the initial state to a node known to be the answer. Speculative parallelism then represents the assignment of resources to múltiple branches of this graph even if they are not positively known to be on the path to a solution. In highly non-deterministic programs the branching factor can be very high and a naive assignment will very soon use up all the resources. This report presents work assignment strategies other than the usual depth-first and breadth-first. Instead, best-first strategies are used. Since their definition is application-dependent, the application language contains primitives that allow the user (or application programmer) to a) indícate when intelligent OR-parallelism should be used; b) provide the functions that define "best," and c) indícate when to use them. An abstract architecture enables those primitives to perform the search in a "speculative" way, using several processors, synchronizing them, killing the siblings of the path leading to the answer, etc. The user is freed from worrying about these interactions. Several search strategies are proposed and their implementation issues are addressed. "Armageddon," a global pruning method, is introduced, together with both a software and a hardware implementation for it. The concepts exposed are applicable to áreas of Artificial Intelligence such as extensive expert systems, planning, game playing, and in general to large search problems. The proposed strategies, although showing promise, have not been evaluated by simulation or experimentation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ciao Prolog incorporates a module system which allows sepárate compilation and sensible creation of standalone executables. We describe some of the main aspects of the Ciao modular compiler, ciaoc, which takes advantage of the characteristics of the Ciao Prolog module system to automatically perform sepárate and incremental compilation and efficiently build small, standalone executables with competitive run-time performance, ciaoc can also detect statically a larger number of programming errors. We also present a generic code processing library for handling modular programs, which provides an important part of the functionality of ciaoc. This library allows the development of program analysis and transformation tools in a way that is to some extent orthogonal to the details of module system design, and has been used in the implementation of ciaoc and other Ciao system tools. We also describe the different types of executables which can be generated by the Ciao compiler, which offer different tradeoffs between executable size, startup time, and portability, depending, among other factors, on the linking regime used (static, dynamic, lazy, etc.). Finally, we provide experimental data which illustrate these tradeoffs.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Detecting user affect automatically during real-time conversation is the main challenge towards our greater aim of infusing social intelligence into a natural-language mixed-initiative High-Fidelity (Hi-Fi) audio control spoken dialog agent. In recent years, studies on affect detection from voice have moved on to using realistic, non-acted data, which is subtler. However, it is more challenging to perceive subtler emotions and this is demonstrated in tasks such as labelling and machine prediction. This paper attempts to address part of this challenge by considering the role of user satisfaction ratings and also conversational/dialog features in discriminating contentment and frustration, two types of emotions that are known to be prevalent within spoken human-computer interaction. However, given the laboratory constraints, users might be positively biased when rating the system, indirectly making the reliability of the satisfaction data questionable. Machine learning experiments were conducted on two datasets, users and annotators, which were then compared in order to assess the reliability of these datasets. Our results indicated that standard classifiers were significantly more successful in discriminating the abovementioned emotions and their intensities (reflected by user satisfaction ratings) from annotator data than from user data. These results corroborated that: first, satisfaction data could be used directly as an alternative target variable to model affect, and that they could be predicted exclusively by dialog features. Second, these were only true when trying to predict the abovementioned emotions using annotator?s data, suggesting that user bias does exist in a laboratory-led evaluation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present initial research regarding a system capable of generating novel card games. We furthermore propose a method for com- putationally analysing existing games of the same genre. Ultimately, we present a formalisation of card game rules, and a context-free grammar G cardgame capable of expressing the rules of a large variety of card games. Example derivations are given for the poker variant Texashold?em , Blackjack and UNO. Stochastic simulations are used both to verify the implementation of these well-known games, and to evaluate the results of new game rules derived from the grammar. In future work, this grammar will be used to evolve completely novel card games using a grammar- guided genetic program.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background Gray scale images make the bulk of data in bio-medical image analysis, and hence, the main focus of many image processing tasks lies in the processing of these monochrome images. With ever improving acquisition devices, spatial and temporal image resolution increases, and data sets become very large. Various image processing frameworks exists that make the development of new algorithms easy by using high level programming languages or visual programming. These frameworks are also accessable to researchers that have no background or little in software development because they take care of otherwise complex tasks. Specifically, the management of working memory is taken care of automatically, usually at the price of requiring more it. As a result, processing large data sets with these tools becomes increasingly difficult on work station class computers. One alternative to using these high level processing tools is the development of new algorithms in a languages like C++, that gives the developer full control over how memory is handled, but the resulting workflow for the prototyping of new algorithms is rather time intensive, and also not appropriate for a researcher with little or no knowledge in software development. Another alternative is in using command line tools that run image processing tasks, use the hard disk to store intermediate results, and provide automation by using shell scripts. Although not as convenient as, e.g. visual programming, this approach is still accessable to researchers without a background in computer science. However, only few tools exist that provide this kind of processing interface, they are usually quite task specific, and don’t provide an clear approach when one wants to shape a new command line tool from a prototype shell script. Results The proposed framework, MIA, provides a combination of command line tools, plug-ins, and libraries that make it possible to run image processing tasks interactively in a command shell and to prototype by using the according shell scripting language. Since the hard disk becomes the temporal storage memory management is usually a non-issue in the prototyping phase. By using string-based descriptions for filters, optimizers, and the likes, the transition from shell scripts to full fledged programs implemented in C++ is also made easy. In addition, its design based on atomic plug-ins and single tasks command line tools makes it easy to extend MIA, usually without the requirement to touch or recompile existing code. Conclusion In this article, we describe the general design of MIA, a general purpouse framework for gray scale image processing. We demonstrated the applicability of the software with example applications from three different research scenarios, namely motion compensation in myocardial perfusion imaging, the processing of high resolution image data that arises in virtual anthropology, and retrospective analysis of treatment outcome in orthognathic surgery. With MIA prototyping algorithms by using shell scripts that combine small, single-task command line tools is a viable alternative to the use of high level languages, an approach that is especially useful when large data sets need to be processed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the application of language translation technologies for generating bus information in Spanish Sign Language (LSE: Lengua de Signos Española). In this work, two main systems have been developed: the first for translating text messages from information panels and the second for translating spoken Spanish into natural conversations at the information point of the bus company. Both systems are made up of a natural language translator (for converting a word sentence into a sequence of LSE signs), and a 3D avatar animation module (for playing back the signs). For the natural language translator, two technological approaches have been analyzed and integrated: an example-based strategy and a statistical translator. When translating spoken utterances, it is also necessary to incorporate a speech recognizer for decoding the spoken utterance into a word sequence, prior to the language translation module. This paper includes a detailed description of the field evaluation carried out in this domain. This evaluation has been carried out at the customer information office in Madrid involving both real bus company employees and deaf people. The evaluation includes objective measurements from the system and information from questionnaires. In the field evaluation, the whole translation presents an SER (Sign Error Rate) of less than 10% and a BLEU greater than 90%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Everybody has to coordinate several tasks everyday, usually in a manual manner. Recently, the concept of Task Automation Services has been introduced to automate and personalize the task coordination problem. Several user centered platforms and applications have arisen in the last years, that let their users configure their very own automations based on third party services. In this paper, we propose a new system architecture for Task Automation Services in a heterogeneous mobile, smart devices, and cloud services environment. Our architecture is based on the novel idea to employ distributed Complex Event Processing to implement innovative mixed execution profiles. The major advantage of the approach is its ability to incorporate context-awareness and real-time coordination in Task Automation Services.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a Focused Crawler in order to Get Semantic Web Resources (CSR). Structured data web are available in formats such as Extensible Markup Language (XML), Resource Description Framework (RDF) and Ontology Web Language (OWL) that can be used for processing. One of the main challenges for performing a manual search and download semantic web resources is that this task consumes a lot of time. Our research work propose a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing. CRS consists of three layers: (a) The User Interface Layer, (b) The Focus Crawler Layer and (c) The Base Crawler Layer. CSR uses as a selection policie the Shark-Search method. CSR was conducted with two experiments. The first one starts on December 15 2012 at 7:11 am and ends on December 16 2012 at 4:01 were obtained 448,123,537 bytes of data. The CSR ends by itself after to analyze 80,4375 seeds with an unlimited depth. CSR got 16,576 semantic resources files where the 89 % was RDF, the 10 % was XML and the 1% was OWL. The second one was based on the Web Data Commons work of the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. This began at 4:46 am of June 2 2013 and 1:37 am June 9 2013. After 162.51 hours of execution the result was 285,279 semantic resources where predominated the XML resources with 99 % and OWL and RDF with 1 % each one.