32 resultados para natural language processing


Relevância:

100.00% 100.00%

Publicador:

Resumo:

We present a practical approach to Natural Language Generation (NLG) for spoken dialogue systems. The approach is based on small template fragments (mini-templates). The system’s object architecture facilitates generation of phrases across pre-defined business domains and registers, as well as into different languages. The architecture simplifies NLG in well-understood application contexts, while providing the flexibility for a developer and for the system, to vary linguistic output according to dialogue context, including any intended affective impact. Mini-templates are used with a suite of domain term objects, resulting in an NLG system (MINTGEN – MINi-Template GENerator) whose extensibility and ease of maintenance is enhanced by the sparsity of information devoted to individual domains. The system also avoids the need for specialist linguistic competence on the part of the system maintainer.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper outlines the design and development of a Java-based, unified and flexible natural language dialogue system that enables users to interact using natural language, e.g. speech. A number of software development issues are considered with the aim of designing an architecture that enables different discourse components to be readily and flexibly combined in a manner that permits information to be easily shared. Use of XML schemas assists this component interaction. The paper describes how a range of Java language features were employed to support the development of the architecture, providing an illustration of how a modern programming language makes tractable the development of a complex dialogue system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In previous papers, we have presented a logic-based framework based on fusion rules for merging structured news reports. Structured news reports are XML documents, where the textentries are restricted to individual words or simple phrases, such as names and domain-specific terminology, and numbers and units. We assume structured news reports do not require natural language processing. Fusion rules are a form of scripting language that define how structured news reports should be merged. The antecedent of a fusion rule is a call to investigate the information in the structured news reports and the background knowledge, and the consequent of a fusion rule is a formula specifying an action to be undertaken to form a merged report. It is expected that a set of fusion rules is defined for any given application. In this paper we extend the approach to handling probability values, degrees of beliefs, or necessity measures associated with textentries in the news reports. We present the formal definition for each of these types of uncertainty and explain how they can be handled using fusion rules. We also discuss the methods of detecting inconsistencies among sources.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper contributes a new approach for developing UML software designs from Natural Language (NL), making use of a meta-domain oriented ontology, well established software design principles and Natural Language Processing (NLP) tools. In the approach described here, banks of grammatical rules are used to assign event flows from essential use cases. A domain specific ontology is also constructed, permitting semantic mapping between the NL input and the modeled domain. Rules based on the widely-used General Responsibility Assignment Software Principles (GRASP) are then applied to derive behavioral models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Identifying responsibility for classes in object oriented software design phase is a crucial task. This paper proposes an approach for producing high quality and robust behavioural diagrams (e.g. Sequence Diagrams) through Class Responsibility Assignment (CRA). GRASP or General Responsibility Assignment Software Pattern (or Principle) was used to direct the CRA process when deriving behavioural diagrams. A set of tools to support CRA was developed to provide designers and developers with a cognitive toolkit that can be used when analysing and designing object-oriented software. The tool developed is called Use Case Specification to Sequence Diagrams (UC2SD). UC2SD uses a new approach for developing Unified Modelling Language (UML) software designs from Natural Language, making use of a meta-domain oriented ontology, well established software design principles and established Natural Language Processing (NLP) tools. UC2SD generates a well-formed UML sequence diagrams as output.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The past decade had witnessed an unprecedented growth in the amount of available digital content, and its volume is expected to continue to grow the next few years. Unstructured text data generated from web and enterprise sources form a large fraction of such content. Many of these contain large volumes of reusable data such as solutions to frequently occurring problems, and general know-how that may be reused in appropriate contexts. In this work, we address issues around leveraging unstructured text data from sources as diverse as the web and the enterprise within the Case-based Reasoning framework. Case-based Reasoning (CBR) provides a framework and methodology for systematic reuse of historical knowledge that is available in the form of problemsolution
pairs, in solving new problems. Here, we consider possibilities of enhancing Textual CBR systems under three main themes: procurement, maintenance and retrieval. We adapt and build upon the stateof-the-art techniques from data mining and natural language processing in addressing various challenges therein. Under procurement, we investigate the problem of extracting cases (i.e., problem-solution pairs) from data sources such as incident/experience
reports. We develop case-base maintenance methods specifically tuned to text targeted towards retaining solutions such that the utility of the filtered case base in solving new problems is maximized. Further, we address the problem of query suggestions for textual case-bases and show that exploiting the problem-solution partition can enhance retrieval effectiveness by prioritizing more useful query suggestions. Additionally, we illustrate interpretable clustering as a tool to drill-down to domain specific text collections (since CBR systems are usually very domain specific) and develop techniques for improved similarity assessment in social media sources such as microblogs. Through extensive empirical evaluations, we illustrate the improvements that we are able to
achieve over the state-of-the-art methods for the respective tasks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Community-driven Question Answering (CQA) systems that crowdsource experiential information in the form of questions and answers and have accumulated valuable reusable knowledge. Clustering of QA datasets from CQA systems provides a means of organizing the content to ease tasks such as manual curation and tagging. In this paper, we present a clustering method that exploits the two-part question-answer structure in QA datasets to improve clustering quality. Our method, {\it MixKMeans}, composes question and answer space similarities in a way that the space on which the match is higher is allowed to dominate. This construction is motivated by our observation that semantic similarity between question-answer data (QAs) could get localized in either space. We empirically evaluate our method on a variety of real-world labeled datasets. Our results indicate that our method significantly outperforms state-of-the-art clustering methods for the task of clustering question-answer archives.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Due to its efficiency and simplicity, the finite-difference time-domain method is becoming a popular choice for solving wideband, transient problems in various fields of acoustics. So far, the issue of extracting a binaural response from finite difference simulations has only been discussed in the context of embedding a listener geometry in the grid. In this paper, we propose and study a method for binaural response rendering based on a spatial decomposition of the sound field. The finite difference grid is locally sampled using a volumetric array of receivers, from which a plane wave density function is computed and integrated with free-field head related transfer functions, in the spherical harmonics domain. The volumetric array is studied in terms of numerical robustness and spatial aliasing. Analytic formulas that predict the performance of the array are developed, facilitating spatial resolution analysis and numerical binaural response analysis for a number of finite difference schemes. Particular emphasis is placed on the effects of numerical dispersion on array processing and on the resulting binaural responses. Our method is compared to a binaural simulation based on the image method. Results indicate good spatial and temporal agreement between the two methods.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, a complete method for finite-difference time-domain modeling of rooms in 2-D using compact explicit schemes is presented. A family of interpolated schemes using a rectilinear, nonstaggered grid is reviewed, and the most accurate and isotropic schemes are identified. Frequency-dependent boundaries are modeled using a digital impedance filter formulation that is consistent with locally reacting surface theory. A structurally stable and efficient boundary formulation is constructed by carefully combining the boundary condition with the interpolated scheme. An analytic prediction formula for the effective numerical reflectance is given, and a stability proof provided. The results indicate that the identified accurate and isotropic schemes are also very accurate in terms of numerical boundary reflectance, and outperform directly related methods such as Yee's scheme and the standard digital waveguide mesh. In addition, one particular scheme-referred to here as the interpolated wideband scheme-is suggested as the best scheme for most applications.