929 resultados para second language processing


Relevância:

80.00% 80.00%

Publicador:

Resumo:

We show a new method for term extraction from a domain relevant corpus using natural language processing for the purposes of semi-automatic ontology learning. Literature shows that topical words occur in bursts. We find that the ranking of extracted terms is insensitive to the choice of population model, but calculating frequencies relative to the burst size rather than the document length in words yields significantly different results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Social streams have proven to be the mostup-to-date and inclusive information on cur-rent events. In this paper we propose a novelprobabilistic modelling framework, called violence detection model (VDM), which enables the identification of text containing violent content and extraction of violence-related topics over social media data. The proposed VDM model does not require any labeled corpora for training, instead, it only needs the in-corporation of word prior knowledge which captures whether a word indicates violence or not. We propose a novel approach of deriving word prior knowledge using the relative entropy measurement of words based on the in-tuition that low entropy words are indicative of semantically coherent topics and therefore more informative, while high entropy words indicates words whose usage is more topical diverse and therefore less informative. Our proposed VDM model has been evaluated on the TREC Microblog 2011 dataset to identify topics related to violence. Experimental results show that deriving word priors using our proposed relative entropy method is more effective than the widely-used information gain method. Moreover, VDM gives higher violence classification results and produces more coherent violence-related topics compared toa few competitive baselines.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Yorick Wilks is a central figure in the fields of Natural Language Processing and Artificial Intelligence. His influence has extends to many areas of these fields and includes contributions to Machine Translation, word sense disambiguation, dialogue modeling and Information Extraction.This book celebrates the work of Yorick Wilks from the perspective of his peers. It consists of original chapters each of which analyses an aspect of his work and links it to current thinking in that area. His work has spanned over four decades but is shown to be pertinent to recent developments in language processing such as the Semantic Web.This volume forms a two-part set together with Words and Intelligence I, Selected Works by Yorick Wilks, by the same editors.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis examines the main aim of teaching pronunciation in second language acquisition in the Syrian context. In other words, it investigates the desirable end point, namely: whether it is native-like accent, or intelligible pronunciation. This thesis also investigates the factors that affect native-like pronunciation and intelligible accent. It also analyses English language teaching methods. The currently used English pronunciation course is examined in detail too. The aim is to find out the learners’ aim of pronunciation, the best teaching method for achieving that aim, and the most appropriate course book that fulfils the aim. In order to find out learners’ aim in pronunciation, a qualitative research is undertaken. The research takes advantage of some aspects of case study. It is also supported by a questionnaire to gather data. The result of this research can be regarded as an attempt to bring the Syrian context to the current trends in the teaching of English pronunciation. The results show that learners are satisfied with intelligible pronunciation. The currently used teaching method (grammar-translation method) may be better replaced by the (communicative approach) which is more appropriate than the currently used method. It is also more effective to change the currently used book to a new one that corresponds to that aim. The current theories and issues in teaching English pronunciation that support learners’ intelligibility will be taken into account in the newly proposed course book.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Corpora—large collections of written and/or spoken text stored and accessed electronically—provide the means of investigating language that is of growing importance academically and professionally. Corpora are now routinely used in the following fields: The production of dictionaries and other reference materials; The development of aids to translation; Language teaching materials; The investigation of ideologies and cultural assumptions; Natural language processing; and The investigation of all aspects of linguistic behaviour, including vocabulary, grammar and pragmatics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Listening is typically the first language skill to develop in first language (L1) users and has been recognized as a basic and fundamental tool for communication. Despite the importance of listening, aural abilities are often taken for granted, and many people overlook their dependency on listening and the complexities that combine to enable this multi-faceted skill. When second language (L2) students are learning their new language, listening is crucial, as it provides access to oral input and facilitates social interaction. Yet L2 students find listening challenging, and L2 teachers often lack sufficient pedagogy to help learners develop listening abilities that they can use in and beyond the classroom. In an effort to provide a pedagogic alternative to more traditional and limited L2 listening instruction, this thesis investigated the viability of listening strategy instruction (LSI) over three semesters at a private university in Japan through a qualitative action research (AR) intervention. An LSI program was planned and implemented with six classes over the course of three AR phases. Two teachers used the LSI with 121 learners throughout the project. Following each AR phase, student and teacher perceptions of the methodology were investigated via questionnaires and interviews, which were primary data collection methods. Secondary research methods (class observations, pre/post-semester test scores, and a research journal) supplemented the primary methods. Data were analyzed and triangulated for emerging themes related to participants’ perceptions of LSI and the viability thereof. These data showed consistent positive perceptions of LSI on the parts of both learners and teachers, although some aspects of LSI required additional refinement. This project provided insights on LSI specific to the university context in Japan and also produced principles for LSI program planning and implementation that can inform the broader L2 education community.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this report we summarize the state-of-the-art of speech emotion recognition from the signal processing point of view. On the bases of multi-corporal experiments with machine-learning classifiers, the observation is made that existing approaches for supervised machine learning lead to database dependent classifiers which can not be applied for multi-language speech emotion recognition without additional training because they discriminate the emotion classes following the used training language. As there are experimental results showing that Humans can perform language independent categorisation, we made a parallel between machine recognition and the cognitive process and tried to discover the sources of these divergent results. The analysis suggests that the main difference is that the speech perception allows extraction of language independent features although language dependent features are incorporated in all levels of the speech signal and play as a strong discriminative function in human perception. Based on several results in related domains, we have suggested that in addition, the cognitive process of emotion-recognition is based on categorisation, assisted by some hierarchical structure of the emotional categories, existing in the cognitive space of all humans. We propose a strategy for developing language independent machine emotion recognition, related to the identification of language independent speech features and the use of additional information from visual (expression) features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In recent years, learning word vector representations has attracted much interest in Natural Language Processing. Word representations or embeddings learned using unsupervised methods help addressing the problem of traditional bag-of-word approaches which fail to capture contextual semantics. In this paper we go beyond the vector representations at the word level and propose a novel framework that learns higher-level feature representations of n-grams, phrases and sentences using a deep neural network built from stacked Convolutional Restricted Boltzmann Machines (CRBMs). These representations have been shown to map syntactically and semantically related n-grams to closeby locations in the hidden feature space. We have experimented to additionally incorporate these higher-level features into supervised classifier training for two sentiment analysis tasks: subjectivity classification and sentiment classification. Our results have demonstrated the success of our proposed framework with 4% improvement in accuracy observed for subjectivity classification and improved the results achieved for sentiment classification over models trained without our higher level features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Modern technology has moved on and completely changed the way that people can use the telephone or mobile to dialogue with information held on computers. Well developed “written speech analysis” does not work with “verbal speech”. The main purpose of our article is, firstly, to highlights the problems and, secondly, to shows the possible ways to solve these problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Linguistic theory, cognitive, information, and mathematical modeling are all useful while we attempt to achieve a better understanding of the Language Faculty (LF). This cross-disciplinary approach will eventually lead to the identification of the key principles applicable in the systems of Natural Language Processing. The present work concentrates on the syntax-semantics interface. We start from recursive definitions and application of optimization principles, and gradually develop a formal model of syntactic operations. The result – a Fibonacci- like syntactic tree – is in fact an argument-based variant of the natural language syntax. This representation (argument-centered model, ACM) is derived by a recursive calculus that generates a mode which connects arguments and expresses relations between them. The reiterative operation assigns primary role to entities as the key components of syntactic structure. We provide experimental evidence in support of the argument-based model. We also show that mental computation of syntax is influenced by the inter-conceptual relations between the images of entities in a semantic space.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The given work is devoted to development of the computer-aided system of semantic text analysis of a technical specification. The purpose of this work is to increase efficiency of software engineering based on automation of semantic text analysis of a technical specification. In work it is offered and investigated the model of the analysis of the text of the technical project is submitted, the attribute grammar of a technical specification, intended for formalization of limited Russian is constructed with the purpose of analysis of offers of text of a technical specification, style features of the technical project as class of documents are considered, recommendations on preparation of text of a technical specification for the automated processing are formulated. The computer-aided system of semantic text analysis of a technical specification is considered. This system consists of the following subsystems: preliminary text processing, the syntactic and semantic analysis and construction of software models, storage of documents and interface.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Рассмотрен подход к конспектированию ЕЯ текстов с использованием трехуровневой онтологии ассоциаций. Предложенная структура онтологии позволяет улучшить связность конспекта.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

В статье рассмотрен формальный подход и основное содержание методологии формализованного проектирования.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Онтолингвистические системы ориентированы на решение сложных задач обработки естественного языка, требующих семантических знаний. В основе проектирования онтолингвистических систем лежат процессы скоординированного взаимодействия онтологических и лингвистических моделей. В статье рассматриваются методы решения лингвистических задач на основе онтологий, разработанные при проектировании специализированной онтолингвистической системы «ЛоТА», предназначенной для анализа специальных технических текстов «Логика работы системы... ».