911 resultados para Machine Learning,Natural Language Processing,Descriptive Text Mining,POIROT,Transformer


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existe um problema de representação em processamento de linguagem natural, pois uma vez que o modelo tradicional de bag-of-words representa os documentos e as palavras em uma unica matriz, esta tende a ser completamente esparsa. Para lidar com este problema, surgiram alguns métodos que são capazes de representar as palavras utilizando uma representação distribuída, em um espaço de dimensão menor e mais compacto, inclusive tendo a propriedade de relacionar palavras de forma semântica. Este trabalho tem como objetivo utilizar um conjunto de documentos obtido através do projeto Media Cloud Brasil para aplicar o modelo skip-gram em busca de explorar relações e encontrar padrões que facilitem na compreensão do conteúdo.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-06

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper summarizes the scientific work presented at the 32nd European Conference on Information Retrieval. It demonstrates that information retrieval (IR) as a research area continues to thrive with progress being made in three complementary sub-fields, namely IR theory and formal methods together with indexing and query representation issues, furthermore Web IR as a primary application area and finally research into evaluation methods and metrics. It is the combination of these areas that gives IR its solid scientific foundations. The paper also illustrates that significant progress has been made in other areas of IR. The keynote speakers addressed three such subject fields, social search engines using personalization and recommendation technologies, the renewed interest in applying natural language processing to IR, and multimedia IR as another fast-growing area.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of the project is to enhance the already effective health-monitoring system (HUMS) for helicopters by analysing structural vibrations to recognise different flight conditions directly from sensor information. The goal of this paper is to develop a new method to select those sensors and frequency bands that are best for detecting changes in flight conditions. We projected frequency information to a 2-dimensional space in order to visualise flight-condition transitions using the Generative Topographic Mapping (GTM) and a variant which supports simultaneous feature selection. We created an objective measure of the separation between different flight conditions in the visualisation space by calculating the Kullback-Leibler (KL) divergence between Gaussian mixture models (GMMs) fitted to each class: the higher the KL-divergence, the better the interclass separation. To find the optimal combination of sensors, they were considered in pairs, triples and groups of four sensors. The sensor triples provided the best result in terms of KL-divergence. We also found that the use of a variational training algorithm for the GMMs gave more reliable results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linguistic theory, cognitive, information, and mathematical modeling are all useful while we attempt to achieve a better understanding of the Language Faculty (LF). This cross-disciplinary approach will eventually lead to the identification of the key principles applicable in the systems of Natural Language Processing. The present work concentrates on the syntax-semantics interface. We start from recursive definitions and application of optimization principles, and gradually develop a formal model of syntactic operations. The result – a Fibonacci- like syntactic tree – is in fact an argument-based variant of the natural language syntax. This representation (argument-centered model, ACM) is derived by a recursive calculus that generates a mode which connects arguments and expresses relations between them. The reiterative operation assigns primary role to entities as the key components of syntactic structure. We provide experimental evidence in support of the argument-based model. We also show that mental computation of syntax is influenced by the inter-conceptual relations between the images of entities in a semantic space.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

В статье рассмотрен формальный подход и основное содержание методологии формализованного проектирования.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Онтолингвистические системы ориентированы на решение сложных задач обработки естественного языка, требующих семантических знаний. В основе проектирования онтолингвистических систем лежат процессы скоординированного взаимодействия онтологических и лингвистических моделей. В статье рассматриваются методы решения лингвистических задач на основе онтологий, разработанные при проектировании специализированной онтолингвистической системы «ЛоТА», предназначенной для анализа специальных технических текстов «Логика работы системы... ».

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Рассматриваются проблемы анализа естественно-языковых объектов (ЕЯО) с точки зрения их представления и обработки в памяти компьютера. Предложена формализация задачи анализа ЕЯО и приведен пример формализованного представления ЕЯО предметной области.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Описывается один из подходов к анализу естественно-языкового текста, который использует толковый словарь естественного языка, локальный словарь анализируемого текста и частотные характеристики слов в этом тексте.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The goal of this paper is to model normal airframe conditions for helicopters in order to detect changes. This is done by inferring the flying state using a selection of sensors and frequency bands that are best for discriminating between different states. We used non-linear state-space models (NLSSM) for modelling flight conditions based on short-time frequency analysis of the vibration data and embedded the models in a switching framework to detect transitions between states. We then created a density model (using a Gaussian mixture model) for the NLSSM innovations: this provides a model for normal operation. To validate our approach, we used data with added synthetic abnormalities which was detected as low-probability periods. The model of normality gave good indications of faults during the flight, in the form of low probabilities under the model, with high accuracy (>92 %). © 2013 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Storyline detection from news articles aims at summarizing events described under a certain news topic and revealing how those events evolve over time. It is a difficult task because it requires first the detection of events from news articles published in different time periods and then the construction of storylines by linking events into coherent news stories. Moreover, each storyline has different hierarchical structures which are dependent across epochs. Existing approaches often ignore the dependency of hierarchical structures in storyline generation. In this paper, we propose an unsupervised Bayesian model, called dynamic storyline detection model, to extract structured representations and evolution patterns of storylines. The proposed model is evaluated on a large scale news corpus. Experimental results show that our proposed model outperforms several baseline approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This presentation summarizes experience with the automated speech recognition and translation approach realised in the context of the European project EMMA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Taxonomies have gained a broad usage in a variety of fields due to their extensibility, as well as their use for classification and knowledge organization. Of particular interest is the digital document management domain in which their hierarchical structure can be effectively employed in order to organize documents into content-specific categories. Common or standard taxonomies (e.g., the ACM Computing Classification System) contain concepts that are too general for conceptualizing specific knowledge domains. In this paper we introduce a novel automated approach that combines sub-trees from general taxonomies with specialized seed taxonomies by using specific Natural Language Processing techniques. We provide an extensible and generalizable model for combining taxonomies in the practical context of two very large European research projects. Because the manual combination of taxonomies by domain experts is a highly time consuming task, our model measures the semantic relatedness between concept labels in CBOW or skip-gram Word2vec vector spaces. A preliminary quantitative evaluation of the resulting taxonomies is performed after applying a greedy algorithm with incremental thresholds used for matching and combining topic labels.