144 resultados para NLP
Resumo:
Nous proposons dans cette thèse un système permettant de déterminer, à partir des données envoyées sur les microblogs, les évènements qui stimulent l’intérêt des utilisateurs durant une période donnée et les dates saillantes de chaque évènement. Étant donné son taux d’utilisation élevé et l’accessibilité de ses données, nous avons utilisé la plateforme Twitter comme source de nos données. Nous traitons dans ce travail les tweets portant sur la Tunisie dont la plupart sont écrits par des tunisiens. La première tâche de notre système consistait à extraire automatiquement les tweets d’une façon continue durant 67 jours (de 8 février au 15 avril 2012). Nous avons supposé qu’un évènement est représenté par plusieurs termes dont la fréquence augmente brusquement à un ou plusieurs moments durant la période analysée. Le manque des ressources nécessaires pour déterminer les termes (notamment les hashtags) portant sur un même sujet, nous a obligé à proposer des méthodes permettant de regrouper les termes similaires. Pour ce faire, nous avons eu recours à des méthodes phonétiques que nous avons adaptées au mode d’écriture utilisée par les tunisiens, ainsi que des méthodes statistiques. Pour déterminer la validité de nos méthodes, nous avons demandé à des experts, des locuteurs natifs du dialecte tunisien, d’évaluer les résultats retournés par nos méthodes. Ces groupes ont été utilisés pour déterminer le sujet de chaque tweet et/ou étendre les tweets par de nouveaux termes. Enfin, pour sélectionner l'ensemble des évènements (EV), nous nous sommes basés sur trois critères : fréquence, variation et TF-IDF. Les résultats que nous avons obtenus ont montré la robustesse de notre système.
Resumo:
Resumen tomado de la publicaci??n. Resumen tambi??n en ingl??s
Resumo:
Estando plenamente definidos desde la PNL los conceptos de ancla y anclaje, este documento define las maneras en las que los mismos, ocurriendo de forma colectiva, alteran los comportamientos de los grupos sociales y la importancia de su estudio.
Resumo:
Real-time geoparsing of social media streams (e.g. Twitter, YouTube, Instagram, Flickr, FourSquare) is providing a new 'virtual sensor' capability to end users such as emergency response agencies (e.g. Tsunami early warning centres, Civil protection authorities) and news agencies (e.g. Deutsche Welle, BBC News). Challenges in this area include scaling up natural language processing (NLP) and information retrieval (IR) approaches to handle real-time traffic volumes, reducing false positives, creating real-time infographic displays useful for effective decision support and providing support for trust and credibility analysis using geosemantics. I will present in this seminar on-going work by the IT Innovation Centre over the last 4 years (TRIDEC and REVEAL FP7 projects) in building such systems, and highlights our research towards improving trustworthy and credible of crisis map displays and real-time analytics for trending topics and influential social networks during major news worthy events.
Resumo:
Title: Data-Driven Text Generation using Neural Networks Speaker: Pavlos Vougiouklis, University of Southampton Abstract: Recent work on neural networks shows their great potential at tackling a wide variety of Natural Language Processing (NLP) tasks. This talk will focus on the Natural Language Generation (NLG) problem and, more specifically, on the extend to which neural network language models could be employed for context-sensitive and data-driven text generation. In addition, a neural network architecture for response generation in social media along with the training methods that enable it to capture contextual information and effectively participate in public conversations will be discussed. Speaker Bio: Pavlos Vougiouklis obtained his 5-year Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki in 2013. He was awarded an MSc degree in Software Engineering from the University of Southampton in 2014. In 2015, he joined the Web and Internet Science (WAIS) research group of the University of Southampton and he is currently working towards the acquisition of his PhD degree in the field of Neural Network Approaches for Natural Language Processing. Title: Provenance is Complicated and Boring — Is there a solution? Speaker: Darren Richardson, University of Southampton Abstract: Paper trails, auditing, and accountability — arguably not the sexiest terms in computer science. But then you discover that you've possibly been eating horse-meat, and the importance of provenance becomes almost palpable. Having accepted that we should be creating provenance-enabled systems, the challenge of then communicating that provenance to casual users is not trivial: users should not have to have a detailed working knowledge of your system, and they certainly shouldn't be expected to understand the data model. So how, then, do you give users an insight into the provenance, without having to build a bespoke system for each and every different provenance installation? Speaker Bio: Darren is a final year Computer Science PhD student. He completed his undergraduate degree in Electronic Engineering at Southampton in 2012.
Resumo:
El objetivo de esta investigación es describir la calidad de vida y la calidad del sueño en los pacientes con diagnóstico de Síndrome de Apnea Hipoapnea del sueño, mediante el uso de un grupo de cuestionarios para obtener datos demográficos, la evaluación del grado de somnolencia diurna percibida, la percepción de la calidad del sueño y la percepción de la calidad de vida relacionada con la salud con encuestas en sus respectivas versiones validadas para Colombia.
Resumo:
There are still major challenges in the area of automatic indexing and retrieval of multimedia content data for very large multimedia content corpora. Current indexing and retrieval applications still use keywords to index multimedia content and those keywords usually do not provide any knowledge about the semantic content of the data. With the increasing amount of multimedia content, it is inefficient to continue with this approach. In this paper, we describe the project DREAM, which addresses such challenges by proposing a new framework for semi-automatic annotation and retrieval of multimedia based on the semantic content. The framework uses the Topic Map Technology, as a tool to model the knowledge automatically extracted from the multimedia content using an Automatic Labelling Engine. We describe how we acquire knowledge from the content and represent this knowledge using the support of NLP to automatically generate Topic Maps. The framework is described in the context of film post-production.
Resumo:
Due to idiosyncrasies in their syntax, semantics or frequency, Multiword Expressions (MWEs) have received special attention from the NLP community, as the methods and techniques developed for the treatment of simplex words are not necessarily suitable for them. This is certainly the case for the automatic acquisition of MWEs from corpora. A lot of effort has been directed to the task of automatically identifying them, with considerable success. In this paper, we propose an approach for the identification of MWEs in a multilingual context, as a by-product of a word alignment process, that not only deals with the identification of possible MWE candidates, but also associates some multiword expressions with semantics. The results obtained indicate the feasibility and low costs in terms of tools and resources demanded by this approach, which could, for example, facilitate and speed up lexicographic work.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
A branch and bound (B& B) algorithm using the DC model, to solve the power system transmission expansion planning by incorporating the electrical losses in network modelling problem is presented. This is a mixed integer nonlinear programming (MINLP) problem, and in this approach, the so-called fathoming tests in the B&B algorithm were redefined and a nonlinear programming (NLP) problem is solved in each node of the B& B tree, using an interior-point method. Pseudocosts were used to manage the development of the B&B tree and to decrease its size and the processing time. There is no guarantee of convergence towards global optimisation for the MINLP problem. However, preliminary tests show that the algorithm easily converges towards the best-known solutions or to the optimal solutions for all the tested systems neglecting the electrical losses. When the electrical losses are taken into account, the solution obtained using the Garver system is better than the best one known in the literature.
Resumo:
In this paper, we provide a brief description of the multidisciplinary domain of research called Natural Language Processing (NLP), which aims at enabling the computer to deal with natural languages. In accordance with this description, NLP is conceived as "human language engineering or technology". Therefore, NLP requires consistent description of linguistic facts on every linguistic level: morphological, syntactic, semantic, and even the level of pragmatics and discourse. In addition to the linguistically-motivated conception of NLP, we emphasize the origin of such research field, the place occupied by NLP inside a multidisciplinary scenario, their objectives and challenges. Finally, we provide some remarks on the automatic processing of Brazilian Portuguese language.
Resumo:
In this paper a heuristic technique for solving simultaneous short-term transmission network expansion and reactive power planning problem (TEPRPP) via an AC model is presented. A constructive heuristic algorithm (CHA) aimed to obtaining a significant quality solution for such problem is employed. An interior point method (IPM) is applied to solve TEPRPP as a nonlinear programming (NLP) during the solution steps of the algorithm. For each proposed network topology, an indicator is deployed to identify the weak buses for reactive power sources placement. The objective function of NLP includes the costs of new transmission lines, real power losses as well as reactive power sources. By allocating reactive power sources at load buses, the circuit capacity may increase while the cost of new lines can be decreased. The proposed methodology is tested on Garver's system and the obtained results shows its capability and the viability of using AC model for solving such non-convex optimization problem. © 2011 IEEE.
Resumo:
Non-pressure compensating drip hose is widely used for irrigation of vegetables and orchards. One limitation is that the lateral line length must be short to maintain uniformity due to head loss and slope. Any procedure to increase the length is appropriate because it represents low initial cost of the irrigation system. The hypothesis of this research is that it is possible to increase the lateral line length combining two points: using a larger spacing between emitters at the beginning of the lateral line and a smaller one after a certain distance; and allowing a higher pressure variation along the lateral line under an acceptable value of distribution uniformity. To evaluate this hypothesis, a nonlinear programming model (NLP) was developed. The input data are: diameter, roughness coefficient, pressure variation, emitter operational pressure, relationship between emitter discharge and pressure. The output data are: line length, discharge and length of the each section with different spacing between drippers, total discharge in the lateral line, multiple outlet adjustment coefficient, head losses, localized head loss, pressure variation, number of emitters, spacing between emitters, discharge in each emitter, and discharge per linear meter. The mathematical model developed was compared with the lateral line length obtained with the algebraic solution generated by the Darcy-Weisbach equation. The NLP model showed the best results since it generated the greater gain in the lateral line length, maintaining the uniformity and the flow variation under acceptable standards. It had also the lower flow variation, so its adoption is feasible and recommended.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)