940 resultados para NLP (Natural Language Processing)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Ecological models written in a mathematical language L(M) or model language, with a given style or methodology can be considered as a text. It is possible to apply statistical linguistic laws and the experimental results demonstrate that the behaviour of a mathematical model is the same of any literary text of any natural language. A text has the following characteristics: (a) the variables, its transformed functions and parameters are the lexic units or LUN of ecological models; (b) the syllables are constituted by a LUN, or a chain of them, separated by operating or ordering LUNs; (c) the flow equations are words; and (d) the distribution of words (LUM and CLUN) according to their lengths is based on a Poisson distribution, the Chebanov's law. It is founded on Vakar's formula, that is calculated likewise the linguistic entropy for L(M). We will apply these ideas over practical examples using MARIOLA model. In this paper it will be studied the problem of the lengths of the simple lexic units composed lexic units and words of text models, expressing these lengths in number of the primitive symbols, and syllables. The use of these linguistic laws renders it possible to indicate the degree of information given by an ecological model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

There is a growing societal need to address the increasing prevalence of behavioral health issues, such as obesity, alcohol or drug use, and general lack of treatment adherence for a variety of health problems. The statistics, worldwide and in the USA, are daunting. Excessive alcohol use is the third leading preventable cause of death in the United States (with 79,000 deaths annually), and is responsible for a wide range of health and social problems. On the positive side though, these behavioral health issues (and associated possible diseases) can often be prevented with relatively simple lifestyle changes, such as losing weight with a diet and/or physical exercise, or learning how to reduce alcohol consumption. Medicine has therefore started to move toward finding ways of preventively promoting wellness, rather than solely treating already established illness.^ Evidence-based patient-centered Brief Motivational Interviewing (BMI) interventions have been found particularly effective in helping people find intrinsic motivation to change problem behaviors after short counseling sessions, and to maintain healthy lifestyles over the long-term. Lack of locally available personnel well-trained in BMI, however, often limits access to successful interventions for people in need. To fill this accessibility gap, Computer-Based Interventions (CBIs) have started to emerge. Success of the CBIs, however, critically relies on insuring engagement and retention of CBI users so that they remain motivated to use these systems and come back to use them over the long term as necessary.^ Because of their text-only interfaces, current CBIs can therefore only express limited empathy and rapport, which are the most important factors of health interventions. Fortunately, in the last decade, computer science research has progressed in the design of simulated human characters with anthropomorphic communicative abilities. Virtual characters interact using humans’ innate communication modalities, such as facial expressions, body language, speech, and natural language understanding. By advancing research in Artificial Intelligence (AI), we can improve the ability of artificial agents to help us solve CBI problems.^ To facilitate successful communication and social interaction between artificial agents and human partners, it is essential that aspects of human social behavior, especially empathy and rapport, be considered when designing human-computer interfaces. Hence, the goal of the present dissertation is to provide a computational model of rapport to enhance an artificial agent’s social behavior, and to provide an experimental tool for the psychological theories shaping the model. Parts of this thesis were already published in [LYL+12, AYL12, AL13, ALYR13, LAYR13, YALR13, ALY14].^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The present paper presents an application that composes formal poetry in Spanish in a semiautomatic interactive fashion. JASPER is a forward reasoning rule-based system that obtains from the user an intended message, the desired metric, a choice of vocabulary, and a corpus of verses; and, by intelligent adaptation of selected examples from this corpus using the given words, carries out a prose-to-poetry translation of the given message. In the composition process, JASPER combines natural language generation and a set of construction heuristics obtained from formal literature on Spanish poetry.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp, 2016. © 2016 Wiley Periodicals, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis is about young students’ writing in school mathematics and the ways in which this writing is designed, interpreted and understood. Students’ communication can act as a source from which teachers can make inferences regarding students’ mathematical knowledge and understanding. In mathematics education previous research indicates that teachers assume that the process of interpreting and judging students’ writing is unproblematic. The relationship between what students’ write, and what they know or understand, is theoretical as well as empirical. In an era of increased focus on assessment and measurement in education it is necessary for teachers to know more about the relationship between communication and achievement. To add to this knowledge, the thesis has adopted a broad approach, and the thesis consists of four studies. The aim of these studies is to reach a deep understanding of writing in school mathematics. Such an understanding is dependent on examining different aspects of writing. The four studies together examine how the concept of communication is described in authoritative texts, how students’ writing is viewed by teachers and how students make use of different communicational resources in their writing. The results of the four studies indicate that students’ writing is more complex than is acknowledged by teachers and authoritative texts in mathematics education. Results point to a sophistication in students’ approach to the merging of the two functions of writing, writing for oneself and writing for others. Results also suggest that students attend, to various extents, to questions regarding how, what and for whom they are writing in school mathematics. The relationship between writing and achievement is dependent on students’ ability to have their writing reflect their knowledge and on teachers’ thorough knowledge of the different features of writing and their awareness of its complexity. From a communicational perspective the ability to communicate [in writing] in mathematics can and should be distinguished from other mathematical abilities. By acknowledging that mathematical communication integrates mathematical language and natural language, teachers have an opportunity to turn writing in mathematics into an object of learning. This offers teachers the potential to add to their assessment literacy and offers students the potential to develop their communicational ability in order to write in a way that better reflects their mathematical knowledge.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O Reconhecimento de Entidades Mencionadas tem como objectivo identificar e classificar entidades, baseando-se em determinadas categorias ou etiquetas, contidas em textos escritos em linguagem natural. O Sistema de Reconhecimento de Entidades Mencionadas implementado na elaboração desta Dissertação pretende identificar localidades presentes em textos informais e definir para cada localidade identificada uma das etiquetas “aldeia", "vila" ou “cidade" numa primeira aproximação ao problema. Numa segunda aproximação tiveram-se em conta as etiquetas "freguesia", "concelho" e "distrito". Para a obtenção das classificações das entidades procedeu-se a uma análise estatística do número de resultados obtidos numa pesquisa de uma entidade precedida por uma etiqueta usando o motor de pesquisa Google Search. ABSTRACT: Named Entitity Recognition has the objective of identifying and classifying entities, according to certain categories or labels, contained in texts written in natural language. The Named Entitity Recognition system implemented in the developing of this dissertation intends to identify localities in informal texts, setting for each one of these localities identified one of the labels "aldeia", ''vila" or "cidade" in a first approach to the problem. ln a second approach the labels "freguesia", "concelho" and "distrito" were taken in consideration. To obtain classifications for the entities a statistical analysis of the number of results returned by a search of an entity preceded by a label using Google search engine was performed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bangla OCR (Optical Character Recognition) is a long deserving software for Bengali community all over the world. Numerous e efforts suggest that due to the inherent complex nature of Bangla alphabet and its word formation process development of high fidelity OCR producing a reasonably acceptable output still remains a challenge. One possible way of improvement is by using post processing of OCR’s output; algorithms such as Edit Distance and the use of n-grams statistical information have been used to rectify misspelled words in language processing. This work presents the first known approach to use these algorithms to replace misrecognized words produced by Bangla OCR. The assessment is made on a set of fifty documents written in Bangla script and uses a dictionary of 541,167 words. The proposed correction model can correct several words lowering the recognition error rate by 2.87% and 3.18% for the character based n- gram and edit distance algorithms respectively. The developed system suggests a list of 5 (five) alternatives for a misspelled word. It is found that in 33.82% cases, the correct word is the topmost suggestion of 5 words list for n-gram algorithm while using Edit distance algorithm the first word in the suggestion properly matches 36.31% of the cases. This work will ignite rooms of thoughts for possible improvements in character recognition endeavour.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As descrições de produtos turísticos na área da hotelaria, aviação, rent-a-car e pacotes de férias baseiam-se sobretudo em descrições textuais em língua natural muito heterogénea com estilos, apresentações e conteúdos muito diferentes entre si. Uma vez que o sector do turismo é bastante dinâmico e que os seus produtos e ofertas estão constantemente em alteração, o tratamento manual de normalização de toda essa informação não é possível. Neste trabalho construiu-se um protótipo que permite a classificação e extracção automática de informação a partir de descrições de produtos de turismo. Inicialmente a informação é classificada quanto ao tipo. Seguidamente são extraídos os elementos relevantes de cada tipo e gerados objectos facilmente computáveis. Sobre os objectos extraídos, o protótipo com recurso a modelos de textos e imagens gera automaticamente descrições normalizadas e orientadas a um determinado mercado. Esta versatilidade permite um novo conjunto de serviços na promoção e venda dos produtos que seria impossível implementar com a informação original. Este protótipo, embora possa ser aplicado a outros domínios, foi avaliado na normalização da descrição de hotéis. As frases descritivas do hotel são classificadas consoante o seu tipo (Local, Serviços e/ou Equipamento) através de um algoritmo de aprendizagem automática que obtém valores médios de cobertura de 96% e precisão de 72%. A cobertura foi considerada a medida mais importante uma vez que a sua maximização permite que não se percam frases para processamentos posteriores. Este trabalho permitiu também a construção e população de uma base de dados de hotéis que possibilita a pesquisa de hotéis pelas suas características. Esta funcionalidade não seria possível utilizando os conteúdos originais. ABSTRACT: The description of tourism products, like hotel, aviation, rent-a-car and holiday packages, is strongly supported on natural language expressions. Due to the extent of tourism offers and considering the high dynamics in the tourism sector, manual data management is not a reliable or scalable solution. Offer descriptions - in the order of thousands - are structured in different ways, possibly comprising different languages, complementing and/or overlap one another. This work aims at creating a prototype for the automatic classification and extraction of relevant knowledge from tourism-related text expressions. Captured knowledge is represented in a normalized/standard format to enable new services based on this information in order to promote and sale tourism products that would be impossible to implement with the raw information. Although it could be applied to other areas, this prototype was evaluated in the normalization of hotel descriptions. Hotels descriptive sentences are classified according their type (Location, Services and/or Equipment) using a machine learning algorithm. The built setting obtained an average recall of 96% and precision of 72%. Recall considered the most important measure of performance since its maximization allows that sentences were not lost in further processes. As a side product a database of hotels was built and populated with search facilities on its characteristics. This ability would not be possible using the original contents.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Twitter is a highly popular social media which on one hand allows information transmission in real time and on the other hand represents a source of open access homogeneous text data. We propose an analysis of the most common self-reported COVID symptoms from a dataset of Italian tweets to investigate the evolution of the pandemic in Italy from the end of September 2020 to the end of January 2021. After manually filtering tweets actually describing COVID symptoms from the database - which contains words related to fever, cough and sore throat - we discuss usefulness of such filtering. We then compare our time series with the daily data of new hospitalisations in Italy, with the aim of building a simple linear regression model that accounts for the delay which is observed from the tweets mentioning individual symptoms to new hospitalisations. We discuss both the results and limitations of linear regression given that our data suggests that the relationship between time series of symptoms tweets and of new hospitalisations changes towards the end of the acquisition.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the framework of industrial problems, the application of Constrained Optimization is known to have overall very good modeling capability and performance and stands as one of the most powerful, explored, and exploited tool to address prescriptive tasks. The number of applications is huge, ranging from logistics to transportation, packing, production, telecommunication, scheduling, and much more. The main reason behind this success is to be found in the remarkable effort put in the last decades by the OR community to develop realistic models and devise exact or approximate methods to solve the largest variety of constrained or combinatorial optimization problems, together with the spread of computational power and easily accessible OR software and resources. On the other hand, the technological advancements lead to a data wealth never seen before and increasingly push towards methods able to extract useful knowledge from them; among the data-driven methods, Machine Learning techniques appear to be one of the most promising, thanks to its successes in domains like Image Recognition, Natural Language Processes and playing games, but also the amount of research involved. The purpose of the present research is to study how Machine Learning and Constrained Optimization can be used together to achieve systems able to leverage the strengths of both methods: this would open the way to exploiting decades of research on resolution techniques for COPs and constructing models able to adapt and learn from available data. In the first part of this work, we survey the existing techniques and classify them according to the type, method, or scope of the integration; subsequently, we introduce a novel and general algorithm devised to inject knowledge into learning models through constraints, Moving Target. In the last part of the thesis, two applications stemming from real-world projects and done in collaboration with Optit will be presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Values are beliefs or principles that are deemed significant or desirable within a specific society or culture, serving as the fundamental underpinnings for ethical and socio-behavioral norms. The objective of this research is to explore the domain encompassing moral, cultural, and individual values. To achieve this, we employ an ontological approach to formally represent the semantic relations within the value domain. The theoretical framework employed adopts Fillmore’s frame semantics, treating values as semantic frames. A value situation is thus characterized by the co-occurrence of specific semantic roles fulfilled within a given event or circumstance. Given the intricate semantics of values as abstract entities with high social capital, our investigation extends to two interconnected domains. The first domain is embodied cognition, specifically image schemas, which are cognitive patterns derived from sensorimotor experiences that shape our conceptualization of entities in the world. The second domain pertains to emotions, which are inherently intertwined with the realm of values. Consequently, our approach endeavors to formalize the semantics of values within an embodied cognition framework, recognizing values as emotional-laden semantic frames. The primary ontologies proposed in this work are: (i) ValueNet, an ontology network dedicated to the domain of values; (ii) ISAAC, the Image Schema Abstraction And Cognition ontology; and (iii) EmoNet, an ontology for theories of emotions. The knowledge formalization adheres to established modeling practices, including the reuse of semantic web resources such as WordNet, VerbNet, FrameNet, DBpedia, and alignment to foundational ontologies like DOLCE, as well as the utilization of Ontology Design Patterns. These ontological resources are operationalized through the development of a fully explainable frame-based detector capable of identifying values, emotions, and image schemas generating knowledge graphs from from natural language, leveraging the semantic dependencies of a sentence, and allowing non trivial higher layer knowledge inferences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed argumentation technology is a computational approach incorporating argumentation reasoning mechanisms within multi-agent systems. For the formal foundations of distributed argumentation technology, in this thesis we conduct a principle-based analysis of structured argumentation as well as abstract multi-agent and abstract bipolar argumentation. The results of the principle-based approach of these theories provide an overview and guideline for further applications of the theories. Moreover, in this thesis we explore distributed argumentation technology using distributed ledgers. We envision an Intelligent Human-input-based Blockchain Oracle (IHiBO), an artificial intelligence tool for storing argumentation reasoning. We propose a decentralized and secure architecture for conducting decision-making, addressing key concerns of trust, transparency, and immutability. We model fund management with agent argumentation in IHiBO and analyze its compliance with European fund management legal frameworks. We illustrate how bipolar argumentation balances pros and cons in legal reasoning in a legal divorce case, and how the strength of arguments in natural language can be represented in structured arguments. Finally, we discuss how distributed argumentation technology can be used to advance risk management, regulatory compliance of distributed ledgers for financial securities, and dialogue techniques.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study, the influence of the processing conditions and the addition of trans-polyoctenylene rubber (TOR) on Mooney viscosity, tensile properties, hardness, tearing resistance, and resilience of natural rubber/styrene-butadiene rubber blends was investigated. The results obtained are explained in light of dynamic mechanical and morphological analyses. Increasing processing time produced a finer blend morphology, which resulted in an improvement in the mechanical properties. The addition of TOR involved an increase in hardness, a decrease in tear resistance, and no effect on the resilience. It resulted in a large decrease in the Mooney viscosity and a slight decrease in the tensile properties if the components of the compounds were not properly mixed. The results indicate that TOR acted more as a plasticizer than a compatibilizer. (c) 2008 Wiley Periodicals, Inc.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The role of GABA in the central processing of complex auditory signals is not fully understood. We have studied the involvement of GABA(A)-mediated inhibition in the processing of birdsong, a learned vocal communication signal requiring intact hearing for its development and maintenance. We focused on caudomedial nidopallium (NCM), an area analogous to parts of the mammalian auditory cortex with selective responses to birdsong. We present evidence that GABA(A)-mediated inhibition plays a pronounced role in NCM`s auditory processing of birdsong. Using immunocytochemistry, we show that approximately half of NCM`s neurons are GABAergic. Whole cell patch-clamp recordings in a slice preparation demonstrate that, at rest, spontaneously active GABAergic synapses inhibit excitatory inputs onto NCM neurons via GABA(A) receptors. Multi-electrode electrophysiological recordings in awake birds show that local blockade of GABA(A)-mediated inhibition in NCM markedly affects the temporal pattern of song-evoked responses in NCM without modifications in frequency tuning. Surprisingly, this blockade increases the phasic and largely suppresses the tonic response component, reflecting dynamic relationships of inhibitory networks that could include disinhibition. Thus processing of learned natural communication sounds in songbirds, and possibly other vocal learners, may depend on complex interactions of inhibitory networks.