923 resultados para Natural Language Processing,Recommender Systems,Android,Applicazione mobile


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The principal feature of ontology, which is developed for a text processing, is wider knowledge representation of an external world due to introduction of three-level hierarchy. It allows to improve semantic interpretation of natural language texts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ongoing rapid expansion of the Internet greatly increases the necessity of effective recommender systems for filtering the abundant information. Extensive research for recommender systems is conducted by a broad range of communities including social and computer scientists, physicists, and interdisciplinary researchers. Despite substantial theoretical and practical achievements, unification and comparison of different approaches are lacking, which impedes further advances. In this article, we review recent developments in recommender systems and discuss the major challenges. We compare and evaluate available algorithms and examine their roles in the future developments. In addition to algorithms, physical aspects are described to illustrate macroscopic behavior of recommender systems. Potential impacts and future directions are discussed. We emphasize that recommendation has great scientific depth and combines diverse research fields which makes it interesting for physicists as well as interdisciplinary researchers. © 2012 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Most research in the area of emotion detection in written text focused on detecting explicit expressions of emotions in text. In this paper, we present a rule-based pipeline approach for detecting implicit emotions in written text without emotion-bearing words based on the OCC Model. We have evaluated our approach on three different datasets with five emotion categories. Our results show that the proposed approach outperforms the lexicon matching method consistently across all the three datasets by a large margin of 17–30% in F-measure and gives competitive performance compared to a supervised classifier. In particular, when dealing with formal text which follows grammatical rules strictly, our approach gives an average F-measure of 82.7% on “Happy”, “Angry-Disgust” and “Sad”, even outperforming the supervised baseline by nearly 17% in F-measure. Our preliminary results show the feasibility of the approach for the task of implicit emotion detection in written text.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Personalized recommender systems aim to assist users in retrieving and accessing interesting items by automatically acquiring user preferences from the historical data and matching items with the preferences. In the last decade, recommendation services have gained great attention due to the problem of information overload. However, despite recent advances of personalization techniques, several critical issues in modern recommender systems have not been well studied. These issues include: (1) understanding the accessing patterns of users (i.e., how to effectively model users' accessing behaviors); (2) understanding the relations between users and other objects (i.e., how to comprehensively assess the complex correlations between users and entities in recommender systems); and (3) understanding the interest change of users (i.e., how to adaptively capture users' preference drift over time). To meet the needs of users in modern recommender systems, it is imperative to provide solutions to address the aforementioned issues and apply the solutions to real-world applications. ^ The major goal of this dissertation is to provide integrated recommendation approaches to tackle the challenges of the current generation of recommender systems. In particular, three user-oriented aspects of recommendation techniques were studied, including understanding accessing patterns, understanding complex relations and understanding temporal dynamics. To this end, we made three research contributions. First, we presented various personalized user profiling algorithms to capture click behaviors of users from both coarse- and fine-grained granularities; second, we proposed graph-based recommendation models to describe the complex correlations in a recommender system; third, we studied temporal recommendation approaches in order to capture the preference changes of users, by considering both long-term and short-term user profiles. In addition, a versatile recommendation framework was proposed, in which the proposed recommendation techniques were seamlessly integrated. Different evaluation criteria were implemented in this framework for evaluating recommendation techniques in real-world recommendation applications. ^ In summary, the frequent changes of user interests and item repository lead to a series of user-centric challenges that are not well addressed in the current generation of recommender systems. My work proposed reasonable solutions to these challenges and provided insights on how to address these challenges using a simple yet effective recommendation framework.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This investigation is grounded within the concept of embodied cognition where the mind is considered to be part of a biological system. A first year undergraduate Mechanical Engineering cohort of students was tasked with explaining the behaviour of three balls of different masses being rolled down a ramp. The explanations given by the students highlighted the cognitive conflict between the everyday interpretation of the word energy and its mathematical use. The results showed that even after many years of schooling, students found it challenging to interpret the mathematics they had learned and relied upon pseudo-scientific notions to account for the behaviour of the balls.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed Computing frameworks belong to a class of programming models that allow developers to

launch workloads on large clusters of machines. Due to the dramatic increase in the volume of

data gathered by ubiquitous computing devices, data analytic workloads have become a common

case among distributed computing applications, making Data Science an entire field of

Computer Science. We argue that Data Scientist's concern lays in three main components: a dataset,

a sequence of operations they wish to apply on this dataset, and some constraint they may have

related to their work (performances, QoS, budget, etc). However, it is actually extremely

difficult, without domain expertise, to perform data science. One need to select the right amount

and type of resources, pick up a framework, and configure it. Also, users are often running their

application in shared environments, ruled by schedulers expecting them to specify precisely their resource

needs. Inherent to the distributed and concurrent nature of the cited frameworks, monitoring and

profiling are hard, high dimensional problems that block users from making the right

configuration choices and determining the right amount of resources they need. Paradoxically, the

system is gathering a large amount of monitoring data at runtime, which remains unused.

In the ideal abstraction we envision for data scientists, the system is adaptive, able to exploit

monitoring data to learn about workloads, and process user requests into a tailored execution

context. In this work, we study different techniques that have been used to make steps toward

such system awareness, and explore a new way to do so by implementing machine learning

techniques to recommend a specific subset of system configurations for Apache Spark applications.

Furthermore, we present an in depth study of Apache Spark executors configuration, which highlight

the complexity in choosing the best one for a given workload.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rhythm analysis of written texts focuses on literary analysis and it mainly considers poetry. In this paper we investigate the relevance of rhythmic features for categorizing texts in prosaic form pertaining to different genres. Our contribution is threefold. First, we define a set of rhythmic features for written texts. Second, we extract these features from three corpora, of speeches, essays, and newspaper articles. Third, we perform feature selection by means of statistical analyses, and determine a subset of features which efficiently discriminates between the three genres. We find that using as little as eight rhythmic features, documents can be adequately assigned to a given genre with an accuracy of around 80 %, significantly higher than the 33 % baseline which results from random assignment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Opinion mining and sentiment analysis are important research areas of Natural Language Processing (NLP) tools and have become viable alternatives for automatically extracting the affective information found in texts. Our aim is to build an NLP model to analyze gamers’ sentiments and opinions expressed in a corpus of 9750 game reviews. A Principal Component Analysis using sentiment analysis features explained 51.2 % of the variance of the reviews and provides an integrated view of the major sentiment and topic related dimensions expressed in game reviews. A Discriminant Function Analysis based on the emerging components classified game reviews into positive, neutral and negative ratings with a 55 % accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The large upfront investments required for game development pose a severe barrier for the wider uptake of serious games in education and training. Also, there is a lack of well-established methods and tools that support game developers at preserving and enhancing the games’ pedagogical effectiveness. The RAGE project, which is a Horizon 2020 funded research project on serious games, addresses these issues by making available reusable software components that aim to support the pedagogical qualities of serious games. In order to easily deploy and integrate these game components in a multitude of game engines, platforms and programming languages, RAGE has developed and validated a hybrid component-based software architecture that preserves component portability and interoperability. While a first set of software components is being developed, this paper presents selected examples to explain the overall system’s concept and its practical benefits. First, the Emotion Detection component uses the learners’ webcams for capturing their emotional states from facial expressions. Second, the Performance Statistics component is an add-on for learning analytics data processing, which allows instructors to track and inspect learners’ progress without bothering about the required statistics computations. Third, a set of language processing components accommodate the analysis of textual inputs of learners, facilitating comprehension assessment and prediction. Fourth, the Shared Data Storage component provides a technical solution for data storage - e.g. for player data or game world data - across multiple software components. The presented components are exemplary for the anticipated RAGE library, which will include up to forty reusable software components for serious gaming, addressing diverse pedagogical dimensions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Community-driven Question Answering (CQA) systems that crowdsource experiential information in the form of questions and answers and have accumulated valuable reusable knowledge. Clustering of QA datasets from CQA systems provides a means of organizing the content to ease tasks such as manual curation and tagging. In this paper, we present a clustering method that exploits the two-part question-answer structure in QA datasets to improve clustering quality. Our method, {\it MixKMeans}, composes question and answer space similarities in a way that the space on which the match is higher is allowed to dominate. This construction is motivated by our observation that semantic similarity between question-answer data (QAs) could get localized in either space. We empirically evaluate our method on a variety of real-world labeled datasets. Our results indicate that our method significantly outperforms state-of-the-art clustering methods for the task of clustering question-answer archives.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overwhelming amount and unprecedented speed of publication in the biomedical domain make it difficult for life science researchers to acquire and maintain a broad view of the field and gather all information that would be relevant for their research. As a response to this problem, the BioNLP (Biomedical Natural Language Processing) community of researches has emerged and strives to assist life science researchers by developing modern natural language processing (NLP), information extraction (IE) and information retrieval (IR) methods that can be applied at large-scale, to scan the whole publicly available biomedical literature and extract and aggregate the information found within, while automatically normalizing the variability of natural language statements. Among different tasks, biomedical event extraction has received much attention within BioNLP community recently. Biomedical event extraction constitutes the identification of biological processes and interactions described in biomedical literature, and their representation as a set of recursive event structures. The 2009–2013 series of BioNLP Shared Tasks on Event Extraction have given raise to a number of event extraction systems, several of which have been applied at a large scale (the full set of PubMed abstracts and PubMed Central Open Access full text articles), leading to creation of massive biomedical event databases, each of which containing millions of events. Sinece top-ranking event extraction systems are based on machine-learning approach and are trained on the narrow-domain, carefully selected Shared Task training data, their performance drops when being faced with the topically highly varied PubMed and PubMed Central documents. Specifically, false-positive predictions by these systems lead to generation of incorrect biomolecular events which are spotted by the end-users. This thesis proposes a novel post-processing approach, utilizing a combination of supervised and unsupervised learning techniques, that can automatically identify and filter out a considerable proportion of incorrect events from large-scale event databases, thus increasing the general credibility of those databases. The second part of this thesis is dedicated to a system we developed for hypothesis generation from large-scale event databases, which is able to discover novel biomolecular interactions among genes/gene-products. We cast the hypothesis generation problem as a supervised network topology prediction, i.e predicting new edges in the network, as well as types and directions for these edges, utilizing a set of features that can be extracted from large biomedical event networks. Routine machine learning evaluation results, as well as manual evaluation results suggest that the problem is indeed learnable. This work won the Best Paper Award in The 5th International Symposium on Languages in Biology and Medicine (LBM 2013).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Decision-making is often dependent on uncertain data, e.g. data associated with confidence scores or probabilities. We present a comparison of different informa- tion presentations for uncertain data and, for the first time, measure their effects on human decision-making. We show that the use of Natural Language Genera- tion (NLG) improves decision-making un- der uncertainty, compared to state-of-the- art graphical-based representation meth- ods. In a task-based study with 442 adults, we found that presentations using NLG lead to 24% better decision-making on av- erage than the graphical presentations, and to 44% better decision-making when NLG is combined with graphics. We also show that women achieve significantly better re- sults when presented with NLG output (an 87% increase on average compared to graphical presentations).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EU]Testu bat koherente egiten duten arrazoiak ulertzea oso baliagarria da testuaren beraren ulermenerako, koherentzia eta koherentzia-erlazioak testu bat edo gehiago koherente diren ondorioztatzen laguntzen baitigu. Lan honetan gai bera duten testu ezberdinen arteko koherentziazko 3 Cross Document Structure Theory edo CST (Radev, 2000) erlazio aztertu eta sailkatu dira. Hori egin ahal izateko, euskaraz idatziriko gai berari buruzko testuak segmentatzeko eta beraien arteko erlazioak etiketatzeko gidalerroak proposatzen dira. 10 testuz osaturiko corpusa etiketatu da; horietako 3 cluster bi etiketatzailek aztertu dute. Etiketatzaileen arteko adostasunaren berri ematen dugu. Koherentzia-erlazioak garatzea oso garrantzitsua da Hizkuntzaren Prozesamenduko hainbat sistementzat, hala nola, informazioa erauzteko sistementzat, itzulpen automatikoarentzat, galde-erantzun sistementzat eta laburpen automatikoarentzat. Etorkizunean CSTko erlazio guztiak corpus esanguratsuan aztertuko balira, testuen arteko koherentzia- erlazioak euskarazko testuen prozesaketa automatikoa bideratzeko lehenengo pausua litzateke hemen egindakoa.