946 resultados para Databases as Topic


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Latent topics derived by topic models such as Latent Dirichlet Allocation (LDA) are the result of hidden thematic structures which provide further insights into the data. The automatic labelling of such topics derived from social media poses however new challenges since topics may characterise novel events happening in the real world. Existing automatic topic labelling approaches which depend on external knowledge sources become less applicable here since relevant articles/concepts of the extracted topics may not exist in external sources. In this paper we propose to address the problem of automatic labelling of latent topics learned from Twitter as a summarisation problem. We introduce a framework which apply summarisation algorithms to generate topic labels. These algorithms are independent of external sources and only rely on the identification of dominant terms in documents related to the latent topic. We compare the efficiency of existing state of the art summarisation algorithms. Our results suggest that summarisation algorithms generate better topic labels which capture event-related context compared to the top-n terms returned by LDA. © 2014 Association for Computational Linguistics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mapping-based visualisations of image databases are well suited to users wanting to survey the overall content of a collection. Given the large amount of image data contained within such visualisations, however, this approach has yet to be applied to large image databases stored remotely. In this technical demonstration, we showcase our Web-Based Images Browser (WBIB). Our novel system makes use of image pyramids so that users can interactively explore mapping-based visualisations of large remote image databases. © 2012 Authors.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Social media data are produced continuously by a large and uncontrolled number of users. The dynamic nature of such data requires the sentiment and topic analysis model to be also dynamically updated, capturing the most recent language use of sentiments and topics in text. We propose a dynamic Joint Sentiment-Topic model (dJST) which allows the detection and tracking of views of current and recurrent interests and shifts in topic and sentiment. Both topic and sentiment dynamics are captured by assuming that the current sentiment-topic-specific word distributions are generated according to the word distributions at previous epochs. We study three different ways of accounting for such dependency information: (1) Sliding window where the current sentiment-topic word distributions are dependent on the previous sentiment-topic-specific word distributions in the last S epochs; (2) skip model where history sentiment topic word distributions are considered by skipping some epochs in between; and (3) multiscale model where previous long- and shorttimescale distributions are taken into consideration. We derive efficient online inference procedures to sequentially update the model with newly arrived data and show the effectiveness of our proposed model on the Mozilla add-on reviews crawled between 2007 and 2011. © 2013 ACM 2157-6904/2013/12-ART5 $ 15.00.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we explore the idea of social role theory (SRT) and propose a novel regularized topic model which incorporates SRT into the generative process of social media content. We assume that a user can play multiple social roles, and each social role serves to fulfil different duties and is associated with a role-driven distribution over latent topics. In particular, we focus on social roles corresponding to the most common social activities on social networks. Our model is instantiated on microblogs, i.e., Twitter and community question-answering (cQA), i.e., Yahoo! Answers, where social roles on Twitter include "originators" and "propagators", and roles on cQA are "askers" and "answerers". Both explicit and implicit interactions between users are taken into account and modeled as regularization factors. To evaluate the performance of our proposed method, we have conducted extensive experiments on two Twitter datasets and two cQA datasets. Furthermore, we also consider multi-role modeling for scientific papers where an author's research expertise area is considered as a social role. A novel application of detecting users' research interests through topical keyword labeling based on the results of our multi-role model has been presented. The evaluation results have shown the feasibility and effectiveness of our model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The methods and software for integration of databases (DBs) on inorganic material and substance properties have been developed. The information systems integration is based on known approaches combination: EII (Enterprise Information Integration) and EAI (Enterprise Application Integration). The metabase - special database that stores data on integrated DBs contents is an integrated system kernel. Proposed methods have been applied for DBs integrated system creation in the field of inorganic chemistry and materials science. Important developed integrated system feature is ability to include DBs that have been created by means of different DBMS using essentially various computer platforms: Sun (DB "Diagram") and Intel (other DBs) and diverse operating systems: Sun Solaris (DB "Diagram") and Microsoft Windows Server (other DBs).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article explores some of the strategies used by international students of English to manage topic shifts in casual conversations with English-speaking peers. It therefore covers aspects of discourse which have been comparatively under-researched, and where research has also tended to focus on the problems rather than the communicative achievements of non-native speakers. A detailed analysis of the conversations under discussion, which were recorded by the participants themselves, showed that they all flowed smoothly, and this was in large measure due to the ways in which topic shifts were managed. The paper will focus on a very distinct type of topic shift, namely that of topic transitions, which enable a smooth flow from one topic to another, but which do not explicitly signal that a shift is taking place. It will examine how the non-native speakers achieved coherence in the topic transitions which they initiated, which strategies or procedures they employed, and show how their initiations were effective in enabling the proposed topic to be understood, taken up and developed. It therefore adds to our understanding of the interactional achievements of international speakers in informal, social contexts. © 2013 Elsevier B.V.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current state of Russian databases for substances and materials properties was considered. A brief review of integration methods of given information systems was prepared and a distributed databases integration approach based on metabase was proposed. Implementation details were mentioned on the posed database on electronics materials integration approach. An operating pilot version of given integrated information system implemented at IMET RAS was considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The principles of organization of the distributed system of databases on properties of inorganic substances and materials based on the use of a special reference database are considered. The last includes not only information on a site of the data about the certain substance in other databases but also brief information on the most widespread properties of inorganic substances. The proposed principles were successfully realized at the creation of the distributed system of databases on properties of inorganic compounds developed by A.A.Baikov Institute of Metallurgy and Materials Science of the Russian Academy of Sciences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Visual information is becoming increasingly important and tools to manage repositories of media collections are highly sought after. In this paper, we focus on image databases and on how to effectively and efficiently access these. In particular, we present effective image browsing systems that are operated on a large multi-touch environment for truly interactive exploration. Not only do image browsers pose a useful alternative to retrieval-based systems, they also provide a visualisation of the whole image collection and let users explore particular parts of the collection. Our systems are based on the idea that visually similar images are located close to each other in the visualisation, that image thumbnails are arranged on a regular lattice (either a regular grid projected on a sphere or a hexagonal lattice), and that large image datasets can be accessed through a hierarchical tree structure. © 2014 International Information Institute.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Short text messages a.k.a Microposts (e.g. Tweets) have proven to be an effective channel for revealing information about trends and events, ranging from those related to Disaster (e.g. hurricane Sandy) to those related to Violence (e.g. Egyptian revolution). Being informed about such events as they occur could be extremely important to authorities and emergency professionals by allowing such parties to immediately respond. In this work we study the problem of topic classification (TC) of Microposts, which aims to automatically classify short messages based on the subject(s) discussed in them. The accurate TC of Microposts however is a challenging task since the limited number of tokens in a post often implies a lack of sufficient contextual information. In order to provide contextual information to Microposts, we present and evaluate several graph structures surrounding concepts present in linked knowledge sources (KSs). Traditional TC techniques enrich the content of Microposts with features extracted only from the Microposts content. In contrast our approach relies on the generation of different weighted semantic meta-graphs extracted from linked KSs. We introduce a new semantic graph, called category meta-graph. This novel meta-graph provides a more fine grained categorisation of concepts providing a set of novel semantic features. Our findings show that such category meta-graph features effectively improve the performance of a topic classifier of Microposts. Furthermore our goal is also to understand which semantic feature contributes to the performance of a topic classifier. For this reason we propose an approach for automatic estimation of accuracy loss of a topic classifier on new, unseen Microposts. We introduce and evaluate novel topic similarity measures, which capture the similarity between the KS documents and Microposts at a conceptual level, considering the enriched representation of these documents. Extensive evaluation in the context of Emergency Response (ER) and Violence Detection (VD) revealed that our approach outperforms previous approaches using single KS without linked data and Twitter data only up to 31.4% in terms of F1 measure. Our main findings indicate that the new category graph contains useful information for TC and achieves comparable results to previously used semantic graphs. Furthermore our results also indicate that the accuracy of a topic classifier can be accurately predicted using the enhanced text representation, outperforming previous approaches considering content-based similarity measures. © 2014 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

IMPORTANCE: Metformin is widely viewed as the best initial pharmacological option to lower glucose concentrations in patients with type 2 diabetes mellitus. However, the drug is contraindicated in many individuals with impaired kidney function because of concerns of lactic acidosis. OBJECTIVE: To assess the risk of lactic acidosis associated with metformin use in individuals with impaired kidney function. EVIDENCE ACQUISITION: In July 2014, we searched the MEDLINE and Cochrane databases for English-language articles pertaining tometformin, kidney disease, and lactic acidosis in humans between 1950 and June 2014.We excluded reviews, letters, editorials, case reports, small case series, and manuscripts that did not directly pertain to the topic area or that met other exclusion criteria. Of an original 818 articles, 65 were included in this review, including pharmacokinetic/metabolic studies, large case series, retrospective studies, meta-analyses, and a clinical trial. RESULTS: Although metformin is renally cleared, drug levels generally remain within the therapeutic range and lactate concentrations are not substantially increased when used in patients with mild to moderate chronic kidney disease (estimated glomerular filtration rates, 30-60 mL/min per 1.73m2). The overall incidence of lactic acidosis in metformin users varies across studies from approximately 3 per 100 000 person-years to 10 per 100 000 person-years and is generally indistinguishable from the background rate in the overall population with diabetes. Data suggesting an increased risk of lactic acidosis in metformin-treated patients with chronic kidney disease are limited, and no randomized controlled trials have been conducted to test the safety ofmetformin in patients with significantly impaired kidney function. Population-based studies demonstrate that metformin may be prescribed counter to prevailing guidelines suggesting a renal risk in up to 1 in 4 patients with type 2 diabetes mellitus-use which, in most reports, has not been associated with increased rates of lactic acidosis. Observational studies suggest a potential benefit from metformin on macrovascular outcomes, even in patients with prevalent renal contraindications for its use. CONCLUSIONS AND RELEVANCE: Available evidence supports cautious expansion of metformin use in patients with mild to moderate chronic kidney disease, as defined by estimated glomerular filtration rate, with appropriate dosage reductions and careful follow-up of kidney function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper surveys research in the field of data mining, which is related to discovering the dependencies between attributes in databases. We consider a number of approaches to finding the distribution intervals of association rules, to discovering branching dependencies between a given set of attributes and a given attribute in a database relation, to finding fractional dependencies between a given set of attributes and a given attribute in a database relation, and to collaborative filtering.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Иванка Марашева-Делинова - В настоящата работа се разглежда избора на тема при разработване на проекти в непрофилирани по математика класове. Посочват се критерии за подбор на тема. Предлагат се примерни теми и източници за разработка, съобразени с възрастовите и индивидуални особености на учениците, както и техните общи и индивидуални интереси.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62H30

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Topic classification (TC) of short text messages offers an effective and fast way to reveal events happening around the world ranging from those related to Disaster (e.g. Sandy hurricane) to those related to Violence (e.g. Egypt revolution). Previous approaches to TC have mostly focused on exploiting individual knowledge sources (KS) (e.g. DBpedia or Freebase) without considering the graph structures that surround concepts present in KSs when detecting the topics of Tweets. In this paper we introduce a novel approach for harnessing such graph structures from multiple linked KSs, by: (i) building a conceptual representation of the KSs, (ii) leveraging contextual information about concepts by exploiting semantic concept graphs, and (iii) providing a principled way for the combination of KSs. Experiments evaluating our TC classifier in the context of Violence detection (VD) and Emergency Responses (ER) show promising results that significantly outperform various baseline models including an approach using a single KS without linked data and an approach using only Tweets. Copyright 2013 ACM.