2 resultados para Semantic web, Search engine optimization, Information retrieval, Key concept induction

em DRUM (Digital Repository at the University of Maryland)


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social network sites (SNS), such as Facebook, Google+ and Twitter, have attracted hundreds of millions of users daily since their appearance. Within SNS, users connect to each other, express their identity, disseminate information and form cooperation by interacting with their connected peers. The increasing popularity and ubiquity of SNS usage and the invaluable user behaviors and connections give birth to many applications and business models. We look into several important problems within the social network ecosystem. The first one is the SNS advertisement allocation problem. The other two are related to trust mechanisms design in social network setting, including local trust inference and global trust evaluation. In SNS advertising, we study the problem of advertisement allocation from the ad platform's angle, and discuss its differences with the advertising model in the search engine setting. By leveraging the connection between social networks and hyperbolic geometry, we propose to solve the problem via approximation using hyperbolic embedding and convex optimization. A hyperbolic embedding method, \hcm, is designed for the SNS ad allocation problem, and several components are introduced to realize the optimization formulation. We show the advantages of our new approach in solving the problem compared to the baseline integer programming (IP) formulation. In studying the problem of trust mechanisms in social networks, we consider the existence of distrust (i.e. negative trust) relationships, and differentiate between the concept of local trust and global trust in social network setting. In the problem of local trust inference, we propose a 2-D trust model. Based on the model, we develop a semiring-based trust inference framework. In global trust evaluation, we consider a general setting with conflicting opinions, and propose a consensus-based approach to solve the complex problem in signed trust networks.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

While news stories are an important traditional medium to broadcast and consume news, microblogging has recently emerged as a place where people can dis- cuss, disseminate, collect or report information about news. However, the massive information in the microblogosphere makes it hard for readers to keep up with these real-time updates. This is especially a problem when it comes to breaking news, where people are more eager to know “what is happening”. Therefore, this dis- sertation is intended as an exploratory effort to investigate computational methods to augment human effort when monitoring the development of breaking news on a given topic from a microblog stream by extractively summarizing the updates in a timely manner. More specifically, given an interest in a topic, either entered as a query or presented as an initial news report, a microblog temporal summarization system is proposed to filter microblog posts from a stream with three primary concerns: topical relevance, novelty, and salience. Considering the relatively high arrival rate of microblog streams, a cascade framework consisting of three stages is proposed to progressively reduce quantity of posts. For each step in the cascade, this dissertation studies methods that improve over current baselines. In the relevance filtering stage, query and document expansion techniques are applied to mitigate sparsity and vocabulary mismatch issues. The use of word embedding as a basis for filtering is also explored, using unsupervised and supervised modeling to characterize lexical and semantic similarity. In the novelty filtering stage, several statistical ways of characterizing novelty are investigated and ensemble learning techniques are used to integrate results from these diverse techniques. These results are compared with a baseline clustering approach using both standard and delay-discounted measures. In the salience filtering stage, because of the real-time prediction requirement a method of learning verb phrase usage from past relevant news reports is used in conjunction with some standard measures for characterizing writing quality. Following a Cranfield-like evaluation paradigm, this dissertation includes a se- ries of experiments to evaluate the proposed methods for each step, and for the end- to-end system. New microblog novelty and salience judgments are created, building on existing relevance judgments from the TREC Microblog track. The results point to future research directions at the intersection of social media, computational jour- nalism, information retrieval, automatic summarization, and machine learning.