5 resultados para Wikipedia, crowdsourcing, traduzione collaborativa
em Indian Institute of Science - Bangalore - Índia
Resumo:
An exciting application of crowdsourcing is to use social networks in complex task execution. In this paper, we address the problem of a planner who needs to incentivize agents within a network in order to seek their help in executing an atomic task as well as in recruiting other agents to execute the task. We study this mechanism design problem under two natural resource optimization settings: (1) cost critical tasks, where the planner's goal is to minimize the total cost, and (2) time critical tasks, where the goal is to minimize the total time elapsed before the task is executed. We identify a set of desirable properties that should ideally be satisfied by a crowdsourcing mechanism. In particular, sybil-proofness and collapse-proofness are two complementary properties in our desiderata. We prove that no mechanism can satisfy all the desirable properties simultaneously. This leads us naturally to explore approximate versions of the critical properties. We focus our attention on approximate sybil-proofness and our exploration leads to a parametrized family of payment mechanisms which satisfy collapse-proofness. We characterize the approximate versions of the desirable properties in cost critical and time critical domain.
Resumo:
In recent times, crowdsourcing over social networks has emerged as an active tool for complex task execution. In this paper, we address the problem faced by a planner to incen-tivize agents in the network to execute a task and also help in recruiting other agents for this purpose. We study this mecha-nism design problem under two natural resource optimization settings: (1) cost critical tasks, where the planner’s goal is to minimize the total cost, and (2) time critical tasks, where the goal is to minimize the total time elapsed before the task is executed. We define a set of fairness properties that should beideally satisfied by a crowdsourcing mechanism. We prove that no mechanism can satisfy all these properties simultane-ously. We relax some of these properties and define their ap-proximate counterparts. Under appropriate approximate fair-ness criteria, we obtain a non-trivial family of payment mech-anisms. Moreover, we provide precise characterizations of cost critical and time critical mechanisms.
Resumo:
In social choice theory, preference aggregation refers to computing an aggregate preference over a set of alternatives given individual preferences of all the agents. In real-world scenarios, it may not be feasible to gather preferences from all the agents. Moreover, determining the aggregate preference is computationally intensive. In this paper, we show that the aggregate preference of the agents in a social network can be computed efficiently and with sufficient accuracy using preferences elicited from a small subset of critical nodes in the network. Our methodology uses a model developed based on real-world data obtained using a survey on human subjects, and exploits network structure and homophily of relationships. Our approach guarantees good performance for aggregation rules that satisfy a property which we call expected weak insensitivity. We demonstrate empirically that many practically relevant aggregation rules satisfy this property. We also show that two natural objective functions in this context satisfy certain properties, which makes our methodology attractive for scalable preference aggregation over large scale social networks. We conclude that our approach is superior to random polling while aggregating preferences related to individualistic metrics, whereas random polling is acceptable in the case of social metrics.
Resumo:
Scatter/Gather systems are increasingly becoming useful in browsing document corpora. Usability of the present-day systems are restricted to monolingual corpora, and their methods for clustering and labeling do not easily extend to the multilingual setting, especially in the absence of dictionaries/machine translation. In this paper, we study the cluster labeling problem for multilingual corpora in the absence of machine translation, but using comparable corpora. Using a variational approach, we show that multilingual topic models can effectively handle the cluster labeling problem, which in turn allows us to design a novel Scatter/Gather system ShoBha. Experimental results on three datasets, namely the Canadian Hansards corpus, the entire overlapping Wikipedia of English, Hindi and Bengali articles, and a trilingual news corpus containing 41,000 articles, confirm the utility of the proposed system.
Resumo:
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other ``auxiliary'' languages. We observe that translations have many topically related words in common in the auxiliary language. To model this, we define the notion of a translingual theme, a set of topically related words from auxiliary language corpora, and present a probabilistic framework for translation induction. Extensive experiments on 35 comparable corpora using English and French as auxiliary languages show that this approach can yield dramatic improvements in performance (e.g. MRR improves by 124% to 0.419 for Telugu-Kannada). A user study on WikiTSu, a system for cross-lingual Wikipedia title suggestion that uses our approach, shows a 20% improvement in the quality of titles suggested.