927 resultados para Wikipedia, crowdsourcing, traduzione collaborativa
Resumo:
Crowdsourcing has become a popular approach for capitalizing on the potential of large and open crowds of people external to the organization. While crowdsourcing as a phenomenon is studied in a variety of fields, research mostly focuses on isolated aspects and little is known about the integrated design of crowdsourcing efforts. We introduce a socio-technical systems perspective on crowdsourcing, which provides a deeper understanding of the components and relationships in crowdsourcing systems. By considering the function of crowdsourcing systems within their organizational context, we develop a typology of four distinct system archetypes. We analyze the characteristics of each type and derive a number of design requirements for the respective system components. The paper lays a foundation for IS-based crowdsourcing research, channels related academic work, and helps guiding the study and design of crowdsourcing information systems.
Resumo:
Building and maintaining software are not easy tasks. However, thanks to advances in web technologies, a new paradigm is emerging in software development. The Service Oriented Architecture (SOA) is a relatively new approach that helps bridge the gap between business and IT and also helps systems remain exible. However, there are still several challenges with SOA. As the number of available services grows, developers are faced with the problem of discovering the services they need. Public service repositories such as Programmable Web provide only limited search capabilities. Several mechanisms have been proposed to improve web service discovery by using semantics. However, most of these require manually tagging the services with concepts in an ontology. Adding semantic annotations is a non-trivial process that requires a certain skill-set from the annotator and also the availability of domain ontologies that include the concepts related to the topics of the service. These issues have prevented these mechanisms becoming widespread. This thesis focuses on two main problems. First, to avoid the overhead of manually adding semantics to web services, several automatic methods to include semantics in the discovery process are explored. Although experimentation with some of these strategies has been conducted in the past, the results reported in the literature are mixed. Second, Wikipedia is explored as a general-purpose ontology. The benefit of using it as an ontology is assessed by comparing these semantics-based methods to classic term-based information retrieval approaches. The contribution of this research is significant because, to the best of our knowledge, a comprehensive analysis of the impact of using Wikipedia as a source of semantics in web service discovery does not exist. The main output of this research is a web service discovery engine that implements these methods and a comprehensive analysis of the benefits and trade-offs of these semantics-based discovery approaches.
Resumo:
This paper presents research that investigated the role of conflict in the editorial process of the online encyclopedia, Wikipedia. The study used a grounded approach to analyzing 147 conversations about quality from the archived history of the Wikipedia article 'Australia'. It found that conflict in Wikipedia is a generative friction, regulated by references to policy as part of a coordinated effort within the community to improve the quality of articles.
Resumo:
For TREC Crowdsourcing 2011 (Stage 2) we propose a networkbased approach for assigning an indicative measure of worker trustworthiness in crowdsourced labelling tasks. Workers, the gold standard and worker/gold standard agreements are modelled as a network. For the purpose of worker trustworthiness assignment, a variant of the PageRank algorithm, named TurkRank, is used to adaptively combine evidence that suggests worker trustworthiness, i.e., agreement with other trustworthy co-workers and agreement with the gold standard. A single parameter controls the importance of co-worker agreement versus gold standard agreement. The TurkRank score calculated for each worker is incorporated with a worker-weighted mean label aggregation.
Resumo:
In the field of information retrieval (IR), researchers and practitioners are often faced with a demand for valid approaches to evaluate the performance of retrieval systems. The Cranfield experiment paradigm has been dominant for the in-vitro evaluation of IR systems. Alternative to this paradigm, laboratory-based user studies have been widely used to evaluate interactive information retrieval (IIR) systems, and at the same time investigate users’ information searching behaviours. Major drawbacks of laboratory-based user studies for evaluating IIR systems include the high monetary and temporal costs involved in setting up and running those experiments, the lack of heterogeneity amongst the user population and the limited scale of the experiments, which usually involve a relatively restricted set of users. In this paper, we propose an alternative experimental methodology to laboratory-based user studies. Our novel experimental methodology uses a crowdsourcing platform as a means of engaging study participants. Through crowdsourcing, our experimental methodology can capture user interactions and searching behaviours at a lower cost, with more data, and within a shorter period than traditional laboratory-based user studies, and therefore can be used to assess the performances of IIR systems. In this article, we show the characteristic differences of our approach with respect to traditional IIR experimental and evaluation procedures. We also perform a use case study comparing crowdsourcing-based evaluation with laboratory-based evaluation of IIR systems, which can serve as a tutorial for setting up crowdsourcing-based IIR evaluations.
Resumo:
Wikipedia is often held up as an example of the potential of the internet to foster open, free and non-commercial collaboration. However such discourses often conflate these values without recognising how they play out in reality in a peer-production community. As Wikipedia is evolving, it is an ideal time to examine these discourses and the tensions that exist between its initial ideals and the reality of commercial activity in the encyclopaedia. Through an analysis of three failed proposals to ban paid advocacy editing in the English language Wikipedia, this paper highlights the shift in values from the early editorial community that forked encyclopaedic content over the threat of commercialisation, to one that today values the freedom that allows anyone to edit the encyclopaedia.
Resumo:
The interest in poverty and the moral sense of'helping the poor' are a constant topic in Western culture (Mayo 2009).ln recent years, multinational corporations (MNCs) have evolved in their understanding of how social issues, such as poverty alleviation, relate to their fundamental purposes. From a business strategy point of view, 'socially responsible' initiatives are generally born with lhe dual purpose of attaining social visibility (i.e. marketing) and increasing economic returns. Besides addressing social challenges as part of their corporate social responsibility strategies, MNCs have also begun 'selling to the poor' in emerging markets (Prahalad 2004). A few forward -looking companies consider tltis base of the pyramid (BOP) market also as a source of innovation and have started to co-create with consumers (Simanis and Hart 2008).
Resumo:
Though popular, concepts such as Toffler's 'prosumer' (1970; 1980; 1990) are inherently limited in their ability to accurately describe the makeup and dynamics of current co-creative environments, from fundamentally non-profit initiatives like the Wikipedia to user-industry partnerships that engage in crowdsourcing and the development of collective intelligence. Instead, the success or failure of such projects can be understood best if the traditional producer/consumer divide is dissolved, allowing for the emergence of the produser (Bruns, 2008). A close investigation of leading spaces for produsage makes it possible to extract the key principles which underpin and guide such content co-creation, and to identify how innovative pro-am partnerships between commercial entities and user communities might be structured in order to maximise the benefits that both sides will be able to draw from such collaboration. This chapter will outline these principles, and point to successes and failures in applying them to pro- am initiatives.
Resumo:
Clustering is an important technique in organising and categorising web scale documents. The main challenges faced in clustering the billions of documents available on the web are the processing power required and the sheer size of the datasets available. More importantly, it is nigh impossible to generate the labels for a general web document collection containing billions of documents and a vast taxonomy of topics. However, document clusters are most commonly evaluated by comparison to a ground truth set of labels for documents. This paper presents a clustering and labeling solution where the Wikipedia is clustered and hundreds of millions of web documents in ClueWeb12 are mapped on to those clusters. This solution is based on the assumption that the Wikipedia contains such a wide range of diverse topics that it represents a small scale web. We found that it was possible to perform the web scale document clustering and labeling process on one desktop computer under a couple of days for the Wikipedia clustering solution containing about 1000 clusters. It takes longer to execute a solution with finer granularity clusters such as 10,000 or 50,000. These results were evaluated using a set of external data.
Resumo:
This chapter considers the legal ramifications of Wikipedia, and other online media, such as the Encyclopedia of Life. Nathaniel Tkacz (2007) has observed: 'Wikipedia is an ideal entry-point from which to approach the shifting character of knowledge in contemporary society.' He observes: 'Scholarship on Wikipedia from computer science, history, philosophy, pedagogy and media studies has moved beyond speculation regarding its considerable potential, to the task of interpreting - and potentially intervening in - the significance of Wikipedia's impact' (Tkacz 2007). After an introduction, Part II considers the evolution and development of Wikipedia, and the legal troubles that have attended it. It also considers the establishment of rival online encyclopedia - such as Citizendium set up by Larry Sanger, the co-founder of Wikipedia; and Knol, the mysterious new project of Google. Part III explores the use of mass, collaborative authorship in the field of science. In particular, it looks at the development of the Encyclopedia of Life, which seeks to document the world's biodiversity. This chapter expresses concern that Wiki-based software had to develop in a largely hostile and inimical legal environment. It contends that copyright law and related fields of intellectual property need to be reformed in order better to accommodate users of copyright material (Rimmer 2007). This chapter makes a number of recommendations. First, there is a need to acknowledge and recognize forms of mass, collaborative production and consumption - not just individual authorship. Second, the view of a copyright 'work' and other subject matter as a complete and closed piece of cultural production also should be reconceptualised. Third, the defense of fair use should be expanded to accommodate a wide range of amateur, peer-to-peer production activities - not only in the United States, but in other jurisdictions as well. Fourth, the safe harbor protections accorded to Internet intermediaries, such as Wikipedia, should be strengthened. Fifth, there should be a defense in respect of the use of 'orphan works' - especially in cases of large-scale digitization. Sixth, the innovations of open source licensing should be expressly incorporated and entrenched within the formal framework of copyright laws. Finally, courts should craft judicial remedies to take into account concerns about political censorship and freedom of speech.
Resumo:
This thesis explores how people and technologies work together to coordinate and shape participation through a case study of the online encyclopaedia Wikipedia. The research found participation is shaped by different understandings of openness, where it is constructed as either a libertarian ideal where "anyone" is free to edit the encyclopaedia, or as an inclusive concept that enables "everyone" to participate in the platform. The findings therefore problematise the idea of single user community, and serve to highlight the different and sometimes competing approaches actors employ to enable and constrain participation in Wikipedia.
Resumo:
Analisi contrastiva delle modalità di traduzione in finnico dei Tempi verbali e delle perifrasi aspettuali dell italiano (Italian Philology) The topic of this research is a contrastive study of tenses and aspect in Italian and in Finnish. The study aims to develop a research method for analyzing translations and comparable texts (non-translation) written in a target language. Thus, the analysis is based on empirical data consisting of translations of novels from Italian to Finnish and vice versa. In addition to this, for the section devoted to solutions adopted in Finnish for translating the Italian tenses Perfetto Semplice and Perfetto Composto, 39 Finnish native speakers were asked to answer questions concerning the choice of Perfekti and Imperfekti in Finnish. The responses given by the Finnish informants were compared to the choices made by translators in the target language, and in this way it was possible both to benefit from the motivation provided by native speakers to explain the selection of a tense (Imperfekti/Perfekti) in a specific context compared with the Italian formal equivalents (Perfetto Composto/Perfetto Semplice), and to define the specific features of the Finnish verb tenses. The research aims to develop a qualitative method for the analysis of formal equivalents and translational changes ( shifts ). Although, as the choice of Italian and Finnish progressive forms is optional and related to speaker preferences, besides the qualitative analysis, I also considered it necessary to operate a quantitative one in order to find out whether the two items share the same degree of correspondence in frequency of use. In this study I explain translation choices in light of cognitive grammar, suggesting that particular translation relationships derive from so-called construal operations. I use the concepts of cognitive linguistics not only to analyze the convergences and divergences of the two aspectual systems, but also to redefine some general procedures related to the phenomenon of translation. For the practical analysis of the corpus were for the most part employed theoretical categories developed in a framework proposed by Pier Marco Bertinetto. Following this approach, the notions of aspect (the morphologic or morphosyntactic, subjective level) and actionality (the lexical aspect or objective level, traditionally Aktionsart) are carefully distinguished. This also allowed me to test the applicability of these distinctions to two languages typologically different from each other. The data allowed both the analysis of the semantic and pragmatic features that determine tense and aspect choices in these two languages, and to discover the correspondences between the two language systems and the strategies that translators are forced to resort to in particular situations. The research provides not only a detailed and analytically argued inventory about possible solutions for translating Italian tenses and aspectual devices in Finnish that could be of pedagogical relevance, but also new contributions about the specific uses of time-aspectual devices in the two languages in question.