76 resultados para Wikipedia, crowdsourcing, traduzione collaborativa
Resumo:
We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.
Resumo:
Entity-oriented retrieval aims to return a list of relevant entities rather than documents to provide exact answers for user queries. The nature of entity-oriented retrieval requires identifying the semantic intent of user queries, i.e., understanding the semantic role of query terms and determining the semantic categories which indicate the class of target entities. Existing methods are not able to exploit the semantic intent by capturing the semantic relationship between terms in a query and in a document that contains entity related information. To improve the understanding of the semantic intent of user queries, we propose concept-based retrieval method that not only automatically identifies the semantic intent of user queries, i.e., Intent Type and Intent Modifier but introduces concepts represented by Wikipedia articles to user queries. We evaluate our proposed method on entity profile documents annotated by concepts from Wikipedia category and list structure. Empirical analysis reveals that the proposed method outperforms several state-of-the-art approaches.
Resumo:
This research investigates the extent to which the World Wide Web and the participatory news media culture have contributed to the democratisation of journalism since 1997. It examined the different ways in which public service and commercial news media models use digital platforms to fulfil their obligations as members of the Fourth Estate. The research found that the digital environment provides news organisations with greater scope for transparency, interactivity, collaboration and social networking compared to the traditional print and broadcast platforms.
Resumo:
This article examines manual textual categorisation by human coders with the hypothesis that the law of total probability may be violated for difficult categories. An empirical evaluation was conducted to compare a one step categorisation task with a two step categorisation task using crowdsourcing. It was found that the law of total probability was violated. Both a quantum and classical probabilistic interpretations for this violation are presented. Further studies are required to resolve whether quantum models are more appropriate for this task.
Resumo:
The first live appearance of The Apartments after many years was at Brisbane's Pig City, a live music event curated to coincide with the release of Andrew Stafford's book of the same name.
Resumo:
Acoustic sensing is a promising approach to scaling faunal biodiversity monitoring. Scaling the analysis of audio collected by acoustic sensors is a big data problem. Standard approaches for dealing with big acoustic data include automated recognition and crowd based analysis. Automatic methods are fast at processing but hard to rigorously design, whilst manual methods are accurate but slow at processing. In particular, manual methods of acoustic data analysis are constrained by a 1:1 time relationship between the data and its analysts. This constraint is the inherent need to listen to the audio data. This paper demonstrates how the efficiency of crowd sourced sound analysis can be increased by an order of magnitude through the visual inspection of audio visualized as spectrograms. Experimental data suggests that an analysis speedup of 12× is obtainable for suitable types of acoustic analysis, given that only spectrograms are shown.
Resumo:
Over the past decade, social media have gone through a process of legitimation and official adoption, and they are now becoming embedded as part of the official communications apparatus of many commercial and public-sector organisations— in turn, providing platforms like Twitter with their own sources of legitimacy. Arguably, the demonstrated utility of social media platforms and tools in times of crisis—from civil unrest and violent crime through to natural disasters like bushfires, earthquakes, and floods—has been a crucial driver of this newfound legitimacy. In the mid-2000s, user-created content and ‘Web 2.0’ platforms were known to play a role in crisis communication; back then, the involvement of extra-institutional actors in providing and sharing information around such events involved distributed, ad hoc, or niche platforms (like Flickr), and was more likely to be framed as ‘citizen journalism’ or ‘crowdsourcing’ (see, for example, Liu, Palen, Sutton, Hughes, & Vieweg, 2008, on the then-emerging role of photo-sharing in disasters). Since then, the dramatically increased take-up of mainstream social media platforms like Facebook and Twitter means that the pool of potential participants in online crisis communication has broadened to include a much larger proportion of the general population, as well as traditional media and official emergency response organisations.
Resumo:
This is the fourth edition of New Media: An Introduction, with the previous editions being published by Oxford University Press in 2002, 2005 and 2008. As the first edition of the book published in the 2010s, every chapter has been comprehensively revised, and there are new chapters on: • Online News and the Future of Journalism (Chapter 7) • New Media and the Transformation of Higher Education (Chapter 10) • Online Activism and Networked Politics (Chapter 12). It has retained popular features of the third edition, including the twenty key concepts in new media (Chapter 2) and illustrative case studies to assist with teaching new media. The case studies in the book cover: the global internet; Wikipedia; transmedia storytelling; Media Studies 2.0; the games industry and exploitation; video games and violence; WikiLeaks; the innovator’s dilemma; massive open online courses (MOOCs); Creative Commons; the Barack Obama Presidential campaigns; and the Arab Spring. Several major changes in the media environment since the publication of the third edition stand out. Of particular importance has been the rise of social media platforms such as Facebook, Twitter and YouTube, which draw out even more strongly the features of the internet as networked and participatory media, with a range of implications across the economy, society and culture. In addition, the political implications of new media have become more apparent with a range of social media-based political campaigns, from Barack Obama’s successful Presidential election campaigns to the Occupy movements and the Arab Spring. At the same time, the subsequent developments of politics in these and other cases has drawn attention to the limitations of thinking about the politics or the public sphere in technologically determinist ways. When the first edition of New Media was published in 2002, the concept of new media was seen as being largely about the internet as it was accessed from personal computers. The subsequent decade has seen a proliferation of platforms and devices: we now access media in all forms from our phones and other mobile platforms, therefore we seen television and the internet increasingly converging, and we see a growing uncoupling of digital media content and delivery platforms. While this has a range of implications for media law and policy, from convergent media policy to copyright reform, governments and policy-makers are struggling to adapt to such seismic shifts from mass communications media to convergent social media. The internet is no longer primarily a Western-based medium. Two-thirds of the world’s internet users are now outside of Europe and North America; three-quarters of internet users use languages other than English; and three-quarters of the world’s mobile cellular phone subscriptions are in developing nations. It is also apparent that conducting discussions about how to develop new media technologies and discussions about their cultural and creative content can no longer be separated. Discussions of broadband strategies and the knowledge economy need to be increasingly joined with those concerning the creative industries and the creative economy.
Resumo:
This paper examines the use of crowdfunding platforms to fund academic research. Looking specifically at the use of a Pozible campaign to raise funds for a small pilot research study into home education in Australia, the paper reports on the success and problems of using the platform. It also examines the crowdsourcing of literature searching as part of the package. The paper looks at the realities of using this type of platform to gain start–up funding for a project and argues that families and friends are likely to be the biggest supporters. The finding that family and friends are likely to be the highest supporters supports similar work in the arts communities that are traditionally served by crowdfunding platforms. The paper argues that, with exceptions, these platforms can be a source of income in times where academics are finding it increasingly difficult to source government funding for projects.
Resumo:
Vom Oderhochwasser über Hurricane Sandy bis zum Tsunami und Reaktormeltdown an der japanischen Ostküste: die letzten Jahre waren leider reich an Naturkatastrophen und anderen Krisensituationen, welche Hunderttausende von Menschen betroffen haben. Abgesehen davon, daß viele dieser Krisen auch die ersten Auswirkungen des Klimawandels greifbar gemacht haben, verdeutlichen sie auch eine andere, ebenfalls nicht unwichtige Form des Wandels: die graduelle Umgestaltung der Medienlandschaft, in der herkömmliche Massenmedien vermehrt durch soziale Medien wie Facebook oder Twitter ergänzt und teilweise vielleicht sogar ersetzt werden.
Resumo:
This paper gives an overview of the INEX 2008 Ad Hoc Track. The main goals of the Ad Hoc Track were two-fold. The first goal was to investigate the value of the internal document structure (as provided by the XML mark-up) for retrieving relevant information. This is a continuation of INEX 2007 and, for this reason, the retrieval results are liberalized to arbitrary passages and measures were chosen to fairly compare systems retrieving elements, ranges of elements, and arbitrary passages. The second goal was to compare focused retrieval to article retrieval more directly than in earlier years. For this reason, standard document retrieval rankings have been derived from all runs, and evaluated with standard measures. In addition, a set of queries targeting Wikipedia have been derived from a proxy log, and the runs are also evaluated against the clicked Wikipedia pages. The INEX 2008 Ad Hoc Track featured three tasks: For the Focused Task a ranked-list of nonoverlapping results (elements or passages) was needed. For the Relevant in Context Task non-overlapping results (elements or passages) were returned grouped by the article from which they came. For the Best in Context Task a single starting point (element start tag or passage start) for each article was needed. We discuss the results for the three tasks, and examine the relative effectiveness of element and passage retrieval. This is examined in the context of content only (CO, or Keyword) search as well as content and structure (CAS, or structured) search. Finally, we look at the ability of focused retrieval techniques to rank articles, using standard document retrieval techniques, both against the judged topics as well as against queries and clicks from a proxy log.
Resumo:
INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when interacting with various sources of information, for realistic task scenarios, and how the user interface impacts search and the search experience. The Social Book Search Track investigated the relative value of authoritative metadata and user-generated content for search and recommendation using a test collection with data from Amazon and LibraryThing, including user profiles and personal catalogues. The Tweet Contextualization Track investigated tweet contextualization, helping a user to understand a tweet by providing him with a short background summary generated from relevant Wikipedia passages aggregated into a coherent summary. INEX 2014 was an exciting year for INEX in which we for the third time ran our workshop as part of the CLEF labs. This paper gives an overview of all the INEX 2014 tracks, their aims and task, the built test-collections, the participants, and gives an initial analysis of the results.
Resumo:
This project is a step forward in the study of text mining where enhanced text representation with semantic information plays a significant role. It develops effective methods of entity-oriented retrieval, semantic relation identification and text clustering utilizing semantically annotated data. These methods are based on enriched text representation generated by introducing semantic information extracted from Wikipedia into the input text data. The proposed methods are evaluated against several start-of-art benchmarking methods on real-life data-sets. In particular, this thesis improves the performance of entity-oriented retrieval, identifies different lexical forms for an entity relation and handles clustering documents with multiple feature spaces.
Resumo:
The use of ‘topic’ concepts has shown improved search performance, given a query, by bringing together relevant documents which use different terms to describe a higher level concept. In this paper, we propose a method for discovering and utilizing concepts in indexing and search for a domain specific document collection being utilized in industry. This approach differs from others in that we only collect focused concepts to build the concept space and that instead of turning a user’s query into a concept based query, we experiment with different techniques of combining the original query with a concept query. We apply the proposed approach to a real-world document collection and the results show that in this scenario the use of concept knowledge at index and search can improve the relevancy of results.
Resumo:
This chapter analyses the copyright law framework needed to ensure open access to outputs of the Australian academic and research sector such as journal articles and theses. It overviews the new knowledge landscape, the principles of copyright law, the concept of open access to knowledge, the recently developed open content models of copyright licensing and the challenges faced in providing greater access to knowledge and research outputs.