230 resultados para Wikipedia


Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes the evaluation in benchmarking the effectiveness of cross-lingual link discovery (CLLD). Cross lingual link discovery is a way of automatically finding prospective links between documents in different languages, which is particularly helpful for knowledge discovery of different language domains. A CLLD evaluation framework is proposed for system performance benchmarking. The framework includes standard document collections, evaluation metrics, and link assessment and evaluation tools. The evaluation methods described in this paper have been utilised to quantify the system performance at NTCIR-9 Crosslink task. It is shown that using the manual assessment for generating gold standard can deliver a more reliable evaluation result.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Purpose: The purpose of this paper is to clarify how end-users’ tacit knowledge can be captured and integrated in an overall business process management (BPM) approach. Current approaches to support stakeholders’ collaboration in the modelling of business processes envision an egalitarian environment where stakeholders interact in the same context, using the same languages and sharing the same perspectives on the business process. Therefore, such stakeholders have to collaborate in the context of process modelling using a language that some of them do not master, and have to integrate their various perspectives. Design/methodology/approach: The paper applies the SECI knowledge management process to analyse the problems of traditional top-down BPM approaches and BPM collaborative modelling tools. Besides, the SECI model is also applied to Wikipedia, a successful Web 2.0-based knowledge management environment, to identify how tacit knowledge is captured in a bottom-up approach. Findings – The paper identifies a set of requirements for a hybrid BPM approach, both top-down and bottom-up, and describes a new BPM method based on a stepwise discovery of knowledge. Originality/value: This new approach, Processpedia, enhances collaborative modelling among stakeholders without enforcing egalitarianism. In Processpedia tacit knowledge is captured and standardised into the organisation’s business processes by fostering an ecological participation of all the stakeholders and capitalising on stakeholders’ distinctive characteristics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Language-use has proven to be the most complex and complicating of all Internet features, yet people and institutions invest enormously in language and crosslanguage features because they are fundamental to the success of the Internet’s past, present and future. The thesis takes into focus the developments of the latter – features that facilitate and signify linking between or across languages – both in their historical and current contexts. In the theoretical analysis, the conceptual platform of inter-language linking is developed to both accommodate efforts towards a new social complexity model for the co-evolution of languages and language content, as well as to create an open analytical space for language and cross-language related features of the Internet and beyond. The practiced uses of inter-language linking have changed over the last decades. Before and during the first years of the WWW, mechanisms of inter-language linking were at best important elements used to create new institutional or content arrangements, but on a large scale they were just insignificant. This has changed with the emergence of the WWW and its development into a web in which content in different languages co-evolve. The thesis traces the inter-language linking mechanisms that facilitated these dynamic changes by analysing what these linking mechanisms are, how their historical as well as current contexts can be understood and what kinds of cultural-economic innovation they enable and impede. The study discusses this alongside four empirical cases of bilingual or multilingual media use, ranging from television and web services for languages of smaller populations, to large-scale, multiple languages involving web ventures by the British Broadcasting Corporation, the Special Broadcasting Service Australia, Wikipedia and Google. To sum up, the thesis introduces the concepts of ‘inter-language linking’ and the ‘lateral web’ to model the social complexity and co-evolution of languages online. The resulting model reconsiders existing social complexity models in that it is the first that can explain the emergence of large-scale, networked co-evolution of languages and language content facilitated by the Internet and the WWW. Finally, the thesis argues that the Internet enables an open space for language and crosslanguage related features and investigates how far this process is facilitated by (1) amateurs and (2) human-algorithmic interaction cultures.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Collaborative user-led content creation by online communities, or produsage (Bruns 2008), has generated a variety of useful and important resources and other valuable outcomes, from open source software through the Wikipedia to a variety of smaller-scale, specialist projects. These are often seen as standing in an inherent opposition to commercial interests, and attempts to develop collaborations between community content creators and commercial partners have had mixed success rates to date. However, such tension between community and commerce is not inevitable, and there is substantial potential for more fruitful exchanges and collaboration. This article contributes to the development of this understanding by outlining the key underlying principles of such participatory community processes and exploring the potential tensions which could arise between these communities and their potential external partners. It also sketches out potential approaches to resolving them.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This article provides an overview on some of the key aspects that relate to the co-evolution of languages and its associated content in the Internet environment. A focus on such a co-evolution is pertinent as the evolution of languages in the Internet environment can be better understood if the development of its existing and emerging content, that is, the content in the respective language, is taken into consideration. By doing so, this article examines two related aspects: the governance of languages at critical sites of the Internet environment, including ICANN, Wikipedia and Google Translate. Following on from this examination, the second part outlines how the co-evolution of languages and associated content in the Internet environment extends policy-making related to linguistic pluralism. It is argued that policies which centre on language availability in the Internet environment must shift their focus to the dynamics of available content instead. The notion of language pairs as a new regime of intersection for both languages and content is discussed to introduce an extended understanding of the uses of linguistic pluralism in the Internet environment. The ultimate extrapolation of such an enhanced approach, it is argued, centres less on 6,000 languages but, instead, on 36 million language pairs. This article describes how such a powerful resource evolves in the Internet environment.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Indem sie Informationen zusammenstellt, sortiert und aktualisiert, betreibt die Wikipedia eine Form der Nachrichtenkuration. Besonders daran ist aber nicht allein, dass nicht Journalisten die Inhalte produzieren, sondern dass ein Kollektiv aus "Produtzern" dahintersteht: Der Nutzer wird zum Produzenten.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cross-Lingual Link Discovery (CLLD) is a new problem in Information Retrieval. The aim is to automatically identify meaningful and relevant hypertext links between documents in different languages. This is particularly helpful in knowledge discovery if a multi-lingual knowledge base is sparse in one language or another, or the topical coverage in each language is different; such is the case with Wikipedia. Techniques for identifying new and topically relevant cross-lingual links are a current topic of interest at NTCIR where the CrossLink task has been running since the 2011 NTCIR-9. This paper presents the evaluation framework for benchmarking algorithms for cross-lingual link discovery evaluated in the context of NTCIR-9. This framework includes topics, document collections, assessments, metrics, and a toolkit for pooling, assessment, and evaluation. The assessments are further divided into two separate sets: manual assessments performed by human assessors; and automatic assessments based on links extracted from Wikipedia itself. Using this framework we show that manual assessment is more robust than automatic assessment in the context of cross-lingual link discovery.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

At NTCIR-10 we participated in the cross-lingual link discovery (CrossLink-2) task. In this paper we describe our systems for discovering cross-lingual links between the Chinese, Japanese, and Korean (CJK) Wikipedia and the English Wikipedia. The evaluation results show that our implementation of the cross-lingual linking method achieved promising results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a cluster ensemble method to map the corpus documents into the semantic space embedded in Wikipedia and group them using multiple types of feature space. A heterogeneous cluster ensemble is constructed with multiple types of relations i.e. document-term, document-concept and document-category. A final clustering solution is obtained by exploiting associations between document pairs and hubness of the documents. Empirical analysis with various real data sets reveals that the proposed meth-od outperforms state-of-the-art text clustering approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Entity-oriented retrieval aims to return a list of relevant entities rather than documents to provide exact answers for user queries. The nature of entity-oriented retrieval requires identifying the semantic intent of user queries, i.e., understanding the semantic role of query terms and determining the semantic categories which indicate the class of target entities. Existing methods are not able to exploit the semantic intent by capturing the semantic relationship between terms in a query and in a document that contains entity related information. To improve the understanding of the semantic intent of user queries, we propose concept-based retrieval method that not only automatically identifies the semantic intent of user queries, i.e., Intent Type and Intent Modifier but introduces concepts represented by Wikipedia articles to user queries. We evaluate our proposed method on entity profile documents annotated by concepts from Wikipedia category and list structure. Empirical analysis reveals that the proposed method outperforms several state-of-the-art approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The first live appearance of The Apartments after many years was at Brisbane's Pig City, a live music event curated to coincide with the release of Andrew Stafford's book of the same name.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This is the fourth edition of New Media: An Introduction, with the previous editions being published by Oxford University Press in 2002, 2005 and 2008. As the first edition of the book published in the 2010s, every chapter has been comprehensively revised, and there are new chapters on: • Online News and the Future of Journalism (Chapter 7) • New Media and the Transformation of Higher Education (Chapter 10) • Online Activism and Networked Politics (Chapter 12). It has retained popular features of the third edition, including the twenty key concepts in new media (Chapter 2) and illustrative case studies to assist with teaching new media. The case studies in the book cover: the global internet; Wikipedia; transmedia storytelling; Media Studies 2.0; the games industry and exploitation; video games and violence; WikiLeaks; the innovator’s dilemma; massive open online courses (MOOCs); Creative Commons; the Barack Obama Presidential campaigns; and the Arab Spring. Several major changes in the media environment since the publication of the third edition stand out. Of particular importance has been the rise of social media platforms such as Facebook, Twitter and YouTube, which draw out even more strongly the features of the internet as networked and participatory media, with a range of implications across the economy, society and culture. In addition, the political implications of new media have become more apparent with a range of social media-based political campaigns, from Barack Obama’s successful Presidential election campaigns to the Occupy movements and the Arab Spring. At the same time, the subsequent developments of politics in these and other cases has drawn attention to the limitations of thinking about the politics or the public sphere in technologically determinist ways. When the first edition of New Media was published in 2002, the concept of new media was seen as being largely about the internet as it was accessed from personal computers. The subsequent decade has seen a proliferation of platforms and devices: we now access media in all forms from our phones and other mobile platforms, therefore we seen television and the internet increasingly converging, and we see a growing uncoupling of digital media content and delivery platforms. While this has a range of implications for media law and policy, from convergent media policy to copyright reform, governments and policy-makers are struggling to adapt to such seismic shifts from mass communications media to convergent social media. The internet is no longer primarily a Western-based medium. Two-thirds of the world’s internet users are now outside of Europe and North America; three-quarters of internet users use languages other than English; and three-quarters of the world’s mobile cellular phone subscriptions are in developing nations. It is also apparent that conducting discussions about how to develop new media technologies and discussions about their cultural and creative content can no longer be separated. Discussions of broadband strategies and the knowledge economy need to be increasingly joined with those concerning the creative industries and the creative economy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper gives an overview of the INEX 2008 Ad Hoc Track. The main goals of the Ad Hoc Track were two-fold. The first goal was to investigate the value of the internal document structure (as provided by the XML mark-up) for retrieving relevant information. This is a continuation of INEX 2007 and, for this reason, the retrieval results are liberalized to arbitrary passages and measures were chosen to fairly compare systems retrieving elements, ranges of elements, and arbitrary passages. The second goal was to compare focused retrieval to article retrieval more directly than in earlier years. For this reason, standard document retrieval rankings have been derived from all runs, and evaluated with standard measures. In addition, a set of queries targeting Wikipedia have been derived from a proxy log, and the runs are also evaluated against the clicked Wikipedia pages. The INEX 2008 Ad Hoc Track featured three tasks: For the Focused Task a ranked-list of nonoverlapping results (elements or passages) was needed. For the Relevant in Context Task non-overlapping results (elements or passages) were returned grouped by the article from which they came. For the Best in Context Task a single starting point (element start tag or passage start) for each article was needed. We discuss the results for the three tasks, and examine the relative effectiveness of element and passage retrieval. This is examined in the context of content only (CO, or Keyword) search as well as content and structure (CAS, or structured) search. Finally, we look at the ability of focused retrieval techniques to rank articles, using standard document retrieval techniques, both against the judged topics as well as against queries and clicks from a proxy log.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

INEX investigates focused retrieval from structured documents by providing large test collections of structured documents, uniform evaluation measures, and a forum for organizations to compare their results. This paper reports on the INEX 2014 evaluation campaign, which consisted of three tracks: The Interactive Social Book Search Track investigated user information seeking behavior when interacting with various sources of information, for realistic task scenarios, and how the user interface impacts search and the search experience. The Social Book Search Track investigated the relative value of authoritative metadata and user-generated content for search and recommendation using a test collection with data from Amazon and LibraryThing, including user profiles and personal catalogues. The Tweet Contextualization Track investigated tweet contextualization, helping a user to understand a tweet by providing him with a short background summary generated from relevant Wikipedia passages aggregated into a coherent summary. INEX 2014 was an exciting year for INEX in which we for the third time ran our workshop as part of the CLEF labs. This paper gives an overview of all the INEX 2014 tracks, their aims and task, the built test-collections, the participants, and gives an initial analysis of the results.