Biblioteca Digital

15 resultados para multilinguality

LMM: an OWL-DL metamodel to represent heterogeneous lexical knowledge

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present a Linguistic Meta-Model (LMM) allowing a semiotic-cognitive representation of knowledge. LMM is freely available and integrates the schemata of linguistic knowledge resources, such as WordNet and FrameNet, as well as foundational ontologies, such as DOLCE and its extensions. In addition, LMM is able to deal with multilinguality and to represent individuals and facts in an open domain perspective.

VIKI : a semiotic-based system for multilingual knowledge retrieval

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract Since its creation, the Internet has permeated our daily life. The web is omnipresent for communication, research and organization. This exploitation has resulted in the rapid development of the Internet. Nowadays, the Internet is the biggest container of resources. Information databases such as Wikipedia, Dmoz and the open data available on the net are a great informational potentiality for mankind. The easy and free web access is one of the major feature characterizing the Internet culture. Ten years earlier, the web was completely dominated by English. Today, the web community is no longer only English speaking but it is becoming a genuinely multilingual community. The availability of content is intertwined with the availability of logical organizations (ontologies) for which multilinguality plays a fundamental role. In this work we introduce a very high-level logical organization fully based on semiotic assumptions. We thus present the theoretical foundations as well as the ontology itself, named Linguistic Meta-Model. The most important feature of Linguistic Meta-Model is its ability to support the representation of different knowledge sources developed according to different underlying semiotic theories. This is possible because mast knowledge representation schemata, either formal or informal, can be put into the context of the so-called semiotic triangle. In order to show the main characteristics of Linguistic Meta-Model from a practical paint of view, we developed VIKI (Virtual Intelligence for Knowledge Induction). VIKI is a work-in-progress system aiming at exploiting the Linguistic Meta-Model structure for knowledge expansion. It is a modular system in which each module accomplishes a natural language processing task, from terminology extraction to knowledge retrieval. VIKI is a supporting system to Linguistic Meta-Model and its main task is to give some empirical evidence regarding the use of Linguistic Meta-Model without claiming to be thorough.

Recent Development in ParaSol: Breadth for Depth and XSLT based web concordancing with CWB

Relevância:

10.00% 10.00%

Publicador:

Tags and self-organisation: a metadata ecology for learning resources in a multilingual context

Relevância:

10.00% 10.00%

Publicador:

The sociolinguistic situation of ǂHoan, a moribund 'Khoisan' language of Botswana

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In 2010, we conducted a sociolinguistic survey on the moribund 'Khoisan' language ǂHoan (Ju-ǂHoan), spoken in Botswana at the fringe of the Kalahari Desert. The survey aimed at investigating language use, degrees of multilingualism and language attitude among the ǂHoan speakers. Data collection was done on the basis of a questionnaire. We found that the positive language attitude of individuals towards ǂHoan often conflicts with the community's attitude towards this language, resulting in a split of actual language use between the family and more formal situations. All ǂHoan speakers are at least bilingual speaking the local lingua franca Kgalagadi (Bantu) besides ǂHoan. Most of them are in fact even trilingual, speaking Gǀui (Khoe-Kwadi) in addition to ǂHoan and Kgalagadi. Most of our results stand in line with an earlier sociolinguistic survey on ǂHoan by Batibo (2005a) which was carried out in 2003. In comparing Batibo's results to ours, changes in the sociolinguistic situation of ǂHoan as well as differences between the different villages will be pointed out.

Enabling folksonomies for knowledge extraction: A semantic grounding approach

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Folksonomies emerge as the result of the free tagging activity of a large number of users over a variety of resources. They can be considered as valuable sources from which it is possible to obtain emerging vocabularies that can be leveraged in knowledge extraction tasks. However, when it comes to understanding the meaning of tags in folksonomies, several problems mainly related to the appearance of synonymous and ambiguous tags arise, specifically in the context of multilinguality. The authors aim to turn folksonomies into knowledge structures where tag meanings are identified, and relations between them are asserted. For such purpose, they use DBpedia as a general knowledge base from which they leverage its multilingual capabilities.

Some reflections on the IT challenges for a multilingual semantic web

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Many attempts have been made to provide multilinguality to the Semantic Web, by means of annotation properties in Natural Language (NL), such as RDFs or SKOS labels, and other lexicon-ontology models, such as lemon, but there are still many issues to be solved if we want to have a truly accessible Multilingual Semantic Web (MSW). Reusability of monolingual resources (ontologies, lexicons, etc.), accessibility of multilingual resources hindered by many formats, reliability of ontological sources, disambiguation problems and multilingual presentation to the end user of all this information in NL can be mentioned as some of the most relevant problems. Unless this NL presentation is achieved, MSW will be restricted to the limits of IT experts, but even so, with great dissatisfaction and disenchantment

DAEDALUS at RepLab 2014: Detecting RepTrak reputation dimensions on tweets

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes our participation at the RepLab 2014 reputation dimensions scenario. Our idea was to evaluate the best combination strategy of a machine learning classifier with a rule-based algorithm based on logical expressions of terms. Results show that our baseline experiment using just Naive Bayes Multinomial with a term vector model representation of the tweet text is ranked second among runs from all participants in terms of accuracy.

DAEDALUS at PAN 2014: Guessing tweet author's gender and age

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper describes our participation at PAN 2014 author profiling task. Our idea was to define, develop and evaluate a simple machine learning classifier able to guess the gender and the age of a given user based on his/her texts, which could become part of the solution portfolio of the company. We were interested in finding not the best possible classifier that achieves the highest accuracy, but to find the optimum balance between performance and throughput using the most simple strategy and less dependent of external systems. Results show that our software using Naive Bayes Multinomial with a term vector model representation of the text is ranked quite well among the rest of participants in terms of accuracy.

Automatic extraction of syntactic semantic patterns for multilingual resources

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper we present an automatic system for the extraction of syntactic semantic patterns applied to the development of multilingual processing tools. In order to achieve optimum methods for the automatic treatment of more than one language, we propose the use of syntactic semantic patterns. These patterns are formed by a verbal head and the main arguments, and they are aligned among languages. In this paper we present an automatic system for the extraction and alignment of syntactic semantic patterns from two manually annotated corpora, and evaluate the main linguistic problems that we must deal with in the alignment process.

LEGOLANG: técnicas de deconstrucción aplicadas a las tecnologías del lenguaje humano

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo de este proyecto se basa en la necesidad de replantearse la filosofía clásica del TLH para adecuarse tanto a las fuentes disponibles actualmente (datos no estructurados con multi-modalidad, multi-lingualidad y diferentes grados de formalidad) como a las necesidades reales de los usuarios finales. Para conseguir este objetivo es necesario integrar tanto la comprensión como la generación del lenguaje humano en un modelo único (modelo LEGOLANG) basado en técnicas de deconstrucción de la lengua, independiente de su aplicación final y de la variante de lenguaje humano elegida para expresar el conocimiento.

Automatic Creation of Lexical Resources for an Interlingua-based System

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Universal Networking Language (UNL) is an interlingua designed to be the base of several natural language processing systems aiming to support multilinguality in internet. One of the main components of the language is the dictionary of Universal Words (UWs), which links the vocabularies of the different languages involved in the project. As any NLP system, coverage and accuracy in its lexical resources are crucial for the development of the system. In this paper, the authors describes how a large coverage UWs dictionary was automatically created, based on an existent and well known resource like the English WordNet. Other aspects like implementation details and the evaluation of the final UW set are also depicted.

Bridging the Gap Between Human Language and Computer-Oriented Representations

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information can be expressed in many ways according to the different capacities of humans to perceive it. Current systems deals with multimedia, multiformat and multiplatform systems but another « multi » is still pending to guarantee global access to information, that is, multilinguality. Different languages imply different replications of the systems according to the language in question. No solutions appear to represent the bridge between the human representation (natural language) and a system-oriented representation. The United Nations University defined in 1997 a language to be the support of effective multilinguism in Internet. In this paper, we describe this language and its possible applications beyond multilingual services as the possible future standard for different language independent applications.

Multilingualism and conceptual modelling

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One of the leading motivations behind the multilingual semantic web is to make resources accessible digitally in an online global multilingual context. Consequently, it is fundamental for knowledge bases to find a way to manage multilingualism and thus be equipped with those procedures for its conceptual modelling. In this context, the goal of this paper is to discuss how common-sense knowledge and cultural knowledge are modelled in a multilingual framework. More particularly, multilingualism and conceptual modelling are dealt with from the perspective of FunGramKB, a lexico-conceptual knowledge base for natural language understanding. This project argues for a clear division between the lexical and the conceptual dimensions of knowledge. Moreover, the conceptual layer is organized into three modules, which result from a strong commitment towards capturing semantic knowledge (Ontology), procedural knowledge (Cognicon) and episodic knowledge (Onomasticon). Cultural mismatches are discussed and formally represented at the three conceptual levels of FunGramKB.

The effect of multilingual facilitation on active participation in MOOCs Authors:

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new approach for overcoming the language and culture barriers to participation in MOOCs is reported. It is hypothesised that the juxtaposition of English as the language of instruction, used for interacting with course materials, and one’s preferred language as the language of participation, used for interaction with peers and facilitators, is preferable to ‘English only’ for participation in a MOOC. The HANDSON MOOC included seven teams of facilitators, each catering for a different language community. Facilitators were responsible for promoting active participation and peer tutoring. Comparing language groups revealed a series of predictors of intention to learn, some of which became apparent in the first days of the MOOC already. The comparison also uncovered four critical factors that influence participation: facilitation, language of participation, group size, and a pre-existing sense of community. Especially crucial was reaching a sufficient number of active participants during the first week. We conclude that multilingual facilitation activates participation in MOOCs in various ways; and that synergy between the four aforementioned factors is critical for the formation of the learning network that supports a social dynamics of active participation. Our approach suggests future targets for the development of the multilingual and community potential of MOOCs.