989 resultados para Language representation
Resumo:
Tutkielmassa käsitellään vieraan kielen, ts. englannin, oppimista peruskoulussa. Hypoteesina oletetaan, että oppilaat, jotka jo hallitsevat kaksi kieltä, menestyvät paremmin vieraan kielen oppimisessa kuin yksikieliset oppilaat. Tutkielmassa vertaillaan kaksikielisten ja yksikielisten oppilaiden englannin kielen taitoja alakoulun kuudennen luokan päättyessä. Kaksikielisyys voidaan ymmärtää monella tavalla, ja tutkimustulokset kaksikielisyyden vaikutuksista ovat usein olleet ristiriitaisia. Siksi tutkielmassa ensin määritellään kaksikielisyys, sen lajit sekä siihen liittyvää terminologiaa. Lisäksi kuvaillaan Suomen sekä erityisesti Turun kaupungin kaksikielisen väestön tilaa ja oikeuksia sekä keskustellaan aikaisempien tutkimusten perusteella mahdollisista ongelmista ja hyödyistä, joita kaksikielisyyteen liittyy. Kaksikielisyyteen on perinteisesti liittynyt myös paljon ennakkoluuloja, kuten pelko puolikielisyydestä, jotka tieteellisten tutkimusten avulla pyritään kumoamaan. Mahdollisia muita ongelmia, kuten pienempi sanavarasto molemmissa kielissä verrattuna saman ikäisiin yksikielisiin sekä reaktioaikojen piteneminen, kuitenkin esiintyy. Kaksikielisyyden hyötyjä sen sijaan voivat olla mm. luovuus, kyky kielen analyyttiseen tarkasteluun, metalingvististen taitojen kehittyminen ja avoimuus muita kieliä ja kulttuureita kohtaan. Kaikki mainitut edut ja haitat myös vaikuttavat vieraan kielen opinnoissa menestymiseen. Myös mahdollinen positiivinen transferenssi otetaan huomioon. Tutkimuksen empiiristä osaa varten vierailtiin kahdessa turkulaisessa alakoulussa, joiden kuudennen luokan oppilaat suorittivat kaksi englannin kielen tehtävää. Toinen kouluista oli suomenkielinen, jonka oppilaat edustivat yksikielistä vertailuryhmää (n=31). Ruotsinkielinen koulu valittiin edustamaan kaksikielistä ryhmää (n=34), sillä yleensä Suomessa ja kaupungeissa kuten Turussa vähemmistökielen puhujat hallitsevat käytännössä usein myös suomen kielen. Ruotsinkielisen koulun oppilaiden kaksikielisyys varmistettiin kielitaustakyselyllä. Kaksikielisten oppilaiden tulokset molemmissa tehtävissä olivat hieman paremmat kuin yksikielisten. Yksikielisessä ryhmässä myös tulosten keskihajonta oli suurempi. Kaksikieliset näyttivät hallitsevan kielen analyyttisen tarkastelun paremmin sekä tekivät vähemmän kieliopillisia virheitä. Positiivisen transferenssin vaikutus oli myös nähtävissä. Toisaalta heillä oli enemmän oikeinkirjoitusvirheitä vastauksissaan.Merkittäviä eroja ei kuitenkaan englannin kielen oppimisessa voitu todentaa.
Resumo:
Article About the Authors Metrics Comments Related Content Abstract Introduction Functionality Implementation Discussion Acknowledgments Author Contributions References Reader Comments (0) Figures Abstract Despite of the variety of available Web services registries specially aimed at Life Sciences, their scope is usually restricted to a limited set of well-defined types of services. While dedicated registries are generally tied to a particular format, general-purpose ones are more adherent to standards and usually rely on Web Service Definition Language (WSDL). Although WSDL is quite flexible to support common Web services types, its lack of semantic expressiveness led to various initiatives to describe Web services via ontology languages. Nevertheless, WSDL 2.0 descriptions gained a standard representation based on Web Ontology Language (OWL). BioSWR is a novel Web services registry that provides standard Resource Description Framework (RDF) based Web services descriptions along with the traditional WSDL based ones. The registry provides Web-based interface for Web services registration, querying and annotation, and is also accessible programmatically via Representational State Transfer (REST) API or using a SPARQL Protocol and RDF Query Language. BioSWR server is located at http://inb.bsc.es/BioSWR/and its code is available at https://sourceforge.net/projects/bioswr/under the LGPL license.
Resumo:
In order to spare functional areas during the removal of brain tumours, electrical stimulation mapping was used in 90 patients (77 in the left hemisphere and 13 in the right; 2754 cortical sites tested). Language functions were studied with a special focus on comprehension of auditory and visual words and the semantic system. In addition to naming, patients were asked to perform pointing tasks from auditory and visual stimuli (using sets of 4 different images controlled for familiarity), and also auditory object (sound recognition) and Token test tasks. Ninety-two auditory comprehension interference sites were observed. We found that the process of auditory comprehension involved a few, fine-grained, sub-centimetre cortical territories. Early stages of speech comprehension seem to relate to two posterior regions in the left superior temporal gyrus. Downstream lexical-semantic speech processing and sound analysis involved 2 pathways, along the anterior part of the left superior temporal gyrus, and posteriorly around the supramarginal and middle temporal gyri. Electrostimulation experimentally dissociated perceptual consciousness attached to speech comprehension. The initial word discrimination process can be considered as an "automatic" stage, the attention feedback not being impaired by stimulation as would be the case at the lexical-semantic stage. Multimodal organization of the superior temporal gyrus was also detected since some neurones could be involved in comprehension of visual material and naming. These findings demonstrate a fine graded, sub-centimetre, cortical representation of speech comprehension processing mainly in the left superior temporal gyrus and are in line with those described in dual stream models of language comprehension processing.
Resumo:
Presentamos el proyecto CLARIN, un proyecto cuyo objetivo es potenciar el uso de instrumentos tecnológicos en la investigación en las Humanidades y Ciencias Sociales
Resumo:
Current-day web search engines (e.g., Google) do not crawl and index a significant portion of theWeb and, hence, web users relying on search engines only are unable to discover and access a large amount of information from the non-indexable part of the Web. Specifically, dynamic pages generated based on parameters provided by a user via web search forms (or search interfaces) are not indexed by search engines and cannot be found in searchers’ results. Such search interfaces provide web users with an online access to myriads of databases on the Web. In order to obtain some information from a web database of interest, a user issues his/her query by specifying query terms in a search form and receives the query results, a set of dynamic pages that embed required information from a database. At the same time, issuing a query via an arbitrary search interface is an extremely complex task for any kind of automatic agents including web crawlers, which, at least up to the present day, do not even attempt to pass through web forms on a large scale. In this thesis, our primary and key object of study is a huge portion of the Web (hereafter referred as the deep Web) hidden behind web search interfaces. We concentrate on three classes of problems around the deep Web: characterization of deep Web, finding and classifying deep web resources, and querying web databases. Characterizing deep Web: Though the term deep Web was coined in 2000, which is sufficiently long ago for any web-related concept/technology, we still do not know many important characteristics of the deep Web. Another matter of concern is that surveys of the deep Web existing so far are predominantly based on study of deep web sites in English. One can then expect that findings from these surveys may be biased, especially owing to a steady increase in non-English web content. In this way, surveying of national segments of the deep Web is of interest not only to national communities but to the whole web community as well. In this thesis, we propose two new methods for estimating the main parameters of deep Web. We use the suggested methods to estimate the scale of one specific national segment of the Web and report our findings. We also build and make publicly available a dataset describing more than 200 web databases from the national segment of the Web. Finding deep web resources: The deep Web has been growing at a very fast pace. It has been estimated that there are hundred thousands of deep web sites. Due to the huge volume of information in the deep Web, there has been a significant interest to approaches that allow users and computer applications to leverage this information. Most approaches assumed that search interfaces to web databases of interest are already discovered and known to query systems. However, such assumptions do not hold true mostly because of the large scale of the deep Web – indeed, for any given domain of interest there are too many web databases with relevant content. Thus, the ability to locate search interfaces to web databases becomes a key requirement for any application accessing the deep Web. In this thesis, we describe the architecture of the I-Crawler, a system for finding and classifying search interfaces. Specifically, the I-Crawler is intentionally designed to be used in deepWeb characterization studies and for constructing directories of deep web resources. Unlike almost all other approaches to the deep Web existing so far, the I-Crawler is able to recognize and analyze JavaScript-rich and non-HTML searchable forms. Querying web databases: Retrieving information by filling out web search forms is a typical task for a web user. This is all the more so as interfaces of conventional search engines are also web forms. At present, a user needs to manually provide input values to search interfaces and then extract required data from the pages with results. The manual filling out forms is not feasible and cumbersome in cases of complex queries but such kind of queries are essential for many web searches especially in the area of e-commerce. In this way, the automation of querying and retrieving data behind search interfaces is desirable and essential for such tasks as building domain-independent deep web crawlers and automated web agents, searching for domain-specific information (vertical search engines), and for extraction and integration of information from various deep web resources. We present a data model for representing search interfaces and discuss techniques for extracting field labels, client-side scripts and structured data from HTML pages. We also describe a representation of result pages and discuss how to extract and store results of form queries. Besides, we present a user-friendly and expressive form query language that allows one to retrieve information behind search interfaces and extract useful data from the result pages based on specified conditions. We implement a prototype system for querying web databases and describe its architecture and components design.
Resumo:
Objectif STOPP/START est un outil de détection de la prescription médicamenteuse potentiellement inappropriée chez la personne de 65 ans ou plus. La version initiale de 2008 vient d'être mise à jour et améliorée par ses auteurs. Nous en présentons l'adaptation et la validation en langue française. Méthodes L'adaptation en français de l'outil STOPP/START.v2 a été réalisée par deux experts, confirmée par la méthode de traduction-inverse, et finalisée d'après les commentaires de neufs évaluateurs francophones, gériatres, pharmaciens cliniciens, et médecin généraliste de quatre pays (France, Belgique, Suisse, Canada). La validation a été complétée par une analyse de concordance inter-juge (CCI) des critères STOPP/START.v2 appliqués à dix vignettes cliniques standardisées. Résultats Les 115 critères de STOPP/START.v2 en français sont, par rapport à la version originale anglaise, identiques par leur classification mais adaptés en termes de présentation (critères START.v2 commençant par la condition clinique, et accompagnés par une justification du caractère inapproprié de l'omission) voire de formulation de certains critères. Cette adaptation en français est validée par (i) la traduction-inverse montrant le respect du sens clinique de la version originale, (ii) l'identification semblable des critères lorsque appliqués à dix vignettes cliniques par les neuf évaluateurs, et (iii) le haut niveau de concordance de ces neuf évaluations tant pour STOPP.v2 (CCI 0,849) que pour START.v2 (CCI 0,921). Conclusion L'adaptation en langue française des critères STOPP/START.v2 fournit aux cliniciens un outil de détection de la prescription médicamenteuse potentiellement inappropriée chez les personnes de 65 ans et plus qui est logique, fiable et facile à utiliser. Objective STOPP/START is a screening tool to detect potentially inappropriate prescribing in persons aged 65 or older. Its Irish authors recently updated and improved the initially published version of 2008. We present the adaptation and validation into French language of this updated tool. Methods STOPP/START.v2 was adapted into French by two experts, then confirmed by a translation-back translation method and finalised according to the comments of nine French-speaking assessors - geriatricians, pharmacologists and a general physician - from four countries (France, Belgium, Switzerland, and Canada). The validation was completed by an inter-rater reliability (IRR) analysis of the STOPP/START.v2 criteria applied to 10 standardized clinical vignettes. Results In comparison to the original English version, the 115 STOPP/START.v2 criteria in French language classify in identical manner, but the presentation has been adjusted (START.v2 first specifies the clinical condition followed by an explanation of the inappropriateness of the prescription or omission). This adaptation into French language was validated by means of (i) the translation/back-translation, which showed that the French version complied with the clinical meaning of the original criteria; (ii) the similar screening results when applied by the nine specialists to the 10 cases; and (iii) the high level of inter-rater reliability of these 9 evaluations, for both STOPP (IRR 0.849) and START.v2 (IRR 0.921). Conclusion The adaptation into French of the STOPP/START.v2 criteria provides clinicians with a screening tool to detect potentially inappropriate prescribing in patients aged 65 and older that is more logical, more reliable and easier to use.
Resumo:
Language diversity has become greatly endangered in the past centuries owing to processes of language shift from indigenous languages to other languages that are seen as socially and economically more advantageous, resulting in the death or doom of minority languages. In this paper, we define a new language competition model that can describe the historical decline of minority languages in competition with more advantageous languages. We then implement this non-spatial model as an interaction term in a reactiondiffusion system to model the evolution of the two competing languages. We use the results to estimate the speed at which the more advantageous language spreads geographically, resulting in the shrinkage of the area of dominance of the minority language. We compare the results from our model with the observed retreat in the area of influence of the Welsh language in the UK, obtaining a good agreement between the model and the observed data
Resumo:
Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.
Resumo:
In this paper we study student interaction in English and Swedish courses at a Finnish university. We focus on language choices made in task-related activities in small group interaction. Our research interests arose from the change in the teaching curriculum, in which content and language courses were integrated at Tampere University of Technology in 2013. Using conversation analysis, we analysed groups of 4-5 students who worked collaboratively on a task via a video conference programme. The results show how language alternation has different functions in 1) situations where students orient to managing the task, e.g., in transitions into task, or where they orient to technical problems, and 2) situations where students accomplish the task. With the results, we aim to show how language alternation can provide interactional opportunities for language learning. The findings will be useful in designing tasks in the future.