875 resultados para Corpus annotation


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Al evaluar los contactos de Plutarco con otras culturas contemporáneas, los investigadores todavía no han llegado a un consenso acerca de la relación entre el queronense y la literatura cristiano-primitiva. Un buen ejemplo de esto aparece al atender al motivo de la creación del alma humana. La intención de las próximas páginas es, tras un análisis de los textos plutarqueos, atender a estos posibles contactos con NHC, los heresiólogos y el Corpus Hermeticum a fin de dilucidar sus similitudes y diferencias.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

El estudio de las combinaciones léxicas según su grado de fijación y su distinción en combinaciones libres, colocaciones y locuciones ha sido realizado desde la perspectiva sincrónica. Planteamos la posibilidad de aplicar las pautas para distinguir estos tipos de estructuras en materiales de tipo diacrónico. Concretamente, nos basamos en los documentos que componen el Corpus del Español del Reino de Granada (CORDEREGRA) para valorar los materiales de este corpus histórico-lingüístico y comprobar si los criterios sincrónicos se pueden aplicar al estudio de documentos de otros siglos.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Si se pretende elaborar un diccionario de adjetivos, ya sea este monolingüe o bilingüe, la primera tarea que se le impone al lexicógrafo es la de definir qué es un adjetivo, una cuestión que todavía hoy no ha sido resuelta satisfactoriamente. En alemán hay una serie de palabras que han sido descritas tradicionalmente como adjetivos en función exclusivamente predicativa, cuyo estatus como adjetivos es, sin embargo, cuestionado por algunos autores. En este artículo se trata de dilucidar si estas palabras realmente solo pueden aparecer en función predicativa, cómo se las describe en diccionarios y gramáticas y cuáles son sus principales correspondencias en español, a fin de decidir si deberían ser incluidas en un corpus destinado a la elaboración de un diccionario sintáctico de adjetivos alemán-español.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper is a study about the way in which se structures are represented in 20 verb entries of nine dictionaries of Spanish language. There is a large number of these structures and they are problematic for native and non native speakers. Verbs of the analysis are middle-high frequency and, in the most part of the cases, very polysemous, and this allows to observe interconnections between the different se structures and the different meanings of each verb. Data of the lexicographic analysis are cross-checked with corpus analysis of the same units. As a result, it is observed that there is a large variety in the data which are offered in each dictionary and in the way they are offered, inter and intradictionary. The reasons range from the theoretical overall of each Project to practical performance. This leads to the conclusion that it is necessary to further progress in the dictionary model it is being handled, in order to offer lexico-grammatical phenomenon such as se verbs in an accurate, clear and exhaustive way.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Semantic Annotation component is a software application that provides support for automated text classification, a process grounded in a cohesion-centered representation of discourse that facilitates topic extraction. The component enables the semantic meta-annotation of text resources, including automated classification, thus facilitating information retrieval within the RAGE ecosystem. It is available in the ReaderBench framework (http://readerbench.com/) which integrates advanced Natural Language Processing (NLP) techniques. The component makes use of Cohesion Network Analysis (CNA) in order to ensure an in-depth representation of discourse, useful for mining keywords and performing automated text categorization. Our component automatically classifies documents into the categories provided by the ACM Computing Classification System (http://dl.acm.org/ccs_flat.cfm), but also into the categories from a high level serious games categorization provisionally developed by RAGE. English and French languages are already covered by the provided web service, whereas the entire framework can be extended in order to support additional languages.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The mismatch between human capacity and the acquisition of Big Data such as Earth imagery undermines commitments to Convention on Biological Diversity (CBD) and Aichi targets. Artificial intelligence (AI) solutions to Big Data issues are urgently needed as these could prove to be faster, more accurate, and cheaper. Reducing costs of managing protected areas in remote deep waters and in the High Seas is of great importance, and this is a realm where autonomous technology will be transformative.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Die Dissertation geht von der Überlegung aus, dass ein Wandel im deutschen Bildungssystem aufgrund kinderrechtlicher und demokratischer Überlegungen eine essenzielle Zukunftsaufgabe ist. Dies geschieht im Besonderen vor dem Hintergrund kinderrechtlicher Forderungen, wie sie beispielsweise aus dem Status Deutschlands als Mitgliedsland der UN-Kinderrechtskonvention und des UN-Übereinkommens über die Rechte von Menschen mit Behinderung entstehen: Die Korrelation zwischen Herkunft und Bildungserfolg sowie die noch konzeptionell ungenügend entwickelten Ansätze zur Inklusion werden als bestimmend für die Diskussion erkannt. Die Arbeit vertritt die These, dass Lehrende eine zentrale Rolle in einem notwendigen Prozess der Umwandlung des deutschen Bildungssystems spielen können und nimmt in emanzipatorischer Absicht die Lehrenden als Zielgruppe in den Blick. Aufgrund der Struktur der Lehrendenausbildung wird zudem dafür argumentiert, entsprechende Veränderungen im Sinne ihrer Nachhaltigkeit und flächendeckenden Relevanz in der Lehrendenausbildung des ersten universitären Ausbildungsabschnittes zu verankern. Als Instrument, welches die Kriterien von Nachhaltigkeit und umfassender Relevanz erfüllt, werden die Standards für die Lehrerbildung im Bereich Bildungswissenschaften benannt und theoretisch unterfüttert. Grundsätzlich spannt sich der Blickwinkel der Arbeit zwischen der Betrachtung der Struktur des Bildungssystems, des Wissens und Könnens von Lehrenden und Lernenden im Sinne der Kompetenzorientierung sowie der Haltung auf, welche Lehrende einnehmen. Dabei wird aufgezeigt, dass sich diese Faktoren wechselseitig auf vielfache Art und Weise beeinflussen. Auf der Basis dieser Überlegungen wird zunächst das deutsche Bildungssystem in seinen Strukturen beschrieben und die Lehrenden und Lernenden als Teilnehmende am Bildungsgeschehen skizziert. Ein exkursiver Vergleich dreier aufgrund kinderrechtlicher Parameter ausgewählter europäischer Bildungssysteme eruiert zudem, inwiefern die aufgezeigten Bestimmungsgrößen Struktur, Wissen/Können und Haltung Einfluss auf kinderrechtliche Verfasstheit des Bildungssystems nehmen, wobei die Grundprinzipien der UN-Kinderrechtskonvention hier als Messinstrument dienen. Davon ausgehend, erscheint pädagogische Haltung als eine wesentliche Einflussgröße, die im weiteren Verlauf der Arbeit im Sinne der kritischen Pädagogik zu einer Konzeption Pädagogischer Verantwortung verdichtet wird. Vor diesem Hintergrund erfolgt eine Betrachtung aktueller Problematiken im deutschen Bildungssystem, die sich an den in den Standards für die Lehrerbildung identifizierten Ausbildungsschwerpunkten orientiert und erneut die Prinzipien und rechtlichen Vorgaben der UN-Kinderrechtskonvention als Maßstab wählt. Auf der Grundlage dieser umfassenden Diskussion werden in einem sich anschließenden Analyseschritt Annotationen vorgenommen, die die Formulierungen der 11 Standards für die Lehrerbildung inhaltlich erweitern und ergänzen. In Verbindung mit einer Pädagogischen Verantwortung, die sich der Ausbildung kritischer Mündigkeit verpflichtet, werden die annotierten Standards als eine Möglichkeit der kinderrechtlichen (Selbst-)Evaluation von Lehrenden sowie als Instrument einer kinderrechtliche Gesichtspunkte fokussierenden Lehrendenbildung verstanden und dargestellt.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study is a corpus-based comparison between student essays written in the subject areas of English linguistics and literature at undergraduate level. They are 200 Bachelor degree theses submitted at a variety of university departments (such as English, Language and Literature, Humanities, Social and Intercultural Studies) in Sweden. The comparison concerns frequencies of core modal verbs and how often they occur together with the I, we and it subject pronouns and in the structures this/the [essay, study, project, thesis] when students attempt to communicate their personal claims. Quantitative and qualitative analyses of the essays show few similarities in the ways that core modal verbs appear in both disciplines. The results indicate mainly distinct differences, especially in relation to clusters and variation of performative verbs. Specific patterns in the ways that students use core modal verbs as hedges have also been identified.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

info:eu-repo/semantics/published

Relevância:

20.00% 20.00%

Publicador:

Resumo:

En lingüística, principalmente en el idioma inglés, se usa el Índice de Niebla de Gunning para determinar la legibilidad de un texto. El índice estima los años de educación formal necesarios para comprenderel texto en una primera lectura. Un Índice de 11 años apunta a una persona con el colegio finalizado, (Gunning, 1973). Analizamos en esta investigación la variación del Índice al cambiar la forma de obtener uno de los parámetros. En la fórmula original se consideran “palabras complejas” las que tienen tres o más sílabas. En su lugar utilizamos “palabras desconocidas” que son aquellas cuyo uso es poco familiar, según un corpus construido durante la investigación, partiendo de millones de libros digitalizados por Google y la Universidad de Harvard. Aunque la variación de los resultados dependerá del valor asignado para determinarsi una palabra es desconocida la investigación es pionera en el uso de un corpus para calcular el Índice de Niebla.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For some years now the Internet and World Wide Web communities have envisaged moving to a next generation of Web technologies by promoting a globally unique, and persistent, identifier for identifying and locating many forms of published objects . These identifiers are called Universal Resource Names (URNs) and they hold out the prospect of being able to refer to an object by what it is (signified by its URN), rather than by where it is (the current URL technology). One early implementation of URN ideas is the Unicode-based Handle technology, developed at CNRI in Reston Virginia. The Digital Object Identifier (DOI) is a specific URN naming convention proposed just over 5 years ago and is now administered by the International DOI organisation, founded by a consortium of publishers and based in Washington DC. The DOI is being promoted for managing electronic content and for intellectual rights management of it, either using the published work itself, or, increasingly via metadata descriptors for the work in question. This paper describes the use of the CNRI handle parser to navigate a corpus of papers for the Electronic Publishing journal. These papers are in PDF format and based on our server in Nottingham. For each paper in the corpus a metadata descriptor is prepared for every citation appearing in the References section. The important factor is that the underlying handle is resolved locally in the first instance. In some cases (e.g. cross-citations within the corpus itself and links to known resources elsewhere) the handle can be handed over to CNRI for further resolution. This work shows the encouraging prospect of being able to use persistent URNs not only for intellectual property negotiations but also for search and discovery. In the test domain of this experiment every single resource, referred to within a given paper, can be resolved, at least to the level of metadata about the referred object. If the Web were to become more fully URN aware then a vast directed graph of linked resources could be accessed, via persistent names. Moreover, if these names delivered embedded metadata when resolved, the way would be open for a new generation of vastly more accurate and intelligent Web search engines.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Relationship between organisms within an ecosystem is one of the main focuses in the study of ecology and evolution. For instance, host-parasite interactions have long been under close interest of ecology, evolutionary biology and conservation science, due to great variety of strategies and interaction outcomes. The monogenean ecto-parasites consist of a significant portion of flatworms. Gyrodactylus salaris is a monogenean freshwater ecto-parasite of Atlantic salmon (Salmo salar) whose damage can make fish to be prone to further bacterial and fungal infections. G. salaris is the only one parasite whose genome has been studied so far. The RNA-seq data analyzed in this thesis has already been annotated by using LAST. The RNA-seq data was obtained from Illumina sequencing i.e. yielded reads were assembled into 15777 transcripts. Last resulted in annotation of 46% transcripts and remaining were left unknown. This thesis work was started with whole data and annotation process was continued by the use of PANNZER, CDD and InterProScan. This annotation resulted in 56% successfully annotated sequences having parasite specific proteins identified. This thesis represents the first of Monogenean transcriptomic information which gives an important source for further research on this specie. Additionally, comparison of annotation methods interestingly revealed that description and domain based methods perform better than simple similarity search methods. Therefore it is more likely to suggest the use of these tools and databases for functional annotation. These results also emphasize the need for use of multiple methods and databases. It also highlights the need of more genomic information related to G. salaris.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This thesis investigates the standardisation of Modern Scottish Gaelic orthography from the mid-eighteenth century to the twenty-first. It presents the results of the first corpus-based analysis of Modern Scottish Gaelic orthographic development combined with an analytic approach that places orthographic choices in their sociolinguistic context. The theoretical framework behind the analysis centres on discussion of how the language ideologies of the phonographic ideal, historicism, autonomy, vernacularism and the ideology of the standard itself have shaped orthographic conventions and debates. It argues that current spelling norms reflect an orthography that is the result of compromise, historical factors and pragmatic function. The research uses a digital corpus to examine how three particular features have been used over time: the dialect variation between <eu> and <ia>; variation in s + stop consonant clusters (sd/st, sg/sc, sb/sp); and the use of the grave and acute accents. Evidence is drawn from the Corpas na Gàidhlig electronic corpus created at the University of Glasgow: the sub-corpus used in this study includes 117 published texts representing a period of over 250 years from 1750 to 2007, and a total size of over four and a quarter million words. The results confirm a key period of reform between 1750 and the early nineteenth century, and thereafter a settled norm being established in the early nineteenth century. Since then, some variation has been acceptable although changes and reform of some features have centred on increasing uniformity and regularisation.