542 resultados para corpora allata


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biomedical research is currently facing a new type of challenge: an excess of information, both in terms of raw data from experiments and in the number of scientific publications describing their results. Mirroring the focus on data mining techniques to address the issues of structured data, there has recently been great interest in the development and application of text mining techniques to make more effective use of the knowledge contained in biomedical scientific publications, accessible only in the form of natural human language. This thesis describes research done in the broader scope of projects aiming to develop methods, tools and techniques for text mining tasks in general and for the biomedical domain in particular. The work described here involves more specifically the goal of extracting information from statements concerning relations of biomedical entities, such as protein-protein interactions. The approach taken is one using full parsing—syntactic analysis of the entire structure of sentences—and machine learning, aiming to develop reliable methods that can further be generalized to apply also to other domains. The five papers at the core of this thesis describe research on a number of distinct but related topics in text mining. In the first of these studies, we assessed the applicability of two popular general English parsers to biomedical text mining and, finding their performance limited, identified several specific challenges to accurate parsing of domain text. In a follow-up study focusing on parsing issues related to specialized domain terminology, we evaluated three lexical adaptation methods. We found that the accurate resolution of unknown words can considerably improve parsing performance and introduced a domain-adapted parser that reduced the error rate of theoriginal by 10% while also roughly halving parsing time. To establish the relative merits of parsers that differ in the applied formalisms and the representation given to their syntactic analyses, we have also developed evaluation methodology, considering different approaches to establishing comparable dependency-based evaluation results. We introduced a methodology for creating highly accurate conversions between different parse representations, demonstrating the feasibility of unification of idiverse syntactic schemes under a shared, application-oriented representation. In addition to allowing formalism-neutral evaluation, we argue that such unification can also increase the value of parsers for domain text mining. As a further step in this direction, we analysed the characteristics of publicly available biomedical corpora annotated for protein-protein interactions and created tools for converting them into a shared form, thus contributing also to the unification of text mining resources. The introduced unified corpora allowed us to perform a task-oriented comparative evaluation of biomedical text mining corpora. This evaluation established clear limits on the comparability of results for text mining methods evaluated on different resources, prompting further efforts toward standardization. To support this and other research, we have also designed and annotated BioInfer, the first domain corpus of its size combining annotation of syntax and biomedical entities with a detailed annotation of their relationships. The corpus represents a major design and development effort of the research group, with manual annotation that identifies over 6000 entities, 2500 relationships and 28,000 syntactic dependencies in 1100 sentences. In addition to combining these key annotations for a single set of sentences, BioInfer was also the first domain resource to introduce a representation of entity relations that is supported by ontologies and able to capture complex, structured relationships. Part I of this thesis presents a summary of this research in the broader context of a text mining system, and Part II contains reprints of the five included publications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

El objetivo de este trabajo es reflexionar acerca del empleo de los corpus informatizados. El caso que presentamos está vinculado a un proyecto de I+D sobre la gramaticalización de perífrasis verbales (GRAPEVERBA). Para llevar a cabo este estudio, hemos extraído las ocurrencias de los dos corpus académicos, CORDE and CREA. La falta de una lematización y de un etiquetado en ambos corpus nos ha planteado un problema de difícil solución, puesto que el número de ejemplos obtenido resulta excesivamente elevado. Otro problema tiene que ver con las ediciones textuales de las obras vertidas en los corpus de la Academia, de manera especial en el CORDE. Con cierta frecuencia, estas ediciones no son contemporáneas de los manuscritos originales, lo que compromete seriamente las conclusiones que se extraen acerca de la gramaticalización de algunas perífrasis verbales, por ejemplo de tener + (a/de) + infinitivo.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The article describes some concrete problems that were encountered when writing a two-level model of Mari morphology. Mari is an agglutinative Finno-Ugric language spoken in Russia by about 600 000 people. The work was begun in the 1980s on the basis of K. Koskenniemi’s Two-Level Morphology (1983), but in the latest stage R. Beesley’s and L. Karttunen’s Finite State Morphology (2003) was used. Many of the problems described in the article concern the inexplicitness of the rules in Mari grammars and the lack of information about the exact distribution of some suffixes, e.g. enclitics. The Mari grammars usually give complete paradigms for a few unproblematic verb stems, whereas the difficult or unclear forms of certain verbs are only superficially discussed. Another example of phenomena that are poorly described in grammars is the way suffixes with an initial sibilant combine to stems ending in a sibilant. The help of informants and searches from electronic corpora were used to overcome such difficulties in the development of the two-level model of Mari. The variation of the order of plural markers, case suffixes and possessive suffixes is a typical feature of Mari. The morphotactic rules constructed for Mari declensional forms tend to be recursive and their productivity must be limited by some technical device, such as filters. In the present model, certain plural markers were treated like nouns. The positional and functional versatility of the possessive suffixes can be regarded as the most challenging phenomenon in attempts to formalize the Mari morphology. Cyrillic orthography, which was used in the model, also caused problems. For instance, a Cyrillic letter may represent a sequence of two sounds, the first being part of the word stem while the other belongs to a suffix. In some cases, letters for voiced consonants are also generalized to represent voiceless consonants. Such orthographical conventions distance a morphological model based on orthography from the actual (morpho)phonological processes in the language.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During the process of language development, one of the most important tasks that children must face is that of identifying the grammatical category to which words in their language belong. This is essential in order to be able to form grammatically correct utterances. How do children proceed in order to classify words in their language and assign them to their corresponding grammatical category? The present study investigates the usefulness of phonological information for the categorization of nouns in English, given the fact that it is phonology the first source of information that might be available to prelinguistic infants who lack access to semantic information or complex morphosyntactic information. We analyse four different corpora containing linguistic samples of English speaking mothers addressing their children in order to explore the reliability with which words are represented in mothers’ speech based on several phonological criteria. The results of the analysis confirm the prediction that most of the words to which English learning infants are exposed during the first two years of life can be accounted for in terms of their phonological resemblance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This dissertation considers the segmental durations of speech from the viewpoint of speech technology, especially speech synthesis. The idea is that better models of segmental durations lead to higher naturalness and better intelligibility. These features are the key factors for better usability and generality of synthesized speech technology. Even though the studies are based on a Finnish corpus the approaches apply to all other languages as well. This is possibly due to the fact that most of the studies included in this dissertation are about universal effects taking place on utterance boundaries. Also the methods invented and used here are suitable for any other study of another language. This study is based on two corpora of news reading speech and sentences read aloud. The other corpus is read aloud by a 39-year-old male, whilst the other consists of several speakers in various situations. The use of two corpora is twofold: it involves a comparison of the corpora and a broader view on the matters of interest. The dissertation begins with an overview to the phonemes and the quantity system in the Finnish language. Especially, we are covering the intrinsic durations of phonemes and phoneme categories, as well as the difference of duration between short and long phonemes. The phoneme categories are presented to facilitate the problem of variability of speech segments. In this dissertation we cover the boundary-adjacent effects on segmental durations. In initial positions of utterances we find that there seems to be initial shortening in Finnish, but the result depends on the level of detail and on the individual phoneme. On the phoneme level we find that the shortening or lengthening only affects the very first ones at the beginning of an utterance. However, on average, the effect seems to shorten the whole first word on the word level. We establish the effect of final lengthening in Finnish. The effect in Finnish has been an open question for a long time, whilst Finnish has been the last missing piece for it to be a universal phenomenon. Final lengthening is studied from various angles and it is also shown that it is not a mere effect of prominence or an effect of speech corpus with high inter- and intra-speaker variation. The effect of final lengthening seems to extend from the final to the penultimate word. On a phoneme level it reaches a much wider area than the initial effect. We also present a normalization method suitable for corpus studies on segmental durations. The method uses an utterance-level normalization approach to capture the pattern of segmental durations within each utterance. This prevents the impact of various problematic variations within the corpora. The normalization is used in a study on final lengthening to show that the results on the effect are not caused by variation in the material. The dissertation shows an implementation and prowess of speech synthesis on a mobile platform. We find that the rule-based method of speech synthesis is a real-time software solution, but the signal generation process slows down the system beyond real time. Future aspects of speech synthesis on limited platforms are discussed. The dissertation considers ethical issues on the development of speech technology. The main focus is on the development of speech synthesis with high naturalness, but the problems and solutions are applicable to any other speech technology approaches.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

During the past century, an increasingly diverse world provided us with opportunities for intercultural communication; especially the growth of commerce at all levels from domestic to international has made the combination of the theories of intercultural communication and international business necessary. As one of the main beneficiaries in international business in recent years, companies in airline industries have developed their international market. For instance, Finnair has developed its Asian strategy which responds to the increasing market demand for flights from Europe to Asia in the new millennium. Therefore, the company manages marketing communication in a global environment and becomes a suitable case for studying the theories of intercultural communication in the context of international marketing. Finnair implemented a large number of international advertisements to promote its Asian routes, where Asia has been constructed as a number of exotic destinations. Meanwhile, the company itself as a provider of these destinations has also been constructed contrastively. Thus, this thesis aims at research how Finnair constructs Asia and the company itself in the new millennium, and how these constructions compare with the theories of intercultural communication. This research applied the theories of international marketing, intercultural communication and culture. In order to analyze the collected corpora as Finnair’s international advertisements and its annual reports in the new millennium, the methods of content analysis and discourse analysis have been used in this research. As a result, Finnair has purposefully applied the essentialist approach to intercultural communication and constructed Asia as an exotic “Other” due to the company’s market orientation. Meanwhile, Finnair has also constructed the company itself two identities based on the same approach: as an international airline provider between Europe and Asia, as well as a part of Finnish society. The combination of intercultural communication and international marketing theories, together with the combination of the methods of content analysis and discourse analysis ensure the originality of this paper.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Department of French Studies of the University of Turku (Finland) organized an International Bilingual Conference on Crosscultural and Crosslinguistic Perspectives on Academic Discourse from 2022 May 2005. The event hosted specialists on Academic Discourse from Belgium, Finland, France, Germany, Italy, Norway, Spain, and the USA. This book is the first volume in our series of publications on Academic Discourse (AD hereafter). The following pages are composed of selected papers from the conference and focus on different aspects and analytical frameworks of Academic Discourse. One of the motivations behind organizing the conference was to examine and expand research on AD in different languages. Another one was to question to what extent academic genres are culturebound and language specific or primarily field or domain specific. The research carried out on AD has been mainly concerned with the use of English in different academic settings for a long time now – mainly written contexts – and at the expense of other languages. Alternatively the academic genre conventions of English and English speaking world have served as a basis for comparison with other languages and cultures. We consider this first volume to be a strong contribution to the spreading out of researches based on other languages than English in AD, namely Finnish, French, Italian, Norwegian and Romanian in this book. All the following articles have a strong link with the French language: either French is constitutive of the AD corpora under examination or the article was written in French. The structure of the book suggests and provides evidence that the concept of AD is understood and tackled to varying degrees by different scholars. Our first volume opens up the discussion on what AD is and backs dissemination, overlapping and expansion of current research questions and methodologies. The book is divided into three parts and contains four articles in English and six articles in French. The papers in part one and part two cover what we call the prototypical genre of written AD, i.e. the research article. Part one follows up on issues linked to the 13 Research Article (RA hereafter). Kjersti Fløttum asks wether a typical RA exists and concentrates on authors’ voices in RA (self and other dimensions), whereas Didriksen and Gjesdal’s article focuses on individual variation of the author’s voice in RA. The last article in this section is by Nadine Rentel and deals with evaluation in the writing of RA. Part two concentrates on the teaching and learning of AD within foreign language learning, another more or less canonical genre of AD. Two aspects of writing are covered in the first two articles: foreign students’ representations on rhetorical traditions (Hidden) and a contrastive assessment of written exercices in French and Finnish in Higher Education (Suzanne). The last contribution in this section on AD moves away from traditional written forms and looks at how argumentation is constructed in students’ oral presentations (Dervin and Fauveau). The last part of the book continues the extension by featuring four articles written in French exploring institutional and scientific discourses. Institutional discourses under scrutiny include the European Bologna Process (Galatanu) and Romanian reform texts (Moilanen). As for scientific discourses, the next paper in this section deconstructs an ideological discourse on the didactics of French as a foreign language (Pescheux). Finally, the last paper in part three reflects on varied forms of AD at university (Defays). We hope that this book will add some fuel to continue discussing diverse forms of and approches to AD – in different languages and voices! No need to say that with the current upsurge in academic mobility, reflecting on crosscultural and crosslinguistic AD has just but started.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Les 21, 22 et 23 septembre 2006, le Département d’Études Françaises de l’Université de Turku (Finlande) a organisé une conférence internationale et bilingue (anglais et français) sur le thème de la mobilité académique ; le but de cette rencontre était de rendre possible la tenue d’un forum international et multidisciplinaire, susceptible d’être le siège de divers débats entre les différents acteurs de la mobilité académique (c’estàdire des étudiants, des chercheurs, des personnels enseignants et administratifs, etc.). Ainsi, ont été mis à contribution plus de cinquante intervenants, (tous issus de domaines aussi variés que la linguistique, les sciences de l’éducation, la didactique, l’anthropologie, la sociologie, la psychologie, l’histoire, la géographie, etc.) ainsi que cinq intervenants renommés1. La plupart des thèmes traités durant la conférence couvraient les champs suivants : l’organisation de la mobilité, les obstacles rencontrés par les candidats à la mobilité, l’intégration des étudiants en situation d’échange, le développement des programmes d’études, la mobilité virtuelle, l’apprentissage et l’enseignement des langues, la prise de cosncience interculturelle, le développement des compétences, la perception du système de mobilité académique et ses impacts sur la mobilité effective. L’intérêt du travail réalisé durant la conférence réside notamment dans le fait qu’il ne concentre pas uniquement des perspectives d’étudiants internationaux et en situation d’échange (comme c’est le cas de la plupart des travaux de recherche déjà menés sur ce sujet), mais aussi ceux d’autres corps : enseignants, chercheurs, etc. La contribution suivante contient un premier corpus de dixsept articles, répartis en trois sections : 1. Impacts de la mobilité étudiante ; 2. Formation en langues ; 3. Amélioration de la mobilité académique. À l’image de la conférence, la production qui suit est bilingue : huit des articles sont rédigés en français, et les neuf autres en anglais. Certains auteurs n’ont pas pu assister à la conférence mais ont tout de même souhaité apparaître dans cet ouvrage. Dans la première section de l’ouvrage, Sandrine Billaud tâche de mettre à jour les principaux obstacles à la mobilité étudiante en France (logement, organisation des universités, démarches administratives), et propose à ce sujet quelques pistes d’amélioration. Vient ensuite un article de Dominique Ulma, laquelle se penche sur la mobilité académique régnant au sein des Instituts Universitaires de Formation des Maîtres (IUFM) ; elle s’est tout particulièrement concentrée sur l’enthousiasme des stagiaires visàvis de la mobilité, et sur les bénéfices qu’apporte la mobilité Erasmus à ce type précis d’étudiant. Ensuite, dans un troisième article, Magali Hardoin s’interroge sur les potentialités éducationnelles de la mobilité des enseignantsstagiaires, et tâche de définir l’impact de celleci sur la construction de leur profil professionnel. Après cela arrive un groupe de trois articles, tous réalisés à bases d’observations faites dans l’enseignement supérieur espagnol, et qui traitent respectivement de la portée qu’a le programme de triple formation en langues européennes appliquées pour les étudiants en mobilité (Marián MorónMartín), des conséquences qu’occasionne la présence d’étudiants étrangers dans les classes de traductions (Dimitra Tsokaktsidu), et des réalités de l’intégration sur un campus espagnol d’étudiants américains en situation d’échange (Guadalupe Soriano Barabino). Le dernier article de la section, issu d’une étude sur la situation dans les institutions japonaises, fait état de la situation des programmes de doubles diplômes existant entre des établissements japonais et étrangers, et tente de voir quel est l’impact exact de tels programmes pour les institutions japonaises (Mihoko Teshigawara, Riichi Murakami and Yoneo Yano). La seconde section est elle consacrée à la relation entre apprentissage et enseignement des langues et mobilité académique. Dans un premier article, Martine Eisenbeis s’intéresse à des modules multimédia réalisés à base du film « L’auberge espagnole », de Cédric Klapish (2001), et destinés aux étudiants en mobilité désireux d’apprendre et/ou améliorer leur français par des méthodes moins classiques. Viennent ensuite les articles de Jeanine Gerbault et Sabine Ylönen, lesquels traitent d’un projet européen visant à supporter la mobilité étudiante par la création d’un programme multimédia de formation linguistique et culturelle pour les étudiants en situation de mobilité (le nom du projet est EUROMOBIL). Ensuite, un article de Pascal Schaller s’intéresse aux différents types d’activités que les étudiants en séjour à l’étranger expérimentent dans le cadre de leur formation en langue. Enfin, la section s’achève avec une contribution de Patricia KohlerBally, consacrée à un programme bilingue coordonné par l’Université de Fribourg (Suisse). La troisième et dernière section propose quelques pistes de réflexion destinées à améliorer la mobilité académique des étudiants et des enseignants ; dans ce cadre seront donc évoquées les questions de l’égalité face à la mobilité étudiante, de la préparation nécessitée par celleci, et de la prise de conscience interculturelle. Dans un premier chapitre, Javier Mato et Bego

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present study focuses on the zero person constructions both in Finnish and Estonian. In the zero person construction, there is no overt subject and the verb is in the 3rd person singular form: Fin. Tammikuussa voi hiihtää Etelä-Suomessakin. Est. Jaanuaris saab suusatada ka Lõuna-Soomes ‘In January one can ski even in South-Finland’. The meaning of the zero construction is usually considered generic and open. However, the zero may be interpreted as indexically open so that the reference can be construed from the context. This study demonstrates how the zero may be interpreted as referring to the speaker, the addressee, or anybody. The zero person construction in Finnish has been contrasted to the generic pronoun constructions in Indo-European languages. For example, the zero person is translated in English as you or one; in Swedish and German as man. The grammar and semantics of the Finnish zero person construction have been studied earlier to some extent. However, the differences and similarities between Finnish and Estonian, two closely related languages, have not been thoroughly studied before. The present doctoral thesis sheds light on the zero person construction, its use, functions, and interpretation both in Finnish and Estonian. The approach taken is contrastive. The data comes from magazine articles published in Finnish and translated into Estonian. The data consists of Finnish sentences with the zero person and their Estonian translations. In addition, the data includes literary fiction, and non-translated Estonian corpora texts as well. Estonian and Finnish are closely related and in principle the personal system of the two languages is almost identical, nevertheless, there are interesting differences. The present study shows that the zero person construction is not as common in Estonian as it is in Finnish. In my data, a typical sentence with the zero person in both languages is a generic statement which tells us what can or cannot be done. When making generic statements the two languages are relatively similar, especially when the zero person is used together with a modal verb. The modal verbs (eg. Fin. voida ‘can’, saada ‘may’, täytyä ‘must’; Est. võima ‘may’, saama ‘can’, tulema ’must’) are the most common verbs in both Finnish and Estonian zero person constructions. Significant differences appear when a non-modal verb is used. Overall, non-modal verbs are used less frequently in both languages. Verbs with relatively low agentivity or intentionality, such as perception verb nähdä in Finnish and nägema in Estonian, are used in the zero person clauses in both languages to certain extent. Verbs with more agentive and intentional properties are not used in the Estonian zero person clauses; in Finnish their use is restricted to specific context. The if–then-frame provides a suitable context for the zero person in Finnish, and the Finnish zero person may occur together with any kind of verb in conditional if-clause. Estonian if-clauses are not suitable contexts for zero person. There is usually a da-infinitive, a generic 2nd person singular or a passive form instead of the zero person in Estonian counterparts for Finnish if-clauses with zero person. The aim of this study was to analyze motivations for choosing the zero person in certain contexts. In Estonian, the use of the zero person constructions is more limited than in Finnish, and some of the constraints are grammatical. On the other hand, some of the constraints are motivated by the differences in actual language use. Contrasting the two languages reveals interesting differences and similarities between these two languages and shows how these languages may use similar means differently.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Heli Kautosen esitys Epics, Digital Cultural Heritage and Vernacular Languages. Corpora and Databases in Oral Tradition Research -seminaarissa Helsingissä 2.3.2013.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Foram encontrados nove casos de carcinoma de células renais em uma pesquisa de 586 tumores em bovinos provenientes de 6.706 necropsias realizadas nessa espécie num período de 45 anos (1964-2008). Seis bovinos morreram por complicações do tumor e três foram achados incidentais. Os bovinos acometidos por carcinoma de células renais demonstraram os seguintes sinais clínicos: perda de peso (5 casos), massas abdominais palpáveis (4 casos), dificuldade respiratória (4 casos), tosse (4 casos), hiporexia (3 casos), anorexia (2 casos), dor abdominal (2 casos) e febre (1 caso). Os sinais clínicos observados estavam relacionados ao comprometimento induzido pelas metástases, que foram observadas nos nove casos. As metástases foram observadas nos linfonodos abdominais, superfícies serosas, fígado e pulmão. Dois bovinos tinham tumor renal bilateral. Microscopicamente, foi observado o padrão tubular, sólido e um misto de sólido e tubular e tubulopapilífero. O tipo celular eosinofílico foi predominante, apenas um tumor sólido era constituído basicamente por células claras. Reação cirrosa variou de discreta à acentuada. Corpora amylaceae foi um achado comum. Todos os tumores marcaram positivamente para citoceratina AE1/AE3 com diferentes graus de intensidade. A imunomarcação para CD10 foi observada em todos os casos testados. CD10 marcou intensamente no CCR de células claras, nos demais a marcação foi observada de forma isolada e menos intensa. Três tumores marcaram de forma isolada e discreta para o anticorpo anti-PAX-2. A avaliação foi negativa para citoceratina 34β12, c-KIT (CD117), S-100, cromogranina A e apoproteína A surfactante. Os resultados obtidos indicam que CCR são incomuns em bovinos no Sul do Brasil com uma média de 1.3 casos para cada mil necropsias realizadas e que o anticorpo anti-CD10 é útil no diagnóstico de CCR em bovinos.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Collared peccaries (Peccary tajacu) are among the most hunted species in Latin America due the appreciation of their pelt and meat. In order to optimize breeding management of captive born collared peccaries in semiarid conditions, the objective was to describe and correlate the changes in the ovarian ultrasonographic pattern, hormonal profile, vulvar appearance, and vaginal cytology during the estrus cycle in this species. During 45 days, females (n=4) were subjected each three days to blood collection destined to hormonal dosage by enzyme immunoassay (EIA). In the same occasions, evaluation of external genitalia, ovarian ultrasonography and vaginal cytology were conducted. Results are presented as means and standard deviations. According to hormonal dosage, six estrous cycles were identified as lasting 21.0 ± 5.7 days, being on average 6 days for the estrogenic phase and 15 days for the progesterone phase. Estrogen presented mean peak values of 55.6 ± 20.5 pg/mL. During the luteal phase, the high values for progesterone were 35.3 ± 4.4 ng/mL. The presence of vaginal mucus, a reddish vaginal mucosa and the separation of the vulvar lips were verified in all animals during the estrogenic peak. Through ultrasonography, ovarian follicles measuring 0.2±0.1 cm were visualized during the estrogen peak. Corpora lutea presented hyperechoic regions measuring 0.4±0.2 cm identified during luteal phase. No significant differences (P>0.05) between proportions of vaginal epithelial cells were identified when comparing estrogenic and progesterone phases. In conclusion, female collared peccaries, captive born in semiarid conditions, have an estral cycle that lasts 21.0±5.7 days, with estrous signs characterized by vulvar lips edema and hyperemic vaginal mucosa, coinciding with developed follicles and high estrogen levels.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein-protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence “Protein A causes protein B to bind protein C” can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP'09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing. Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP'09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP'11 and BioNLP'13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The crude latex of Crown-of-Thorns (Euphorbia milii var. hislopii) is a potent plant molluscicide and a promising alternative to the synthetic molluscicides used in schistosomiasis control. The present study was undertaken to investigate the embryofeto-toxic potential of E. milii latex. The study is part of a comprehensive safety evaluation of this plant molluscicide. Lyophilized latex (0, 125, 250 and 500 mg/kg body weight) in corn oil was given by gavage to Wistar rats (N = 100) from days 6 to 15 of pregnancy and cesarean sections were performed on day 21 of pregnancy. The numbers of implantation sites, living and dead fetuses, resorptions and corpora lutea were recorded. Fetuses were weighed, examined for external malformations, and fixed for visceral examination, or cleared and stained with Alizarin red S for skeleton evaluation. A reduction of body weight minus uterine weight at term indicated that E. milii latex was maternally toxic over the dose range tested. No latex-induced embryolethality was noted at the lowest dose (125 mg/kg) but the resorption rate was markedly increased at 250 mg/kg (62.5%) and 500 mg/kg (93.4%). A higher frequency of fetuses showing signs of delayed ossification (control: 17.4%; 125 mg/kg: 27.4% and 250 mg/kg: 62.8%; P<0.05 vs control) indicated that fetal growth was retarded at doses ³ 125 mg latex/kg body weight. No increase in the proportion of fetuses with skeletal anomalies was observed at the lowest dose but the incidence of minor skeletal malformations was higher at 250 mg/kg body weight (control: 13.7%; 125 mg/kg: 14.8%; 250 mg/kg: 45.7%; P<0.05 vs control). Since a higher frequency of minor malformations was noted only at very high doses of latex which are embryolethal and maternally toxic, it is reasonable to conclude that this plant molluscicide poses no teratogenic hazard or, at least, that this possibility is of a considerably low order of magnitude