980 resultados para Aligned Corpus
Resumo:
The article briefly reviews bilingual Slovak-Bulgarian/Bulgarian-Slovak parallel and aligned corpus. The corpus is collected and developed as results of the collaboration in the frameworks of the joint research project between Institute of Mathematics and Informatics, Bulgarian Academy of Sciences, and Ľ. Štúr Institute of Linguistics, Slovak Academy of Sciences. The multilingual corpora are large repositories of language data with an important role in preserving and supporting the world's cultural heritage, because the natural language is an outstanding part of the human cultural values and collective memory, and a bridge between cultures. This bilingual corpus will be widely applicable to the contrastive studies of the both Slavic languages, will also be useful resource for language engineering research and development, especially in machine translation.
Resumo:
This article briefly reviews multilingual language resources for Bulgarian, developed in the frame of some international projects: the first-ever annotated Bulgarian MTE digital lexical resources, Bulgarian-Polish corpus, Bulgarian-Slovak parallel and aligned corpus, and Bulgarian-Polish-Lithuanian corpus. These resources are valuable multilingual dataset for language engineering research and development for Bulgarian language. The multilingual corpora are large repositories of language data with an important role in preserving and supporting the world's cultural heritage, because the natural language is an outstanding part of the human cultural values and collective memory, and a bridge between cultures.
Resumo:
The paper describes three software packages - the main components of a software system for processing and web-presentation of Bulgarian language resources – parallel corpora and bilingual dictionaries. The author briefly presents current versions of the core components “Dictionary” and “Corpus” as well as the recently developed component “Connection” that links both “Dictionary” and “Corpus”. The components main functionalities are described as well. Some examples of the usage of the system’s web-applications are included.
Resumo:
In this paper we present ClInt (Clinical Interview), a bilingual Spanish-Catalan spoken corpus that contains 15 hours of clinical interviews. It consists of audio files aligned with multiple-level transcriptions comprising orthographic, phonetic and morphological information, as well as linguistic and extralinguistic encoding. This is a previously non-existent resource for these languages and it offers a wide-ranging exploitation potential in a broad variety of disciplines such as Linguistics, Natural Language Processing and related fields.
Resumo:
Mémoire numérisé par la Division de la gestion de documents et des archives de l'Université de Montréal
Resumo:
Les travaux entrepris dans le cadre de la présente thèse portent sur l’analyse de l’équivalence terminologique en corpus parallèle et en corpus comparable. Plus spécifiquement, nous nous intéressons aux corpus de textes spécialisés appartenant au domaine du changement climatique. Une des originalités de cette étude réside dans l’analyse des équivalents de termes simples. Les bases théoriques sur lesquelles nous nous appuyons sont la terminologie textuelle (Bourigault et Slodzian 1999) et l’approche lexico-sémantique (L’Homme 2005). Cette étude poursuit deux objectifs. Le premier est d’effectuer une analyse comparative de l’équivalence dans les deux types de corpus afin de vérifier si l’équivalence terminologique observable dans les corpus parallèles se distingue de celle que l’on trouve dans les corpus comparables. Le deuxième consiste à comparer dans le détail les équivalents associés à un même terme anglais, afin de les décrire et de les répertorier pour en dégager une typologie. L’analyse détaillée des équivalents français de 343 termes anglais est menée à bien grâce à l’exploitation d’outils informatiques (extracteur de termes, aligneur de textes, etc.) et à la mise en place d’une méthodologie rigoureuse divisée en trois parties. La première partie qui est commune aux deux objectifs de la recherche concerne l’élaboration des corpus, la validation des termes anglais et le repérage des équivalents français dans les deux corpus. La deuxième partie décrit les critères sur lesquels nous nous appuyons pour comparer les équivalents des deux types de corpus. La troisième partie met en place la typologie des équivalents associés à un même terme anglais. Les résultats pour le premier objectif montrent que sur les 343 termes anglais analysés, les termes présentant des équivalents critiquables dans les deux corpus sont relativement peu élevés (12), tandis que le nombre de termes présentant des similitudes d’équivalence entre les corpus est très élevé (272 équivalents identiques et 55 équivalents non critiquables). L’analyse comparative décrite dans ce chapitre confirme notre hypothèse selon laquelle la terminologie employée dans les corpus parallèles ne se démarque pas de celle des corpus comparables. Les résultats pour le deuxième objectif montrent que de nombreux termes anglais sont rendus par plusieurs équivalents (70 % des termes analysés). Il est aussi constaté que ce ne sont pas les synonymes qui forment le groupe le plus important des équivalents, mais les quasi-synonymes. En outre, les équivalents appartenant à une autre partie du discours constituent une part importante des équivalents. Ainsi, la typologie élaborée dans cette thèse présente des mécanismes de l’équivalence terminologique peu décrits aussi systématiquement dans les travaux antérieurs.
Resumo:
Con questa tesi abbiamo messo a punto una metodologia per l'applicazione del "corpus-based approach" allo studio dell'interpretazione simultanea, creando DIRSI-C, un corpus elettronico parallelo (italiano-inglese) e allineato di trascrizioni di registrazioni tratte da convegni medici, mediati da interpreti simultaneisti. Poiché gli interpreti professionisti coinvolti hanno lavorato dalla lingua straniera alla loro lingua materna e viceversa, il fattore direzionalità è il parametro di analisi delle prestazioni degli interpreti secondo i metodi di indagine della linguistica dei corpora. In this doctoral thesis a methodology was developed to fully apply the corpus-based approach to simultaneous interpreting research. DIRSI-C is a parallel (Italian-English/English-Italian) and aligned electronic corpus, containing transcripts of recorded medical international conferences with professional simultaneous interpreters working both from and into their foreign language. Against this backdrop, directionality represents the research parameter used to analyze interpreters' performance by means of corpus linguistics tools.
Resumo:
Following the internationalization of contemporary higher education, academic institutions based in non-English speaking countries are increasingly urged to produce contents in English to address international prospective students and personnel, as well as to increase their attractiveness. The demand for English translations in the institutional academic domain is consequently increasing at a rate exceeding the capacity of the translation profession. Resources for assisting non-native authors and translators in the production of appropriate texts in L2 are therefore required in order to help academic institutions and professionals streamline their translation workload. Some of these resources include: (i) parallel corpora to train machine translation systems and multilingual authoring tools; and (ii) translation memories for computer-aided tools. The purpose of this study is to create and evaluate reference resources like the ones mentioned in (i) and (ii) through the automatic sentence alignment of a large set of Italian and English as a Lingua Franca (ELF) institutional academic texts given as equivalent but not necessarily parallel (i.e. translated). In this framework, a set of aligning algorithms and alignment tools is examined in order to identify the most profitable one(s) in terms of accuracy and time- and cost-effectiveness. In order to determine the text pairs to align, a sample is selected according to document length similarity (characters) and subsequently evaluated in terms of extent of noisiness/parallelism, alignment accuracy and content leverageability. The results of these analyses serve as the basis for the creation of an aligned bilingual corpus of academic course descriptions, which is eventually used to create a translation memory in TMX format.
Resumo:
The paper presents our considerations related to the creation of a digital corpus of Bulgarian dialects. The dialectological archive of Bulgarian language consists of more than 250 audio tapes. All tapes were recorded between 1955 and 1965 in the course of regular dialectological expeditions throughout the country. The records typically contain interviews with inhabitants of small villages in Bulgaria. The topics covered are usually related to such issues as birth, everyday life, marriage, family relationship, death, etc. Only a few tapes contain folk songs from different regions of the country. Taking into account the progressive deterioration of the magnetic media and the realistic prospects of data loss, the Institute for Bulgarian Language at the Academy of Sciences launched in 1997 a project aiming at restoration and digital preservation of the dialectological archive. Within the framework of this project more than the half of the records was digitized, de-noised and stored on digital recording media. Since then restoration and digitization activities are done in the Institute on a regular basis. As a result a large collection of sound files has been gathered. Our further efforts are aimed at the creation of a digital corpus of Bulgarian dialects, which will be made available for phonological and linguistic research. Such corpora typically include besides the sound files two basic elements: a transcription, aligned with the sound file, and a set of standardized metadata that defines the corpus. In our work we will present considerations on how these tasks could be realized in the case of the corpus of Bulgarian dialects. Our suggestions will be based on a comparative analysis of existing methods and techniques to build such corpora, and by selecting the ones that fit closer to the particular needs. Our experience can be used in similar institutions storing folklore archives, history related spoken records etc.
Resumo:
We analyzed GFP cells after 24h cultivated on superhydrophilic vertically aligned carbon nanotube scaffolds. We produced two different densities of VACNT scaffolds on Ti using Ni or Fe catalysts. A simple and fast oxygen plasma treatment promoted the superhydrophilicity of them. We used five different substrates, such as: as-grown VACNT produced using Ni as catalyst (Ni), as-grown VACNT produced using Fe as catalyst (Fe), VACNT-O produced using Ni as catalyst (NiO), VACNT-O produced using Fe as catalyst (FeO) and Ti (control). The 4',6-diamidino-2-phenylindole reagent nuclei stained the adherent cells cultivated on five different analyzed scaffolds. We used fluorescence microscopy for image collect, ImageJ® to count adhered cell and GraphPad Prism 5® for statistical analysis. We demonstrated in crescent order: Fe, Ni, NiO, FeO and Ti scaffolds that had an improved cellular adhesion. Oxygen treatment associated to high VACNT density (group FeO) presented significantly superior cell adhesion up to 24h. However, they do not show significant differences compared with Ti substrates (control). We demonstrated that all the analyzed substrates were nontoxic. Also, we proposed that the density and hydrophilicity influenced the cell adhesion behavior.
Resumo:
To characterize the relaxation induced by the soluble guanylate cyclase (sGC) activator, BAY 60-2770 in rabbit corpus cavernosum. Penis from male New Zealand rabbits were removed and fours strips of corpus cavernosum (CC) were obtained. Concentration-response curves to BAY 60-2770 were carried out in the absence and presence of inhibitors of nitric oxide synthase, L-NAME (100 μM), sGC, ODQ (10 μM) and phosphodiestarase type 5, tadalafil (0.1 μM). The potency (pEC50) and maximal response (Emax) values were determined. Second, electrical-field stimulation (EFS)-induced contraction or relaxation was realized in the absence and presence of BAY 60-2770 (0.1 or 1 μM) alone or in combination of ODQ (10 μM). In the case of EFS-induced relaxation two protocols were realized: 1) ODQ (10 μM) was first incubated for 20 min and then BAY 60-2770 (1 μM) was added for another 20 min (ODQ + BAY 60-2770). In different CC strips, BAY 60-2770 was incubated for 20 min followed by another 20 min with ODQ (BAY 60-2770 + ODQ). The intracellular levels of cyclic guanosine monophosphate (cGMP) were also determined. BAY 60-2770 potently relaxed rabbit CC with pEC50 and Emax values of 7.58 ± 0.19 and 81 ± 4%, respectively. The inhibitors ODQ (n=7) or tadalafil (n=7) produced 4.2- and 6.3-leftward shifts, respectively in BAY 60-2770-induced relaxation without interfering on the Emax values. The intracellular levels of cGMP were augmented after stimulation with BAY 60-2770 (1 μM) alone, whereas its co-incubation with ODQ produced even higher levels of cGMP. The EFS-induced contraction was reduced in the presence of BAY 60-2770 (1 μM) and this inhibition was even greater when BAY 60-2770 was co-incubated with ODQ. The nitrergic stimulation induced CC relaxation, which was abolished in the presence of ODQ. BAY 60-2770 alone increased the amplitude of relaxation. Co-incubation of ODQ and BAY 60-2770 did not alter the relaxation in comparison with ODQ alone. Interestingly, when BAY 60-2770 was incubated prior to ODQ, EFS-induced relaxation was partly restored in comparison with ODQ alone or ODQ + BAY 60-2770. Considering that the relaxation induced by the sGC activator, BAY 60-2770 was increased after sGC oxidation and unaltered in the absence of nitric oxide, these class of substances are advantageous over sGC stimulators or PDE5 inhibitors for the treatment in those patients with erectile dysfunction and high endothelial damage. This article is protected by copyright. All rights reserved.
Resumo:
Corpus luteum is a temporary endocrine gland that regulates either the estrous cycle and pregnancy. It presents extreme dependency on the adequate blood supply. This work aims to evaluate goat corpus luteum (CL) vascular density (VD) over the estrous cycle. For that purpose, 20 females were submitted to estrus synchronization/ovulation treatment using a medroxyprogesterone intra-vaginal sponge as well as intramuscular (IM) application of cloprostenol and equine chorionic gonadotrophine (eCG). After sponge removal, estrus was identified at about 72hs. Once treatment was over, female goats were then subdivided into 4 groups (n=5 each) and slaughtered on days 2, 12, 16 and 22 after ovulation (p.o). Ovaries were collected, withdrawn and weighted. CL and ovaries had size and area recorded. Blood samples were collected and the plasma progesterone (P4) was measured through RIA commercial kits. The VD was 24.42±6.66, 36.26±5.61, 8.59±2.2 and 3.97±1.12 vessels/mm² for days 2, 12, 16 and 22 p.o, respectively. Progesterone plasma concentrations were 0.49±0.08, 2.63±0.66, 0.61±0.14 and 0.22±0.04ng/ml for days 2, 12, 16 e 22 p.o, respectively. Studied parameters were affected by the estrous cycle phase. Values greater than 12 p.o were observed. In the present work we observed that ovulation occurred predominantly in the right ovary (70% of the animals), which in turn presented bigger measures than the contra lateral one. There is a meaningful relationship between the weight and size of the ovary and these of CL (r=0.87, r=0.70, respectively, p<0.05). It is possible to conclude that morphology of goat's ovaries and plasma progesterone concentration changed according to estrous cycle stages. We propose these parameters can be used as indicators of CL functional activity.
Resumo:
We report the detection of CoRoT-18b, a massive hot Jupiter transiting in front of its host star with a period of 1.9000693 +/- 0.0000028 days. This planet was discovered thanks to photometric data secured with the CoRoT satellite combined with spectroscopic and photometric ground-based follow-up observations. The planet has a mass M(p) = 3.47 +/- 0.38 M(Jup), a radius R(p) = 1.31 +/- 0.18 R(Jup), and a density rho(p) = 2.2 +/- 0.8 g cm(-3). It orbits a G9V star with a mass M(*) = 0.95 +/- 0.15 M(circle dot), a radius R(*) = 1.00 +/- 0.13 R(circle dot), and a rotation period P(rot) = 5.4 +/- 0.4 days. The age of the system remains uncertain, with stellar evolution models pointing either to a few tens Ma or several Ga, while gyrochronology and lithium abundance point towards ages of a few hundred Ma. This mismatch potentially points to a problem in our understanding of the evolution of young stars, with possibly significant implications for stellar physics and the interpretation of inferred sizes of exoplanets around young stars. We detected the RossiterMcLaughlin anomaly in the CoRoT-18 system thanks to the spectroscopic observation of a transit. We measured the obliquity psi = 20 degrees +/- 20 degrees +/- (sky-projected value lambda = -10 degrees +/- 20 degrees), indicating that the planet orbits in the same way as the star is rotating and that this prograde orbit is nearly aligned with the stellar equator.
Resumo:
The present study aimed to investigate the presence of corpus callosum (CC) volume deficits in a population-based recent-onset psychosis (ROP) sample, and whether CC volume relates to interhemispheric communication deficits. For this purpose, we used voxel-based morphometry comparisons of magnetic resonance imaging data between ROP (n = 122) and healthy control (n = 94) subjects. Subgroups (38 ROP and 39 controls) were investigated for correlations between CC volumes and performance on the Crossed Finger Localization Test (CFLT). Significant CC volume reductions in ROP subjects versus controls emerged after excluding substance misuse and non-right-handedness. CC reductions retained significance in the schizophrenia subgroup but not in affective psychoses subjects. There were significant positive correlations between CC volumes and CFLT scores in ROP subjects, specifically in subtasks involving interhemispheric communication. From these results, we can conclude that CC volume reductions are present in association with ROP. The relationship between such deficits and CFLT performance suggests that interhemispheric communication impairments are directly linked to CC abnormalities in ROP. (C) 2010 Elsevier Ireland Ltd. All rights reserved.
Resumo:
Antiphospholipid syndrome (APS) is a disorder of coagulation that causes thrombosis as well as pregnancy-related complications, occurring due to the autoimmune production of antibodies against phospholipid. Full anticoagulation is the cornerstone therapy in patients with thrombosis history, and this can lead to major bleeding. During a 3-year period, 300 primary and secondary APS patients were followed up at the Rheumatology Division of the authors` University Hospital. Of them, 255 (85%) were women and 180 (60%) were of reproductive age. Three of them (1%) had severe hemorrhagic corpus luteum while receiving long-term anticoagulation treatment and are described in this report. All of them were taking warfarin, had elevated international normalized ratio (> 4.0) and required prompt blood transfusion and emergency surgery. Therefore, we strongly recommend that all women with APS under anticoagulation should have ovulation suppressed with either intramuscular depot-medroxyprogesterone acetate or oral desogestrel. Lupus (2011) 20, 523-526.