992 resultados para Language Modeling
From technicians to classics: on the rationalization of the Russian language in the USSR (1917-1953)
Resumo:
In this paper we present the theoretical and methodologicalfoundations for the development of a multi-agentSelective Dissemination of Information (SDI) servicemodel that applies Semantic Web technologies for specializeddigital libraries. These technologies make possibleachieving more efficient information management,improving agent–user communication processes, andfacilitating accurate access to relevant resources. Othertools used are fuzzy linguistic modelling techniques(which make possible easing the interaction betweenusers and system) and natural language processing(NLP) techniques for semiautomatic thesaurus generation.Also, RSS feeds are used as “current awareness bulletins”to generate personalized bibliographic alerts.
Resumo:
Background: Germline genetic variation is associated with the differential expression of many human genes. The phenotypic effects of this type of variation may be important when considering susceptibility to common genetic diseases. Three regions at 8q24 have recently been identified to independently confer risk of prostate cancer. Variation at 8q24 has also recently been associated with risk of breast and colorectal cancer. However, none of the risk variants map at or relatively close to known genes, with c-MYC mapping a few hundred kilobases distally. Results: This study identifies cis-regulators of germline c-MYC expression in immortalized lymphocytes of HapMap individuals. Quantitative analysis of c-MYC expression in normal prostate tissues suggests an association between overexpression and variants in Region 1 of prostate cancer risk. Somatic c-MYC overexpression correlates with prostate cancer progression and more aggressive tumor forms, which was also a pathological variable associated with Region 1. Expression profiling analysis and modeling of transcriptional regulatory networks predicts a functional association between MYC and the prostate tumor suppressor KLF6. Analysis of MYC/Myc-driven cell transformation and tumorigenesis substantiates a model in which MYC overexpression promotes transformation by down-regulating KLF6. In this model, a feedback loop through E-cadherin down-regulation causes further transactivation of c-MYC.Conclusion: This study proposes that variation at putative 8q24 cis-regulator(s) of transcription can significantly alter germline c-MYC expression levels and, thus, contribute to prostate cancer susceptibility by down-regulating the prostate tumor suppressor KLF6 gene.
Resumo:
El objetivo de PANACEA es engranar diferentes herramientas avanzadas para construir una fábrica de Recursos Lingüísticos (RL), una línea de producción que automatice los pasos implicados en la adquisición, producción, actualización y mantenimiento de los RL que la Traducción Automática y otras tecnologías lingüísticas, necesitan.
Resumo:
Building a personalized model to describe the drug concentration inside the human body for each patient is highly important to the clinical practice and demanding to the modeling tools. Instead of using traditional explicit methods, in this paper we propose a machine learning approach to describe the relation between the drug concentration and patients' features. Machine learning has been largely applied to analyze data in various domains, but it is still new to personalized medicine, especially dose individualization. We focus mainly on the prediction of the drug concentrations as well as the analysis of different features' influence. Models are built based on Support Vector Machine and the prediction results are compared with the traditional analytical models.
Resumo:
The objective of PANACEA is to build a factory of LRs that automates the stages involved in the acquisition, production, updating and maintenance of LRs required by MT systems and by other applications based on language technologies, and simplifies eventual issues regarding intellectual property rights. This automation will cut down the cost, time and human effort significantly. These reductions of costs and time are the only way to guarantee the continuous supply of LRs that MT and other language technologies will be demanding in the multilingual Europe.
Resumo:
Language Resources are a critical component for Natural Language Processing applications. Throughout the years many resources were manually created for the same task, but with different granularity and coverage information. To create richer resources for a broad range of potential reuses, nformation from all resources has to be joined into one. The hight cost of comparing and merging different resources by hand has been a bottleneck for merging existing resources. With the objective of reducing human intervention, we present a new method for automating merging resources. We have addressed the merging of two verbs subcategorization frame (SCF) lexica for Spanish. The results achieved, a new lexicon with enriched information and conflicting information signalled, reinforce our idea that this approach can be applied for other task of NLP.
Resumo:
This paper presents the platform developed in the PANACEA project, a distributed factory that automates the stages involved in the acquisition, production, updating and maintenance of Language Resources required by Machine Translation and other Language Technologies. We adopt a set of tools that have been successfully used in the Bioinformatics field, they are adapted to the needs of our field and used to deploy web services, which can be combined to build more complex processing chains (workflows). This paper describes the platform and its different components (web services, registry, workflows, social network and interoperability). We demonstrate the scalability of the platform by carrying out a set of massive data experiments. Finally, a validation of the platform across a set of required criteria proves its usability for different types of users (non-technical users and providers).
Resumo:
The paper presents a competence-based instructional design system and a way to provide a personalization of navigation in the course content. The navigation aid tool builds on the competence graph and the student model, which includes the elements of uncertainty in the assessment of students. An individualized navigation graph is constructed for each student, suggesting the competences the student is more prepared to study. We use fuzzy set theory for dealing with uncertainty. The marks of the assessment tests are transformed into linguistic terms and used for assigning values to linguistic variables. For each competence, the level of difficulty and the level of knowing its prerequisites are calculated based on the assessment marks. Using these linguistic variables and approximate reasoning (fuzzy IF-THEN rules), a crisp category is assigned to each competence regarding its level of recommendation.
Resumo:
Collaborative activities, in which students actively interact with each other, have proved to provide significant learning benefits. In Computer-Supported Collaborative Learning (CSCL), these collaborative activities are assisted by technologies. However, the use of computers does not guarantee collaboration, as free collaboration does not necessary lead to fruitful learning. Therefore, practitioners need to design CSCL scripts that structure the collaborative settings so that they promote learning. However, not all teachers have the technical and pedagogical background needed to design such scripts. With the aim of assisting teachers in designing effective CSCL scripts, we propose a model to support the selection of reusable good practices (formulated as patterns) so that they can be used as a starting point for their own designs. This model is based on a pattern ontology that computationally represents the knowledge captured on a pattern language for the design of CSCL scripts. A preliminary evaluation of the proposed approach is provided with two examples based on a set of meaningful interrelated patters computationally represented with the pattern ontology, and a paper prototyping experience carried out with two teaches. The results offer interesting insights towards the implementation of the pattern ontology in software tools.
Resumo:
BACKGROUND: Risks of significant infant drug exposurethrough breastmilk are poorly defined for many drugs, and largescalepopulation data are lacking. We used population pharmacokinetics(PK) modeling to predict fluoxetine exposure levels ofinfants via mother's milk in a simulated population of 1000 motherinfantpairs.METHODS: Using our original data on fluoxetine PK of 25breastfeeding women, a population PK model was developed withNONMEM and parameters, including milk concentrations, wereestimated. An exponential distribution model was used to account forindividual variation. Simulation random and distribution-constrainedassignment of doses, dosing time, feeding intervals and milk volumewas conducted to generate 1000 mother-infant pairs with characteristicssuch as the steady-state serum concentrations (Css) and infantdose relative to the maternal weight-adjusted dose (relative infantdose: RID). Full bioavailability and a conservative point estimate of1-month-old infant CYP2D6 activity to be 20% of the adult value(adjusted by weigth) according to a recent study, were assumed forinfant Css calculations.RESULTS: A linear 2-compartment model was selected as thebest model. Derived parameters, including milk-to-plasma ratios(mean: 0.66; SD: 0.34; range, 0 - 1.1) were consistent with the valuesreported in the literature. The estimated RID was below 10% in >95%of infants. The model predicted median infant-mother Css ratio was0.096 (range 0.035 - 0.25); literature reported mean was 0.07 (range0-0.59). Moreover, the predicted incidence of infant-mother Css ratioof >0.2 was less than 1%.CONCLUSION: Our in silico model prediction is consistent withclinical observations, suggesting that substantial systemic fluoxetineexposure in infants through human milk is rare, but further analysisshould include active metabolites. Our approach may be valid forother drugs. [supported by CIHR and Swiss National Science Foundation(SNSF)]
Resumo:
Nuclear receptors are a major component of signal transduction in animals. They mediate the regulatory activities of many hormones, nutrients and metabolites on the homeostasis and physiology of cells and tissues. It is of high interest to model the corresponding regulatory networks. While molecular and cell biology studies of individual promoters have provided important mechanistic insight, a more complex picture is emerging from genome-wide studies. The regulatory circuitry of nuclear receptor regulated gene expression networks, and their response to cellular signaling, appear highly dynamic, and involve long as well as short range chromatin interactions. We review how progress in understanding the kinetics and regulation of cofactor recruitment, and the development of new genomic methods, provide opportunities but also a major challenge for modeling nuclear receptor mediated regulatory networks.
Resumo:
The present study examines the development of interculturality and changes of beliefs, by analyzing 106 compositions produced by 53 advanced level university students of translation studies at a university in Spain before and shortly after a stay-abroad (SA) period. The study draws on data collected at two different times: before (T1) and after the SA (T3). In addition, we compared the results with the writings produced by a control group of 10 native English speakers on SA too. Data were collected by means of a composition which tried to elicit the learners’ opinion about cultural habits maintenance. The results reveal significant changes between T1 and T3 in the degree of better attitudes and intercultural acquisition.
Resumo:
In lateralized Lexical Decision Tasks (LDT), accuracy is commonly higher and reaction times are commonly faster for right visual field (RVF) than left visual field (LVF) presentations. This visual field differences are thought to demonstrate the left hemisphere's dominance for language. Unfortunately, different tasks and words are used between studies and languages making direct comparisons difficult. For example, high frequency words show a performance advantage over low frequency words. Moreover, demographic variables impact on lateralized behavior such as language knowledge (one versus several, early acquired versus late acquired). We here aim to alleviate some of these obstacles by presenting results from a lateralized LDT for which we selected words between 4 and 6 letters used in five different languages, i.e. English, French, German, Dutch and Italian. In this first study using these words, we compared performance of right- and left-handed students being either early or late bilinguals (acquired before or after the age of 6 years) from a French-speaking University in Switzerland. Results showed a left hemispheric advantage (accuracy, reaction times) for all groups, with a trend for early as compared to late bilinguals to be less accurate and taking longer in lexical decisions. These results show that the current words result in solid visual field differences, and do so irrespective of how many languages are spoken. While early bilinguals might experience a slight performance disadvantage, it was not affecting visual field differences.
Resumo:
Under the Dynamic Model of Multilingualism multilinguals are especially vulnerable to language attrition. It was the aim of the present study to verify if this was the case and to observe whether the different linguistic skills (receptive vs. descriptive) and the different linguistic levels (syntactic, lexical, morphological, etc.) would be affected equally.Data were gathered longitudinally by means of a language test for the subject’s reading, writing, listening and speaking skills as well as her knowledge of grammar and vocabulary. Although the overall accuracy remained intact and no proof for attrition in the receptive skills was found, the productive skills - mainly fluency - were shown to have suffered from language attrition. This was demonstrated by an increase in the number of pauses, hesitations, repetitions and self-corrections among others and decrease in the percentage of error-free clauses and decrease in the clause length, in oral and written fluency respectively.