997 resultados para Computational Lexical Semantics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The purpose of this article is to treat a currently much debated issue, the effects of age on second language learning. To do so, we contrast data collected by our research team from over one thousand seven hundred young and adult learners with four popular beliefs or generalizations, which, while deeply rooted in this society, are not always corroborated by our data.Two of these generalizations about Second Language Acquisition (languages spoken in the social context) seem to be widely accepted: a) older children, adolescents and adults are quicker and more efficient at the first stages of learning than are younger learners; b) in a natural context children with an early start are more liable to attain higher levels of proficiency. However, in the context of Foreign Language Acquisition, the context in which we collect the data, this second generalization is difficult to verify due to the low number of instructional hours (a maximum of some 800 hours) and the lower levels of language exposure time provided. The design of our research project has allowed us to study differences observed with respect to the age of onset (ranging from 2 to 18+), but in this article we focus on students who began English instruction at the age of 8 (LOGSE Educational System) and those who began at the age of 11 (EGB). We have collected data from both groups after a period of 200 (Time 1) and 416 instructional hours (Time 2), and we are currently collecting data after a period of 726 instructional hours (Time 3). We have designed and administered a variety of tests: tests on English production and reception, both oral and written, and within both academic and communicative oriented approaches, on the learners' L1 (Spanish and Catalan), as well as a questionnaire eliciting personal and sociolinguistic information. The questions we address and the relevant empirical evidence are as follows: 1. "For young children, learning languages is a game. They enjoy it more than adults."Our data demonstrate that the situation is not quite so. Firstly, both at the levels of Primary and Secondary education (ranging from 70.5% in 11-year-olds to 89% in 14-year-olds) students have a positive attitude towards learning English. Secondly, there is a difference between the two groups with respect to the factors they cite as responsible for their motivation to learn English: the younger students cite intrinsic factors, such as the games they play, the methodology used and the teacher, whereas the older students cite extrinsic factors, such as the role of their knowledge of English in the achievement of their future professional goals. 2 ."Young children have more resources to learn languages." Here our data suggest just the opposite. The ability to employ learning strategies (actions or steps used) increases with age. Older learners' strategies are more varied and cognitively more complex. In contrast, younger learners depend more on their interlocutor and external resources and therefore have a lower level of autonomy in their learning. 3. "Young children don't talk much but understand a lot"This third generalization does seem to be confirmed, at least to a certain extent, by our data in relation to the analysis of differences due to the age factor and productive use of the target language. As seen above, the comparably slower progress of the younger learners is confirmed. Our analysis of interpersonal receptive abilities demonstrates as well the advantage of the older learners. Nevertheless, with respect to passive receptive activities (for example, simple recognition of words or sentences) no great differences are observed. Statistical analyses suggest that in this test, in contrast to the others analyzed, the dominance of the subjects' L1s (reflecting a cognitive capacity that grows with age) has no significant influence on the learning process. 4. "The sooner they begin, the better their results will be in written language"This is not either completely confirmed in our research. First of all, we perceive that certain compensatory strategies disappear only with age, but not with the number of instructional hours. Secondly, given an identical number of instructional hours, the older subjects obtain better results. With respect to our analysis of data from subjects of the same age (12 years old) but with a different number of instructional hours (200 and 416 respectively, as they began at the ages of 11 and 8), we observe that those who began earlier excel only in the area of lexical fluency. In conclusion, the superior rate of older learners appears to be due to their higher level of cognitive development, a factor which allows them to benefit more from formal or explicit instruction in the school context. Younger learners, however, do not benefit from the quantity and quality of linguistic exposure typical of a natural acquisition context in which they would be allowed to make use of implicit learning abilities. It seems clear, then, that the initiative in this country to begin foreign language instruction earlier will have positive effects only if it occurs in combination with either higher levels of exposure time to the foreign language, or, alternatively, with its use as the language of instruction in other areas of the curriculum.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Within the ENCODE Consortium, GENCODE aimed to accurately annotate all protein-coding genes, pseudogenes, and noncoding transcribed loci in the human genome through manual curation and computational methods. Annotated transcript structures were assessed, and less well-supported loci were systematically, experimentally validated. Predicted exon-exon junctions were evaluated by RT-PCR amplification followed by highly multiplexed sequencing readout, a method we called RT-PCR-seq. Seventy-nine percent of all assessed junctions are confirmed by this evaluation procedure, demonstrating the high quality of the GENCODE gene set. RT-PCR-seq was also efficient to screen gene models predicted using the Human Body Map (HBM) RNA-seq data. We validated 73% of these predictions, thus confirming 1168 novel genes, mostly noncoding, which will further complement the GENCODE annotation. Our novel experimental validation pipeline is extremely sensitive, far more than unbiased transcriptome profiling through RNA sequencing, which is becoming the norm. For example, exon-exon junctions unique to GENCODE annotated transcripts are five times more likely to be corroborated with our targeted approach than with extensive large human transcriptome profiling. Data sets such as the HBM and ENCODE RNA-seq data fail sampling of low-expressed transcripts. Our RT-PCR-seq targeted approach also has the advantage of identifying novel exons of known genes, as we discovered unannotated exons in ~11% of assessed introns. We thus estimate that at least 18% of known loci have yet-unannotated exons. Our work demonstrates that the cataloging of all of the genic elements encoded in the human genome will necessitate a coordinated effort between unbiased and targeted approaches, like RNA-seq and RT-PCR-seq.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lipases have received great attention as industrial biocatalysts in areas like oils and fats processing, detergents, baking, cheese making, surface cleaning, or fine chemistry . They can catalyse reactions of insoluble substrates at the lipid-water interface, preserving their catalytic activity in organic solvents. This makes of lipases powerful tools for catalysing not only hydrolysis, but also various reverse reactions such as esterification, transesterification, aminolysis, or thiotransesterifications in anhydrous organic solvents. Moreover, lipases catalyse reactions with high specificity, regio and enantioselectivity, becoming the most used enzymes in synthetic organic chemistry. Therefore, they display important advantages over classical catalysts, as they can catalyse reactions with reduced side products, lowered waste treatment costs, and under mild temperature and pressure conditions. Accordingly, the use of lipases holds a great promise for green and economical process chemistry.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background PPP1R6 is a protein phosphatase 1 glycogen-targeting subunit (PP1-GTS) abundant in skeletal muscle with an undefined metabolic control role. Here PPP1R6 effects on myotube glycogen metabolism, particle size and subcellular distribution are examined and compared with PPP1R3C/PTG and PPP1R3A/GM. Results PPP1R6 overexpression activates glycogen synthase (GS), reduces its phosphorylation at Ser-641/0 and increases the extracted and cytochemically-stained glycogen content, less than PTG but more than GM. PPP1R6 does not change glycogen phosphorylase activity. All tested PP1-GTS-cells have more glycogen particles than controls as found by electron microscopy of myotube sections. Glycogen particle size is distributed for all cell-types in a continuous range, but PPP1R6 forms smaller particles (mean diameter 14.4 nm) than PTG (36.9 nm) and GM (28.3 nm) or those in control cells (29.2 nm). Both PPP1R6- and GM-derived glycogen particles are in cytosol associated with cellular structures; PTG-derived glycogen is found in membrane- and organelle-devoid cytosolic glycogen-rich areas; and glycogen particles are dispersed in the cytosol in control cells. A tagged PPP1R6 protein at the C-terminus with EGFP shows a diffuse cytosol pattern in glucose-replete and -depleted cells and a punctuate pattern surrounding the nucleus in glucose-depleted cells, which colocates with RFP tagged with the Golgi targeting domain of β-1,4-galactosyltransferase, according to a computational prediction for PPP1R6 Golgi location. Conclusions PPP1R6 exerts a powerful glycogenic effect in cultured muscle cells, more than GM and less than PTG. PPP1R6 protein translocates from a Golgi to cytosolic location in response to glucose. The molecular size and subcellular location of myotube glycogen particles is determined by the PPP1R6, PTG and GM scaffolding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present a novel numerical algorithm for the simulation of seismic wave propagation in porous media, which is particularly suitable for the accurate modelling of surface wave-type phenomena. The differential equations of motion are based on Biot's theory of poro-elasticity and solved with a pseudospectral approach using Fourier and Chebyshev methods to compute the spatial derivatives along the horizontal and vertical directions, respectively. The time solver is a splitting algorithm that accounts for the stiffness of the differential equations. Due to the Chebyshev operator the grid spacing in the vertical direction is non-uniform and characterized by a denser spatial sampling in the vicinity of interfaces, which allows for a numerically stable and accurate evaluation of higher order surface wave modes. We stretch the grid in the vertical direction to increase the minimum grid spacing and reduce the computational cost. The free-surface boundary conditions are implemented with a characteristics approach, where the characteristic variables are evaluated at zero viscosity. The same procedure is used to model seismic wave propagation at the interface between a fluid and porous medium. In this case, each medium is represented by a different grid and the two grids are combined through a domain-decomposition method. This wavefield decomposition method accounts for the discontinuity of variables and is crucial for an accurate interface treatment. We simulate seismic wave propagation with open-pore and sealed-pore boundary conditions and verify the validity and accuracy of the algorithm by comparing the numerical simulations to analytical solutions based on zero viscosity obtained with the Cagniard-de Hoop method. Finally, we illustrate the suitability of our algorithm for more complex models of porous media involving viscous pore fluids and strongly heterogeneous distributions of the elastic and hydraulic material properties.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The model plant Arabidopsis thaliana was studied for the search of new metabolites involved in wound signalling. Diverse LC approaches were considered in terms of efficiency and analysis time and a 7-min gradient on a UPLC-TOF-MS system with a short column was chosen for metabolite fingerprinting. This screening step was designed to allow the comparison of a high number of samples over a wide range of time points after stress induction in positive and negative ionisation modes. Thanks to data treatment, clear discrimination was obtained, providing lists of potential stress-induced ions. In a second step, the fingerprinting conditions were transferred to longer column, providing a higher peak capacity able to demonstrate the presence of isomers among the highlighted compounds.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The European Space Agency's Gaia mission will create the largest and most precise three dimensional chart of our galaxy (the Milky Way), by providing unprecedented position, parallax, proper motion, and radial velocity measurements for about one billion stars. The resulting catalogue will be made available to the scientific community and will be analyzed in many different ways, including the production of a variety of statistics. The latter will often entail the generation of multidimensional histograms and hypercubes as part of the precomputed statistics for each data release, or for scientific analysis involving either the final data products or the raw data coming from the satellite instruments. In this paper we present and analyze a generic framework that allows the hypercube generation to be easily done within a MapReduce infrastructure, providing all the advantages of the new Big Data analysis paradigmbut without dealing with any specific interface to the lower level distributed system implementation (Hadoop). Furthermore, we show how executing the framework for different data storage model configurations (i.e. row or column oriented) and compression techniques can considerably improve the response time of this type of workload for the currently available simulated data of the mission. In addition, we put forward the advantages and shortcomings of the deployment of the framework on a public cloud provider, benchmark against other popular solutions available (that are not always the best for such ad-hoc applications), and describe some user experiences with the framework, which was employed for a number of dedicated astronomical data analysis techniques workshops.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Land plants have had the reputation of being problematic for DNA barcoding for two general reasons: (i) the standard DNA regions used in algae, animals and fungi have exceedingly low levels of variability and (ii) the typically used land plant plastid phylogenetic markers (e.g. rbcL, trnL-F, etc.) appear to have too little variation. However, no one has assessed how well current phylogenetic resources might work in the context of identification (versus phylogeny reconstruction). In this paper, we make such an assessment, particularly with two of the markers commonly sequenced in land plant phylogenetic studies, plastid rbcL and internal transcribed spacers of the large subunits of nuclear ribosomal DNA (ITS), and find that both of these DNA regions perform well even though the data currently available in GenBank/EBI were not produced to be used as barcodes and BLAST searches are not an ideal tool for this purpose. These results bode well for the use of even more variable regions of plastid DNA (such as, for example, psbA-trnH) as barcodes, once they have been widely sequenced. In the short term, efforts to bring land plant barcoding up to the standards being used now in other organisms should make swift progress. There are two categories of DNA barcode users, scientists in fields other than taxonomy and taxonomists. For the former, the use of mitochondrial and plastid DNA, the two most easily assessed genomes, is at least in the short term a useful tool that permits them to get on with their studies, which depend on knowing roughly which species or species groups they are dealing with, but these same DNA regions have important drawbacks for use in taxonomic studies (i.e. studies designed to elucidate species limits). For these purposes, DNA markers from uniparentally (usually maternally) inherited genomes can only provide half of the story required to improve taxonomic standards being used in DNA barcoding. In the long term, we will need to develop more sophisticated barcoding tools, which would be multiple, low-copy nuclear markers with sufficient genetic variability and PCR-reliability; these would permit the detection of hybrids and permit researchers to identify the 'genetic gaps' that are useful in assessing species limits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The European Space Agency's Gaia mission will create the largest and most precise three dimensional chart of our galaxy (the Milky Way), by providing unprecedented position, parallax, proper motion, and radial velocity measurements for about one billion stars. The resulting catalogue will be made available to the scientific community and will be analyzed in many different ways, including the production of a variety of statistics. The latter will often entail the generation of multidimensional histograms and hypercubes as part of the precomputed statistics for each data release, or for scientific analysis involving either the final data products or the raw data coming from the satellite instruments. In this paper we present and analyze a generic framework that allows the hypercube generation to be easily done within a MapReduce infrastructure, providing all the advantages of the new Big Data analysis paradigmbut without dealing with any specific interface to the lower level distributed system implementation (Hadoop). Furthermore, we show how executing the framework for different data storage model configurations (i.e. row or column oriented) and compression techniques can considerably improve the response time of this type of workload for the currently available simulated data of the mission. In addition, we put forward the advantages and shortcomings of the deployment of the framework on a public cloud provider, benchmark against other popular solutions available (that are not always the best for such ad-hoc applications), and describe some user experiences with the framework, which was employed for a number of dedicated astronomical data analysis techniques workshops.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The "one-gene, one-protein" rule, coined by Beadle and Tatum, has been fundamental to molecular biology. The rule implies that the genetic complexity of an organism depends essentially on its gene number. The discovery, however, that alternative gene splicing and transcription are widespread phenomena dramatically altered our understanding of the genetic complexity of higher eukaryotic organisms; in these, a limited number of genes may potentially encode a much larger number of proteins. Here we investigate yet another phenomenon that may contribute to generate additional protein diversity. Indeed, by relying on both computational and experimental analysis, we estimate that at least 4%-5% of the tandem gene pairs in the human genome can be eventually transcribed into a single RNA sequence encoding a putative chimeric protein. While the functional significance of most of these chimeric transcripts remains to be determined, we provide strong evidence that this phenomenon does not correspond to mere technical artifacts and that it is a common mechanism with the potential of generating hundreds of additional proteins in the human genome.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Low-copy-number molecules are involved in many functions in cells. The intrinsic fluctuations of these numbers can enable stochastic switching between multiple steady states, inducing phenotypic variability. Herein we present a theoretical and computational study based on Master Equations and Fokker-Planck and Langevin descriptions of stochastic switching for a genetic circuit of autoactivation. We show that in this circuit the intrinsic fluctuations arising from low-copy numbers, which are inherently state-dependent, drive asymmetric switching. These theoretical results are consistent with experimental data that have been reported for the bistable system of the gallactose signaling network in yeast. Our study unravels that intrinsic fluctuations, while not required to describe bistability, are fundamental to understand stochastic switching and the dynamical relative stability of multiple states.