997 resultados para Computational Lexical Semantics


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription factors. A generalized profile was used as a predictive quantitative model for binding sites, and its parameters were estimated from in vitro-selected ligands using standard hidden Markov model training algorithms. Computer simulations showed that several thousand low- to medium-affinity sequences are required to generate a profile of desired accuracy. To produce data on this scale, we applied high-throughput genomics methods to the biochemical problem addressed here. A method combining systematic evolution of ligands by exponential enrichment (SELEX) and serial analysis of gene expression (SAGE) protocols was coupled to an automated quality-controlled sequence extraction procedure based on Phred quality scores. This allowed the sequencing of a database of more than 10,000 potential DNA ligands for the CTF/NFI transcription factor. The resulting binding-site model defines the sequence specificity of this protein with a high degree of accuracy not achieved earlier and thereby makes it possible to identify previously unknown regulatory sequences in genomic DNA. A covariance analysis of the selected sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

For the last 2 decades, supertree reconstruction has been an active field of research and has seen the development of a large number of major algorithms. Because of the growing popularity of the supertree methods, it has become necessary to evaluate the performance of these algorithms to determine which are the best options (especially with regard to the supermatrix approach that is widely used). In this study, seven of the most commonly used supertree methods are investigated by using a large empirical data set (in terms of number of taxa and molecular markers) from the worldwide flowering plant family Sapindaceae. Supertree methods were evaluated using several criteria: similarity of the supertrees with the input trees, similarity between the supertrees and the total evidence tree, level of resolution of the supertree and computational time required by the algorithm. Additional analyses were also conducted on a reduced data set to test if the performance levels were affected by the heuristic searches rather than the algorithms themselves. Based on our results, two main groups of supertree methods were identified: on one hand, the matrix representation with parsimony (MRP), MinFlip, and MinCut methods performed well according to our criteria, whereas the average consensus, split fit, and most similar supertree methods showed a poorer performance or at least did not behave the same way as the total evidence tree. Results for the super distance matrix, that is, the most recent approach tested here, were promising with at least one derived method performing as well as MRP, MinFlip, and MinCut. The output of each method was only slightly improved when applied to the reduced data set, suggesting a correct behavior of the heuristic searches and a relatively low sensitivity of the algorithms to data set sizes and missing data. Results also showed that the MRP analyses could reach a high level of quality even when using a simple heuristic search strategy, with the exception of MRP with Purvis coding scheme and reversible parsimony. The future of supertrees lies in the implementation of a standardized heuristic search for all methods and the increase in computing power to handle large data sets. The latter would prove to be particularly useful for promising approaches such as the maximum quartet fit method that yet requires substantial computing power.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Amplified Fragment Length Polymorphisms (AFLPs) are a cheap and efficient protocol for generating large sets of genetic markers. This technique has become increasingly used during the last decade in various fields of biology, including population genomics, phylogeography, and genome mapping. Here, we present RawGeno, an R library dedicated to the automated scoring of AFLPs (i.e., the coding of electropherogram signals into ready-to-use datasets). Our program includes a complete suite of tools for binning, editing, visualizing, and exporting results obtained from AFLP experiments. RawGeno can either be used with command lines and program analysis routines or through a user-friendly graphical user interface. We describe the whole RawGeno pipeline along with recommendations for (a) setting the analysis of electropherograms in combination with PeakScanner, a program freely distributed by Applied Biosystems; (b) performing quality checks; (c) defining bins and proceeding to scoring; (d) filtering nonoptimal bins; and (e) exporting results in different formats.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract : The human body is composed of a huge number of cells acting together in a concerted manner. The current understanding is that proteins perform most of the necessary activities in keeping a cell alive. The DNA, on the other hand, stores the information on how to produce the different proteins in the genome. Regulating gene transcription is the first important step that can thus affect the life of a cell, modify its functions and its responses to the environment. Regulation is a complex operation that involves specialized proteins, the transcription factors. Transcription factors (TFs) can bind to DNA and activate the processes leading to the expression of genes into new proteins. Errors in this process may lead to diseases. In particular, some transcription factors have been associated with a lethal pathological state, commonly known as cancer, associated with uncontrolled cellular proliferation, invasiveness of healthy tissues and abnormal responses to stimuli. Understanding cancer-related regulatory programs is a difficult task, often involving several TFs interacting together and influencing each other's activity. This Thesis presents new computational methodologies to study gene regulation. In addition we present applications of our methods to the understanding of cancer-related regulatory programs. The understanding of transcriptional regulation is a major challenge. We address this difficult question combining computational approaches with large collections of heterogeneous experimental data. In detail, we design signal processing tools to recover transcription factors binding sites on the DNA from genome-wide surveys like chromatin immunoprecipitation assays on tiling arrays (ChIP-chip). We then use the localization about the binding of TFs to explain expression levels of regulated genes. In this way we identify a regulatory synergy between two TFs, the oncogene C-MYC and SP1. C-MYC and SP1 bind preferentially at promoters and when SP1 binds next to C-NIYC on the DNA, the nearby gene is strongly expressed. The association between the two TFs at promoters is reflected by the binding sites conservation across mammals, by the permissive underlying chromatin states 'it represents an important control mechanism involved in cellular proliferation, thereby involved in cancer. Secondly, we identify the characteristics of TF estrogen receptor alpha (hERa) target genes and we study the influence of hERa in regulating transcription. hERa, upon hormone estrogen signaling, binds to DNA to regulate transcription of its targets in concert with its co-factors. To overcome the scarce experimental data about the binding sites of other TFs that may interact with hERa, we conduct in silico analysis of the sequences underlying the ChIP sites using the collection of position weight matrices (PWMs) of hERa partners, TFs FOXA1 and SP1. We combine ChIP-chip and ChIP-paired-end-diTags (ChIP-pet) data about hERa binding on DNA with the sequence information to explain gene expression levels in a large collection of cancer tissue samples and also on studies about the response of cells to estrogen. We confirm that hERa binding sites are distributed anywhere on the genome. However, we distinguish between binding sites near promoters and binding sites along the transcripts. The first group shows weak binding of hERa and high occurrence of SP1 motifs, in particular near estrogen responsive genes. The second group shows strong binding of hERa and significant correlation between the number of binding sites along a gene and the strength of gene induction in presence of estrogen. Some binding sites of the second group also show presence of FOXA1, but the role of this TF still needs to be investigated. Different mechanisms have been proposed to explain hERa-mediated induction of gene expression. Our work supports the model of hERa activating gene expression from distal binding sites by interacting with promoter bound TFs, like SP1. hERa has been associated with survival rates of breast cancer patients, though explanatory models are still incomplete: this result is important to better understand how hERa can control gene expression. Thirdly, we address the difficult question of regulatory network inference. We tackle this problem analyzing time-series of biological measurements such as quantification of mRNA levels or protein concentrations. Our approach uses the well-established penalized linear regression models where we impose sparseness on the connectivity of the regulatory network. We extend this method enforcing the coherence of the regulatory dependencies: a TF must coherently behave as an activator, or a repressor on all its targets. This requirement is implemented as constraints on the signs of the regressed coefficients in the penalized linear regression model. Our approach is better at reconstructing meaningful biological networks than previous methods based on penalized regression. The method is tested on the DREAM2 challenge of reconstructing a five-genes/TFs regulatory network obtaining the best performance in the "undirected signed excitatory" category. Thus, these bioinformatics methods, which are reliable, interpretable and fast enough to cover large biological dataset, have enabled us to better understand gene regulation in humans.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A precise and simple computational model to generate well-behaved two-dimensional turbulent flows is presented. The whole approach rests on the use of stochastic differential equations and is general enough to reproduce a variety of energy spectra and spatiotemporal correlation functions. Analytical expressions for both the continuous and the discrete versions, together with simulation algorithms, are derived. Results for two relevant spectra, covering distinct ranges of wave numbers, are given.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We show that the dispersal routes reconstruction problem can be stated as an instance of a graph theoretical problem known as the minimum cost arborescence problem, for which there exist efficient algorithms. Furthermore, we derive some theoretical results, in a simplified setting, on the possible optimal values that can be obtained for this problem. With this, we place the dispersal routes reconstruction problem on solid theoretical grounds, establishing it as a tractable problem that also lends itself to formal mathematical and computational analysis. Finally, we present an insightful example of how this framework can be applied to real data. We propose that our computational method can be used to define the most parsimonious dispersal (or invasion) scenarios, which can then be tested using complementary methods such as genetic analysis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Schizotypy is a multidimensional personality construct representing the extension of psychosis-like traits into the general population. Schizotypy has been associated with attenuated expressions of many of the same neuropsychological abnormalities as schizophrenia, including atypical pattern of functional hemispheric asymmetry. Unfortunately, the previous literature on links between schizotypy and hemispheric asymmetry is inconsistent with some research indicating that elevated schizotypy is associated with relative right over left hemisphere shifts, left over right hemisphere shifts, bilateral impairments, or with no hemispheric differences at all. This inconsistency may result from different methodologies, scales, and / or sex proportions between studies. In a within-participant design, we tested for the four possible links between laterality and schizotypy by comparing the relationship between two common self-report measures of multidimensional schizotypy (the O-LIFE questionnaire, and two Chapman scales, magical ideation and physical anhedonia) and performance in two computerized lateralised hemifield paradigms (lexical decision, chimeric face processing) in 80 men and 79 women. Results for the two scales and two tasks did not unequivocally support any of the four possible links. We discuss the possibilities that a link between schizotypy and laterality 1) exists, but is subtle, probably fluctuating, unable to be assessed by traditional methodologies used here; 2) does not exist, or 3) is indirect, mediated by other factors (e.g. stress-responsiveness, handedness, drug use) whose influences need further exploration.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Brain fluctuations at rest are not random but are structured in spatial patterns of correlated activity across different brain areas. The question of how resting-state functional connectivity (FC) emerges from the brain's anatomical connections has motivated several experimental and computational studies to understand structure-function relationships. However, the mechanistic origin of resting state is obscured by large-scale models' complexity, and a close structure-function relation is still an open problem. Thus, a realistic but simple enough description of relevant brain dynamics is needed. Here, we derived a dynamic mean field model that consistently summarizes the realistic dynamics of a detailed spiking and conductance-based synaptic large-scale network, in which connectivity is constrained by diffusion imaging data from human subjects. The dynamic mean field approximates the ensemble dynamics, whose temporal evolution is dominated by the longest time scale of the system. With this reduction, we demonstrated that FC emerges as structured linear fluctuations around a stable low firing activity state close to destabilization. Moreover, the model can be further and crucially simplified into a set of motion equations for statistical moments, providing a direct analytical link between anatomical structure, neural network dynamics, and FC. Our study suggests that FC arises from noise propagation and dynamical slowing down of fluctuations in an anatomically constrained dynamical system. Altogether, the reduction from spiking models to statistical moments presented here provides a new framework to explicitly understand the building up of FC through neuronal dynamics underpinned by anatomical connections and to drive hypotheses in task-evoked studies and for clinical applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The optical-absorption spectrum of a cationic Ag0 atom in a KCl crystal has been studied theoretically by means of a series of cluster models of increasing size. Excitation energies have been determined by means of a multiconfigurational self-consistent field procedure followed by a second-order perturbation correlation treatment. Moreover results obtained within the density-functional framework are also reported. The calculations confirm the assignment of bands I and IV to transitions of the Ag-5s electron into delocalized states with mainly K-4s,4p character. Bands II and III have been assigned to internal transitions on the Ag atom, which correspond to the atomic Ag-4d to Ag-5s transition. We also determine the lowest charge transfer (CT) excitation energy and confirm the assignment of band VI to such a transition. The study of the variation of the CT excitation energy with the Ag-Cl distance R gives additional support to a large displacement of the Cl ions due to the presence of the Ag0 impurity. Moreover, from the present results, it is predicted that on passing to NaCl:Ag0 the CT onset would be out of the optical range while the 5s-5p transition would undergo a redshift of 0.3 eV. These conclusions, which underline the different character of involved orbitals, are consistent with experimental findings. The existence of a CT transition in the optical range for an atom inside an ionic host is explained by a simple model, which also accounts for the differences with the more common 3d systems. The present study sheds also some light on the R dependence of the s2-sp transitions due to s2 ions like Tl+.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Aggregates of oxygen vacancies (F centers) represent a particular form of point defects in ionic crystals. In this study we have considered the combination of two oxygen vacancies, the M center, in the bulk and on the surface of MgO by means of cluster model calculations. Both neutral and charged forms of the defect M and M+ have been taken into account. The ground state of the M center is characterized by the presence of two doubly occupied impurity levels in the gap of the material; in M+ centers the highest level is singly occupied. For the ground-state properties we used a gradient corrected density functional theory approach. The dipole-allowed singlet-to-singlet and doublet-to-doublet electronic transitions have been determined by means of explicitly correlated multireference second-order perturbation theory calculations. These have been compared with optical transitions determined with the time-dependent density functional theory formalism. The results show that bulk M and M+ centers give rise to intense absorptions at about 4.4 and 4.0 eV, respectively. Another less intense transition at 1.3 eV has also been found for the M+ center. On the surface the transitions occur at 1.6 eV (M+) and 2 eV (M). The results are compared with recently reported electron energy loss spectroscopy spectra on MgO thin films.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Multiscale Finite Volume (MsFV) method has been developed to efficiently solve reservoir-scale problems while conserving fine-scale details. The method employs two grid levels: a fine grid and a coarse grid. The latter is used to calculate a coarse solution to the original problem, which is interpolated to the fine mesh. The coarse system is constructed from the fine-scale problem using restriction and prolongation operators that are obtained by introducing appropriate localization assumptions. Through a successive reconstruction step, the MsFV method is able to provide an approximate, but fully conservative fine-scale velocity field. For very large problems (e.g. one billion cell model), a two-level algorithm can remain computational expensive. Depending on the upscaling factor, the computational expense comes either from the costs associated with the solution of the coarse problem or from the construction of the local interpolators (basis functions). To ensure numerical efficiency in the former case, the MsFV concept can be reapplied to the coarse problem, leading to a new, coarser level of discretization. One challenge in the use of a multilevel MsFV technique is to find an efficient reconstruction step to obtain a conservative fine-scale velocity field. In this work, we introduce a three-level Multiscale Finite Volume method (MlMsFV) and give a detailed description of the reconstruction step. Complexity analyses of the original MsFV method and the new MlMsFV method are discussed, and their performances in terms of accuracy and efficiency are compared.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Molecular chaperones are central to cellular protein homeostasis. In mammals, protein misfolding diseases and aging cause inflammation and progressive tissue loss, in correlation with the accumulation of toxic protein aggregates and the defective expression of chaperone genes. Bacteria and non-diseased, non-aged eukaryotic cells effectively respond to heat shock by inducing the accumulation of heat-shock proteins (HSPs), many of which molecular chaperones involved in protein homeostasis, in reducing stress damages and promoting cellular recovery and thermotolerance. We performed a meta-analysis of published microarray data and compared expression profiles of HSP genes from mammalian and plant cells in response to heat or isothermal treatments with drugs. The differences and overlaps between HSP and chaperone genes were analyzed, and expression patterns were clustered and organized in a network. HSPs and chaperones only partly overlapped. Heat-shock induced a subset of chaperones primarily targeted to the cytoplasm and organelles but not to the endoplasmic reticulum, which organized into a network with a central core of Hsp90s, Hsp70s, and sHSPs. Heat was best mimicked by isothermal treatments with Hsp90 inhibitors, whereas less toxic drugs, some of which non-steroidal anti-inflammatory drugs, weakly expressed different subsets of Hsp chaperones. This type of analysis may uncover new HSP-inducing drugs to improve protein homeostasis in misfolding and aging diseases.