64 resultados para Bioinformàtica


Relevância:

10.00% 10.00%

Publicador:

Resumo:

The observation that real complex networks have internal structure has important implication for dynamic processes occurring on such topologies. Here we investigate the impact of community structure on a model of information transfer able to deal with both search and congestion simultaneously. We show that networks with fuzzy community structure are more efficient in terms of packet delivery than those with pronounced community structure. We also propose an alternative packet routing algorithm which takes advantage of the knowledge of communities to improve information transfer and show that in the context of the model an intermediate level of community structure is optimal. Finally, we show that in a hierarchical network setting, providing knowledge of communities at the level of highest modularity will improve network capacity by the largest amount.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Els avenços en les bases dels mètodes teòrics i l'espectacular desenvolupament de la potència de càlcul han fet possible progressar enormement en el somni dels fundadors de la química, és a dir, ser capaços d'estudiar amb mètodes computacionals el conjunt de processos químics. Actualment, la química teòrica està completant el darrer avenç: intentar esdevenir l'eina més recent per a comprendre la naturalesa química dels éssers vius. Aquesta revisió pretén mostrar com els mètodes de la química teòrica, originalment desenvolupats per a examinar molècules petites en fase gas, han evolucionat per a assolir la complexa descripció de sistemes biològics.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Bionformatics is a rapidly evolving research field dedicated toanalyzing and managing biological data with computational resources. This paperaims to overview some of the processes and applications currently implementedat CCiT-UB¿s Bioinformatics Unit, focusing mainly on the areas of Genomics,Transcriptomics and Proteomics

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Information about the genomic coordinates and the sequence of experimentally identified transcription factor binding sites is found scattered under a variety of diverse formats. The availability of standard collections of such high-quality data is important to design, evaluate and improve novel computational approaches to identify binding motifs on promoter sequences from related genes. ABS (http://genome.imim.es/datasets/abs2005/index.html) is a public database of known binding sites identified in promoters of orthologous vertebrate genes that have been manually curated from bibliography. We have annotated 650 experimental binding sites from 68 transcription factors and 100 orthologous target genes in human, mouse, rat or chicken genome sequences. Computational predictions and promoter alignment information are also provided for each entry. A simple and easy-to-use web interface facilitates data retrieval allowing different views of the information. In addition, the release 1.0 of ABS includes a customizable generator of artificial datasets based on the known sites contained in the collection and an evaluation tool to aid during the training and the assessment of motif-finding programs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: The arrangement of regulatory motifs in gene promoters, or promoterarchitecture, is the result of mutation and selection processes that have operated over manymillions of years. In mammals, tissue-specific transcriptional regulation is related to the presence ofspecific protein-interacting DNA motifs in gene promoters. However, little is known about therelative location and spacing of these motifs. To fill this gap, we have performed a systematic searchfor motifs that show significant bias at specific promoter locations in a large collection ofhousekeeping and tissue-specific genes.Results: We observe that promoters driving housekeeping gene expression are enriched inparticular motifs with strong positional bias, such as YY1, which are of little relevance in promotersdriving tissue-specific expression. We also identify a large number of motifs that show positionalbias in genes expressed in a highly tissue-specific manner. They include well-known tissue-specificmotifs, such as HNF1 and HNF4 motifs in liver, kidney and small intestine, or RFX motifs in testis,as well as many potentially novel regulatory motifs. Based on this analysis, we provide predictionsfor 559 tissue-specific motifs in mouse gene promoters.Conclusion: The study shows that motif positional bias is an important feature of mammalianproximal promoters and that it affects both general and tissue-specific motifs. Motif positionalconstraints define very distinct promoter architectures depending on breadth of expression andtype of tissue.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In past years, comprehensive representations of cell signalling pathways have been developed by manual curation from literature, which requires huge effort and would benefit from information stored in databases and from automatic retrieval and integration methods. Once a reconstruction of the network of interactions is achieved, analysis of its structural features and its dynamic behaviour can take place. Mathematical modelling techniques are used to simulate the complex behaviour of cell signalling networks, which ultimately sheds light on the mechanisms leading to complex diseases or helps in the identification of drug targets. A variety of databases containing information on cell signalling pathways have been developed in conjunction with methodologies to access and analyse the data. In principle, the scenario is prepared to make the most of this information for the analysis of the dynamics of signalling pathways. However, are the knowledge repositories of signalling pathways ready to realize the systems biology promise? In this article we aim to initiate this discussion and to provide some insights on this issue.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The number of existing protein sequences spans a very small fraction of sequence space. Natural proteins have overcome a strong negative selective pressure to avoid the formation of insoluble aggregates. Stably folded globular proteins and intrinsically disordered proteins (IDP) use alternative solutions to the aggregation problem. While in globular proteins folding minimizes the access to aggregation prone regions IDPs on average display large exposed contact areas. Here, we introduce the concept of average meta-structure correlation map to analyze sequence space. Using this novel conceptual view we show that representative ensembles of folded and ID proteins show distinct characteristics and responds differently to sequence randomization. By studying the way evolutionary constraints act on IDPs to disable a negative function (aggregation) we might gain insight into the mechanisms by which function - enabling information is encoded in IDPs.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a novel multifactor dimensionality reduction method for epistasis detection in small or extended pedigrees, FAM-MDR. It combines features of the Genome-wide Rapid Association using Mixed Model And Regression approach (GRAMMAR) with Model-Based MDR (MB-MDR). We focus on continuous traits, although the method is general and can be used for outcomes of any type, including binary and censored traits. When comparing FAM-MDR with Pedigree-based Generalized MDR (PGMDR), which is a generalization of Multifactor Dimensionality Reduction (MDR) to continuous traits and related individuals, FAM-MDR was found to outperform PGMDR in terms of power, in most of the considered simulated scenarios. Additional simulations revealed that PGMDR does not appropriately deal with multiple testing and consequently gives rise to overly optimistic results. FAM-MDR adequately deals with multiple testing in epistasis screens and is in contrast rather conservative, by construction. Furthermore, simulations show that correcting for lower order (main) effects is of utmost importance when claiming epistasis. As Type 2 Diabetes Mellitus (T2DM) is a complex phenotype likely influenced by gene-gene interactions, we applied FAM-MDR to examine data on glucose area-under-the-curve (GAUC), an endophenotype of T2DM for which multiple independent genetic associations have been observed, in the Amish Family Diabetes Study (AFDS). This application reveals that FAM-MDR makes more efficient use of the available data than PGMDR and can deal with multi-generational pedigrees more easily. In conclusion, we have validated FAM-MDR and compared it to PGMDR, the current state-of-the-art MDR method for family data, using both simulations and a practical dataset. FAM-MDR is found to outperform PGMDR in that it handles the multiple testing issue more correctly, has increased power, and efficiently uses all available information.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Assessing the contribution of promoters and coding sequences to gene evolution is an important step toward discovering the major genetic determinants of human evolution. Many specific examples have revealed the evolutionary importance of cis-regulatory regions. However, the relative contribution of regulatory and coding regions to the evolutionary process and whether systemic factors differentially influence their evolution remains unclear. To address these questions, we carried out an analysis at the genome scale to identify signatures of positive selection in human proximal promoters. Next, we examined whether genes with positively selected promoters (Prom+ genes) show systemic differences with respect to a set of genes with positively selected protein-coding regions (Cod+ genes). We found that the number of genes in each set was not significantly different (8.1% and 8.5%, respectively). Furthermore, a functional analysis showed that, in both cases, positive selection affects almost all biological processes and only a few genes of each group are located in enriched categories, indicating that promoters and coding regions are not evolutionarily specialized with respect to gene function. On the other hand, we show that the topology of the human protein network has a different influence on the molecular evolution of proximal promoters and coding regions. Notably, Prom+ genes have an unexpectedly high centrality when compared with a reference distribution (P = 0.008, for Eigenvalue centrality). Moreover, the frequency of Prom+ genes increases from the periphery to the center of the protein network (P = 0.02, for the logistic regression coefficient). This means that gene centrality does not constrain the evolution of proximal promoters, unlike the case with coding regions, and further indicates that the evolution of proximal promoters is more efficient in the center of the protein network than in the periphery. These results show that proximal promoters have had a systemic contribution to human evolution by increasing the participation of central genes in the evolutionary process.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Chromosomal anomalies, like Robertsonian and reciprocal translocations represent a big problem in cattle breeding as their presence induces, in the carrier subjects, a well documented fertility reduction. In cattle reciprocal translocations (RCPs, a chromosome abnormality caused by an exchange of material between nonhomologous chromosomes) are considered rare as to date only 19 reciprocal translocations have been described. In cattle it is common knowledge that the Robertsonian translocations represent the most common cytogenetic anomalies, and this is probably due to the existence of the endemic 1;29 Robertsonian translocation. However, these considerations are based on data obtained using techniques that are unable to identify all reciprocal translocations and thus their frequency is clearly underestimated. The purpose of this work is to provide a first realistic estimate of the impact of RCPs in the cattle population studied, trying to eliminate the factors which have caused an underestimation of their frequency so far. We performed this work using a mathematical as well as a simulation approach and, as biological data, we considered the cytogenetic results obtained in the last 15 years. The results obtained show that only 16% of reciprocal translocations can be detected using simple Giemsa techniques and consequently they could be present in no less than 0,14% of cattle subjects, a frequency five times higher than that shown by de novo Robertsonian translocations. This data is useful to open a debate about the need to introduce a more efficient method to identify RCP in cattle.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Abstract Background: The CWxP motif of transmembrane helix 6 (x: any residue) is highly conserved in class A GPCRs. Within this motif, W6.48 is a big star in the theory of the global “toggle switch” because of its key role in the activation mechanism of GPCRs upon ligand binding. With all footlights focused on W6.48, the reason why the preceding residue, C6.47, is largely conserved is still unknown. The present study is aimed to fill up this lack of knowledge by characterizing the role of C6.47 of the CWxP motif. Results: A complete analysis of available crystal structures has been made alongside with molecular dynamics simulations of model peptides to explore a possible structural role for C6.47. Conclusions: We conclude that C6.47 does not modulate the conformation of the TM6 proline kink and propose that C6.47 participates in the rearrangement of the TM6 and TM7 interface accompanying activation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The relationship between inflammation and cancer is well established in several tumor types, including bladder cancer. We performed an association study between 886 inflammatory-gene variants and bladder cancer risk in 1,047 cases and 988 controls from the Spanish Bladder Cancer (SBC)/EPICURO Study. A preliminary exploration with the widely used univariate logistic regression approach did not identify any significant SNP after correcting for multiple testing. We further applied two more comprehensive methods to capture the complexity of bladder cancer genetic susceptibility: Bayesian Threshold LASSO (BTL), a regularized regression method, and AUC-Random Forest, a machine-learning algorithm. Both approaches explore the joint effect of markers. BTL analysis identified a signature of 37 SNPs in 34 genes showing an association with bladder cancer. AUC-RF detected an optimal predictive subset of 56 SNPs. 13 SNPs were identified by both methods in the total population. Using resources from the Texas Bladder Cancer study we were able to replicate 30% of the SNPs assessed. The associations between inflammatory SNPs and bladder cancer were reexamined among non-smokers to eliminate the effect of tobacco, one of the strongest and most prevalent environmental risk factor for this tumor. A 9 SNP-signature was detected by BTL. Here we report, for the first time, a set of SNP in inflammatory genes jointly associated with bladder cancer risk. These results highlight the importance of the complex structure of genetic susceptibility associated with cancer risk.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We have developed numerical simulations of three dimensional suspensions of active particles to characterize the capabilities of the hydrodynamic stresses induced by active swimmers to promote global order and emergent structures in active suspensions. We have considered squirmer suspensions embedded in a fluid modeled under a Lattice Boltzmann scheme. We have found that active stresses play a central role to decorrelate the collective motion of squirmers and that contractile squirmers develop significant aggregates.