922 resultados para DNA-microarray data
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
BACKGROUND: Little information is available on resistance to anti-malarial drugs in the Solomon Islands (SI). The analysis of single nucleotide polymorphisms (SNPs) in drug resistance associated parasite genes is a potential alternative to classical time- and resource-consuming in vivo studies to monitor drug resistance. Mutations in pfmdr1 and pfcrt were shown to indicate chloroquine (CQ) resistance, mutations in pfdhfr and pfdhps indicate sulphadoxine-pyrimethamine (SP) resistance, and mutations in pfATPase6 indicate resistance to artemisinin derivatives. METHODS: The relationship between the rate of treatment failure among 25 symptomatic Plasmodium falciparum-infected patients presenting at the clinic and the pattern of resistance-associated SNPs in P. falciparum infecting 76 asymptomatic individuals from the surrounding population was investigated. The study was conducted in the SI in 2004. Patients presenting at a local clinic with microscopically confirmed P. falciparum malaria were recruited and treated with CQ+SP. Rates of treatment failure were estimated during a 28-day follow-up period. In parallel, a DNA microarray technology was used to analyse mutations associated with CQ, SP, and artemisinin derivative resistance among samples from the asymptomatic community. Mutation and haplotype frequencies were determined, as well as the multiplicity of infection. RESULTS: The in vivo study showed an efficacy of 88% for CQ+SP to treat P. falciparum infections. DNA microarray analyses indicated a low diversity in the parasite population with one major haplotype present in 98.7% of the cases. It was composed of fixed mutations at position 86 in pfmdr1, positions 72, 75, 76, 220, 326 and 356 in pfcrt, and positions 59 and 108 in pfdhfr. No mutation was observed in pfdhps or in pfATPase6. The mean multiplicity of infection was 1.39. CONCLUSION: This work provides the first insight into drug resistance markers of P. falciparum in the SI. The obtained results indicated the presence of a very homogenous P. falciparum population circulating in the community. Although CQ+SP could still clear most infections, seven fixed mutations associated with CQ resistance and two fixed mutations related to SP resistance were observed. Whether the absence of mutations in pfATPase6 indicates the efficacy of artemisinin derivatives remains to be proven.
Resumo:
In order to contribute to the debate about southern glacial refugia used by temperate species and more northern refugia used by boreal or cold-temperate species, we examined the phylogeography of a widespread snake species (Vipera berus) inhabiting Europe up to the Arctic Circle. The analysis of the mitochondrial DNA (mtDNA) sequence variation in 1043 bp of the cytochrome b gene and in 918 bp of the noncoding control region was performed with phylogenetic approaches. Our results suggest that both the duplicated control region and cytochrome b evolve at a similar rate in this species. Phylogenetic analysis showed that V. berus is divided into three major mitochondrial lineages, probably resulting from an Italian, a Balkan and a Northern (from France to Russia) refugial area in Eastern Europe, near the Carpathian Mountains. In addition, the Northern clade presents an important substructure, suggesting two sequential colonization events in Europe. First, the continent was colonized from the three main refugial areas mentioned above during the Lower-Mid Pleistocene. Second, recolonization of most of Europe most likely originated from several refugia located outside of the Mediterranean peninsulas (Carpathian region, east of the Carpathians, France and possibly Hungary) during the Mid-Late Pleistocene, while populations within the Italian and Balkan Peninsulas fluctuated only slightly in distribution range, with larger lowland populations during glacial times and with refugial mountain populations during interglacials, as in the present time. The phylogeographical structure revealed in our study suggests complex recolonization dynamics of the European continent by V. berus, characterized by latitudinal as well as altitudinal range shifts, driven by both climatic changes and competition with related species.
Resumo:
DnaSP is a software package for a comprehensive analysis of DNA polymorphism data. Version 5 implements a number of new features and analytical methods allowing extensive DNA polymorphism analyses on large datasets. Among other features, the newly implemented methods allow for: (i) analyses on multiple data files; (ii) haplotype phasing; (iii) analyses on insertion/deletion polymorphism data; (iv) visualizing sliding window results integrated with available genome annotations in the UCSC browser.
Resumo:
In this work, we propose a copula-based method to generate synthetic gene expression data that account for marginal and joint probability distributions features captured from real data. Our method allows us to implant significant genes in the synthetic dataset in a controlled manner, giving the possibility of testing new detection algorithms under more realistic environments.
Resumo:
Background: We use an approach based on Factor Analysis to analyze datasets generated for transcriptional profiling. The method groups samples into biologically relevant categories, and enables the identification of genes and pathways most significantly associated to each phenotypic group, while allowing for the participation of a given gene in more than one cluster. Genes assigned to each cluster are used for the detection of pathways predominantly activated in that cluster by finding statistically significant associated GO terms. We tested the approach with a published dataset of microarray experiments in yeast. Upon validation with the yeast dataset, we applied the technique to a prostate cancer dataset. Results: Two major pathways are shown to be activated in organ-confined, non-metastatic prostate cancer: those regulated by the androgen receptor and by receptor tyrosine kinases. A number of gene markers (HER3, IQGAP2 and POR1) highlighted by the software and related to the later pathway have been validated experimentally a posteriori on independent samples. Conclusion: Using a new microarray analysis tool followed by a posteriori experimental validation of the results, we have confirmed several putative markers of malignancy associated with peptide growth factor signalling in prostate cancer and revealed others, most notably ERRB3 (HER3). Our study suggest that, in primary prostate cancer, HER3, together or not with HER4, rather than in receptor complexes involving HER2, could play an important role in the biology of these tumors. These results provide new evidence for the role of receptor tyrosine kinases in the establishment and progression of prostate cancer.
Resumo:
Paracoccidioides brasiliensis is a thermally dimorphic fungus, and causes the most prevalent systemic mycosis in Latin America. Infection is initiated by inhalation of conidia or mycelial fragments by the host, followed by further differentiation into the yeast form. Information regarding gene expression by either form has rarely been addressed with respect to multiple time points of growth in culture. Here, we report on the construction of a genomic DNA microarray, covering approximately 25% of the genome of the organism, and its utilization in identifying genes and gene expression patterns during growth in vitro. Cloned, amplified inserts from randomly sheared genomic DNA (gDNA) and known control genes were printed onto glass slides to generate a microarray of over 12 000 elements. To examine gene expression, mRNA was extracted and amplified from mycelial or yeast cultures grown in semi-defined medium for 5, 8 and 14 days. Principal components analysis and hierarchical clustering indicated that yeast gene expression profiles differed greatly from those of mycelia, especially at earlier time points, and that mycelial gene expression changed less than gene expression in yeasts over time. Genes upregulated in yeasts were found to encode proteins shown to be involved in methionine/cysteine metabolism, respiratory and metabolic processes (of sugars, amino acids, proteins and lipids), transporters (small peptides, sugars, ions and toxins), regulatory proteins and transcription factors. Mycelial genes involved in processes such as cell division, protein catabolism, nucleotide biosynthesis and toxin and sugar transport showed differential expression. Sequenced clones were compared with Histoplasma capsulatum and Coccidioides posadasii genome sequences to assess potentially common pathways across species, such as sulfur and lipid metabolism, amino acid transporters, transcription factors and genes possibly related to virulence. We also analysed gene expression with time in culture and found that while transposable elements and components of respiratory pathways tended to increase in expression with time, genes encoding ribosomal structural proteins and protein catabolism tended to sharply decrease in expression over time, particularly in yeast. These findings expand our knowledge of the different morphological forms of P. brasiliensis during growth in culture.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Abstract Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.
Resumo:
Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules), and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.