998 resultados para CT-DNA
Resumo:
One of the first useful products from the human genome will be a set of predicted genes. Besides its intrinsic scientific interest, the accuracy and completeness of this data set is of considerable importance for human health and medicine. Though progress has been made on computational gene identification in terms of both methods and accuracy evaluation measures, most of the sequence sets in which the programs are tested are short genomic sequences, and there is concern that these accuracy measures may not extrapolate well to larger, more challenging data sets. Given the absence of experimentally verified large genomic data sets, we constructed a semiartificial test set comprising a number of short single-gene genomic sequences with randomly generated intergenic regions. This test set, which should still present an easier problem than real human genomic sequence, mimics the approximately 200kb long BACs being sequenced. In our experiments with these longer genomic sequences, the accuracy of GENSCAN, one of the most accurate ab initio gene prediction programs, dropped significantly, although its sensitivity remained high. Conversely, the accuracy of similarity-based programs, such as GENEWISE, PROCRUSTES, and BLASTX was not affected significantly by the presence of random intergenic sequence, but depended on the strength of the similarity to the protein homolog. As expected, the accuracy dropped if the models were built using more distant homologs, and we were able to quantitatively estimate this decline. However, the specificities of these techniques are still rather good even when the similarity is weak, which is a desirable characteristic for driving expensive follow-up experiments. Our experiments suggest that though gene prediction will improve with every new protein that is discovered and through improvements in the current set of tools, we still have a long way to go before we can decipher the precise exonic structure of every gene in the human genome using purely computational methodology.
Resumo:
A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.
Resumo:
The vast majority of the biology of a newly sequenced genome is inferred from the set of encoded proteins. Predicting this set is therefore invariably the first step after the completion of the genome DNA sequence. Here we review the main computational pipelines used to generate the human reference protein-coding gene sets.
Resumo:
Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.
Resumo:
GLUT2 expression is reduced in the pancreatic beta-cells of several diabetic animals. The transcriptional control of the gene in beta-cells involves at least two islet-specific DNA-binding proteins, GTIIa and PDX-1, which also transactivates the insulin, somatostatin and glucokinase genes. In this report, we assessed the DNA-binding activities of GTIIa and PDX-1 to their respective cis-elements of the GLUT2 promoter using nuclear extracts prepared from pancreatic islets of 12 week old db/db diabetic mice. We show that the decreased GLUT2 mRNA expression correlates with a decrease of the GTIIa DNA-binding activity, whereas the PDX-1 binding activity is increased. In these diabetic animals, insulin mRNA expression remains normal. The adjunction of dexamethasone to isolated pancreatic islets, a treatment previously shown to decrease PDX-1 expression in the insulin-secreting HIT-T15 cells, has no effect on the GTIIa and PDX-1 DNA-binding activities. These data suggest that the decreased activity of GTIIa, in contrast to PDX-1, may be a major initial step in the development of the beta-cell dysfunction in this model of diabetes.
Resumo:
FANCM binds and remodels replication fork structures in vitro. We report that in vivo, FANCM controls DNA chain elongation in an ATPase-dependent manner. In the presence of replication inhibitors that do not damage DNA, FANCM counteracts fork movement, possibly by remodelling fork structures. Conversely, through damaged DNA, FANCM promotes replication and recovers stalled forks. Hence, the impact of FANCM on fork progression depends on the underlying hindrance. We further report that signalling through the checkpoint effector kinase Chk1 prevents FANCM from degradation by the proteasome after exposure to DNA damage. FANCM also acts in a feedback loop to stabilize Chk1. We propose that FANCM is a ringmaster in the response to replication stress by physically altering replication fork structures and by providing a tight link to S-phase checkpoint signalling.
Resumo:
MicroRNAs (miRNA) are recognized posttranscriptional gene repressors involved in the control of almost every biological process. Allelic variants in these regions may be an important source of phenotypic diversity and contribute to disease susceptibility. We analyzed the genomic organization of 325 human miRNAs (release 7.1, miRBase) to construct a panel of 768 single-nucleotide polymorphisms (SNPs) covering approximately 1 Mb of genomic DNA, including 131 isolated miRNAs (40%) and 194 miRNAs arranged in 48 miRNA clusters, as well as their 5-kb flanking regions. Of these miRNAs, 37% were inside known protein-coding genes, which were significantly associated with biological functions regarding neurological, psychological or nutritional disorders. SNP coverage analysis revealed a lower SNP density in miRNAs compared with the average of the genome, with only 24 SNPs located in the 325 miRNAs studied. Further genotyping of 340 unrelated Spanish individuals showed that more than half of the SNPs in miRNAs were either rare or monomorphic, in agreement with the reported selective constraint on human miRNAs. A comparison of the minor allele frequencies between Spanish and HapMap population samples confirmed the applicability of this SNP panel to the study of complex disorders among the Spanish population, and revealed two miRNA regions, hsa-mir-26a-2 in the CTDSP2 gene and hsa-mir-128-1 in the R3HDM1 gene, showing geographical allelic frequency variation among the four HapMap populations, probably because of differences in natural selection. The designed miRNA SNP panel could help to identify still hidden links between miRNAs and human disease.
Resumo:
Background: There is increasing evidence that impairment of mitochondrial energy metabolism plays an important role in the pathophysiology of autism spectrum disorders (ASD; OMIM number: 209850). A significant proportion of ASD cases display biochemical alterations suggestive of mitochondrial dysfunction and several studies have reported that mutations in the mitochondrial DNA (mtDNA) molecule could be involved in the disease phenotype. Methods: We analysed a cohort of 148 patients with idiopathic ASD for a number of mutations proposed in the literature as pathogenic in ASD. We also carried out a case control association study for the most common European haplogroups (hgs) and their diagnostic single nucleotide polymorphisms (SNPs) by comparing cases with 753 healthy and ethnically matched controls.Results: We did not find statistical support for an association between mtDNA mutations or polymorphisms and ASD.Conclusions: Our results are compatible with the idea that mtDNA mutations are not a relevant cause of ASD and the frequent observation of concomitant mitochondrial dysfunction and ASD could be due to nuclear factors influencing mitochondrion functions or to a more complex interplay between the nucleus and the mitochondrion/mtDNA.
Resumo:
Patient-specific simulations of the hemodynamics in intracranial aneurysms can be constructed by using image-based vascular models and CFD techniques. This work evaluates the impact of the choice of imaging technique on these simulations
Resumo:
Shrews of the genus Sorex are characterized by a Holarctic distribution, and relationships among extant taxa have never been fully resolved. Phylogenies have been proposed based on morphological, karyological, and biochemical comparisons, but these analyses often produced controversial and contradictory results. Phylogenetic analyses of partial mitochondrial cytochrome b gene sequences (1011 bp) were used to examine the relationships among 27 Sorex species. The molecular data suggest that Sorex comprises two major monophyletic lineages, one restricted mostly to the New World and one with a primarily Palearctic distribution. Furthermore, several sister-species relationships are revealed by the analysis. Based on the split between the Soricinae and Crocidurinae subfamilies, we used a 95% confidence interval for both the calibration of a molecular clock and the subsequent calculation of major diversification events within the genus Sorex. Our analysis does not support an unambiguous acceleration of the molecular clock in shrews, the estimated rate being similar to other estimates of mammalian mitochondrial clocks. In addition, the data presented here indicate that estimates from the fossil record greatly underestimate divergence dates among Sorex taxa.
Resumo:
Urine samples from 20 male volunteers of European Caucasian origin were stored at 4 degrees C over a 4-month period in order to compare the identification potential of nuclear DNA (nDNA) and mitochondrial DNA (mtDNA) markers. The amount of nDNA recovered from urines dramatically declined over time. Consequently, nDNA likelihood ratios (LRs) greater than 1,000 were obtained for 100, 70 and 55% of the urines analysed after 6, 60 and 120 days, respectively. For the mtDNA, HVI and HVII sequences were obtained for all samples tested, whatever the period considered. Nevertheless, the highest mtDNA LR of 435 was relatively low compared to its nDNA equivalent. Indeed, LRs obtained with only three nDNA loci could easily exceed this value and are quite easier to obtain. Overall, the joint use of nDNA and mtDNA markers enabled the 20 urine samples to be identified, even after the 4-month period.