51 resultados para whole genome duplication


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Genomic plasticity of human chromosome 8p23.1 region is highly influenced by two groups of complex segmental duplications (SDs), termed REPD and REPP, that mediate different kinds of rearrangements. Part of the difficulty to explain the wide range of phenotypes associated with 8p23.1 rearrangements is that REPP and REPD are not yet well characterized, probably due to their polymorphic status. Here, we describe a novel primate-specific gene family, named FAM90A (family with sequence similarity 90), found within these SDs. According to the current human reference sequence assembly, the FAM90A family includes 24 members along 8p23.1 region plus a single member on chromosome 12p13.31, showing copy number variation (CNV) between individuals. These genes can be classified into subfamilies I and II, which differ in their upstream and 5′-untranslated region sequences, but both share the same open reading frame and are ubiquitously expressed. Sequence analysis and comparative fluorescence in situ hybridization studies showed that FAM90A subfamily II suffered a big expansion in the hominoid lineage, whereas subfamily I members were likely generated sometime around the divergence of orangutan and African great apes by a fusion process. In addition, the analysis of the Ka/Ks ratios provides evidence of functional constraint of some FAM90A genes in all species. The characterization of the FAM90A gene family contributes to a better understanding of the structural polymorphism of the human 8p23.1 region and constitutes a good example of how SDs, CNVs and rearrangements within themselves can promote the formation of new gene sequences with potential functional consequences.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: We present the results of EGASP, a community experiment to assess the state-ofthe-art in genome annotation within the ENCODE regions, which span 1% of the human genomesequence. The experiment had two major goals: the assessment of the accuracy of computationalmethods to predict protein coding genes; and the overall assessment of the completeness of thecurrent human genome annotations as represented in the ENCODE regions. For thecomputational prediction assessment, eighteen groups contributed gene predictions. Weevaluated these submissions against each other based on a ‘reference set’ of annotationsgenerated as part of the GENCODE project. These annotations were not available to theprediction groups prior to the submission deadline, so that their predictions were blind and anexternal advisory committee could perform a fair assessment.Results: The best methods had at least one gene transcript correctly predicted for close to 70%of the annotated genes. Nevertheless, the multiple transcript accuracy, taking into accountalternative splicing, reached only approximately 40% to 50% accuracy. At the coding nucleotidelevel, the best programs reached an accuracy of 90% in both sensitivity and specificity. Programsrelying on mRNA and protein sequences were the most accurate in reproducing the manuallycurated annotations. Experimental validation shows that only a very small percentage (3.2%) of the selected 221 computationally predicted exons outside of the existing annotation could beverified.Conclusions: This is the first such experiment in human DNA, and we have followed thestandards established in a similar experiment, GASP1, in Drosophila melanogaster. We believe theresults presented here contribute to the value of ongoing large-scale annotation projects and shouldguide further experimental methods when being scaled up to the entire human genome sequence.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A number of experimental methods have been reported for estimating the number of genes in a genome, or the closely related coding density of a genome, defined as the fraction of base pairs in codons. Recently, DNA sequence data representative of the genome as a whole have become available for several organisms, making the problem of estimating coding density amenable to sequence analytic methods. Estimates of coding density for a single genome vary widely, so that methods with characterized error bounds have become increasingly desirable. We present a method to estimate the protein coding density in a corpus of DNA sequence data, in which a ‘coding statistic’ is calculated for a large number of windows of the sequence under study, and the distribution of the statistic is decomposed into two normal distributions, assumed to be the distributions of the coding statistic in the coding and noncoding fractions of the sequence windows. The accuracy of the method is evaluated using known data and application is made to the yeast chromosome III sequence and to C.elegans cosmid sequences. It can also be applied to fragmentary data, for example a collection of short sequences determined in the course of STS mapping.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Selenocysteine (Sec) is co-translationally inserted into selenoproteins in response to codon UGA with the help of the selenocysteine insertion sequence (SECIS) element. The number of selenoproteins in animals varies, with humans having 25 and mice having 24 selenoproteins. To date, however, only one selenoprotein, thioredoxin reductase, has been detected in Caenorhabditis elegans, and this enzyme contains only one Sec. Here, we characterize the selenoproteomes of C.elegans and Caenorhabditis briggsae with three independent algorithms, one searching for pairs of homologous nematode SECIS elements, another searching for Cys- or Sec-containing homologs of potential nematode selenoprotein genes and the third identifying Sec-containing homologs of annotated nematode proteins. These methods suggest that thioredoxin reductase is the only Sec-containing protein in the C.elegans and C.briggsae genomes. In contrast, we identified additional selenoproteins in other nematodes. Assuming that Sec insertion mechanisms are conserved between nematodes and other eukaryotes, the data suggest that nematode selenoproteomes were reduced during evolution, and that in an extreme reduction case Sec insertion systems probably decode only a single UGA codon in C.elegans and C.briggsae genomes. In addition, all detected genes had a rare form of SECIS element containing a guanosine in place of a conserved adenosine present in most other SECIS structures, suggesting that in organisms with small selenoproteomes SECIS elements may change rapidly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Despite the continuous production of genome sequence for a number of organisms,reliable, comprehensive, and cost effective gene prediction remains problematic. This is particularlytrue for genomes for which there is not a large collection of known gene sequences, such as therecently published chicken genome. We used the chicken sequence to test comparative andhomology-based gene-finding methods followed by experimental validation as an effective genomeannotation method.Results: We performed experimental evaluation by RT-PCR of three different computational genefinders, Ensembl, SGP2 and TWINSCAN, applied to the chicken genome. A Venn diagram wascomputed and each component of it was evaluated. The results showed that de novo comparativemethods can identify up to about 700 chicken genes with no previous evidence of expression, andcan correctly extend about 40% of homology-based predictions at the 5' end.Conclusions: De novo comparative gene prediction followed by experimental verification iseffective at enhancing the annotation of the newly sequenced genomes provided by standardhomology-based methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: An excess of caffeine is cytotoxic to all eukaryotic cell types. We aim to study how cells become tolerant to atoxic dose of this drug, and the relationship between caffeine and oxidative stress pathways.Methodology/Principal Findings: We searched for Schizosaccharomyces pombe mutants with inhibited growth on caffeinecontainingplates. We screened a collection of 2,700 haploid mutant cells, of which 98 were sensitive to caffeine. The genes mutated in these sensitive clones were involved in a number of cellular roles including the H2O2-induced Pap1 and Sty1 stress pathways, the integrity and calcineurin pathways, cell morphology and chromatin remodeling. We have investigated the role of the oxidative stress pathways in sensing and promoting survival to caffeine. The Pap1 and the Sty1 pathways are both required for normal tolerance to caffeine, but only the Sty1 pathway is activated by the drug. Cells lacking Pap1 aresensitive to caffeine due to the decreased expression of the efflux pump Hba2. Indeed, ?hba2 cells are sensitive to caffeine, and constitutive activation of the Pap1 pathway enhances resistance to caffeine in an Hba2-dependent manner. Conclusions/Significance: With our caffeine-sensitive, genome-wide screen of an S. pombe deletion collection, we havedemonstrated the importance of some oxidative stress pathway components on wild-type tolerance to the drug.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Poor understanding of the spliceosomal mechanisms to select intronic 3' ends (3'ss) is a major obstacle to deciphering eukaryotic genomes. Here, we discern the rules for global 3'ss selection in yeast. We show that, in contrast to the uniformity of yeast splicing, the spliceosome uses all available 3'ss within a distance window from the intronic branch site (BS), and that in 70% of all possible 3'ss this is likely to be mediated by pre-mRNA structures. Our results reveal that one of these RNA folds acts as an RNA thermosensor, modulating alternative splicing in response to heat shock by controlling alternate 3'ss availability. Thus, our data point to a deeper role for the pre-mRNA in the control of its own fate, and to a simple mechanism for some alternative splicing.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this work we describe the usage of bilinear statistical models as a means of factoring the shape variability into two components attributed to inter-subject variation and to the intrinsic dynamics of the human heart. We show that it is feasible to reconstruct the shape of the heart at discrete points in the cardiac cycle. Provided we are given a small number of shape instances representing the same heart atdifferent points in the same cycle, we can use the bilinearmodel to establish this. Using a temporal and a spatial alignment step in the preprocessing of the shapes, around half of the reconstruction errors were on the order of the axial image resolution of 2 mm, and over 90% was within 3.5 mm. From this, weconclude that the dynamics were indeed separated from theinter-subject variability in our dataset.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Kabuki syndrome (KS) is a multiple congenital anomaly syndrome characterized by specific facial features, mild to moderate mental retardation, postnatal growth delay, skeletal abnormalities, and unusual dermatoglyphic patterns with prominent fingertip pads. A 3.5 Mb duplication at 8p23.1-p22 was once reported as a specific alteration in KS but has not been confirmed in other patients. The molecular basis of KS remains unknown. Methods: We have studied 16 Spanish patients with a clinical diagnosis of KS or KS-like to search for genomic imbalances using genome-wide array technologies. All putative rearrangements were confirmed by FISH, microsatellite markers and/or MLPA assays, which also determined whether the imbalance was de novo or inherited. Results: No duplication at 8p23.1-p22 was observed in our patients. We detected complex rearrangements involving 2q in two patients with Kabuki-like features: 1) a de novo inverted duplication of 11 Mb with a 4.5 Mb terminal deletion, and 2) a de novo 7.2 Mb-terminal deletion in a patient with an additional de novo 0.5 Mb interstitial deletion in 16p. Additional copy number variations (CNV), either inherited or reported in normal controls, were identified and interpreted as polymorphic variants. No specific CNV was significantly increased in the KS group. Conclusion: Our results further confirmed that genomic duplications of 8p23 region are not a common cause of KS and failed to detect other recurrent rearrangement causing this disorder. The detection of two patients with 2q37 deletions suggests that there is a phenotypic overlap between the two conditions, and screening this region in the Kabuki-like patients should be considered.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tomato (Solanum lycopersicum) is a major crop plant and a model system for fruit development. Solanum is one of the largest angiosperm genera1 and includes annual and perennial plants from diverse habitats. Here we present a high-quality genome sequence of domesticated tomato, a draft sequence of its closest wild relative, Solanum pimpinellifolium2, and compare them to each other and to the potato genome (Solanum tuberosum). The two tomato genomes show only 0.6% nucleotide divergence and signs of recent admixture, but show more than 8% divergence from potato, with nine large and several smaller inversions. In contrast to Arabidopsis, but similar to soybean, tomato and potato small RNAs map predominantly to gene-rich chromosomal regions, including gene promoters. The Solanum lineage has experienced two consecutive genome triplications: one that is ancient and shared with rosids, and a more recent one. These triplications set the stage for the neofunctionalization of genes controlling fruit characteristics, such as colour and fleshiness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: DNA sequence polymorphisms analysis can provide valuable information on the evolutionary forces shaping nucleotide variation, and provides an insight into the functional significance of genomic regions. The recent ongoing genome projects will radically improve our capabilities to detect specific genomic regions shaped by natural selection. Current available methods and software, however, are unsatisfactory for such genome-wide analysis. RESULTS: We have developed methods for the analysis of DNA sequence polymorphisms at the genome-wide scale. These methods, which have been tested on a coalescent-simulated and actual data files from mouse and human, have been implemented in the VariScan software package version 2.0. Additionally, we have also incorporated a graphical-user interface. The main features of this software are: i) exhaustive population-genetic analyses including those based on the coalescent theory; ii) analysis adapted to the shallow data generated by the high-throughput genome projects; iii) use of genome annotations to conduct a comprehensive analyses separately for different functional regions; iv) identification of relevant genomic regions by the sliding-window and wavelet-multiresolution approaches; v) visualization of the results integrated with current genome annotations in commonly available genome browsers. CONCLUSION: VariScan is a powerful and flexible suite of software for the analysis of DNA polymorphisms. The current version implements new algorithms, methods, and capabilities, providing an important tool for an exhaustive exploratory analysis of genome-wide DNA polymorphism data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Chemoreception is a widespread mechanism that is involved in critical biologic processes, including individual and social behavior. The insect peripheral olfactory system comprises three major multigene families: the olfactory receptor (Or), the gustatory receptor (Gr), and the odorant-binding protein (OBP) families. Members of the latter family establish the first contact with the odorants, and thus constitute the first step in the chemosensory transduction pathway.Results: Comparative analysis of the OBP family in 12 Drosophila genomes allowed the identification of 595 genes that encode putative functional and nonfunctional members in extant species, with 43 gene gains and 28 gene losses (15 deletions and 13 pseudogenization events). The evolution of this family shows tandem gene duplication events, progressive divergence in DNA and amino acid sequence, and prevalence of pseudogenization events in external branches of the phylogenetic tree. We observed that the OBP arrangement in clusters is maintained across the Drosophila species and that purifying selection governs the evolution of the family; nevertheless, OBP genes differ in their functional constraints levels. Finally, we detect that the OBP repertoire evolves more rapidly in the specialist lineages of the Drosophila melanogaster group (D. sechellia and D. erecta) than in their closest generalists.Conclusion: Overall, the evolution of the OBP multigene family is consistent with the birth-and-death model. We also found that members of this family exhibit different functional constraints, which is indicative of some functional divergence, and that they might be involved in some of the specialization processes that occurred through the diversification of the Drosophila genus.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: Non-long terminal repeat (non-LTR) retrotransposons have contributed to shaping the structure and function of genomes. In silico and experimental approaches have been used to identify the non-LTR elements of the urochordate Ciona intestinalis. Knowledge of the types and abundance of non-LTR elements in urochordates is a key step in understanding their contribution to the structure and function of vertebrate genomes. Results: Consensus elements phylogenetically related to the I, LINE1, LINE2, LOA and R2 elements of the 14 eukaryotic non-LTR clades are described from C. intestinalis. The ascidian elements showed conservation of both the reverse transcriptase coding sequence and the overall structural organization seen in each clade. The apurinic/apyrimidinic endonuclease and nucleic-acid-binding domains encoded upstream of the reverse transcriptase, and the RNase H and the restriction enzyme-like endonuclease motifs encoded downstream of the reverse transcriptase were identified in the corresponding Ciona families. Conclusions: The genome of C. intestinalis harbors representatives of at least five clades of non-LTR retrotransposons. The copy number per haploid genome of each element is low, less than 100, far below the values reported for vertebrate counterparts but within the range for protostomes. Genomic and sequence analysis shows that the ascidian non-LTR elements are unmethylated and flanked by genomic segments with a gene density lower than average for the genome. The analysis provides valuable data for understanding the evolution of early chordate genomes and enlarges the view on the distribution of the non-LTR retrotransposons in eukaryotes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The nucleoid-associated proteins Hha and YdgT repress the expression of the toxin α-hemolysin. An Escherichia coli mutant lacking these proteins overexpresses the toxin α-hemolysin encoded in the multicopy recombinant plasmid pANN202-312R. Unexpectedly, we could observe that this mutant generated clones that no further produced hemolysin (Hly-). Generation of Hly- clones was dependent upon the presence in the culture medium of the antibiotic kanamycin (km), a marker of the hha allele (hha::Tn5). Detailed analysis of different Hly- clones evidenced that recombination between partial IS91 sequences that flank the hly operon had occurred. A fluctuation test evidenced that the presence of km in the culture medium was underlying the generation of these clones. A decrease of the km concentration from 25 mg/l to 12.5 mg/l abolished the appearance of Hly- derivatives. We considered as a working hypothesis that, when producing high levels of the toxin (combination of the hha ydgT mutations with the presence of the multicopy hemolytic plasmid pANN202-312R), the concentration of km of 25 mg/l resulted subinhibitory and stimulated the recombination between adjacent IS91 flanking sequences. To further test this hypothesis, we analyzed the effect of subinhibitory km concentrations in the wild type E. coli strain MG1655 harboring the parental low copy number plasmid pHly152. At a km concentration of 5 mg/l, subinhibitory for strain MG1655 (pHly152), generation of Hly- clones could be readily detected. Similar results were also obtained when, instead of km, ampicillin was used. IS91 is flanking several virulence determinants in different enteric bacterial pathogenic strains from E. coli and Shigella. The results presented here evidence that stress generated by exposure to subinhibitory antibiotic concentrations may result in rearrangements of the bacterial genome. Whereas some of these rearrangements may be deleterious, others may generate genotypes with increased virulence, which may resume infection.