3 resultados para Submarine micro-geomorphic data

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Current methods to find significantly under- and over-represented gene ontology (GO) terms in a set of genes consider the genes as equally probable balls in a bag, as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm - GONOME - that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least) the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term development is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE) upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes. Inappropriate use of existing algorithms that do not take gene size into account has led to erroneous or suspect conclusions. Reciprocally GONOME may be used to identify new features in genomes that are significantly associated with particular categories of genes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The proteome of bovine milk is dominated by just six gene products that constitute approximately 95% of milk protein. Nonetheless, over 150 protein spots can be readily detected following two-dimensional electrophoresis of whole milk. Many of these represent isoforms of the major gene products produced through extensive posttranslational modification. Peptide mass fingerprinting of in-gel tryptic digests (using matrix-assisted laser desorption/ionization-time of flight mass spectrometry (MALDI-TOF MS) in reflectron mode with alpha-cyano-4-hydroxycinnamic acid as the matrix) identified 10 forms of K-casein with isoelectric point (pl) values from 4.47 to 5.81, but could not distinguish between them. MALDI-TOF MS in linear mode, using sinapinic acid as the matrix, revealed a large tryptic peptide (mass > 5990 Da) derived from the C-terminus that contained all the known sites of genetic variance, phosphorylation and glycosylation. Two genetic variants present as singly or doubly phosphorylated forms could be distinguished using mass data alone. Glycoforms containing a single acidic tetrasaccharide were also identified. The differences in electrophoretic mobility of these isoforms were consistent with the addition of the acidic groups. While more extensively glycosylated forms were also observed, substantial loss of N-acetylneuraminic acid from the glycosyl group was evident in the MALDI spectra such that ions corresponding to the intact glycopeptide were not observed and assignment of the glycoforms was not possible. However, by analysing the pl shifts observed on the two-dimensional gels in conjunction with the MS data, the number of N-acetylneuraminic acid residues, and hence the glycoforms present, could be determined.