6 resultados para Frankel, ZachariasFrankel, ZachariasZachariasFrankel
em University of Queensland eSpace - Australia
Resumo:
The study of continuously varying, quantitative traits is important in evolutionary biology, agriculture, and medicine. Variation in such traits is attributable to many, possibly interacting, genes whose expression may be sensitive to the environment, which makes their dissection into underlying causative factors difficult. An important population parameter for quantitative traits is heritability, the proportion of total variance that is due to genetic factors. Response to artificial and natural selection and the degree of resemblance between relatives are all a function of this parameter. Following the classic paper by R. A. Fisher in 1918, the estimation of additive and dominance genetic variance and heritability in populations is based upon the expected proportion of genes shared between different types of relatives, and explicit, often controversial and untestable models of genetic and non-genetic causes of family resemblance. With genome-wide coverage of genetic markers it is now possible to estimate such parameters solely within families using the actual degree of identity-by-descent sharing between relatives. Using genome scans on 4,401 quasi-independent sib pairs of which 3,375 pairs had phenotypes, we estimated the heritability of height from empirical genome-wide identity-by-descent sharing, which varied from 0.374 to 0.617 (mean 0.498, standard deviation 0.036). The variance in identity-by-descent sharing per chromosome and per genome was consistent with theory. The maximum likelihood estimate of the heritability for height was 0.80 with no evidence for non-genetic causes of sib resemblance, consistent with results from independent twin and family studies but using an entirely separate source of information. Our application shows that it is feasible to estimate genetic variance solely from within- family segregation and provides an independent validation of previously untestable assumptions. Given sufficient data, our new paradigm will allow the estimation of genetic variation for disease susceptibility and quantitative traits that is free from confounding with non-genetic factors and will allow partitioning of genetic variation into additive and non-additive components.
Resumo:
Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi- spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane. Application of MemO, a membrane organization annotation pipeline, to the FANTOM3 Isoform Protein Sequence mouse protein set revealed that within the 8,032 transcriptional units ( TUs) with multiple protein isoforms, 573 had variation in their use of signal peptides, 1,527 had variation in their use of transmembrane domains, and 615 generated protein isoforms from distinct membrane organization classes. The mechanisms underlying these transcript variations were analyzed. While TUs were identified encoding all pairwise combinations of membrane organization categories, the most common was conversion of membrane proteins to soluble proteins. Observed within our highconfidence set were 156 TUs predicted to generate both extracellular soluble and membrane proteins, and 217 TUs generating both intracellular soluble and membrane proteins. The differential use of endoplasmic reticulum signal peptides and transmembrane domains is a common occurrence within the variable protein output of TUs. The generation of protein isoforms that are targeted to multiple subcellular locations represents a major functional consequence of transcript variation within the mouse transcriptome.
Resumo:
Several pathogenic strains of Escherichia coli exploit type III secretion to inject effector proteins into human cells, which then subvert eukaryotic cell biology to the bacterium's advantage. We have exploited bioinformatics and experimental approaches to establish that the effector repertoire in the Sakai strain of enterohemorrhagic E. coli (EHEC) O157:H7 is much larger than previously thought. Homology searches led to the identification of > 60 putative effector genes. Thirteen of these were judged to be likely pseudogenes, whereas 49 were judged to be potentially functional. In total, 39 proteins were confirmed experimentally as effectors: 31 through proteomics and 28 through translocation assays. At the protein level, the EHEC effector sequences fall into > 20 families. The largest family, the NleG family, contains 14 members in the Sakai strain alone. EHEC also harbors functional homologs of effectors from plant pathogens (HopPtoH, HopW, AvrA) and from Shigella (OspD, OspE, OspG), and two additional members of the Map/IpgB family. Genes encoding proven or predicted effectors occur in > 20 exchangeable effector loci scattered throughout the chromosome. Crucially, the majority of functional effector genes are encoded by nine exchangeable effector loci that lie within lambdoid prophages. Thus, type III secretion in E. coli is linked to a vast phage metagenome, acting as a crucible for the evolution of pathogenicity.
Resumo:
Non- protein- coding RNAs ( ncRNAs) are increasingly being recognized as having important regulatory roles. Although much recent attention has focused on tiny 22- to 25- nucleotide microRNAs, several functional ncRNAs are orders of magnitude larger in size. Examples of such macro ncRNAs include Xist and Air, which in mouse are 18 and 108 kilobases ( Kb), respectively. We surveyed the 102,801 FANTOM3 mouse cDNA clones and found that Air and Xist were present not as single, full- length transcripts but as a cluster of multiple, shorter cDNAs, which were unspliced, had little coding potential, and were most likely primed from internal adenine- rich regions within longer parental transcripts. We therefore conducted a genome- wide search for regional clusters of such cDNAs to find novel macro ncRNA candidates. Sixty- six regions were identified, each of which mapped outside known protein- coding loci and which had a mean length of 92 Kb. We detected several known long ncRNAs within these regions, supporting the basic rationale of our approach. In silico analysis showed that many regions had evidence of imprinting and/ or antisense transcription. These regions were significantly associated with microRNAs and transcripts from the central nervous system. We selected eight novel regions for experimental validation by northern blot and RT- PCR and found that the majority represent previously unrecognized noncoding transcripts that are at least 10 Kb in size and predominantly localized in the nucleus. Taken together, the data not only identify multiple new ncRNAs but also suggest the existence of many more macro ncRNAs like Xist and Air.
Resumo:
Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and nonsynonymous substitutions with the computer program CRITICA. This analysis confirms that there is no real discontinuity at length 100. Roughly 10% of mouse proteins are shorter than 100 aa, although the majority of these are variants of proteins longer than 100 aa. We identify many novel short proteins, including a dark matter'' subset containing ones that lack detectable homology to other known proteins. Translation assays confirm that some of these novel proteins can be translated and localised to the secretory pathway.