14 resultados para PROTEIN SEQUENCES
em CentAUR: Central Archive University of Reading - UK
Resumo:
Figs and fig wasps form a peculiar closed community in which the Ficus tree provides a compact syconium (inflorescence) habitat for the lives of a complex assemblage of Chalcidoid insects. These diverse fig wasp species have intimate ecological relationships within the closed world of the fig syconia. Previous surveys of Wolbachia, maternally inherited endosymbiotic bacteria that infect vast numbers of arthropod hosts, showed that fig wasps have some of the highest known incidences of Wolbachia amongst all insects. We ask whether the evolutionary patterns of Wolbachia sequences in this closed syconium community are different from those in the outside world. In the present study, we sampled all 17 fig wasp species living on Ficus benjamina, covering 4 families, 6 subfamilies, and 8 genera of wasps. We made a thorough survey of Wolbachia infection patterns and studied evolutionary patterns in wsp (Wolbachia Surface Protein) sequences. We find evidence for high infection incidences, frequent recombination between Wolbachia strains, and considerable horizontal transfer, suggesting rapid evolution of Wolbachia sequences within the syconium community. Though the fig wasps have relatively limited contact with outside world, Wolbachia may be introduced to the syconium community via horizontal transmission by fig wasps species that have winged males and visit the syconia earlier.
Resumo:
BACKGROUND: In order to maintain the most comprehensive structural annotation databases we must carry out regular updates for each proteome using the latest profile-profile fold recognition methods. The ability to carry out these updates on demand is necessary to keep pace with the regular updates of sequence and structure databases. Providing the highest quality structural models requires the most intensive profile-profile fold recognition methods running with the very latest available sequence databases and fold libraries. However, running these methods on such a regular basis for every sequenced proteome requires large amounts of processing power.In this paper we describe and benchmark the JYDE (Job Yield Distribution Environment) system, which is a meta-scheduler designed to work above cluster schedulers, such as Sun Grid Engine (SGE) or Condor. We demonstrate the ability of JYDE to distribute the load of genomic-scale fold recognition across multiple independent Grid domains. We use the most recent profile-profile version of our mGenTHREADER software in order to annotate the latest version of the Human proteome against the latest sequence and structure databases in as short a time as possible. RESULTS: We show that our JYDE system is able to scale to large numbers of intensive fold recognition jobs running across several independent computer clusters. Using our JYDE system we have been able to annotate 99.9% of the protein sequences within the Human proteome in less than 24 hours, by harnessing over 500 CPUs from 3 independent Grid domains. CONCLUSION: This study clearly demonstrates the feasibility of carrying out on demand high quality structural annotations for the proteomes of major eukaryotic organisms. Specifically, we have shown that it is now possible to provide complete regular updates of profile-profile based fold recognition models for entire eukaryotic proteomes, through the use of Grid middleware such as JYDE.
Resumo:
Protein sequences from characterized type III secretion (TTS) systems were used as probes in silico to identify several TTS gene homologs in the genome sequence of Brucella suis biovar 1 strain 1330. Four of the genes, named flhB, fliP, fliR, and fliF on the basis of greatest homologies to known flagellar apparatus proteins, were targeted in PCR and hybridization assays to determine their distribution among other Brucella nomen species and biovars. The results indicated that flhB, fliP, fliR and fliF are present in Brucella melitensis, Brucella ovis, and Brucella suis biovars 1, 2 and 3. Similar homologos have been reported previously in Brucella abortus. Using RT-PCR assays, we were unable to detect any expression of these genes. It is not yet known whether the genes are the cryptic remnants of a flagellar system or are actively involved in a process contributing to pathogenicity or previously undetected motility, but they are distributed widely in Brucella and merit further study to determine their role.
Resumo:
IntFOLD is an independent web server that integrates our leading methods for structure and function prediction. The server provides a simple unified interface that aims to make complex protein modelling data more accessible to life scientists. The server web interface is designed to be intuitive and integrates a complex set of quantitative data, so that 3D modelling results can be viewed on a single page and interpreted by non-expert modellers at a glance. The only required input to the server is an amino acid sequence for the target protein. Here we describe major performance and user interface updates to the server, which comprises an integrated pipeline of methods for: tertiary structure prediction, global and local 3D model quality assessment, disorder prediction, structural domain prediction, function prediction and modelling of protein-ligand interactions. The server has been independently validated during numerous CASP (Critical Assessment of Techniques for Protein Structure Prediction) experiments, as well as being continuously evaluated by the CAMEO (Continuous Automated Model Evaluation) project. The IntFOLD server is available at: http://www.reading.ac.uk/bioinf/IntFOLD/
Resumo:
AtTRB1, 2 and 3 are members of the SMH (single Myb histone) protein family, which comprises double-stranded DNA-binding proteins that are specific to higher plants. They are structurally conserved, containing a Myb domain at the N-terminus, a central H1/H5-like domain and a C-terminally located coiled-coil domain. AtTRB1, 2 and 3 interact through their Myb domain specifically with telomeric double-stranded DNA in vitro, while the central H1/H5-like domain interacts non-specifically with DNA sequences and mediates protein–protein interactions. Here we show that AtTRB1, 2 and 3 preferentially localize to the nucleus and nucleolus during interphase. Both the central H1/H5-like domain and the Myb domain from AtTRB1 can direct a GFP fusion protein to the nucleus and nucleolus. AtTRB1–GFP localization is cell cycle-regulated, as the level of nuclear-associated GFP diminishes during mitotic entry and GFP progressively re-associates with chromatin during anaphase/telophase. Using fluorescence recovery after photobleaching and fluorescence loss in photobleaching, we determined the dynamics of AtTRB1 interactions in vivo. The results reveal that AtTRB1 interaction with chromatin is regulated at two levels at least, one of which is coupled with cell-cycle progression, with the other involving rapid exchange.
Resumo:
The objective was to determine the presence or absence of transgenic and endogenous plant DNA in ruminal fluid, duodenal digesta, milk, blood, and feces, and if found, to determine fragment size. Six multiparous lactating Holstein cows fitted with ruminal and duodenal cannulas received a total mixed ration. There were two treatments (T). In T1, the concentrate contained genetically modified (GM) soybean meal (cp4epsps gene) and GM corn grain (cry1a[b] gene), whereas T2 contained the near isogenic non-GM counterparts. Polymerase chain reaction analysis was used to determine the presence or absence of DNA sequences. Primers were selected to amplify small fragments from single-copy genes (soy lectin and corn high-mobility protein and cp4epsps and cry1a[b] genes from the GM crops) and multicopy genes (bovine mitochondrial cytochrome b and rubisco). Single-copy genes were only detected in the solid phase of rumen and duodenal digesta. In contrast, fragments of the rubisco gene were detected in the majority of samples analyzed in both the liquid and solid phases of ruminal and duodenal digesta, milk, and feces, but rarely in blood. The size of the rubisco gene fragments detected decreased from 1176 bp in ruminal and duodenal digesta to 351 bp in fecal samples.
Resumo:
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Resumo:
A well defined structure is available for the carboxyl half of the cellular prion protein (PrPc), while the structure of the amino terminal half of the molecule remains ill defined. The unstructured nature of the polypeptide has meant that relatively few of the many antibodies generated against PrPc recognise this region. To circumvent this problem, we have used a previously characterised and well expressed fragment derived from the amino terminus of PrPc as bait for panning a single chain antibody phage (scFv-P) library. Using this approach, we identified and characterised I predominant and 3 additional scFv-Ps that contained different V-H and V-L sequences and that bound specifically to the PrPc target. Epitope mapping revealed that all scFv-Ps recognised linear epitopes between PrPc residues 76 and 156. When compared with existing monoclonal antibodies (MAb), the binding of the scFvs was significantly different in that high level binding was evident on truncated forms of PrPc that reacted poorly or not at all with several pre-existing MAbs. These data suggest that the isolated scFv-Ps bind to novel epitopes within the aminocentral region of PrPc. In addition, the binding of MAbs to known linear epitopes within PrPc depends strongly on the endpoints of the target PrPc fragment used. (c) 2006 Elsevier Inc. All rights reserved.
Resumo:
Proteins are commonly identified through enzymatic digestion and generation of short sequence tags or fingerprints of peptide masses by mass spectrometry. Separation methods, such as liquid chromatography and electrophoresis, are often used to fractionate complex protein or peptide mixtures and these separations also provide information on the different species, such as molecular weight and isoelectric point from electrophoresis and hydrophobicity in reversed-phase chromatography. These are also properties that can be predicted from amino acid sequences derived from genomic sequences and used in protein identification. This chapter reviews recently introduced methods based on retention time prediction to extract information from chromatographic separations and the applications to protein identification in organisms with small and large genomes. Novel data on retention time prediction of posttranslationally modified peptides is also presented.
Resumo:
Motivation: There is a frequent need to apply a large range of local or remote prediction and annotation tools to one or more sequences. We have created a tool able to dispatch one or more sequences to assorted services by defining a consistent XML format for data and annotations. Results: By analyzing annotation tools, we have determined that annotations can be described using one or more of the six forms of data: numeric or textual annotation of residues, domains (residue ranges) or whole sequences. With this in mind, XML DTDs have been designed to store the input and output of any server. Plug-in wrappers to a number of services have been written which are called from a master script. The resulting APATML is then formatted for display in HTML. Alternatively further tools may be written to perform post-analysis.
Resumo:
As an obligatory parasite of humans, the body louse (Pediculus humanus humanus) is an important vector for human diseases, including epidemic typhus, relapsing fever, and trench fever. Here, we present genome sequences of the body louse and its primary bacterial endosymbiont Candidatus Riesia pediculicola. The body louse has the smallest known insect genome, spanning 108 Mb. Despite its status as an obligate parasite, it retains a remarkably complete basal insect repertoire of 10,773 protein-coding genes and 57 microRNAs. Representing hemimetabolous insects, the genome of the body louse thus provides a reference for studies of holometabolous insects. Compared with other insect genomes, the body louse genome contains significantly fewer genes associated with environmental sensing and response, including odorant and gustatory receptors and detoxifying enzymes. The unique architecture of the 18 minicircular mitochondrial chromosomes of the body louse may be linked to the loss of the gene encoding the mitochondrial single-stranded DNA binding protein. The genome of the obligatory louse endosymbiont Candidatus Riesia pediculicola encodes less than 600 genes on a short, linear chromosome and a circular plasmid. The plasmid harbors a unique arrangement of genes required for the synthesis of pantothenate, an essential vitamin deficient in the louse diet. The human body louse, its primary endosymbiont, and the bacterial pathogens that it vectors all possess genomes reduced in size compared with their free-living close relatives. Thus, the body louse genome project offers unique information and tools to use in advancing understanding of coevolution among vectors, symbionts, and pathogens.
Resumo:
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (±20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed
Resumo:
Protein structure prediction methods aim to predict the structures of proteins from their amino acid sequences, utilizing various computational algorithms. Structural genome annotation is the process of attaching biological information to every protein encoded within a genome via the production of three-dimensional protein models.
Resumo:
The human ROCO proteins are a family of multi-domain proteins sharing a conserved ROC-COR supra-domain. The family has four members: leu- cine-rich repeat kinase 1 (LRRK1), leucine-rich repeat kinase 2 (LRRK2), death-associated protein kinase 1 (DAPK1) and malignant fibrous histiocy- toma amplified sequences with leucine-rich tandem repeats 1 (MASL1). Previous studies of LRRK1/2 and DAPK1 have shown that the ROC (Ras of complex proteins) domain can bind and hydrolyse GTP, but the cellular consequences of this activity are still unclear. Here, the first biochemical characterization of MASL1 and the impact of GTP binding on MASL1 complex formation are reported. The results demonstrate that MASL1, similar to other ROCO proteins, can bind guanosine nucleotides via its ROC domain. Furthermore, MASL1 exists in two distinct cellular com- plexes associated with heat shock protein 60, and the formation of a low molecular weight pool of MASL1 is modulated by GTP binding. Finally, loss of GTP enhances MASL1 toxicity in cells. Taken together, these data point to a central role for the ROC/GTPase domain of MASL1 in the reg- ulation of its cellular function.