67 resultados para Sequence Alignment
Resumo:
Conventionally, protein structure prediction via threading relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. (C) 1999 John Wiley & Sons, Inc.
Resumo:
Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.
Resumo:
Wurst is a protein threading program with an emphasis on high quality sequence to structure alignments (http://www.zbh.uni-hamburg.de/wurst). Submitted sequences are aligned to each of about 3000 templates with a conventional dynamic programming algorithm, but using a score function with sophisticated structure and sequence terms. The structure terms are a log-odds probability of sequence to structure fragment compatibility, obtained from a Bayesian classification procedure. A simplex optimization was used to optimize the sequence-based terms for the goal of alignment and model quality and to balance the sequence and structural contributions against each other. Both sequence and structural terms operate with sequence profiles.
Resumo:
In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.
Resumo:
Despite many successes of conventional DNA sequencing methods, some DNAs remain difficult or impossible to sequence. Unsequenceable regions occur in the genomes of many biologically important organisms, including the human genome. Such regions range in length from tens to millions of bases, and may contain valuable information such as the sequences of important genes. The authors have recently developed a technique that renders a wide range of problematic DNAs amenable to sequencing. The technique is known as sequence analysis via mutagenesis (SAM). This paper presents a number of algorithms for analysing and interpreting data generated by this technique.
Resumo:
Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs.
Resumo:
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Resumo:
We sequenced cDNAs coding for chicken cellular nucleic acid binding protein (CNBP). Two slightly different variations of the open reading frame were found, each of which translates into a protein with seven zinc finger domains. The longest transcript contains an in-frame insert of 3 bp. The sequence conservation between chick CNBP cDNAs with human, rat and mouse CNBP cDNAs is extreme, especially in the coding region, where the deduced amino acid sequence identity with human, rat and mouse CNBP is 99%. CNBP-like transcripts were also found in various tissues from insect, shrimp, fish and lizard. Regions with remarkable nucleotide conservation were also found in the 3' untranslated region, indicating important functions for these regions. Quantitative reverse transcription polymerase chain reaction (RT-PCR) indicated that in the chick, CNBP is present in all tissues examined in approximately equal ratios to total RNA. RT-PCR of total RNA isolated from different phyla indicate CNBP-like proteins art widespread throughout the animal kingdom. The extraordinary level of conservation suggests an important physiological role for CNBP. (C) 1997 Elsevier Science Inc.
Resumo:
The complete nucleotide sequence of the genomic RNA from the insect picorna-like virus Drosophila C virus (DCV) was determined. The DCV sequence predicts a genome organization different to that of other RNA virus families whose sequences are known. The single-stranded positive-sense genomic RNA is 9264 nucleotides in length and contains two large open reading frames (ORFs) which are separated by 191 nucleotides. The 5' ORF contains regions of similarities with the RNA-dependent RNA polymerase, helicase and protease domains of viruses from the picornavirus, comovirus and sequivirus families. The 3' ORF encodes the capsid proteins as confirmed by N-terminal sequence analysis of these proteins. The capsid protein coding region is unusual in two ways: firstly the cistron appears to lack an initiating methionine and secondly no subgenomic RNA is produced, suggesting that the proteins may be translated through internal initiation of translation from the genomic length RNA. The finding of this novel genome organization for DCV shows that this virus is not a member of the Picornaviridae as previously thought, but belongs to a distinct and hitherto unrecognized virus family.
Resumo:
The nifH gene sequence of the nitrogen-fixing bacterium Acetobacter diazotrophicus was determined with the use of the polymerase chain reaction and universal degenerate oligonucleotide primers. The gene shows highest pair-wise similarity to the nifH gene of Azospirillum brasilense. The phylogenetic relationships of the nifH gene sequences were compared with those inferred from 16S rRNA gene sequences. Knowledge of the sequence of the nifH gene contributes to the growing database of nifH gene sequences, and will allow the detection of Acet. diazotrophicus from environmental samples with nifH gene-based primers.
Resumo:
A clone encoding ovine preprogastrin was isolated from a sheep genomic library. The deduced 104 amino acid sequence of ovine preprogastrin was 92% and 68% identical to the sequences of bovine and human preprogastrin, respectively. While the similarity was greatest in the gastrin-17 sequence, an unexpected similarity was also observed in the N-terminus of mature progastrin.
Resumo:
Segregation of mRNAs in the cytoplasm of polar cells has been demonstrated for proteins involved in Xenopus and Drosophila oogenesis, and for some proteins in somatic cells. It is assumed that vectorial transport of the messages is generally responsible for this localization. The mRNA encoding the basic protein of central nervous system myelin is selectively transported to the distal ends of the processes of oligodendrocytes, where it is anchored to the myelin membrane and translated. This transport is dependent on a 21-nucleotide cis-acting segment of the 3'-untranslated region (RTS). Proteins that bind to this cis-acting segment have now been isolated from extracts of rat brain. A group of six 35-42-kDa proteins bind to a 35-base oligoribonucleotide incorporating the RTS, but not to several oligoribonucleotides with the same composition but randomized sequences, thus establishing specificity for the base sequence in the RTS. The most abundant of these proteins has been identified, by Edman sequencing of tryptic peptides and mass spectroscopy, as heterogeneous nuclear ribonucleoprotein (hnRNP) A2, a 36-kDa member of a family of proteins that are primarily, but not solely, intranuclear. This protein was most abundant in samples from rat brain and testis, with lower amounts in other tissues. It was separated from the other polypeptides by using reverse-phase HPLC and shown to retain preferential association with the RTS. In cultured oligodendrocytes, hnRNP A2 was demonstrated by confocal microscopy to be distributed throughout the nucleus, cell soma, and processes.
Resumo:
Parkinson's disease (PD) is a neurodegenerative movement disorder primarily due to basal ganglia dysfunction. While much research has been conducted on Parkinsonian deficits in the traditional arena of musculoskeletal limb movement, research in other functional motor tasks is lacking. The present study examined articulation in PD with increasingly complex sequences of articulatory movement. Of interest was whether dysfunction would affect articulation in the same manner as in limb-movement impairment. In particular, since very Similar (homogeneous) articulatory sequences (the tongue twister effect) are more difficult for healthy individuals to achieve than dissimilar (heterogeneous) gestures, while the reverse may apply for skeletal movements in PD, we asked which factor would dominate when PD patients articulated various grades of artificial tongue twisters: the influence of disease or a possible difference between the two motor systems. Execution was especially impaired when articulation involved a sequence of motor program heterogeneous in terms of place of articulation. The results are suggestive of a hypokinesic tendency in complex sequential articulatory movement as in limb movement. It appears that PD patients do show abnormalities in articulatory movement which are similar to those of the musculoskeletal system. The present study suggests that an underlying disease effect modulates movement impairment across different functional motor systems. (C) 1998 Academic Press.
Resumo:
Phosphorylation of the tumor suppressor p53 is generally thought to modify the properties of the protein in four of its five independent domains. We used synthetic peptides to directly study the effects of phosphorylation on the non-sequence-specific DNA binding and conformation of the C-terminal, basic domain. The peptides corresponded to amino acids 361-393 and were either nonphosphorylated or phosphorylated at the protein kinase C (PKC) site, Ser378, or the casein kinase II (CKII) site, Ser392, or bis-phosphorylated on both the PKC and the CKII sites. A fluorescence polarization analysis revealed that either the recombinant p53 protein or the synthetic peptides bound to two unrelated target DNA fragments. Phosphorylation of the peptide at the PKC or the CKII sites clearly decreased DNA binding, and addition of a second phosphate group almost completely abolished binding. Circular dichroism spectroscopy showed that the peptides assumed identical unordered structures in aqueous solutions. The unmodified peptide, unlike the Ser378 phosphorylated peptide, changed conformation in the presence of DNA. The inherent ability of the peptides to form an alpha-helix could be detected when circular dichroism and nuclear magnetic resonance spectra were: taken in trifluoroethanol-water mixtures. A single or double phosphorylation destabilized the helix around the phosphorylated Ser378 residue but stabilized the helix downstream in the sequence.
Resumo:
Monocrotaline is a pyrrolizidine alkaloid known to cause toxicity in humans and animals. Its mechanism of biological action is still unclear although DNA crosslinking has been suggested to a play a role in its activity. In this study we found that an active metabolite of monocrotaline, dehydromonocrotaline (DHM), alkylates guanines at the N7 position of DNA with a preference for 5'-GG and 5'-GA sequences; In addition, it generates piperidine- and heat-resistant multiple DNA crosslinks, as confirmed by electrophoresis and electron microscopy. On the basis of these findings, we propose that DHM undergoes rapid polymerization to a structure which is able to crosslink several fragments of DNA.