60 resultados para recurrent sequence
em University of Queensland eSpace - Australia
Resumo:
Motivation: Targeting peptides direct nascent proteins to their specific subcellular compartment. Knowledge of targeting signals enables informed drug design and reliable annotation of gene products. However, due to the low similarity of such sequences and the dynamical nature of the sorting process, the computational prediction of subcellular localization of proteins is challenging. Results: We contrast the use of feed forward models as employed by the popular TargetP/SignalP predictors with a sequence-biased recurrent network model. The models are evaluated in terms of performance at the residue level and at the sequence level, and demonstrate that recurrent networks improve the overall prediction performance. Compared to the original results reported for TargetP, an ensemble of the tested models increases the accuracy by 6 and 5% on non-plant and plant data, respectively.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
We studied the internal transcribed spacer 2 (ITS2) in twenty-two spp. of ticks from the subfamily Rhipicephalinae. A 104-109 base pair (bp) region was Imperfectly repeated In most ticks studied. Mapping the number of repeat copies on to a phylogeny from the ITS2 showed that there have been many Independent gains and losses of repeats. Comparison of the sequences of the repeat copies Indicated that in most taxa concerted evolution had played little if any role in the evolution of these regions, as the copies clustered by sequence position rather than species, In our putative secondary structure, each repeat copy can fold into a distinct and almost identical stem-loop complex; a gain or loss of a repeat copy apparently does not impair the function of the ITS2 in these ticks.
Resumo:
Despite many successes of conventional DNA sequencing methods, some DNAs remain difficult or impossible to sequence. Unsequenceable regions occur in the genomes of many biologically important organisms, including the human genome. Such regions range in length from tens to millions of bases, and may contain valuable information such as the sequences of important genes. The authors have recently developed a technique that renders a wide range of problematic DNAs amenable to sequencing. The technique is known as sequence analysis via mutagenesis (SAM). This paper presents a number of algorithms for analysing and interpreting data generated by this technique.
Resumo:
Despite the success of conventional Sanger sequencing, significant regions of many genomes still present major obstacles to sequencing. Here we propose a novel approach with the potential to alleviate a wide range of sequencing difficulties. The technique involves extracting target DNA sequence from variants generated by introduction of random mutations. The introduction of mutations does not destroy original sequence information, but distributes it amongst multiple variants. Some of these variants lack problematic features of the target and are more amenable to conventional sequencing. The technique has been successfully demonstrated with mutation levels up to an average 18% base substitution and has been used to read previously intractable poly(A), AT-rich and GC-rich motifs.
Resumo:
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Resumo:
We sequenced cDNAs coding for chicken cellular nucleic acid binding protein (CNBP). Two slightly different variations of the open reading frame were found, each of which translates into a protein with seven zinc finger domains. The longest transcript contains an in-frame insert of 3 bp. The sequence conservation between chick CNBP cDNAs with human, rat and mouse CNBP cDNAs is extreme, especially in the coding region, where the deduced amino acid sequence identity with human, rat and mouse CNBP is 99%. CNBP-like transcripts were also found in various tissues from insect, shrimp, fish and lizard. Regions with remarkable nucleotide conservation were also found in the 3' untranslated region, indicating important functions for these regions. Quantitative reverse transcription polymerase chain reaction (RT-PCR) indicated that in the chick, CNBP is present in all tissues examined in approximately equal ratios to total RNA. RT-PCR of total RNA isolated from different phyla indicate CNBP-like proteins art widespread throughout the animal kingdom. The extraordinary level of conservation suggests an important physiological role for CNBP. (C) 1997 Elsevier Science Inc.
Resumo:
The nifH gene sequence of the nitrogen-fixing bacterium Acetobacter diazotrophicus was determined with the use of the polymerase chain reaction and universal degenerate oligonucleotide primers. The gene shows highest pair-wise similarity to the nifH gene of Azospirillum brasilense. The phylogenetic relationships of the nifH gene sequences were compared with those inferred from 16S rRNA gene sequences. Knowledge of the sequence of the nifH gene contributes to the growing database of nifH gene sequences, and will allow the detection of Acet. diazotrophicus from environmental samples with nifH gene-based primers.
Resumo:
A clone encoding ovine preprogastrin was isolated from a sheep genomic library. The deduced 104 amino acid sequence of ovine preprogastrin was 92% and 68% identical to the sequences of bovine and human preprogastrin, respectively. While the similarity was greatest in the gastrin-17 sequence, an unexpected similarity was also observed in the N-terminus of mature progastrin.
Resumo:
Segregation of mRNAs in the cytoplasm of polar cells has been demonstrated for proteins involved in Xenopus and Drosophila oogenesis, and for some proteins in somatic cells. It is assumed that vectorial transport of the messages is generally responsible for this localization. The mRNA encoding the basic protein of central nervous system myelin is selectively transported to the distal ends of the processes of oligodendrocytes, where it is anchored to the myelin membrane and translated. This transport is dependent on a 21-nucleotide cis-acting segment of the 3'-untranslated region (RTS). Proteins that bind to this cis-acting segment have now been isolated from extracts of rat brain. A group of six 35-42-kDa proteins bind to a 35-base oligoribonucleotide incorporating the RTS, but not to several oligoribonucleotides with the same composition but randomized sequences, thus establishing specificity for the base sequence in the RTS. The most abundant of these proteins has been identified, by Edman sequencing of tryptic peptides and mass spectroscopy, as heterogeneous nuclear ribonucleoprotein (hnRNP) A2, a 36-kDa member of a family of proteins that are primarily, but not solely, intranuclear. This protein was most abundant in samples from rat brain and testis, with lower amounts in other tissues. It was separated from the other polypeptides by using reverse-phase HPLC and shown to retain preferential association with the RTS. In cultured oligodendrocytes, hnRNP A2 was demonstrated by confocal microscopy to be distributed throughout the nucleus, cell soma, and processes.
Resumo:
Parkinson's disease (PD) is a neurodegenerative movement disorder primarily due to basal ganglia dysfunction. While much research has been conducted on Parkinsonian deficits in the traditional arena of musculoskeletal limb movement, research in other functional motor tasks is lacking. The present study examined articulation in PD with increasingly complex sequences of articulatory movement. Of interest was whether dysfunction would affect articulation in the same manner as in limb-movement impairment. In particular, since very Similar (homogeneous) articulatory sequences (the tongue twister effect) are more difficult for healthy individuals to achieve than dissimilar (heterogeneous) gestures, while the reverse may apply for skeletal movements in PD, we asked which factor would dominate when PD patients articulated various grades of artificial tongue twisters: the influence of disease or a possible difference between the two motor systems. Execution was especially impaired when articulation involved a sequence of motor program heterogeneous in terms of place of articulation. The results are suggestive of a hypokinesic tendency in complex sequential articulatory movement as in limb movement. It appears that PD patients do show abnormalities in articulatory movement which are similar to those of the musculoskeletal system. The present study suggests that an underlying disease effect modulates movement impairment across different functional motor systems. (C) 1998 Academic Press.
Resumo:
Phosphorylation of the tumor suppressor p53 is generally thought to modify the properties of the protein in four of its five independent domains. We used synthetic peptides to directly study the effects of phosphorylation on the non-sequence-specific DNA binding and conformation of the C-terminal, basic domain. The peptides corresponded to amino acids 361-393 and were either nonphosphorylated or phosphorylated at the protein kinase C (PKC) site, Ser378, or the casein kinase II (CKII) site, Ser392, or bis-phosphorylated on both the PKC and the CKII sites. A fluorescence polarization analysis revealed that either the recombinant p53 protein or the synthetic peptides bound to two unrelated target DNA fragments. Phosphorylation of the peptide at the PKC or the CKII sites clearly decreased DNA binding, and addition of a second phosphate group almost completely abolished binding. Circular dichroism spectroscopy showed that the peptides assumed identical unordered structures in aqueous solutions. The unmodified peptide, unlike the Ser378 phosphorylated peptide, changed conformation in the presence of DNA. The inherent ability of the peptides to form an alpha-helix could be detected when circular dichroism and nuclear magnetic resonance spectra were: taken in trifluoroethanol-water mixtures. A single or double phosphorylation destabilized the helix around the phosphorylated Ser378 residue but stabilized the helix downstream in the sequence.
Resumo:
Background and Purpose-Few community-based studies have examined the long-term risk of recurrent stroke after an acute first-ever stroke. This study aimed to determine the absolute and relative risks of a first recurrent stroke over the first 5 years after a first-ever stroke and the predictors of such recurrence in a population-based series of people with first-ever stroke in Perth, Western Australia. Methods-Between February 1989 and August 1990, all people with a suspected acute stroke or transient ischemic attack of the brain who were resident in a geographically defined region of Perth, Western Australia, with a population of 138 708 people, were registered prospectively and assessed according to standardized diagnostic criteria. Patients were followed up prospectively at 4 months, 12 months, and 5 years after the index event. Results-Three hundred seventy patients with a first-ever stroke were registered, of whom 351 survived >2 days. Data were available for 98% of the cohort at 5 years, by which time 199 patients (58%) had died and 52 (15%) had experienced a recurrent stroke, 12 (23%) of which were fatal within 28 days. The 5-year cumulative risk of first recurrent stroke was 22.5% (95% confidence limits [CL], 16.8%, 28.1%). The risk of recurrent stroke was greatest in the first 6 months after stroke, at 8.8% (95% CL, 5.4%, 12.1%). After adjustment for age and sex, the prognostic factors for recurrent stroke were advanced, but not extreme, age (75 to 84 years) (hazard ratio [HR], 2.6; 95% CL, 1.1, 6.2), hemorrhagic index stroke (HR, 2.1; 95% CL, 0.98, 4.4), and diabetes mellitus (HR, 2.1; 95% CL, 0.95, 4.4). Conclusions-Approximately 1 in 6 survivors (15%) of a first-ever stroke experience a recurrent stroke over the next 5 years, of which 25% are fatal within 28 days. The pathological subtype of the recurrent stroke is the same as that of the index stroke in 88% of cases. The predictors of first recurrent stroke in this study were advanced age, hemorrhagic index stroke, and diabetes mellitus, but numbers of recurrent events were modest. Because the risk of recurrent stroke is highest (8.8%) in the first 6 months after stroke, strategies for secondary prevention should be initiated as soon as possible after the index event.
Resumo:
Monocrotaline is a pyrrolizidine alkaloid known to cause toxicity in humans and animals. Its mechanism of biological action is still unclear although DNA crosslinking has been suggested to a play a role in its activity. In this study we found that an active metabolite of monocrotaline, dehydromonocrotaline (DHM), alkylates guanines at the N7 position of DNA with a preference for 5'-GG and 5'-GA sequences; In addition, it generates piperidine- and heat-resistant multiple DNA crosslinks, as confirmed by electrophoresis and electron microscopy. On the basis of these findings, we propose that DHM undergoes rapid polymerization to a structure which is able to crosslink several fragments of DNA.
Resumo:
Liver samples from rabbits killed by RHDV, collected from five States in Australia in 1996 and 1997 were analysed by RT-PCR. A 398 bp fragment of the capsid protein (VP60) gene was amplified by PCR and directly sequenced. The alignment of the nucleotide and amino acid sequences and their comparison with the original strain of the virus released in Australia indicated genetic changes after two years have been small with 98.2% to 100% identity. The constructed phylogenetic tree suggests slight differences in nucleotide substitutions in various States but there is no clear evidence of clustering of sequences according to their geographic origin. In practical terms, sequencing of viral RNA provides a means of testing the efficacy of further releases and subsequent spread of the virus if such a strategy is employed as a means of enhancing RHD as a biological control of the wild rabbit in Australia.