Biblioteca Digital

62 resultados para sequence based alignments

Better prediction of protein contact number using a support vector regression analysis of amino acid sequence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.

SeqDoC: rapid SNP and mutation detection by direct comparison of DNA sequence chromatograms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: This paper describes SeqDoC, a simple, web-based tool to carry out direct comparison of ABI sequence chromatograms. This allows the rapid identification of single nucleotide polymorphisms (SNPs) and point mutations without the need to install or learn more complicated analysis software. Results: SeqDoC produces a subtracted trace showing differences between a reference and test chromatogram, and is optimised to emphasise those characteristic of single base changes. It automatically aligns sequences, and produces straightforward graphical output. The use of direct comparison of the sequence chromatograms means that artefacts introduced by automatic base-calling software are avoided. Homozygous and heterozygous substitutions and insertion/deletion events are all readily identified. SeqDoC successfully highlights nucleotide changes missed by the Staden package 'tracediff' program. Conclusion: SeqDoC is ideal for small-scale SNP identification, for identification of changes in random mutagenesis screens, and for verification of PCR amplification fidelity. Differences are highlighted, not interpreted, allowing the investigator to make the ultimate decision on the nature of the change.

A new algorithm for the detection of intercluster galaxy filaments using galaxy orientation alignments

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a new algorithm for detecting intercluster galaxy filaments based upon the assumption that the orientations of constituent galaxies along such filaments are non-isotropic. We apply the algorithm to the 2dF Galaxy Redshift Survey catalogue and find that it readily detects many straight filaments between close cluster pairs. At large intercluster separations (> 15 h(-1) Mpc), we find that the detection efficiency falls quickly, as it also does with more complex filament morphologies. We explore the underlying assumptions and suggest that it is only in the case of close cluster pairs that we can expect galaxy orientations to be significantly correlated with filament direction.

A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment. (C) 2005 Wiley-Liss, Inc.

Measurement-based teleportation along quantum spin chains

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We examine the teleportation of an unknown spin-1/2 quantum state along a quantum spin chain with an even number of sites. Our protocol, using a sequence of Bell measurements, may be viewed as an iterated version of the 2-qubit protocol of C. H. Bennett et al. [Phys. Rev. Lett. 70, 1895 (1993)]. A decomposition of the Hilbert space of the spin chain into 4 vector spaces, called Bell subspaces, is given. It is established that any state from a Bell subspace may be used as a channel to perform unit fidelity teleportation. The space of all spin-0 many-body states, which includes the ground states of many known antiferromagnetic systems, belongs to a common Bell subspace. A channel-dependent teleportation parameter O is introduced, and a bound on the teleportation fidelity is given in terms of O.

Development of an oligonucleotide-based SNP detection method on lateral flow strips using hexapet tages

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Detection of point mutations or single nucleotide polymorphisms (SNPs) is important in relation to disease susceptibility or detection in pathogens of mutations determining drug resistance or host range. There is an emergent need for rapid detection methods amenable to point-of-care applications. The purpose of this study was to reduce to practice a novel method for SNP detection and to demonstrate that this technology can be used downstream of nucleic acid amplification. The authors used a model system to develop an oligonucleotide-based SNP detection system on nitrocellulose lateral flow strips. To optimize the assay they used cloned sequences of the herpes simplex virus-1 (HSV-1) DNA polymerase gene into which they introduced a point mutation. The assay system uses chimeric polymerase chain reaction (PCR) primers that incorporate hexameric repeat tags ("hexapet tags"). The chimeric sequences allow capture of amplified products to predefined positions on a lateral flow strip. These "hexapet" sequences have minimal cross-reactivity and allow specific hybridization-based capture of the PCR products at room temperature onto lateral flow strips that have been striped with complementary hexapet tags. The allele-specific amplification was carried out with both mutant and wild-type primer sets present in the PCR mix ("competitive" format). The resulting PCR products carried a hexapet tag that corresponded with either a wild-type or mutant sequence. The lateral flow strips are dropped into the PCR reaction tube, and mutant sequence and wild-type sequences diffuse along the strip and are captured at the corresponding position on the strip. A red line indicative of a positive reaction is visible after 1 minute. Unlike other systems that require separate reactions and strips for each target sequence, this system allows multiplex PCR reactions and multiplex detection on a single strip or other suitable substrates. Unambiguous visual discrimination of a point mutation under room temperature hybridization conditions was achieved with this model system in 10 minutes after PCR. The authors have developed a capture-based hybridization method for the detection and discrimination of HSV-1 DNA polymerase genes that contain a single nucleotide change. It has been demonstrated that the hexapet oligonucleotides can be adapted for hybridization on the lateral flow strip platform for discrimination of SNPs. This is the first step in demonstrating SNP detection on lateral flow using the hexapet oligonucleotide capture system. It is anticipated that this novel system can be widely used in point-of-care settings.

Development of a DNA-based method for detection and identification of Phytophthora species

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Phytophthora diseases cause major losses to agricultural and horticultural production in Australia and worldwide. Most Phytophthora diseases are soilborne and difficult to control, making disease prevention an important component of many disease management strategies. Detection and identification of the causal agent, therefore, is an essential part of effective disease management. This paper describes the development and validation of a DNA-based diagnostic assay that can detect and identify 27 different Phytophthora species. We have designed PCR primers that are specific to the genus Phytophthora. The resulting amplicon after PCR is subjected to digestion by restriction enzymes to yield a specific restriction pattern or fingerprint unique to each species. The restriction patterns are compared with a key comprising restriction patterns of type specimens or representative isolates of 27 different Phytophthora species. A number of fundamental issues, such as genetic diversity within and among species which underpin the development and validation of DNA-based diagnostic assays, are addressed in this paper.

Effect of sequence variation in Plasmodium falciparum Histidine-Rich protein 2 on binding of specific monoclonal antibodies: Implications for rapid diagnostic tests for malaria

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This Article Right arrow Full Text Right arrow Full Text (PDF) Right arrow Supplemental material Right arrow Alert me when this article is cited Right arrow Alert me if a correction is posted Services Right arrow Similar articles in this journal Right arrow Similar articles in PubMed Right arrow Alert me to new issues of the journal Right arrow Download to citation manager Right arrow Reprints and Permissions Right arrow Copyright Information Right arrow Books from ASM Press Right arrow MicrobeWorld Citing Articles Right arrow Citing Articles via HighWire Right arrow Citing Articles via Google Scholar Google Scholar Right arrow Articles by Lee, N. Right arrow Articles by McCarthy, J. Right arrow Search for Related Content PubMed Right arrow PubMed Citation Right arrow Articles by Lee, N. Right arrow Articles by McCarthy, J. Right arrow Pubmed/NCBI databases * Substance via MeSH Previous Article | Next Article Journal of Clinical Microbiology, August 2006, p. 2773-2778, Vol. 44, No. 8 0095-1137/06/$08.00+0 doi:10.1128/JCM.02557-05 Copyright © 2006, American Society for Microbiology. All Rights Reserved. Effect of Sequence Variation in Plasmodium falciparum Histidine- Rich Protein 2 on Binding of Specific Monoclonal Antibodies: Implications for Rapid Diagnostic Tests for Malaria{dagger} Nelson Lee,1,2 Joanne Baker,2 Kathy T. Andrews,1 Michelle L. Gatton,1,3 David Bell,4 Qin Cheng,2,3 and James McCarthy1* Australian Centre for International and Tropical Health and Nutrition, Queensland Institute of Medical Research and School of Population Health, University of Queensland, Queensland, Australia,1 Department of Drug Resistance and Diagnostics, Australian Army Malaria Institute, Brisbane, Australia,2 Malaria Drug Resistance and Chemotherapy, Queensland Institute of Medical Research, Queensland, Australia,3 World Health Organization, Regional Office for the Western Pacific, Manila, Philippines4 Received 8 December 2005/ Returned for modification 23 February 2006/ Accepted 26 May 2006 The ability to accurately diagnose malaria infections, particularly in settings where laboratory facilities are not well developed, is of key importance in the control of this disease. Rapid diagnostic tests (RDTs) offer great potential to address this need. Reports of significant variation in the field performance of RDTs based on the detection of Plasmodium falciparum histidine-rich protein 2 (HRP2) (PfHRP2) and of significant sequence polymorphism in PfHRP2 led us to evaluate the binding of four HRP2-specific monoclonal antibodies (MABs) to parasite proteins from geographically distinct P. falciparum isolates, define the epitopes recognized by these MABs, and relate the copy number of the epitopes to MAB reactivity. We observed a significant difference in the reactivity of the same MAB to different isolates and between different MABs tested with single isolates. When the target epitopes of three of the MABs were determined and mapped onto the peptide sequences of the field isolates, significant variability in the frequency of these epitopes was observed. These findings support the role of sequence variation as an explanation for variations in the performance of HRP2-based RDTs and point toward possible approaches to improve their diagnostic sensitivities

Predicting residue-wise contact orders in proteins by support vector regression

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results: We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion: The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.

Predicting the solvent accessibility of transmembrane residues from protein sequence

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.

Quantitative analysis of DNA-protein interactions using double-labeled native gel electrophoresis and fluorescence-based imaging

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a sensitive, non-radioactive method to assess the interaction of transcription factors/DNA-binding proteins with DNA. We have modified the traditional radiolabeled DNA gel mobility shift assay to incorporate a DNA probe end-labeled with a Texas-red fluorophore and a DNA-binding protein tagged with the green fluorescent protein to monitor precisely DNA-protein complexation by native gel electrophoresis. We have applied this method to the DNA-binding proteins telomere release factor-1 and the sex-determining region-Y, demonstrating that the method is sensitive (able to detect 100 fmol of fluorescently labeled DNA), permits direct visualization of both the DNA probe and the DNA-binding protein, and enables quantitative analysis of DNA and protein complexation, and thereby an estimation of the stoichiometry of protein-DNA binding.

Transcript annotation in FANTOM3: Mouse gene catalog based on physical cDNAs

Relevância:

30.00% 30.00%

Publicador:

Resumo:

T he international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM(2), comprised 60,770 full- length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein- coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full- length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web- based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full- length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding ( including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full- length cDNAs. The total number of distinct non- protein- coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and. nal expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

Translation of the flavivirus Kunjin NS3 gene in cis but not its RNA sequence or secondary structure is essential for efficient RNA packaging

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Our previous studies using trans-complementation analysis of Kunjin virus (KUN) full-length cDNA clones harboring in-frame deletions in the NS3 gene demonstrated the inability of these defective complemented RNAs to be packaged into virus particles (W. J. Liu, P. L. Sedlak, N. Kondratieva, and A. A. Khromykh, J. Virol. 76:10766-10775). In this study we aimed to establish whether this requirement for NS3 in RNA packaging is determined by the secondary RNA structure of the NS3 gene or by the essential role of the translated NS3 gene product. Multiple silent mutations of three computer-predicted stable RNA structures in the NS3 coding region of KUN replicon RNA aimed at disrupting RNA secondary structure without affecting amino acid sequence did not affect RNA replication and packaging into virus-like particles in the packaging cell line, thus demonstrating that the predicted conserved RNA structures in the NS3 gene do not play a role in RNA replication and/or packaging. In contrast, double frameshift mutations in the NS3 coding region of full-length KUN RNA, producing scrambled NS3 protein but retaining secondary RNA structure, resulted in the loss of ability of these defective RNAs to be packaged into virus particles in complementation experiments in KUN replicon-expressing cells. Furthermore, the more robust complementation-packaging system based on established stable cell lines producing large amounts of complemented replicating NS3-deficient replicon RNAs and infection with KUN virus to provide structural proteins also failed to detect any secreted virus-like particles containing packaged NS3-deficient replicon RNAs. These results have now firmly established the requirement of KUN NS3 protein translated in cis for genome packaging into virus particles.

Rationalization of taro germplam collections in the Pacific Island region using simple sequence repeat (SSR) markers

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A regional (Oceania) core collection for taro germplasm has been developed based on phenotypic and molecular characterization. In total, 2199 accessions of taro germplasm have been collected by TaroGen (Taro Genetic Resources: Conservation and Utilisation) from 10 countries in Oceania: Papua New Guinea, Solomon Islands, Vanuatu, New Caledonia, Fiji, Palau, Niue, Tonga, Cook Islands and Samoa. Our objective was to select 10% from each country to contribute to a regional core. The larger collections from Papua New Guinea, Vanuatu and New Caledonia were analysed based on phenotypic characters, and a diverse subset representing 20% of these collections was fingerprinted. A diverse 20% subsample was also taken from the Solomon Islands. All accessions from the other six countries were fingerprinted. In total, 515 accessions were genotyped (23.4% overall) using taro specific simple sequence repeat (SSR) markers. DNA fingerprint data showed that great allelic diversity existed in Papua New Guinea and the Solomon Islands. Interestingly, rare alleles were identified in taros from the Solomon Islands province of Choiseul which were not observed in any of the other collections. Overall, 211 accessions were recommended for inclusion in the final regional core collection based on the phenotypic and molecular characterization.

Exploiting sequence dependencies in the prediction of peroxisomal proteins

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Prediction of peroxisomal matrix proteins generally depends on the presence of one of two distinct motifs at the end of the amino acid sequence. PTS1 peroxisomal proteins have a well conserved tripeptide at the C-terminal end. However, the preceding residues in the sequence arguably play a crucial role in targeting the protein to the peroxisome. Previous work in applying machine learning to the prediction of peroxisomal matrix proteins has failed W capitalize on the full extent of these dependencies. We benchmark a range of machine learning algorithms, and show that a classifier - based on the Support Vector Machine - produces more accurate results when dependencies between the conserved motif and the preceding section are exploited. We publish an updated and rigorously curated data set that results in increased prediction accuracy of most tested models.

«
1
2
3
4
5
»