62 resultados para sequence based alignments
em University of Queensland eSpace - Australia
Resumo:
This article investigates the expression patterns of 160 genes that are expressed during early mouse development. The cDNAs were isolated from 7.5 d postcoitum (dpc) encloderm, a region that comprises visceral encloderm (VE), definitive encloderm, and the node-tissues that are required for the initial steps of axial specification and tissue patterning in the mouse. To avoid examining the same gene more than once, and to exclude potentially ubiquitously expressed housekeeping genes, cDNA sequence was derived from 1978 clones of the Endoderm library. These yielded 1440 distinct cDNAs, of which 123 proved to be novel in the mouse. In situ hybridization analysis was carried out on 160 of the cDNAs, and of these, 29 (18%) proved to have restricted expression patterns.
Resumo:
Wurst is a protein threading program with an emphasis on high quality sequence to structure alignments (http://www.zbh.uni-hamburg.de/wurst). Submitted sequences are aligned to each of about 3000 templates with a conventional dynamic programming algorithm, but using a score function with sophisticated structure and sequence terms. The structure terms are a log-odds probability of sequence to structure fragment compatibility, obtained from a Bayesian classification procedure. A simplex optimization was used to optimize the sequence-based terms for the goal of alignment and model quality and to balance the sequence and structural contributions against each other. Both sequence and structural terms operate with sequence profiles.
Resumo:
OBJECTIVE: Although little studied in developing countries, multidrug-resistant tuberculosis (MDR-TB) is considered a major threat. We report the molecular epidemiology, clinical features and outcome of an emerging MDR-TB epidemic. METHODS: In 1996 all tuberculosis suspects in the rural Hlabisa district, South Africa, had sputum cultured, and drug susceptibility patterns of mycobacterial isolates were determined. Isolates with MDR-TB (resistant to both isoniazid and rifampicin) were DNA fingerprinted by restriction fragment length polymorphism (RFLP) using IS6110 and polymorphic guanine-cytosine-rich sequence-based (PGRS) probes. Patients with MDR-TB were traced to determine outcome. Data were compared with results from a survey of drug susceptibility done in 1994. RESULTS: The rate of MDR-TB among smear-positive patients increased six-fold from 0.36% (1/275) in 1994 to 2.3% (13/561) in 1996 (P = 0.04). A further eight smear-negative cases were identified in 1996 from culture, six of whom had not been diagnosed with tuberculosis. MDR disease was clinically suspected in only five of the 21 cases (24%). Prevalence of primary and acquired MDR-TB was 1.8% and 4.1%, respectively. Twelve MDR-TB cases (67%) were in five RFLP-defined clusters. Among 20 traced patients, 10 (50%) had died, five had active disease (25%) and five (25%) were apparently cured. CONCLUSIONS: The rate of MDR-TB has risen rapidly in Hlabisa, apparently due to both reactivation disease and recent transmission. Many patients were not diagnosed with tuberculosis and many were not suspected of drug-resistant disease, and outcome was poor.
Resumo:
Sausage is a protein sequence threading program, but with remarkable run-time flexibility. Using different scripts, it can calculate protein sequence-structure alignments, search structure libraries, swap force fields, create models form alignments, convert file formats and analyse results. There are several different force fields which might be classed as knowledge-based, although they do not rely on Boltzmann statistics. Different force fields are used for alignment calculations and subsequent ranking of calculated models.
Resumo:
In this study, the suitability of two repetitive-element-based PCR (rep-PCR) assays, enterobacterial repetitive intergenic consensus (ERIC)-PCR and BOX-PCR, to rapidly characterize Pseudomonas aeruginosa strains isolated from patients with cystic fibrosis (CF) was examined. ERIC-PCR utilizes paired sequence-specific primers and BOX-PCR a single primer that target highly conserved repetitive elements in the P. aeruginosa genome. Using these rep-PCR assays, 163 P. aeruginosa isolates cultured from sputa collected from 50 patients attending an adult CF clinic and 50 children attending a paediatric CF clinic were typed. The results of the rep-PCR assays were compared to the results of PFGE. All three assays revealed the presence of six major clonal groups shared by multiple patients attending either of the CF clinics, with the dominant clonal group infecting 38% of all patients. This dominant clonal group was not related to the dominant clonal group detected in Sydney or Melbourne (pulsotype 1), nor was it related to the dominant groups detected in the UK. In all, PFGE and rep-PCR identified 58 distinct clonal groups, with only three of these shared between the two clinics. The results of this study showed that both ERIC-PCR and BOX-PCR are rapid, highly discriminatory and reproducible assays that proved to be powerful surveillance screening tools for the typing of clinical P. aeruginosa isolates recovered from patients with CF.
Resumo:
Patterns of population subdivision and the relationship between gene flow and geographical distance in the tropical estuarine fish Lares calcarifer (Centropomidae) were investigated using mtDNA control region sequences. Sixty-three putative haplotypes were resolved from a total of 270 individuals from nine localities within three geographical regions spanning the north Australian coastline. Despite a continuous estuarine distribution throughout the sampled range, no haplotypes were shared among regions. However, within regions, common haplotypes were often shared among localities. Both sequence-based (average Phi(ST)=0.328) and haplotype-based (average Phi(ST)=0.182) population subdivision analyses indicated strong geographical structuring. Depending on the method of calculation, geographical distance explained either 79 per cent (sequence-based) or 23 per cent (haplotype-based) of the variation in mitochondrial gene flow. Such relationships suggest that genetic differentiation of L. calcarifer has been generated via isolation-by-distance, possibly in a stepping-stone fashion. This pattern of genetic structure is concordant with expectations based on the life history of L. calcarifer and direct studies of its dispersal patterns. Mitochondrial DNA variation, although generally in agreement with patterns of allozyme variation, detected population subdivision at smaller spatial scales. Our analysis of mtDNA variation in L. calcarifer confirms that population genetic models can detect population structure of not only evolutionary significance but also of demographic significance. Further, it demonstrates the power of inferring such structure from hypervariable markers, which correspond to small effective population sizes.
Resumo:
Sound application of molecular epidemiological principles requires working knowledge of both molecular biological and epidemiological methods. Molecular tools have become an increasingly important part of studying the epidemiology of infectious agents. Molecular tools have allowed the aetiological agent within a population to be diagnosed with a greater degree of efficiency and accuracy than conventional diagnostic tools. They have increased the understanding of the pathogenicity, virulence, and host-parasite relationships of the aetiological agent, provided information on the genetic structure and taxonomy of the parasite and allowed the zoonotic potential of previously unidentified agents to be determined. This review describes the concept of epidemiology and proper study design, describes the array of currently available molecular biological tools and provides examples of studies that have integrated both disciplines to successfully unravel zoonotic relationships that would otherwise be impossible utilising conventional diagnostic tools. The current limitations of applying these tools, including cautions that need to be addressed during their application are also discussed.(c) 2005 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.
Resumo:
The polypeptide backbones and side chains of proteins are constantly moving due to thermal motion and the kinetic energy of the atoms. The B-factors of protein crystal structures reflect the fluctuation of atoms about their average positions and provide important information about protein dynamics. Computational approaches to predict thermal motion are useful for analyzing the dynamic properties of proteins with unknown structures. In this article, we utilize a novel support vector regression (SVR) approach to predict the B-factor distribution (B-factor profile) of a protein from its sequence. We explore schemes for encoding sequences and various settings for the parameters used in SVR. Based on a large dataset of high-resolution proteins, our method predicts the B-factor distribution with a Pearson correlation coefficient (CC) of 0.53. In addition, our method predicts the B-factor profile with a CC of at least 0.56 for more than half of the proteins. Our method also performs well for classifying residues (rigid vs. flexible). For almost all predicted B-factor thresholds, prediction accuracies (percent of correctly predicted residues) are greater than 70%. These results exceed the best results of other sequence-based prediction methods. (C) 2005 Wiley-Liss, Inc.
Resumo:
Motivation: While processing of MHC class II antigens for presentation to helper T-cells is essential for normal immune response, it is also implicated in the pathogenesis of autoimmune disorders and hypersensitivity reactions. Sequence-based computational techniques for predicting HLA-DQ binding peptides have encountered limited success, with few prediction techniques developed using three-dimensional models. Methods: We describe a structure-based prediction model for modeling peptide-DQ3.2 beta complexes. We have developed a rapid and accurate protocol for docking candidate peptides into the DQ3.2 beta receptor and a scoring function to discriminate binders from the background. The scoring function was rigorously trained, tested and validated using experimentally verified DQ3.2 beta binding and non-binding peptides obtained from biochemical and functional studies. Results: Our model predicts DQ3.2 beta binding peptides with high accuracy [area under the receiver operating characteristic (ROC) curve A(ROC) > 0.90], compared with experimental data. We investigated the binding patterns of DQ3.2 beta peptides and illustrate that several registers exist within a candidate binding peptide. Further analysis reveals that peptides with multiple registers occur predominantly for high-affinity binders.
Resumo:
Pattern discovery in a long temporal event sequence is of great importance in many application domains. Most of the previous work focuses on identifying positive associations among time stamped event types. In this paper, we introduce the problem of defining and discovering negative associations that, as positive rules, may also serve as a source of knowledge discovery. In general, an event-oriented pattern is a pattern that associates with a selected type of event, called a target event. As a counter-part of previous research, we identify patterns that have a negative relationship with the target events. A set of criteria is defined to evaluate the interestingness of patterns associated with such negative relationships. In the process of counting the frequency of a pattern, we propose a new approach, called unique minimal occurrence, which guarantees that the Apriori property holds for all patterns in a long sequence. Based on the interestingness measures, algorithms are proposed to discover potentially interesting patterns for this negative rule problem. Finally, the experiment is made for a real application.
Resumo:
Pattern discovery in temporal event sequences is of great importance in many application domains, such as telecommunication network fault analysis. In reality, not every type of event has an accurate timestamp. Some of them, defined as inaccurate events may only have an interval as possible time of occurrence. The existence of inaccurate events may cause uncertainty in event ordering. The traditional support model cannot deal with this uncertainty, which would cause some interesting patterns to be missing. A new concept, precise support, is introduced to evaluate the probability of a pattern contained in a sequence. Based on this new metric, we define the uncertainty model and present an algorithm to discover interesting patterns in the sequence database that has one type of inaccurate event. In our model, the number of types of inaccurate events can be extended to k readily, however, at a cost of increasing computational complexity.
Resumo:
Bellerophon is a program for detecting chimeric sequences in multiple sequence datasets by an adaption of partial treeing analysis. Bellerophon was specifically developed to detect 16S rRNA gene chimeras in PCR-clone libraries of environmental samples but can be applied to other nucleotide sequence alignments.
Resumo:
Conventionally, protein structure prediction via threading relies on some nonoptimal method to align a protein sequence to each member of a library of known structures. We show how a score function (force field) can be modified so as to allow the direct application of a dynamic programming algorithm to the problem. This involves an approximation whose damage can be minimized by an optimization process during score function parameter determination. The method is compared to sequence to structure alignments using a more conventional pair-wise score function and the frozen approximation. The new method produces results comparable to the frozen approximation, but is faster and has fewer adjustable parameters. It is also free of memory of the template's original amino acid sequence, and does not suffer from a problem of nonconvergence, which can be shown to occur with the frozen approximation. Alignments generated by the simplified score function can then be ranked using a second score function with the approximations removed. (C) 1999 John Wiley & Sons, Inc.
Resumo:
Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.