23 resultados para sequence based alignments
em Aston University Research Archive
Resumo:
Based on Bayesian Networks, methods were created that address protein sequence-based bacterial subcellular location prediction. Distinct predictive algorithms for the eight bacterial subcellular locations were created. Several variant methods were explored. These variations included differences in the number of residues considered within the query sequence - which ranged from the N-terminal 10 residues to the whole sequence - and residue representation - which took the form of amino acid composition, percentage amino acid composition, or normalised amino acid composition. The accuracies of the best performing networks were then compared to PSORTB. All individual location methods outperform PSORTB except for the Gram+ cytoplasmic protein predictor, for which accuracies were essentially equal, and for outer membrane protein prediction, where PSORTB outperforms the binary predictor. The method described here is an important new approach to method development for subcellular location prediction. It is also a new, potentially valuable tool for candidate subunit vaccine selection.
Resumo:
As torrents of new data now emerge from microbial genomics, bioinformatic prediction of immunogenic epitopes remains challenging but vital. In silico methods often produce paradoxically inconsistent results: good prediction rates on certain test sets but not others. The inherent complexity of immune presentation and recognition processes complicates epitope prediction. Two encouraging developments – data driven artificial intelligence sequence-based methods for epitope prediction and molecular modeling methods based on three-dimensional protein structures – offer hope for the future.
Resumo:
Switched mode power supplies (SMPSs) are essential components in many applications, and electromagnetic interference is an important consideration in the SMPS design. Spread spectrum based PWM strategies have been used in SMPS designs to reduce the switching harmonics. This paper proposes a novel method to integrate a communication function into spread spectrum based PWM strategy without extra hardware costs. Direct sequence spread spectrum (DSSS) and phase shift keying (PSK) data modulation are employed to the PWM of the SMPS, so that it has reduced switching harmonics and the input and output power line voltage ripples contain data. A data demodulation algorithm has been developed for receivers, and code division multiple access (CDMA) concept is employed as communication method for a system with multiple SMPSs. The proposed method has been implemented in both Buck and Boost converters. The experimental results validated the proposed DSSS based PWM strategy for both harmonic reduction and communication.
Resumo:
The evidence that cochlear implant listeners routinely experience stream segregation is limited and equivocal. Streaming in these listeners was explored using tone sequences matched to the center frequencies of the implant’s 22 electrodes. Experiment 1 measured temporal discrimination for short (ABA triplet) and longer (12 AB cycles) sequences (tone/silence durations = 60/40 ms). Tone A stimulated electrode 11; tone B stimulated one of 14 electrodes. On each trial, one sequence remained isochronous, and tone B was delayed in the other; listeners had to identify the anisochronous interval. The delay was introduced in the second half of the longer sequences. Prior build-up of streaming should cause thresholds to rise more steeply with increasing electrode separation, but no interaction with sequence length was found. Experiment 2 required listeners to identify which of two target sequences was present when interleaved with distractors (tone/silence durations = 120/80 ms). Accuracy was high for isolated targets, but most listeners performed near chance when loudness-matched distractors were added, even when remote from the target. Only a substantial reduction in distractor level improved performance, and this effect did not interact with target-distractor separation. These results indicate that implantees often do not achieve stream segregation, even in relatively unchallenging tasks.
Resumo:
Using an optical biosensor based on a dual-peak long-period fiber grating, we have demonstrated the detection of interactions between biomolecules in real time. Silanization of the grating surface was successfully realized for the covalent immobilization of probe DNA, which was subsequently hybridized with the complementary target DNA sequence. It is interesting to note that the DNA biosensor was reusable after being stripped off the hybridized target DNA from the grating surface, demonstrating a function of multiple usability.
Resumo:
Affinity purification of plasmid DNA is an attractive option for the biomanufacture of therapeutic plasmids, which are strictly controlled for levels of host protein, DNA, RNA, and endotoxin. Plasmid vectors are considered to be a safer alternative than viruses for gene therapy, but milligram quantities of DNA are required per dose. Previous affinity approaches have involved triplex DNA formation and a sequence-specific zinc finger protein. We present a more generically applicable protein-based approach, which exploits the lac operator, present in a wide diversity of plasmids, as a target sequence. We used a GFP/His-tagged Lacl protein, which is precomplexed with the plasmid, and the resulting complex was immobilized on a solid support (TALON resin). Ensuing elution gives plasmid DNA, in good yield (>80% based on recovered starting material, 35-50% overall process), free from detectable RNA and protein and with minimal genomic DNA contamination. Such an affinity-based process should enhance plasmid purity and ultimately, after appropriate development, may simplify the biomanufacturing process of therapeutic plasmids.
Resumo:
The genome of Salmonella enterica serovar Enteritidis was shown to possess three IS3-like insertion elements, designated IS1230A, B and C, and each was cloned and their respective deoxynucleotide sequences determined. Mutations in elements IS1230A and B resulted in frameshifts in the open reading frames that encoded a putative transposase to be inactive. IS1230C was truncated at nucleotide 774 relative to IS1230B and therefore did not possess the 3' terminal inverted repeat. The three IS1230 derivatives were closely related to each other based on nucleotide sequence similarity. IS1230A was located adjacent to the sef operon encoding SEF14 fimbriae located at minute 97 of the genome of S. Enteritidis. IS1230B was located adjacent to the umuDC operon at minute 42.5 on the genome, itself located near to one terminus of an 815-kb genome inversion of S. Enteritidis relative to S. Typhimurium. IS1230C was located next to attB, the bacteriophage P22 attachment site, and proB, encoding gamma-glutamyl phosphate reductase. A truncated 3' remnant of IS1230, designated IS1230T, was identified in a clinical isolate of S. Typhimurium DT193 strain 2391. This element was located next to attB adjacent to which were bacteriophage P22-like sequences. Southern hybridisation of total genomic DNA from eighteen phage types of S. Enteritidis and eighteen definitive types of S. Typhimurium showed similar, if not identical, restriction fragment profiles in the respective serovars when probed with IS1230A.
Resumo:
Although techniques such as biopanning rely heavily upon the screening of randomized gene libraries, there is surprisingly little information available on the construction of those libraries. In general, it is based on the cloning of 'randomized' synthetic oligonucleotides, in which given position(s) contain an equal mixture of all four bases. Yet, many supposedly 'randomized' libraries contain significant elements of bias and/or omission. Here, we report the development and validation of a new, PCR-based assay that enables rapid examination of library composition both prior to and after cloning. By using our assay to analyse model libraries, we demonstrate that the cloning of a given distribution of sequences does not necessarily result in a similarly composed library of clones. Thus, while bias in randomized synthetic oligonucleotide mixtures can be virtually eliminated by using unequal ratios of the four phosphoramidites, the use of such mixtures does not ensure retrieval of a truly randomized library. We propose that in the absence of a technique to control cloning frequencies, the ability to analyse the composition of libraries after cloning will enhance significantly the quality of information derived from those libraries. (C) 2000 Published by Elsevier Science B.V. All rights reserved.
Resumo:
The G-protein coupled receptors--or GPCRs--comprise simultaneously one of the largest and one of the most multi-functional protein families known to modern-day molecular bioscience. From a drug discovery and pharmaceutical industry perspective, the GPCRs constitute one of the most commercially and economically important groups of proteins known. The GPCRs undertake numerous vital metabolic functions and interact with a hugely diverse range of small and large ligands. Many different methodologies have been developed to efficiently and accurately classify the GPCRs. These range from motif-based techniques to machine learning as well as a variety of alignment-free techniques based on the physiochemical properties of sequences. We review here the available methodologies for the classification of GPCRs. Part of this work focuses on how we have tried to build the intrinsically hierarchical nature of sequence relations, implicit within the family, into an adaptive approach to classification. Importantly, we also allude to some of the key innate problems in developing an effective approach to classifying the GPCRs: the lack of sequence similarity between the six classes that comprise the GPCR family and the low sequence similarity to other family members evinced by many newly revealed members of the family.
Resumo:
In this work we propose the hypothesis that replacing the current system of representing the chemical entities known as amino acids using Latin letters with one of several possible alternative symbolic representations will bring significant benefits to the human construction, modification, and analysis of multiple protein sequence alignments. We propose ways in which this might be done without prescribing the choice of actual scripts used. Specifically we propose and explore three ways to encode amino acid texts using novel symbolic alphabets free from precedents. Primary orthographic encoding is the direct substitution of a new alphabet for the standard, Latin-based amino acid code. Secondary encoding imposes static residue groupings onto the orthography of the alphabet by manipulating the shape and/or orientation of amino acid symbols. Tertiary encoding renders each residue as a composite symbol; each such symbol thus representing several alternative amino acid groupings simultaneously. We also propose that the use of a new group-focussed alphabet will free the colouring of amino acid residues often used as a tool to facilitate the representation or construction of multiple alignments for other purposes, possibly to indicate dynamic properties of an alignment such as position-wise residue conservation.
Resumo:
We have developed a novel multilocus sequence typing (MLST) scheme and database (http://pubmlst.org/pacnes/) for Propionibacterium acnes based on the analysis of seven core housekeeping genes. The scheme, which was validated against previously described antibody, single locus and random amplification of polymorphic DNA typing methods, displayed excellent resolution and differentiated 123 isolates into 37 sequence types (STs). An overall clonal population structure was detected with six eBURST groups representing the major clades I, II and III, along with two singletons. Two highly successful and global clonal lineages, ST6 (type IA) and ST10 (type IB1), representing 64?% of this current MLST isolate collection were identified. The ST6 clone and closely related single locus variants, which comprise a large clonal complex CC6, dominated isolates from patients with acne, and were also significantly associated with ophthalmic infections. Our data therefore support an association between acne and P. acnes strains from the type IA cluster and highlight the role of a widely disseminated clonal genotype in this condition. Characterization of type I cell surface-associated antigens that are not detected in ST10 or strains of type II and III identified two dermatan-sulphate-binding proteins with putative phase/antigenic variation signatures. We propose that the expression of these proteins by type IA organisms contributes to their role in the pathophysiology of acne and helps explain the recurrent nature of the disease. The MLST scheme and database described in this study should provide a valuable platform for future epidemiological and evolutionary studies of P. acnes.
Resumo:
We investigate full-field detection-based maximum-likelihood sequence estimation (MLSE) for chromatic dispersion compensation in 10 Gbit/s OOK optical communication systems. Important design criteria are identified to optimize the system performance. It is confirmed that approximately 50% improvement in transmission reach can be achieved compared to conventional direct-detection MLSE at both 4 and 16 states. It is also shown that full-field MLSE is more robust to the noise and the associated noise amplifications in full-field reconstruction, and consequently exhibits better tolerance to nonoptimized system parameters than full-field feedforward equalizer. Experiments over 124 km spans of field-installed single-mode fiber without optical dispersion compensation using full-field MLSE verify the theoretically predicted performance benefits.
Resumo:
We describe the linear and nonlinear transfer characteristics of a multi-resonance optical device consisting of two ring resonators coupled one to another and to a waveguide. The propagation effects displayed by the device are compared with those of a sequence of a waveguide-coupled fundamental ring resonators.
Resumo:
Background - Modelling the interaction between potentially antigenic peptides and Major Histocompatibility Complex (MHC) molecules is a key step in identifying potential T-cell epitopes. For Class II MHC alleles, the binding groove is open at both ends, causing ambiguity in the positional alignment between the groove and peptide, as well as creating uncertainty as to what parts of the peptide interact with the MHC. Moreover, the antigenic peptides have variable lengths, making naive modelling methods difficult to apply. This paper introduces a kernel method that can handle variable length peptides effectively by quantifying similarities between peptide sequences and integrating these into the kernel. Results - The kernel approach presented here shows increased prediction accuracy with a significantly higher number of true positives and negatives on multiple MHC class II alleles, when testing data sets from MHCPEP [1], MCHBN [2], and MHCBench [3]. Evaluation by cross validation, when segregating binders and non-binders, produced an average of 0.824 AROC for the MHCBench data sets (up from 0.756), and an average of 0.96 AROC for multiple alleles of the MHCPEP database. Conclusion - The method improves performance over existing state-of-the-art methods of MHC class II peptide binding predictions by using a custom, knowledge-based representation of peptides. Similarity scores, in contrast to a fixed-length, pocket-specific representation of amino acids, provide a flexible and powerful way of modelling MHC binding, and can easily be applied to other dynamic sequence problems.
Resumo:
A statistics-based method using genetic algorithms for predicting discrete sequences is presented. The prediction of the next value is based upon a fixed number of previous values and the statistics offered by the training data. According to the statistics, in similar past cases different values occurred next. If these values are considered with the appropriate weights, the forecast is successful. Weights are generated by genetic algorithms.