19 resultados para Sequence Analysis, Protein

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Aim: The aim of this study was to characterize the bacterial community adhering to the mucosa of the terminal ileum, and proximal and distal colon of the human digestive tract. Methods and Results: Pinch samples of the terminal ileum, proximal and distal colon were taken from a healthy 35-year-old, and a 68-year-old subject with mild diverticulosis. The 16S rDNA genes were amplified using a low number of PCR cycles, cloned, and sequenced. In total, 361 sequences were obtained comprising 70 operational taxonomic units (OTU), with a calculated coverage of 82.6%. Twenty-three per cent of OTU were common to the terminal ileum, proximal colon and distal colon, but 14% OTU were only found in the terminal ileum, and 43% were only associated with the proximal or distal colon. The most frequently represented clones were from the Clostridium group XIVa (24.7%), and the Bacteroidetes (Cytophaga-Flavobacteria-Bacteroides ) cluster (27.7%). Conclusion: Comparison of 16S rDNA clone libraries of the hindgut across mammalian species confirms that the distribution of phylogenetic groups is similar irrespective of the host species. Lesser site-related differences within groups or clusters of organisms, are probable. Significance and Impact: This study provides further evidence of the distribution of the bacteria on the mucosal surfaces of the human hindgut. Data contribute to the benchmarking of the microbial composition of the human digestive tract.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The nuclectide sequence for pituitary prolactin cDNA from the marsupial bandicoot (Isoodon macrourus) was determined by reverse transcription-polymerase chain reaction and 5'/3' rapid amplification of cDNA ends. The deduced amino acid sequence showed high sequence identity with brushtail possum prolactin (95%) and all of the expected structural features of a quadruped prolactin. A prolactin gene tree was constructed and rates of evolution calculated for bandicoot, possum, opossum and several mammalian and non-mammalian prolactins. Bootstrap analysis provided strong support for marsupials as a sister group with eutherian mammals and weak support for opossum and bandicoot as an independent grouping from the brushtail possum. The rates of molecular evolution for marsupial prolactins were comparable to the slow rate seen in the majority of quadruped prolactins that have been sequenced. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this study, we propose a novel method to predict the solvent accessible surface areas of transmembrane residues. For both transmembrane alpha-helix and beta-barrel residues, the correlation coefficients between the predicted and observed accessible surface areas are around 0.65. On the basis of predicted accessible surface areas, residues exposed to the lipid environment or buried inside a protein can be identified by using certain cutoff thresholds. We have extensively examined our approach based on different definitions of accessible surface areas and a variety of sets of control parameters. Given that experimentally determining the structures of membrane proteins is very difficult and membrane proteins are actually abundant in nature, our approach is useful for theoretically modeling membrane protein tertiary structures, particularly for modeling the assembly of transmembrane domains. This approach can be used to annotate the membrane proteins in proteomes to provide extra structural and functional information.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Classic Hodgkin's lymphoma (HL) tissue contains a small population of morphologically distinct malignant cells called Hodgkin and Reed-Sternberg (HRS) cells, associated with the development of HL. Using 3'-rapid amplification of cDNA ends ( RACE) we identified an alternative mRNA for the DEC-205 multilectin receptor in the HRS cell line L428. Sequence analysis revealed that the mRNA encodes a fusion protein between DEC-205 and a novel C-type lectin DCL-1. Although the 7.5-kb DEC-205 and 4.2-kb DCL-1 mRNA were expressed independently in myeloid and B lymphoid cell lines, the DEC-205/DCL-1 fusion mRNA (9.5 kb) predominated in the HRS cell lines ( L428, KM-H2, and HDLM-2). The DEC-205 and DCL-1 genes comprising 35 and 6 exons, respectively, are juxtaposed on chromosome band 2q24 and separated by only 5.4 kb. We determined the DCL-1 transcription initiation site within the intervening sequence by 5'-RACE, confirming that DCL-1 is an independent gene. Two DEC-205/DCL-1 fusion mRNA variants may result from cotranscription of DEC-205 and DCL-1, followed by splicing DEC-205 exon 35 or 34-35 along with DCL-1 exon 1. The resulting reading frames encode the DEC-205 ectodomain plus the DCL-1 ectodomain, the transmembrane, and the cytoplasmic domain. Using DCL-1 cytoplasmic domain-specific polyclonal and DEC-205 monoclonal antibodies for immunoprecipitation/Western blot analysis, we showed that the fusion mRNA is translated into a DEC-205/DCL-1 fusion protein, expressed in the HRS cell lines. These results imply an unusual transcriptional control mechanism in HRS cells, which cotranscribe an mRNA containing DEC-205 and DCL-1 prior to generating the intergenically spliced mRNA to produce a DEC-205/DCL-1 fusion protein.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The polypeptide backbones and side chains of proteins are constantly moving due to thermal motion and the kinetic energy of the atoms. The B-factors of protein crystal structures reflect the fluctuation of atoms about their average positions and provide important information about protein dynamics. Computational approaches to predict thermal motion are useful for analyzing the dynamic properties of proteins with unknown structures. In this article, we utilize a novel support vector regression (SVR) approach to predict the B-factor distribution (B-factor profile) of a protein from its sequence. We explore schemes for encoding sequences and various settings for the parameters used in SVR. Based on a large dataset of high-resolution proteins, our method predicts the B-factor distribution with a Pearson correlation coefficient (CC) of 0.53. In addition, our method predicts the B-factor profile with a CC of at least 0.56 for more than half of the proteins. Our method also performs well for classifying residues (rigid vs. flexible). For almost all predicted B-factor thresholds, prediction accuracies (percent of correctly predicted residues) are greater than 70%. These results exceed the best results of other sequence-based prediction methods. (C) 2005 Wiley-Liss, Inc.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Full-length genome sequences of five virulent and five avirulent strains of Newcastle disease virus isolated between 1998 and 2002 in Victoria and New South Wales, Australia were determined. Comparisons between these strains revealed that coding sequence variability in the haemagglutinin-neuraminidase (HN), matrix (M) and phosphoprotein (P) gene sequences appeared to be more variable than in the fusion (F), nucleocapsid (N) and RNA dependent-RNA replicase (L) genes. Sequence analysis of a number of other isolates made during the recent virulent NDV outbreaks, also identified the presence of a number of variants with altered F gene cleavage sites, which resulted in altered biological properties of those viruses. Quasispecies analysis of a number of field isolates indicated the presence of virulent virus in one particular isolate. Gene sequence analysis of the progenitor virus isolated in 1998 showed very little sequence variation when compared to that of a progenitor-like virus isolated in 2001 demonstrating that in the field. viral genome sequence variation appears to be biologically restricted to that of a consensus sequence. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A single-tube RT-PCR technique generated a 387 bp or 300 bp cDNA amplicon covering the F-0 cleavage site or the carboxyl (C)-terminus of the HN gene, respectively, of Newcastle disease virus (NDV) strain 1-2. Sequence analysis was used to deduce the amino acid sequences of the cleavage site of F protein and the C-terminus of HN protein, which were then compared with sequences for other NDV strains. The cleavage site of NDV strain 1-2 had a sequence Motif of (112)RKQGRLIG(119), consistent with an avirulent phenotype. Nucleotide sequencing and deduction of amino acids at the C-terminus of HN revealed that strain 1-2 had a 7-amino-acid extension (VEILKDGVREARSSR). This differs from the virulent viruses that caused outbreaks of Newcastle disease in Australia in the 1930s and 1990s, which have HN extensions of 0 and 9 amino acids, respectively. Amino acid sequence analyses of the F and HN genes of strain 1-2 confirmed its avirulent nature and its Australian origin.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The intestinal spirochaete Brachyspira pilosicoli causes colitis in a wide variety of host species. Little is known about the structure or protein constituents of the B. pilosicoli outer membrane (OM). To identify surface-exposed proteins in this species, membrane vesicles were isolated from B. pilosicoli strain 95-1000 cells by osmotic lysis in dH(2)O followed by isopycnic centrifugation in sucrose density gradients. The membrane vesicles were separated into a high-density fraction (HDMV; p = 1.18 g CM-3) and a low-density fraction (LDMV; rho=1.12 g cm(-3)). Both fractions were free of flagella and soluble protein contamination. LDMV contained predominantly OM markers (lipo-oligosaccharide and a 29 kDa B. pilosicoli OM protein) and was used as a source of antigens to produce mAbs. Five B. pilosicoli-specific mAbs reacting with proteins with molecular masses of 23, 24, 35, 61 and 79 kDa were characterized. The 23 kDa protein was only partially soluble in Triton X-114, whereas the 24 and 35 kDa proteins were enriched in the detergent phase, implying that they were integral membrane proteins or lipoproteins. All three proteins were localized to the B. pilosicoli OM by immunogold labelling using specific mAbs. The gene encoding the abundant, surface-exposed 23 kDa protein was identified by screening a B. pilosicoli 95-1000 genome library with the mAb and was expressed in Escherichia coli. Sequence analysis showed that it encoded a unique lipoprotein, designated BmpC. Recombinant BmpC partitioned predominantly in the OM fraction of E. coli strain SOLR. The mAb to BmpC was used to screen a collection of 13 genetically heterogeneous strains of B. pilosicoli isolated from five different host species. Interestingly, only strain 95-1000 was reactive with the mAb, indicating that either the surface-exposed epitope on BmpC is variable between strains or that the protein is restricted in its distribution within B. pilosicoli.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Emiliania huxleyi (Lohm.) Hay and Mohler is a ubiquitous unicellular marine alga surrounded by an elaborate covering of calcite platelets called coccoliths. It is an important primary producer involved in oceanic biogeochemistry and climate regulation. Currently, E. huxleyi is separated into five morphotypes based on morphometric, physiological, biochemical, and immunological differences. However, a genetic marker has yet to be found to characterize these morphotypes. With the use of sequence analysis and denaturing gradient gel electrophoresis, we discovered a genetic marker that correlates significantly with the separation of the most widely recognized A and B morphotypes. Furthermore, we reveal that the A morphotype is composed of a number of distinct genotypes. This marker lies within the 3' untranslated region of a coccolith associated protein mRNA, which is implicated in regulating coccolith calcification. Consequently, we tentatively termed this marker the coccolith morphology motif.