15 resultados para Sequence motif analysis

em University of Queensland eSpace - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The tropical abalone. Haliotis asinina. is,in ideal species to investigate the molecular mechanisms that control development. growth, reproduction and shell formation in all cultured haliotids. Here we describe the analysis of 232 expressed sequence tags (EST) obtained front a developmental H. asinina cDNA library intended for future microarray studies. From this data set we identified 183 unique gene Clusters. Of these, 90 clusters showed significant homology with sequences lodged in GenBank, ranging in function from general housekeeping to signal transduction, gene regulation and cell-cell communication. Seventy-one clusters possessed completely novel ORFs greater than 50 codons in length, highlighting the paucity of sequence data from molluscs and other lophotrochozoans. This study of developmental gene expression in H. asinina provides the foundation for further detailed analyses of abalone growth, development and reproduction.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The planctomycetes are a phylum of bacteria that have a unique cell compartmentalisation and yeast-like budding cell division and peptidoglycan-less proteinaceous cell walls. We wished to further our understanding of these unique organisms at the molecular level by searching for conserved amino acid sequence motifs and domains in the proteins encoded by Rhodopirellula baltica. Using BLAST and single-linkage clustering, we have discovered several new protein domains and sequence motifs in this planctomycete. R. baltica has multiple members of the newly discovered GEFGR protein family and the ASPIC C-terminal domain family, whilst most other organisms for which whole genome sequence is available have no more than one. Many of the domains and motifs appear to be restricted to the planctomycetes. It is possible that these protein domains and motifs may have been lost or replaced in other phyla, or they may have undergone multiple duplication events in the planctomycete lineage. One of the novel motifs probably represents a novel N-terminal export signal peptide. With their unique cell biology, it may be that the planctomycete cell compartmentalisation plan in particular needs special membrane transport mechanisms. The discovery of these new domains and motifs, many of which are associated with secretion and cell-surface functions, will help to stimulate experimental work and thus enhance further understanding of this fascinating group of organisms. (C) 2004 Federation of European Microbiological Societies. Published by Elsevier B.V. All rights reserved.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A single-tube RT-PCR technique generated a 387 bp or 300 bp cDNA amplicon covering the F-0 cleavage site or the carboxyl (C)-terminus of the HN gene, respectively, of Newcastle disease virus (NDV) strain 1-2. Sequence analysis was used to deduce the amino acid sequences of the cleavage site of F protein and the C-terminus of HN protein, which were then compared with sequences for other NDV strains. The cleavage site of NDV strain 1-2 had a sequence Motif of (112)RKQGRLIG(119), consistent with an avirulent phenotype. Nucleotide sequencing and deduction of amino acids at the C-terminus of HN revealed that strain 1-2 had a 7-amino-acid extension (VEILKDGVREARSSR). This differs from the virulent viruses that caused outbreaks of Newcastle disease in Australia in the 1930s and 1990s, which have HN extensions of 0 and 9 amino acids, respectively. Amino acid sequence analyses of the F and HN genes of strain 1-2 confirmed its avirulent nature and its Australian origin.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In a first step toward understanding the molecular basis of pineapple fruit development, a sequencing project was initiated to survey a range of expressed sequences from green unripe and yellow ripe fruit tissue. A highly abundant metallothionein transcript was identified during library construction, and was estimated to account for up to 50% of all EST library clones. Library clones with metallothionein subtracted were sequenced, and 408 unripe green and 1140 ripe yellow edited EST clone sequences were retrieved. Clone redundancy was high, with the combined 1548 clone sequences clustering into just 634 contigs comprising 191 consensus sequences and 443 singletons. Half of the EST clone sequences clustered within 13.5% and 9.3% of contigs from green unripe and yellow ripe libraries, respectively, indicating that a small subset of genes dominate the majority of the transcriptome. Furthermore, sequence cluster analysis, northern analysis, and functional classification revealed major differences between genes expressed in the unripe green and ripe yellow fruit tissues. Abundant genes identified from the green fruit include a fruit bromelain and a bromelain inhibitor. Abundant genes identified in the yellow fruit library include a MADS box gene, and several genes normally associated with protein synthesis, including homologues of ribosomal L10 and the translation factors SUI1 and eIF5A. Both the green unripe and yellow ripe libraries contained high proportions of clones associated with oxidative stress responses and the detoxification of free radicals.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Aim: The aim of this study was to characterize the bacterial community adhering to the mucosa of the terminal ileum, and proximal and distal colon of the human digestive tract. Methods and Results: Pinch samples of the terminal ileum, proximal and distal colon were taken from a healthy 35-year-old, and a 68-year-old subject with mild diverticulosis. The 16S rDNA genes were amplified using a low number of PCR cycles, cloned, and sequenced. In total, 361 sequences were obtained comprising 70 operational taxonomic units (OTU), with a calculated coverage of 82.6%. Twenty-three per cent of OTU were common to the terminal ileum, proximal colon and distal colon, but 14% OTU were only found in the terminal ileum, and 43% were only associated with the proximal or distal colon. The most frequently represented clones were from the Clostridium group XIVa (24.7%), and the Bacteroidetes (Cytophaga-Flavobacteria-Bacteroides ) cluster (27.7%). Conclusion: Comparison of 16S rDNA clone libraries of the hindgut across mammalian species confirms that the distribution of phylogenetic groups is similar irrespective of the host species. Lesser site-related differences within groups or clusters of organisms, are probable. Significance and Impact: This study provides further evidence of the distribution of the bacteria on the mucosal surfaces of the human hindgut. Data contribute to the benchmarking of the microbial composition of the human digestive tract.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Chemokine (C-C motif) ligand-2 (CCL2) is a chemoattractant and activator of macrophages and is a key determinant of the macrophage infiltrate into tumours. We demonstrate here that CCL2 is expressed in normal human ovarian surface epithelium ( HOSE) cells and is silenced in most ovarian cancer cell lines, and silenced or downregulated in the majority of primary ovarian adenocarcinomas. Analysis of the CCL2 locus at 17q11.2-q12 showed loss of heterozygosity (LOH) in 70% of primary tumours, and this was significantly more common in tumours of advanced stage or grade. However, we did not detect any mutations in the CCL2 coding sequence in 94 primary ovarian adenocarcinomas. These data support the hypothesis that CCL2 may play a role in the pathobiology of ovarian cancers, but additional studies will be required to evaluate this possibility.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Protein tertiary structure can be partly characterized via each amino acid's contact number measuring how residues are spatially arranged. The contact number of a residue in a folded protein is a measure of its exposure to the local environment, and is defined as the number of C-beta atoms in other residues within a sphere around the C-beta atom of the residue of interest. Contact number is partly conserved between protein folds and thus is useful for protein fold and structure prediction. In turn, each residue's contact number can be partially predicted from primary amino acid sequence, assisting tertiary fold analysis from sequence data. In this study, we provide a more accurate contact number prediction method from protein primary sequence. Results: We predict contact number from protein sequence using a novel support vector regression algorithm. Using protein local sequences with multiple sequence alignments (PSI-BLAST profiles), we demonstrate a correlation coefficient between predicted and observed contact numbers of 0.70, which outperforms previously achieved accuracies. Including additional information about sequence weight and amino acid composition further improves prediction accuracies significantly with the correlation coefficient reaching 0.73. If residues are classified as being either contacted or non-contacted, the prediction accuracies are all greater than 77%, regardless of the choice of classification thresholds. Conclusion: The successful application of support vector regression to the prediction of protein contact number reported here, together with previous applications of this approach to the prediction of protein accessible surface area and B-factor profile, suggests that a support vector regression approach may be very useful for determining the structure-function relation between primary sequence and higher order consecutive protein structural and functional properties.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The nuclectide sequence for pituitary prolactin cDNA from the marsupial bandicoot (Isoodon macrourus) was determined by reverse transcription-polymerase chain reaction and 5'/3' rapid amplification of cDNA ends. The deduced amino acid sequence showed high sequence identity with brushtail possum prolactin (95%) and all of the expected structural features of a quadruped prolactin. A prolactin gene tree was constructed and rates of evolution calculated for bandicoot, possum, opossum and several mammalian and non-mammalian prolactins. Bootstrap analysis provided strong support for marsupials as a sister group with eutherian mammals and weak support for opossum and bandicoot as an independent grouping from the brushtail possum. The rates of molecular evolution for marsupial prolactins were comparable to the slow rate seen in the majority of quadruped prolactins that have been sequenced. (c) 2005 Elsevier Inc. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We completed the genome sequence of Lettuce necrotic yellows virus (LNYV) by determining the nucleotide sequences of the 4a (putative phosphoprotein), 4b, M (matrix protein), G (glycoprotein) and L (polymerase) genes. The genome consists of 12,807 nucleotides and encodes six genes in the order 3' leader-N-4a(P)-4b-M-G-L-5' trailer. Sequences were derived from clones of a cDNA library from LNYV genomic RNA and from fragments amplified using reverse transcription-polymerase chain reaction. The 4a protein has a low isoelectric point characteristic for rhabdovirus phosphoproteins. The 4b protein has significant sequence similarities with the movement proteins of capillo- and trichoviruses and may be involved in cell-to-cell movement. The putative G protein sequence contains a predicted 25 amino acids signal peptide and endopeptidase cleavage site, three predicted glycosylation sites and a putative transmembrane domain. The deduced L protein sequence shows similarities with the L proteins of other plant rhabdoviruses and contains polymerase module motifs characteristic for RNA-dependent RNA polymerases of negative-strand RNA viruses. Phylogenetic analysis of this motif among rhabdoviruses placed LNYV in a group with other sequenced cytorhabdoviruses, most closely related to Strawberry crinkle virus. (c) 2005 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.