969 resultados para Sequence analysis
Resumo:
The gene encoding 2-methyl-3-hydroxypyridine-5-carboxylic acid oxygenase (MHPCO; EC 1.14.12.4) was cloned by using an oligonucleotide probe corresponding to the N terminus of the enzyme to screen a DNA library of Pseudomonas sp. MA-1. The gene encodes for a protein of 379 amino acid residues corresponding to a molecular mass of 41.7 kDa, the same as that previously estimated for MHPCO. MHPCO was expressed in Escherichia coli and found to have the same properties as the native enzyme from Pseudomonas sp. MA-1. This study shows that MHPCO is a homotetrameric protein with one flavin adenine dinucleotide bound per subunit. Sequence comparison of the enzyme with other hydroxylases reveals regions that are conserved among aromatic flavoprotein hydroxylases.
Resumo:
The intensely studied MHC has become the paradigm for understanding the architectural evolution of vertebrate multigene families. The 4-Mb human MHC (also known as the HLA complex) encodes genes critically involved in the immune response, graft rejection, and disease susceptibility. Here we report the continuous 1,796,938-bp genomic sequence of the HLA class I region, linking genes between MICB and HLA-F. A total of 127 genes or potentially coding sequences were recognized within the analyzed sequence, establishing a high gene density of one per every 14.1 kb. The identification of 758 microsatellite provides tools for high-resolution mapping of HLA class I-associated disease genes. Most importantly, we establish that the repeated duplication and subsequent diversification of a minimal building block, MIC-HCGIX-3.8–1-P5-HCGIV-HLA class I-HCGII, engendered the present-day MHC. That the currently nonessential HLA-F and MICE genes have acted as progenitors to today’s immune-competent HLA-ABC and MICA/B genes provides experimental evidence for evolution by “birth and death,” which has general relevance to our understanding of the evolutionary forces driving vertebrate multigene families.
Resumo:
The cell matrix adhesion regulator (CMAR) gene has been suggested to be a signal transduction molecule influencing cell adhesion to collagen and, through this, possibly involved in tumor suppression. The originally reported CMAR cDNA was 464 bp long with a tyrosine phosphorylation site at the extreme 3′ end, which mutagenesis studies had shown to be central to the function of this gene. Since the discovery of a 4-bp insertion polymorphism within the originally reported coding region, further sequence information has been obtained. The cDNA has been extended 5′ by ≈2 kb revealing a 559-bp region showing strong homology to the proposed 5′ untranslated sequence of a murine protein kinase receptor family member, variant in kinase (vik). CMAR genomic sequencing has shown the presence of an intron, the intron/exon boundary lying within this region of homology. An RNA transcript for CMAR of ≈2.5 kb has also been identified. The data suggest complex mechanisms for control of expression of two closely associated genes, CMAR and the vik- associated sequence.
Resumo:
The HIV Reverse Transcriptase and Protease Sequence Database is an on-line relational database that catalogs evolutionary and drug-related sequence variation in the human immunodeficiency virus (HIV) reverse transcriptase (RT) and protease enzymes, the molecular targets of anti-HIV therapy (http://hivdb.stanford.edu). The database contains a compilation of nearly all published HIV RT and protease sequences, including submissions from International Collaboration databases and sequences published in journal articles. Sequences are linked to data about the source of the sequence sample and the antiretroviral drug treatment history of the individual from whom the isolate was obtained. During the past year 3500 sequences have been added and the data model has been expanded to include drug susceptibility data on sequenced isolates. Database content has also been integrated with didactic text and the output of two sequence analysis programs.
Resumo:
The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA, were identified in the repeat region of all available prion mRNAs available in GenBank, but not those of birds and the red slider turtle. An alignment of these mRNAs, which share low sequence homology, shows several co-variations that maintain the pseudoknot pattern. The presence of pseudoknots in yeast Sup35p and Rnq1 suggests acquisition in the prokaryotic era. Computer generated three-dimensional structures of the human prion pseudoknot highlight protein and RNA interaction domains, which suggest a possible effect in prion protein translation. The role of pseudoknots in prion diseases is discussed as individuals with extra copies of the 24 nt repeat develop the familial form of Creutzfeldt–Jakob disease.
Resumo:
Streptomyces lavendulae produces complestatin, a cyclic peptide natural product that antagonizes pharmacologically relevant protein–protein interactions including formation of the C4b,2b complex in the complement cascade and gp120-CD4 binding in the HIV life cycle. Complestatin, a member of the vancomycin group of natural products, consists of an α-ketoacyl hexapeptide backbone modified by oxidative phenolic couplings and halogenations. The entire complestatin biosynthetic and regulatory gene cluster spanning ca. 50 kb was cloned and sequenced. It consisted of 16 ORFs, encoding proteins homologous to nonribosomal peptide synthetases, cytochrome P450-related oxidases, ferredoxins, nonheme halogenases, four enzymes involved in 4-hydroxyphenylglycine (Hpg) biosynthesis, transcriptional regulators, and ABC transporters. The nonribosomal peptide synthetase consisted of a priming module, six extending modules, and a terminal thioesterase; their arrangement and domain content was entirely consistent with functions required for the biosynthesis of a heptapeptide or α-ketoacyl hexapeptide backbone. Two oxidase genes were proposed to be responsible for the construction of the unique aryl-ether-aryl-aryl linkage on the linear heptapeptide intermediate. Hpg, 3,5-dichloro-Hpg, and 3,5-dichloro-hydroxybenzoylformate are unusual building blocks that repesent five of the seven requisite monomers in the complestatin peptide. Heterologous expression and biochemical analysis of 4-hydroxyphenylglycine transaminon confirmed its role as an aminotransferase responsible for formation of all three precursors. The close similarity but functional divergence between complestatin and chloroeremomycin biosynthetic genes also presents a unique opportunity for the construction of hybrid vancomycin-type antibiotics.
Resumo:
Plectin, a 500-kDa intermediate filament binding protein, has been proposed to provide mechanical strength to cells and tissues by acting as a cross-linking element of the cytoskeleton. To set the basis for future studies on gene regulation, tissue-specific expression, and pathological conditions involving this protein, we have cloned the human plectin gene, determined its coding sequence, and established its genomic organization. The coding sequence contains 32 exons that extend over 32 kb of the human genome. Most of the introns reside within a region encoding the globular N-terminal domain of the molecule, whereas the entire central rod domain and the entire C-terminal globular domain were found to be encoded by single exons of remarkable length, >3 kb and >6 kb, respectively. Overall, the organization of the human plectin gene was strikingly similar to that of human bullous pemphigoid antigen 1 (BPAG1), confirming that both proteins belong to the same gene family. Comparison of the deduced protein sequences for human and rat plectin revealed that they were 93% identical. By using fluorescence in situ hybridization, we have mapped the plectin gene to the long arm of chromosome 8 within the telomeric region. This gene locus (8q24) has previously been implicated in the human blistering skin disease epidermolysis bullosa simplex Ogna. Detailed knowledge of the structure of the plectin gene and its chromosome localization will aid in the elucidation of whether this or any other pathological conditions are linked to alterations in the plectin gene.
Resumo:
Competing hypotheses seek to explain the evolution of oxygenic and anoxygenic processes of photosynthesis. Since chlorophyll is less reduced and precedes bacteriochlorophyll on the modern biosynthetic pathway, it has been proposed that chlorophyll preceded bacteriochlorophyll in its evolution. However, recent analyses of nucleotide sequences that encode chlorophyll and bacteriochlorophyll biosynthetic enzymes appear to provide support for an alternative hypothesis. This is that the evolution of bacteriochlorophyll occurred earlier than the evolution of chlorophyll. Here we demonstrate that the presence of invariant sites in sequence datasets leads to inconsistency in tree building (including maximum-likelihood methods). Homologous sequences with different biological functions often share invariant sites at the same nucleotide positions. However, different constraints can also result in additional invariant sites unique to the genes, which have specific and different biological functions. Consequently, the distribution of these sites can be uneven between the different types of homologous genes. The presence of invariant sites, shared by related biosynthetic genes as well as those unique to only some of these genes, has misled the recent evolutionary analysis of oxygenic and anoxygenic photosynthetic pigments. We evaluate an alternative scheme for the evolution of chlorophyll and bacteriochlorophyll.
Resumo:
Expansins are unusual proteins discovered by virtue of their ability to mediate cell wall extension in plants. We identified cDNA clones for two cucumber expansins on the basis of peptide sequences of proteins purified from cucumber hypocotyls. The expansin cDNAs encode related proteins with signal peptides predicted to direct protein secretion to the cell wall. Northern blot analysis showed moderate transcript abundance in the growing region of the hypocotyl and no detectable transcripts in the nongrowing region. Rice and Arabidopsis expansin cDNAs were identified from collections of anonymous cDNAs (expressed sequence tags). Sequence comparisons indicate at least four distinct expansin cDNAs in rice and at least six in Arabidopsis. Expansins are highly conserved in size and sequence (60-87% amino acid sequence identity and 75-95% similarity between any pairwise comparison), and phylogenetic trees indicate that this multigene family formed before the evolutionary divergence of monocotyledons and dicotyledons. Sequence and motif analyses show no similarities to known functional domains that might account for expansin action on wall extension. A series of highly conserved tryptophans may function in expansin binding to cellulose or other glycans. The high conservation of this multigene family indicates that the mechanism by which expansins promote wall extensin tolerates little variation in protein structure.
Resumo:
The bithorax complex (BX-C) of Drosophila, one of two complexes that act as master regulators of the body plan of the fly, has now been entirely sequenced and comprises approximately 315,000 bp, only 1.4% of which codes for protein. Analysis of this sequence reveals significantly overrepresented DNA motifs of unknown, as well as known, functions in the non-protein-coding portion of the sequence. The following types of motifs in that portion are analyzed: (i) concatamers of mono-, di-, and trinucleotides; (ii) tightly clustered hexanucleotides (spaced < or = 5 bases apart); (iii) direct and reverse repeats longer than 20 bp; and (iv) a number of motifs known from biochemical studies to play a role in the regulation of the BX-C. The hexanucleotide AGATAC is remarkably overrepresented and is surmised to play a role in chromosome pairing. The positions of sites of highly overrepresented motifs are plotted for those that occur at more than five sites in the sequence, when < 0.5 case is expected. Expected values are based on a third-order Markov chain, which is the optimal order for representing the BXCALL sequence.
Resumo:
Mannitol is the most abundant sugar alcohol in nature, occurring in bacteria, fungi, lichens, and many species of vascular plants. Celery (Apium graveolens L.), a plant that forms mannitol photosynthetically, has high photosynthetic rates thought to results from intrinsic differences in the biosynthesis of hexitols vs. sugars. Celery also exhibits high salt tolerance due to the function of mannitol as an osmoprotectant. A mannitol catabolic enzyme that oxidizes mannitol to mannose (mannitol dehydrogenase, MTD) has been identified. In celery plants, MTD activity and tissue mannitol concentration are inversely related. MTD provides the initial step by which translocated mannitol is committed to central metabolism and, by regulating mannitol pool size, is important in regulating salt tolerance at the cellular level. We have now isolated, sequenced, and characterized a Mtd cDNA from celery. Analyses showed that Mtd RNA was more abundant in cells grown on mannitol and less abundant in salt-stressed cells. A protein database search revealed that the previously described ELI3 pathogenesis-related proteins from parsley and Arabidopsis are MTDs. Treatment of celery cells with salicylic acid resulted in increased MTD activity and RNA. Increased MTD activity results in an increased ability to utilize mannitol. Among other effects, this may provide an additional source of carbon and energy for response to pathogen attack. These responses of the primary enzyme controlling mannitol pool size reflect the importance of mannitol metabolism in plant responses to divergent types of environmental stress.
Resumo:
Aim: The aim of this study was to characterize the bacterial community adhering to the mucosa of the terminal ileum, and proximal and distal colon of the human digestive tract. Methods and Results: Pinch samples of the terminal ileum, proximal and distal colon were taken from a healthy 35-year-old, and a 68-year-old subject with mild diverticulosis. The 16S rDNA genes were amplified using a low number of PCR cycles, cloned, and sequenced. In total, 361 sequences were obtained comprising 70 operational taxonomic units (OTU), with a calculated coverage of 82.6%. Twenty-three per cent of OTU were common to the terminal ileum, proximal colon and distal colon, but 14% OTU were only found in the terminal ileum, and 43% were only associated with the proximal or distal colon. The most frequently represented clones were from the Clostridium group XIVa (24.7%), and the Bacteroidetes (Cytophaga-Flavobacteria-Bacteroides ) cluster (27.7%). Conclusion: Comparison of 16S rDNA clone libraries of the hindgut across mammalian species confirms that the distribution of phylogenetic groups is similar irrespective of the host species. Lesser site-related differences within groups or clusters of organisms, are probable. Significance and Impact: This study provides further evidence of the distribution of the bacteria on the mucosal surfaces of the human hindgut. Data contribute to the benchmarking of the microbial composition of the human digestive tract.
Resumo:
Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.
Resumo:
The nuclectide sequence for pituitary prolactin cDNA from the marsupial bandicoot (Isoodon macrourus) was determined by reverse transcription-polymerase chain reaction and 5'/3' rapid amplification of cDNA ends. The deduced amino acid sequence showed high sequence identity with brushtail possum prolactin (95%) and all of the expected structural features of a quadruped prolactin. A prolactin gene tree was constructed and rates of evolution calculated for bandicoot, possum, opossum and several mammalian and non-mammalian prolactins. Bootstrap analysis provided strong support for marsupials as a sister group with eutherian mammals and weak support for opossum and bandicoot as an independent grouping from the brushtail possum. The rates of molecular evolution for marsupial prolactins were comparable to the slow rate seen in the majority of quadruped prolactins that have been sequenced. (c) 2005 Elsevier Inc. All rights reserved.
Resumo:
Objective: The description and evaluation of the performance of a new real-time seizure detection algorithm in the newborn infant. Methods: The algorithm includes parallel fragmentation of EEG signal into waves; wave-feature extraction and averaging; elementary, preliminary and final detection. The algorithm detects EEG waves with heightened regularity, using wave intervals, amplitudes and shapes. The performance of the algorithm was assessed with the use of event-based and liberal and conservative time-based approaches and compared with the performance of Gotman's and Liu's algorithms. Results: The algorithm was assessed on multi-channel EEG records of 55 neonates including 17 with seizures. The algorithm showed sensitivities ranging 83-95% with positive predictive values (PPV) 48-77%. There were 2.0 false positive detections per hour. In comparison, Gotman's algorithm (with 30 s gap-closing procedure) displayed sensitivities of 45-88% and PPV 29-56%; with 7.4 false positives per hour and Liu's algorithm displayed sensitivities of 96-99%, and PPV 10-25%; with 15.7 false positives per hour. Conclusions: The wave-sequence analysis based algorithm displayed higher sensitivity, higher PPV and a substantially lower level of false positives than two previously published algorithms. Significance: The proposed algorithm provides a basis for major improvements in neonatal seizure detection and monitoring. Published by Elsevier Ireland Ltd. on behalf of International Federation of Clinical Neurophysiology.