963 resultados para Multiple sequence alignment


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Facile modification of oligodeoxyribonucleotides is required for efficient immobilization to a pre-activated glass surface. This report presents an oligodeoxyribonucleotide which contains a hairpin stem–loop structure with multiple phosphorothioate moieties in the loop. These moieties are used to anchor the oligo to glass slides that are pre-activated with bromoacetamidopropylsilane. The efficiency of the attachment reaction was improved by increasing the number of phosphorothioates in the loop, as shown in the remarkable enhancement of template hybridization and single base extension through catalysis by DNA polymerase. The loop and stem presumably serve as lateral spacers between neighboring oligodeoxyribonucleotides and as a linker arm between the glass surface and the single-stranded sequence of interest. The oligodeoxyribonucleotides of this hairpin stem–loop architecture with multiple phosphorothioate moieties have broad application in DNA chip-based gene analysis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distribution’s parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described ‘island’ method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report documents the error rate in a commercially distributed subset of the IMAGE Consortium mouse cDNA clone collection. After isolation of plasmid DNA from 1189 bacterial stock cultures, only 62.2% were uncontaminated and contained cDNA inserts that had significant sequence identity to published data for the ordered clones. An agarose gel electrophoresis pre-screening strategy identified 361 stock cultures that appeared to contain two or more plasmid species. Isolation of individual colonies from these stocks demonstrated that 7.1% of the original 1189 stocks contained both a correct and an incorrect plasmid. 5.9% of the original 1189 stocks contained multiple, distinct, incorrect plasmids, indicating the likelihood of multiple contaminating events. While only 739 of the stocks purchased contained the desired cDNA clone, agarose gel pre-screening, colony isolation and similarity searching of dbEST allowed for the identification of an additional 420 clones that would have otherwise been discarded. Considering the high error rate in this subset of the IMAGE cDNA clone set, the use of sequence verified clones for cDNA microarray construction is warranted. When this is not possible, pre-screening non-sequence verified clones with agarose gel electrophoresis provides an inexpensive and efficient method to eliminate contaminated clones from the probe set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The IMGT/HLA Database (www.ebi.ac.uk/imgt/hla/) specialises in sequences of polymorphic genes of the HLA system, the human major histocompatibility complex (MHC). The HLA complex is located within the 6p21.3 region on the short arm of human chromosome 6 and contains more than 220 genes of diverse function. Many of the genes encode proteins of the immune system and these include the 21 highly polymorphic HLA genes, which influence the outcome of clinical transplantation and confer susceptibility to a wide range of non-infectious diseases. The database contains sequences for all HLA alleles officially recognised by the WHO Nomenclature Committee for Factors of the HLA System and provides users with online tools and facilities for their retrieval and analysis. These include allele reports, alignment tools and detailed descriptions of the source cells. The online IMGT/HLA submission tool allows both new and confirmatory sequences to be submitted directly to the WHO Nomenclature Committee. The latest version (release 1.7.0 July 2000) contains 1220 HLA alleles derived from over 2700 component sequences from the EMBL/GenBank/DDBJ databases. The HLA database provides a model which will be extended to provide specialist databases for polymorphic MHC genes of other species.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The database reported here is derived using the Combinatorial Extension (CE) algorithm which compares pairs of protein polypeptide chains and provides a list of structurally similar proteins along with their structure alignments. Using CE, structure–structure alignments can provide insights into biological function. When a protein of known function is shown to be structurally similar to a protein of unknown function, a relationship might be inferred; a relationship not necessarily detectable from sequence comparison alone. Establishing structure–structure relationships in this way is of great importance as we enter an era of structural genomics where there is a likelihood of an increasing number of structures with unknown functions being determined. Thus the CE database is an example of a useful tool in the annotation of protein structures of unknown function. Comparisons can be performed on the complete PDB or on a structurally representative subset of proteins. The source protein(s) can be from the PDB (updated monthly) or uploaded by the user. CE provides sequence alignments resulting from structural alignments and Cartesian coordinates for the aligned structures, which may be analyzed using the supplied Compare3D Java applet, or downloaded for further local analysis. Searches can be run from the CE web site, http://cl.sdsc.edu/ce.html, or the database and software downloaded from the site for local use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The human prion gene contains five copies of a 24 nt repeat that is highly conserved among species. An analysis of folding free energies of the human prion mRNA, in particular in the repeat region, suggested biased codon selection and the presence of RNA patterns. In particular, pseudoknots, similar to the one predicted by Wills in the human prion mRNA, were identified in the repeat region of all available prion mRNAs available in GenBank, but not those of birds and the red slider turtle. An alignment of these mRNAs, which share low sequence homology, shows several co-variations that maintain the pseudoknot pattern. The presence of pseudoknots in yeast Sup35p and Rnq1 suggests acquisition in the prokaryotic era. Computer generated three-dimensional structures of the human prion pseudoknot highlight protein and RNA interaction domains, which suggest a possible effect in prion protein translation. The role of pseudoknots in prion diseases is discussed as individuals with extra copies of the 24 nt repeat develop the familial form of Creutzfeldt–Jakob disease.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

SF3b155 is an essential spliceosomal protein, highly conserved during evolution. It has been identified as a subunit of splicing factor SF3b, which, together with a second multimeric complex termed SF3a, interacts specifically with the 12S U2 snRNP and converts it into the active 17S form. The protein displays a characteristic intranuclear localization. It is diffusely distributed in the nucleoplasm but highly concentrated in defined intranuclear structures termed “speckles,” a subnuclear compartment enriched in small ribonucleoprotein particles and various splicing factors. The primary sequence of SF3b155 suggests a multidomain structure, different from those of other nuclear speckles components. To identify which part of SF3b155 determines its specific intranuclear localization, we have constructed expression vectors encoding a series of epitope-tagged SF3b155 deletion mutants as well as chimeric combinations of SF3b155 sequences with the soluble cytoplasmic protein pyruvate kinase. Following transfection of cultured mammalian cells, we have identified (i) a functional nuclear localization signal of the monopartite type (KRKRR, amino acids 196–200) and (ii) a molecular segment with multiple threonine-proline repeats (amino acids 208–513), which is essential and sufficient to confer a specific accumulation in nuclear speckles. This latter sequence element, in particular amino acids 208–440, is required for correct subcellular localization of SF3b155 and is also sufficient to target a reporter protein to nuclear speckles. Moreover, this “speckle-targeting sequence” transfers the capacity for interaction with other U2 snRNP components.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The complete genome sequence of Caulobacter crescentus was determined to be 4,016,942 base pairs in a single circular chromosome encoding 3,767 genes. This organism, which grows in a dilute aquatic environment, coordinates the cell division cycle and multiple cell differentiation events. With the annotated genome sequence, a full description of the genetic network that controls bacterial differentiation, cell growth, and cell cycle progression is within reach. Two-component signal transduction proteins are known to play a significant role in cell cycle progression. Genome analysis revealed that the C. crescentus genome encodes a significantly higher number of these signaling proteins (105) than any bacterial genome sequenced thus far. Another regulatory mechanism involved in cell cycle progression is DNA methylation. The occurrence of the recognition sequence for an essential DNA methylating enzyme that is required for cell cycle regulation is severely limited and shows a bias to intergenic regions. The genome contains multiple clusters of genes encoding proteins essential for survival in a nutrient poor habitat. Included are those involved in chemotaxis, outer membrane channel function, degradation of aromatic ring compounds, and the breakdown of plant-derived carbon sources, in addition to many extracytoplasmic function sigma factors, providing the organism with the ability to respond to a wide range of environmental fluctuations. C. crescentus is, to our knowledge, the first free-living α-class proteobacterium to be sequenced and will serve as a foundation for exploring the biology of this group of bacteria, which includes the obligate endosymbiont and human pathogen Rickettsia prowazekii, the plant pathogen Agrobacterium tumefaciens, and the bovine and human pathogen Brucella abortus.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Trehalose (α-d-glucopyranosyl-1,1-α-d-glucopyranoside), a disaccharide widespread among microbes and lower invertebrates, is generally believed to be nonexistent in higher plants. However, the recent discovery of Arabidopsis genes whose products are involved in trehalose synthesis has renewed interest in the possibility of a function of trehalose in higher plants. We previously showed that trehalase, the enzyme that degrades trehalose, is present in nodules of soybean (Glycine max [L.] Merr.), and we characterized the enzyme as an apoplastic glycoprotein. Here we describe the purification of this trehalase to homogeneity and the cloning of a full-length cDNA encoding this enzyme, named GMTRE1 (G. max trehalase 1). The amino acid sequence derived from the open reading frame of GMTRE1 shows strong homology to known trehalases from bacteria, fungi, and animals. GMTRE1 is a single-copy gene and is expressed at a low but constant level in many tissues.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

ADAM 3 is a sperm surface glycoprotein that has been implicated in sperm-egg adhesion. Because little is known about the adhesive activity of ADAMs, we investigated the interaction of ADAM 3 disintegrin domains, made in bacteria and in insect cells, with murine eggs. Both recombinant proteins inhibited sperm-egg binding and fusion with potencies similar to that which we recently reported for the ADAM 2 disintegrin domain. Alanine scanning mutagenesis revealed a critical importance for the glutamine at position 7 of the disintegrin loop. Fluorescent beads coated with the ADAM 3 disintegrin domain bound to the egg surface. Bead binding was inhibited by an authentic, but not by a scrambled, peptide analog of the disintegrin loop. Bead binding was also inhibited by the function-blocking anti-α6 monoclonal antibody (mAb) GoH3, but not by a nonfunction blocking anti-α6 mAb, or by mAbs against either the αv or β3 integrin subunits. We also present evidence that in addition to the tetraspanin CD9, two other β1-integrin-associated proteins, the tetraspanin CD81 as well as the single pass transmembrane protein CD98 are expressed on murine eggs. Antibodies to CD9 and CD98 inhibited in vitro fertilization and binding of the ADAM 3 disintegrin domain. Our findings are discussed in terms of the involvement of multiple sperm ADAMs and multiple egg β1 integrin-associated proteins in sperm-egg binding and fusion. We propose that an egg surface “tetraspan web” facilitates fertilization and that it may do so by fostering ADAM–integrin interactions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It has been suggested that delayed DNA replication underlies fragility at common human fragile sites, but specific sequences responsible for expression of these inducible fragile sites have not been identified. One approach to identify such cis-acting sequences within the large nonexonic regions of fragile sites would be to identify conserved functional elements within orthologous fragile sites by interspecies sequence comparison. This study describes a comparison of orthologous fragile regions, the human FRA3B/FHIT and the murine Fra14A2/Fhit locus. We sequenced over 600 kbp of the mouse Fra14A2, covering the region orthologous to the fragile epicenter of FRA3B, and determined the Fhit deletion break points in a mouse kidney cancer cell line (RENCA). The murine Fra14A2 locus, like the human FRA3B, was characterized by a high AT content. Alignment of the two sequences showed that this fragile region was stable in evolution despite its susceptibility to mitotic recombination on inhibition of DNA replication. There were also several unusual highly conserved regions (HCRs). The positions of predicted matrix attachment regions (MARs), possibly related to replication origins, were not conserved. Of known fragile region landmarks, five cancer cell break points, one viral integration site, and one aphidicolin break cluster were located within or near HCRs. Thus, comparison of orthologous fragile regions has identified highly conserved sequences with possible functional roles in maintenance of fragility.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Polymorphisms in the prion protein gene are known to affect prion disease incubation times and susceptibility in humans and mice. However, studies with inbred lines of mice show that large differences in incubation times occur even with the same amino acid sequence of the prion protein, suggesting that other genes may contribute to the observed variation. To identify these loci we analyzed 1,009 animals from an F2 intercross between two strains of mice, CAST/Ei and NZW/OlaHSd, with significantly different incubation periods when challenged with RML scrapie prions. Interval mapping identified three highly significantly linked regions on chromosomes 2, 11, and 12; composite interval mapping suggests that each of these regions includes multiple linked quantitative trait loci. Suggestive evidence for linkage was obtained on chromosomes 6 and 7. The sequence conservation between the mouse and human genome suggests that identification of mouse prion susceptibility alleles may have direct relevance to understanding human susceptibility to bovine spongiform encephalopathy (BSE) infection, as well as identifying key factors in the molecular pathways of prion pathogenesis. However, the demonstration of other major genetic effects on incubation period suggests the need for extreme caution in interpreting estimates of variant Creutzfeldt–Jakob disease epidemic size utilizing existing epidemiological models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called emotif (http://motif.stanford.edu/emotif). Given an aligned set of protein sequences, emotif generates a set of motifs with a wide range of specificities and sensitivities. emotif also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used emotif to generate sets of motifs from all 7,000 protein alignments in the blocks and prints databases. The resulting database, called identify (http://motif.stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10−10 to 10−5. Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. identify assigns biological functions to 25–30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, identify assigned functions to 172 of proteins of unknown function in the yeast genome.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present an approach for assessing the significance of sequence and structure comparisons by using nearly identical statistical formalisms for both sequence and structure. Doing so involves an all-vs.-all comparison of protein domains [taken here from the Structural Classification of Proteins (scop) database] and then fitting a simple distribution function to the observed scores. By using this distribution, we can attach a statistical significance to each comparison score in the form of a P value, the probability that a better score would occur by chance. As expected, we find that the scores for sequence matching follow an extreme-value distribution. The agreement, moreover, between the P values that we derive from this distribution and those reported by standard programs (e.g., blast and fasta validates our approach. Structure comparison scores also follow an extreme-value distribution when the statistics are expressed in terms of a structural alignment score (essentially the sum of reciprocated distances between aligned atoms minus gap penalties). We find that the traditional metric of structural similarity, the rms deviation in atom positions after fitting aligned atoms, follows a different distribution of scores and does not perform as well as the structural alignment score. Comparison of the sequence and structure statistics for pairs of proteins known to be related distantly shows that structural comparison is able to detect approximately twice as many distant relationships as sequence comparison at the same error rate. The comparison also indicates that there are very few pairs with significant similarity in terms of sequence but not structure whereas many pairs have significant similarity in terms of structure but not sequence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A whole genome cattle-hamster radiation hybrid cell panel was used to construct a map of 54 markers located on bovine chromosome 5 (BTA5). Of the 54 markers, 34 are microsatellites selected from the cattle linkage map and 20 are genes. Among the 20 mapped genes, 10 are new assignments that were made by using the comparative mapping by annotation and sequence similarity strategy. A LOD-3 radiation hybrid framework map consisting of 21 markers was constructed. The relatively low retention frequency of markers on this chromosome (19%) prevented unambiguous ordering of the other 33 markers. The length of the map is 398.7 cR, corresponding to a ratio of ≈2.8 cR5,000/cM. Type I genes were binned for comparison of gene order among cattle, humans, and mice. Multiple internal rearrangements within conserved syntenic groups were apparent upon comparison of gene order on BTA5 and HSA12 and HSA22. A similarly high number of rearrangements were observed between BTA5 and MMU6, MMU10, and MMU15. The detailed comparative map of BTA5 should facilitate identification of genes affecting economically important traits that have been mapped to this chromosome and should contribute to our understanding of mammalian chromosome evolution.