12 resultados para local sequence alignment problem

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Historically morphological features were used as the primary means to classify organisms. However, the age of molecular genetics has allowed us to approach this field from the perspective of the organism's genetic code. Early work used highly conserved sequences, such as ribosomal RNA. The increasing number of complete genomes in the public data repositories provides the opportunity to look not only at a single gene, but at organisms' entire parts list. ^ Here the Sequence Comparison Index (SCI) and the Organism Comparison Index (OCI), algorithms and methods to compare proteins and proteomes, are presented. The complete proteomes of 104 sequenced organisms were compared. Over 280 million full Smith-Waterman alignments were performed on sequence pairs which had a reasonable expectation of being related. From these alignments a whole proteome phylogenetic tree was constructed. This method was also used to compare the small subunit (SSU) rRNA from each organism and a tree constructed from these results. The SSU rRNA tree by the SCI/OCI method looks very much like accepted SSU rRNA trees from sources such as the Ribosomal Database Project, thus validating the method. The SCI/OCI proteome tree showed a number of small but significant differences when compared to the SSU rRNA tree and proteome trees constructed by other methods. Horizontal gene transfer does not appear to affect the SCI/OCI trees until the transferred genes make up a large portion of the proteome. ^ As part of this work, the Database of Related Local Alignments (DaRLA) was created and contains over 81 million rows of sequence alignment information. DaRLA, while primarily used to build the whole proteome trees, can also be applied shared gene content analysis, gene order analysis, and creating individual protein trees. ^ Finally, the standard BLAST method for analyzing shared gene content was compared to the SCI method using 4 spirochetes. The SCI system performed flawlessly, finding all proteins from one organism against itself and finding all the ribosomal proteins between organisms. The BLAST system missed some proteins from its respective organism and failed to detect small ribosomal proteins between organisms. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A pivotal mediator of actin dynamics is the protein cofilin, which promotes filament severing and depolymerization, facilitating the breakdown of existing filaments, and the enhancement of filament growth from newly created barbed ends. It does so in concert with actin interacting protein 1 (Aip1), which serves to accelerate cofilin's activity. While progress has been made in understanding its biochemical functions, the physiologic processes the cofilin/Aip1 complex regulates, particularly in higher organisms, are yet to be determined. We have generated an allelic series for WD40 repeat protein 1 (Wdr1), the mammalian homolog of Aip1, and report that reductions in Wdr1 function produce a dramatic phenotype gradient. While severe loss of function at the Wdr1 locus causes embryonic lethality, macrothrombocytopenia and autoinflammatory disease develop in mice carrying hypomorphic alleles. Macrothrombocytopenia is the result of megakaryocyte maturation defects, which lead to a failure of normal platelet shedding. Autoinflammatory disease, which is bone marrow-derived yet nonlymphoid in origin, is characterized by a massive infiltration of neutrophils into inflammatory lesions. Cytoskeletal responses are impaired in Wdr1 mutant neutrophils. These studies establish an essential requirement for Wdr1 in megakaryocytes and neutrophils, indicating that cofilin-mediated actin dynamics are critically important to the development and function of both cell types.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

POLN is a nuclear A-family DNA polymerase encoded in vertebrate genomes. POLN has unusual fidelity and DNA lesion bypass properties, including strong strand displacement activity, low fidelity favoring incorporation of T for template G and accurate translesion synthesis past a 5S-thymine glycol (5S-Tg). We searched for conserved features of the polymerase domain that distinguish it from prokaryotic pol I-type DNA polymerases. A Lys residue (679 in human POLN) of particular interest was identified in the conserved 'O-helix' of motif 4 in the fingers sub-domain. The corresponding residue is one of the most important for controlling fidelity of prokaryotic pol I and is a nonpolar Ala or Thr in those enzymes. Kinetic measurements show that K679A or K679T POLN mutant DNA polymerases have full activity on nondamaged templates, but poorly incorporate T opposite template G and do not bypass 5S-Tg efficiently. We also found that a conserved Tyr residue in the same motif not only affects sensitivity to dideoxynucleotides, but also greatly influences enzyme activity, fidelity and bypass. Protein sequence alignment reveals that POLN has three specific insertions in the DNA polymerase domain. The results demonstrate that residues have been strictly retained during evolution that confer unique bypass and fidelity properties on POLN.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cells must rapidly sense and respond to a wide variety of potentially cytotoxic external stressors to survive in a constantly changing environment. In a search for novel genes required for stress tolerance in Saccharomyces cerevisiae, we identified the uncharacterized open reading frame YER139C as a gene required for growth at 37 degrees C in the presence of the heat shock mimetic formamide. YER139C encodes the closest yeast homolog of the human RPAP2 protein, recently identified as a novel RNA polymerase II (RNAPII)-associated factor. Multiple lines of evidence support a role for this gene family in transcription, prompting us to rename YER139C RTR1 (regulator of transcription). The core RNAPII subunits RPB5, RPB7, and RPB9 were isolated as potent high-copy-number suppressors of the rtr1Delta temperature-sensitive growth phenotype, and deletion of the nonessential subunits RPB4 and RPB9 hypersensitized cells to RTR1 overexpression. Disruption of RTR1 resulted in mycophenolic acid sensitivity and synthetic genetic interactions with a number of genes involved in multiple phases of transcription. Consistently, rtr1Delta cells are defective in inducible transcription from the GAL1 promoter. Rtr1 constitutively shuttles between the cytoplasm and nucleus, where it physically associates with an active RNAPII transcriptional complex. Taken together, our data reveal a role for members of the RTR1/RPAP2 family as regulators of core RNAPII function.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A gain-of-function R620W polymorphism in the PTPN22 gene, encoding the lymphoid tyrosine phosphatase LYP, has recently emerged as an important risk factor for human autoimmunity. Here we report that another missense substitution (R263Q) within the catalytic domain of LYP leads to reduced phosphatase activity. High-resolution structural analysis revealed the molecular basis for this loss of function. Furthermore, the Q263 variant conferred protection against human systemic lupus erythematosus, reinforcing the proposal that inhibition of LYP activity could be beneficial in human autoimmunity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transcription enhancer factor 1 is essential for cardiac, skeletal, and smooth muscle development and uses its N-terminal TEA domain (TEAD) to bind M-CAT elements. Here, we present the first structure of TEAD and show that it is a three-helix bundle with a homeodomain fold. Structural data reveal how TEAD binds DNA. Using structure-function correlations, we find that the L1 loop is essential for cooperative loading of TEAD molecules on to tandemly duplicated M-CAT sites. Furthermore, using a microarray chip-based assay, we establish that known binding sites of the full-length protein are only a subset of DNA elements recognized by TEAD. Our results provide a model for understanding the regulation of genome-wide gene expression during development by TEA/ATTS family of transcription factors.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Previous studies have demonstrated that ribbon synapses in the retina do not contain the t-SNARE (target-soluble N-ethylmaleimide-sensitive factor attachment protein receptor) syntaxin 1A that is found in conventional synapses of the nervous system. In contrast, ribbon synapses of the retina contain the related isoform syntaxin 3. In addition to its localization in ribbon synapses, syntaxin 3 is also found in nonneuronal cells, where it has been implicated in the trafficking of transport vesicles to the apical plasma membrane of polarized cells. The syntaxin 3 gene codes for four different splice forms, syntaxins 3A, 3B, 3C, and 3D. We demonstrate here by using analysis of EST databases, RT-PCR, in situ hybridization, and Northern blot analysis that cells in the mouse retina express only syntaxin 3B. In contrast, nonneuronal tissues, such as kidney, express only syntaxin 3A. The two major syntaxin isoforms (3A and 3B) have an identical N-terminal domain but differ in the C-terminal half of the SNARE domain and the C-terminal transmembrane domain. These two domains are thought to be directly involved in synaptic vesicle fusion. The interaction of syntaxin 1A and syntaxin 3B with other synaptic proteins was examined. We found that both proteins bind Munc18/N-sec1 with similar affinity. In contrast, syntaxin 3B had a much lower binding affinity for the t-SNARE SNAP25 compared with syntaxin 1A. By using an in vitro fusion assay, we could demonstrate that vesicles containing syntaxin 3B and SNAP25 could fuse with vesicles containing synaptobrevin2/VAMP2, demonstrating that syntaxin 3B can function as a t-SNARE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cytochromes P450 are a superfamily of heme-thiolate proteins that function in a concert with another protein, cytochrome P450 reductase, as terminal oxidases of an enzymatic system catalyzing the metabolism of a variety of foreign compounds and endogenous substrates. In order to better understand P450s catalytic mechanism and substrate specificity, information about the structure of the active site is necessary. Given the lack of a crystal structure of mammalian P450, other methods have been used to elucidate the substrate recognition and binding site structure in the active center. In this project I utilized the photoaffinity labeling technique and site-directed mutagenesis approach to gain further structural insight into the active site of mammalian cytochrome P4501AI and examine the role of surface residues in the interaction of P4501A1 with the reductase. ^ Four crosslinked peptides were identified by photoaffinity labeling using diazido benzphetamine as a substrate analog. Alignment of the primary structure of cytochrome P4501A1 with that of bacterial cytochrome P450102 (the crystal structure of which is known) revealed that two of the isolated crosslinked peptides can be placed in the vicinity of heme (in the L helix region and β10-β11 sheet region of cytochrome P450102) and could be involved in substrate binding. The other two peptides were located on the surface of the protein with the label bound specifically to Lys residues that were proposed to be involved in reductase-P450 interaction. ^ Alternatively, it has been shown that some of the organic hydroperoxides can support P450 catalyzed reactions in the absence of NADPH, O2 and reductase. By means of photoaffinity labeling the cumene hydroperoxide binding region was identified. Using azidocumene as the photoaffinity label, the tripeptide T501-L502-K503 was shown to be the site where azidocumene covalently binds to P4501A1. The sequence alignment of cytochrome P4501A1 with cytochrome P450102 predicts that this region might correspond to β-sheet structure localized on the distal side of the heme ring near the I helix and the oxygen binding pocket. The role of Thr501 in the cumene hydroperoxide binding was confirmed by mutations of this residue and kinetic analysis of the effects of the mutations. ^ In addition, the role of two lysine residues, Lys271 and Lys279, in the interaction with reductase was examined by means of site-directed mutagenesis. The lysine residues were substituted with isoleucine and enzymatic activity of the wild type and the mutants were compared in reductase- and cumene hydroperoxide-supported systems. The lysine 279 residue has been shown to play a critical role in the P4501A1-reductase interaction. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

My dissertation focuses on two aspects of RNA sequencing technology. The first is the methodology for modeling the overdispersion inherent in RNA-seq data for differential expression analysis. This aspect is addressed in three sections. The second aspect is the application of RNA-seq data to identify the CpG island methylator phenotype (CIMP) by integrating datasets of mRNA expression level and DNA methylation status. Section 1: The cost of DNA sequencing has reduced dramatically in the past decade. Consequently, genomic research increasingly depends on sequencing technology. However it remains elusive how the sequencing capacity influences the accuracy of mRNA expression measurement. We observe that accuracy improves along with the increasing sequencing depth. To model the overdispersion, we use the beta-binomial distribution with a new parameter indicating the dependency between overdispersion and sequencing depth. Our modified beta-binomial model performs better than the binomial or the pure beta-binomial model with a lower false discovery rate. Section 2: Although a number of methods have been proposed in order to accurately analyze differential RNA expression on the gene level, modeling on the base pair level is required. Here, we find that the overdispersion rate decreases as the sequencing depth increases on the base pair level. Also, we propose four models and compare them with each other. As expected, our beta binomial model with a dynamic overdispersion rate is shown to be superior. Section 3: We investigate biases in RNA-seq by exploring the measurement of the external control, spike-in RNA. This study is based on two datasets with spike-in controls obtained from a recent study. We observe an undiscovered bias in the measurement of the spike-in transcripts that arises from the influence of the sample transcripts in RNA-seq. Also, we find that this influence is related to the local sequence of the random hexamer that is used in priming. We suggest a model of the inequality between samples and to correct this type of bias. Section 4: The expression of a gene can be turned off when its promoter is highly methylated. Several studies have reported that a clear threshold effect exists in gene silencing that is mediated by DNA methylation. It is reasonable to assume the thresholds are specific for each gene. It is also intriguing to investigate genes that are largely controlled by DNA methylation. These genes are called “L-shaped” genes. We develop a method to determine the DNA methylation threshold and identify a new CIMP of BRCA. In conclusion, we provide a detailed understanding of the relationship between the overdispersion rate and sequencing depth. And we reveal a new bias in RNA-seq and provide a detailed understanding of the relationship between this new bias and the local sequence. Also we develop a powerful method to dichotomize methylation status and consequently we identify a new CIMP of breast cancer with a distinct classification of molecular characteristics and clinical features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Transglutaminases are a family of enzymes that catalyze the covalent cross-linking of proteins through the formation of $\varepsilon$-($\gamma$-glutaminyl)-lysyl isopeptide bonds. Tissue transglutaminase (Tgase) is an intracellular enzyme which is expressed in terminally differentiated and senescent cells and also in cells undergoing apoptotic cell death. To characterize this enzyme and examine its relationship with other members of the transglutaminase family, cDNAs, the first two exons of the gene and 2 kb of the 5$\sp\prime$ flanking region, including the promoter, were isolated. The full length Tgase transcript consists of 66 bp of 5$\sp\prime$-UTR (untranslated) sequence, an open reading frame which encodes 686 amino acids and 1400 bp of 3$\sp\prime$-UTR sequence. Alignment of the deduced Tgase protein sequence with that of other transglutaminases revealed regions of strong homology, particularly in the active site region.^ The Tgase cDNA was used to isolate and characterize a genomic clone encompassing the 5$\sp\prime$ end of the mouse Tgase gene. The transcription start site was defined using genomic and cDNA clones coupled with S1 protection analysis and anchored PCR. This clone includes 2.3 kb upstream of the transcription start site and two exons that contain the first 256 nucleotides of the mouse Tgase cDNA sequence. The exon intron boundaries have been mapped and compared with the exon intron boundaries of three members of the transglutaminase family: human factor XIIIa, the human keratinocyte transglutaminase and human erythrocyte band 4.1. Tissue Tgase exon II is similar to comparable exons of these genes. However, exon I bears no resemblance with any of the other transglutaminase amino terminus exons.^ Previous work in our laboratory has shown that the transcription of the Tgase gene is directly controlled by retinoic acid and retinoic acid receptors. To identify the region of the Tgase gene responsible for regulating its expression, fragments of the Tgase promoter and 5$\sp\prime$-flanking region were cloned into the chloramphenicol actetyl transferase (CAT) reporter constructs. Transient transfection experiments with these constructs demonstrated that the upstream region of Tgase is a functional promoter which contains a retinoid response element within a 1573 nucleotide region spanning nucleotides $-$252 to $-$1825. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study is a secondary analysis of a survey developed by Dr. Jimmy Perkins and administered by San Antonio/Bexar County Metropolitan Health District. The survey was developed subsequent to the implementation of the city smoking ordinance effective January 1, 2004. The survey had a multi-purpose plan to establish the number of restaurants having smoke free status prior to and following the ordinance, determine compliance as it relates to a necessary smoking section and proper signage, and expose the rationale for restaurants to become smoke free. The data resulting from the survey was presented to the San Antonio/Bexar County Metropolitan Health District. The summary presented the types of establishments surveyed, smoking status of the establishment, reasons for the establishment becoming smoke free, compliance with smoking sections, compliance with signage requirements, awareness of ordinance, and chain status of the establishment. ^ The results of this study display the relationships among the variables previously mentioned. The following relationships have been examined and the outcomes have determined whether each is significant. After careful analysis, knowledge translates into compliance with signage regulations, which then translate into ordinance compliance. Size does matter as it relates to an establishment's number of employees and seating capacity. The smaller the establishment the more likely the establishment is to have become smoke free before the ordinance went into effect. Restaurants, rather than fast food establishments most commonly cited their reason for becoming smoke free was to comply with the ordinance and only ten percent of restaurants gave policy as the main reason for becoming smoke free. ^ This study is important for public health because the negative health effects of environmental tobacco smoke (ETS) are still an overwhelming problem in the United States (3). ETS is a Known Human Group A Carcinogen (5). The Environmental Protection Agency (EPA) has estimated that around 3,000 non-smoking Americans die every year from lung cancer caused by ETS (6). This information illustrates the importance of providing smoke free establishments, especially to non-smoking patrons. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Microarray technology is a high-throughput method for genotyping and gene expression profiling. Limited sensitivity and specificity are one of the essential problems for this technology. Most of existing methods of microarray data analysis have an apparent limitation for they merely deal with the numerical part of microarray data and have made little use of gene sequence information. Because it's the gene sequences that precisely define the physical objects being measured by a microarray, it is natural to make the gene sequences an essential part of the data analysis. This dissertation focused on the development of free energy models to integrate sequence information in microarray data analysis. The models were used to characterize the mechanism of hybridization on microarrays and enhance sensitivity and specificity of microarray measurements. ^ Cross-hybridization is a major obstacle factor for the sensitivity and specificity of microarray measurements. In this dissertation, we evaluated the scope of cross-hybridization problem on short-oligo microarrays. The results showed that cross hybridization on arrays is mostly caused by oligo fragments with a run of 10 to 16 nucleotides complementary to the probes. Furthermore, a free-energy based model was proposed to quantify the amount of cross-hybridization signal on each probe. This model treats cross-hybridization as an integral effect of the interactions between a probe and various off-target oligo fragments. Using public spike-in datasets, the model showed high accuracy in predicting the cross-hybridization signals on those probes whose intended targets are absent in the sample. ^ Several prospective models were proposed to improve Positional Dependent Nearest-Neighbor (PDNN) model for better quantification of gene expression and cross-hybridization. ^ The problem addressed in this dissertation is fundamental to the microarray technology. We expect that this study will help us to understand the detailed mechanism that determines sensitivity and specificity on the microarrays. Consequently, this research will have a wide impact on how microarrays are designed and how the data are interpreted. ^