987 resultados para sequence homology
Resumo:
Cystic Fibrosis (CF) is an autosomal recessive monogenic disorder caused by mutations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene with the ΔF508 mutation accounting for approximately 70% of all CF cases worldwide. This thesis investigates whether existing zinc finger nucleases designed in this lab and CRISPR/gRNAs designed in this thesis can mediate efficient homology-directed repair (HDR) with appropriate donor repair plasmids to correct CF-causing mutations in a CF cell line. Firstly, the most common mutation, ΔF508, was corrected using a pair of existing ZFNs, which cleave in intron 9, and the donor repair plasmid pITR-donor-XC, which contains the correct CTT sequence and two unique restriction sites. HDR was initially determined to be <1% but further analysis by next generation sequencing (NGS) revealed HDR occurred at a level of 2%. This relatively low level of repair was determined to be a consequence of distance from the cut site to the mutation and so rather than designing a new pair of ZFNs, the position of the existing intron 9 ZFNs was exploited and attempts made to correct >80% of CF-causing mutations. The ZFN cut site was used as the site for HDR of a mini-gene construct comprising exons 10-24 from CFTR cDNA (with appropriate splice acceptor and poly A sites) to allow production of full length corrected CFTR mRNA. Finally, the ability to cleave closer to the mutation and mediate repair of CFTR using the latest gene editing tool CRISPR/Cas9 was explored. Two CRISPR gRNAs were tested; CRISPR ex10 was shown to cleave at an efficiency of 15% and CRISPR in9 cleaved at 3%. Both CRISPR gRNAs mediated HDR with appropriate donor plasmids at a rate of ~1% as determined by NGS. This is the first evidence of CRISPR induced HDR in CF cell lines.
Resumo:
Lactococcus lactis is used extensively world-wide for the production of fermented dairy products. Bacteriophages (phages) infecting L. lactis can result in slow or incomplete fermentations, or may even cause total fermentation failure. Therefore, bacteriophages disrupting L. lactis fermentation are of economic concern. This thesis employed a multifaceted approach to investigate various molecular aspects of phage-host interaction in L. lactis. The genome sequence of an Irish dairy starter strain, the prophage-cured L. lactis subsp. cremoris UC509.9, was studied. The 2,250,427 bp circular chromosome represents the smallest among its sequenced lactococcal equivalents. The genome displays clear genetic adaptation to the dairy niche in the form of extensive reductive evolution. Gene prediction identified 2066 protein-encoding genes, including 104 which showed significant homology to transposase-specifying genes. Over 9 % of the identified genes appear to be inactivated through stop codons or frame shift mutations. Many pseudogenes were found in genes that are assigned to carbohydrate and amino acid transport and metabolism orthologous groups, reflecting L. lactis UC509.9’s adaptation to the lactose and casein-rich dairy environment. Sequence analysis of the eight plasmids of L. lactis revealed extensive adaptation to the dairy environment. Key industrial phenotypes were mapped and novel lactococcal plasmid-associated genes highlighted. In addition to chromosomally-encoded bacteriophage resistance systems, six functional such systems were identified, including two abortive infection systems, AbiB and AbiD1, explaining the observed phage resistance of L. lactis UC509.9 Molecular analysis suggests that the constitutive expression of AbiB is not lethal to cells, suggesting the protein is expressed in an un/inactivated form. Analysis of 936 species phage sk1-escape mutants of AbiB revealed that all such mutants harbour mutations in orf6, which encodes the major capsid protein. Results suggest that the major capsid protein is required for activation of the AbiB system, although this requires furrther investigations. Temporal transcriptomes of L. lactis UC509.9 undergoing lytic infection with either one of two distinct bacteriophages, Tuc2009 and c2, was determined and compared to the transcriptome of uninfected UC509.9 cells. Whole genome microarrays performed at various time-points post-infection demonstrated a rather modest impact on host transcription. Alterations in the UC509.9 transcriptome during lytic infection appear phage-specific, with a relatively small number of differentially transcribed genes shared between infection with either Tuc2009 or c2. Transcriptional profiles of both bacteriophages during lytic infection was shown to generally correlate with previous studies and allowed the confirmation of previously predicted promoter sequences. Bioinformatic analysis of genomic regions encoding the presumed cell wall polysaccharide (CW PS) biosynthesis gene cluster of several strains of L. lactis was performed. Results demonstrate the presence of three dominant genetic types of this gene cluster, termed type A, B and C. These regions were used for the development of a multiplex PCR to identify CW PS genotype of various lactococcal strains. Analysis of 936 species phage receptor binding protein phylogeny (RBP) and CW PS genotype revealed an apparent correlation between RBP phylogeny and CW PS type, thereby providing a partial explanation for the observed narrow host range of 936 phages. Further analysis of the genetic locus encompassing the presumed CW PS biosynthesis operon of eight strains identified as belonging to the CW PS C (geno)type, revealed the presence of a variable region among the examined strains. The obtained comparative analysis allowed for the identification of five subgroups of the C type, named C1 to C5. We purified an acidic polysaccharide from the cell wall of L. lactis 3107 (C2 subtype) and confirmed that it is structurally different from the CW PS of the C1 subtype L. lactis MG1363. Combinations of genes from the variable region of C2 subtype were amplified from L. lactis 3107 and introduced into a mutant of the C1 subtype L. lactis NZ9000 (a direct derivative of MG1363) deficient in CW PS biosynthesis. The resulting recombinant mutant synthesized a CW PS with a composition characteristic for that of the C2 subtype L. lactis 3107 and not the wildtype C1 L. lactis NZ9000. The recombinant mutant exhibited a changed phage resistance/sensitivity profile consistent with that of L. lactis 3107, which unambiguously demonstrated that L. lactis 3107 CW PS is the host cell surface receptor of two bacteriophages belonging to the P335 species as well as phages that are member of the 936 species. The research presented in this thesis has significantly advanced our understanding of L. lactis bacteriophage-host interactions in several ways. Firstly, the examination of plasmidencoded bacteriophage resistance systems has allowed inferences to be made regarding the mode of action of AbiB, thereby providing a platform for further elucidation of the molecular trigger of this system. Secondly, the phage infection transcriptome data presented, in addition to previous work, has made L. lactis a model organism in terms of transcriptomic studies of bacteriophage-host interactions. And finally, the research described in this thesis has for the first time explicitly revealed the nature of a carbohydrate bacteriophage receptor in L. lactis, while also providing a logical explanation for the observed narrow host ranges exhibited by 936 and P335 phages. Future research in discerning the structures of other L. lactis CW PS, combined with the determination of the molecular interplay between receptor binding proteins of these phages and CW PS will allow an in depth understanding of the mechanism by which the most prevalent lactococcal phages identify and adsorb to their specific host.
Resumo:
Eukaryotic genomes are mostly composed of noncoding DNA whose role is still poorly understood. Studies in several organisms have shown correlations between the length of the intergenic and genic sequences of a gene and the expression of its corresponding mRNA transcript. Some studies have found a positive relationship between intergenic sequence length and expression diversity between tissues, and concluded that genes under greater regulatory control require more regulatory information in their intergenic sequences. Other reports found a negative relationship between expression level and gene length and the interpretation was that there is selection pressure for highly expressed genes to remain small. However, a correlation between gene sequence length and expression diversity, opposite to that observed for intergenic sequences, has also been reported, and to date there is no testable explanation for this observation. To shed light on these varied and sometimes conflicting results, we performed a thorough study of the relationships between sequence length and gene expression using cell-type (tissue) specific microarray data in Arabidopsis thaliana. We measured median gene expression across tissues (expression level), expression variability between tissues (expression pattern uniformity), and expression variability between replicates (expression noise). We found that intergenic (upstream and downstream) and genic (coding and noncoding) sequences have generally opposite relationships with respect to expression, whether it is tissue variability, median, or expression noise. To explain these results we propose a model, in which the lengths of the intergenic and genic sequences have opposite effects on the ability of the transcribed region of the gene to be epigenetically regulated for differential expression. These findings could shed light on the role and influence of noncoding sequences on gene expression.
Resumo:
In addition to modulating the function and stability of cellular mRNAs, microRNAs can profoundly affect the life cycles of viruses bearing sequence complementary targets, a finding recently exploited to ameliorate toxicities of vaccines and oncolytic viruses. To elucidate the mechanisms underlying microRNA-mediated antiviral activity, we modified the 3' untranslated region (3'UTR) of Coxsackievirus A21 to incorporate targets with varying degrees of homology to endogenous microRNAs. We show that microRNAs can interrupt the picornavirus life-cycle at multiple levels, including catalytic degradation of the viral RNA genome, suppression of cap-independent mRNA translation, and interference with genome encapsidation. In addition, we have examined the extent to which endogenous microRNAs can suppress viral replication in vivo and how viruses can overcome this inhibition by microRNA saturation in mouse cancer models.
Resumo:
Light-dependent deactivation of rhodopsin as well as homologous desensitization of beta-adrenergic receptors involves receptor phosphorylation that is mediated by the highly specific protein kinases rhodopsin kinase (RK) and beta-adrenergic receptor kinase (beta ARK), respectively. We report here the cloning of a complementary DNA for RK. The deduced amino acid sequence shows a high degree of homology to beta ARK. In a phylogenetic tree constructed by comparing the catalytic domains of several protein kinases, RK and beta ARK are located on a branch close to, but separate from the cyclic nucleotide-dependent protein kinase and protein kinase C subfamilies. From the common structural features we conclude that both RK and beta ARK are members of a newly delineated gene family of guanine nucleotide-binding protein (G protein)-coupled receptor kinases that may function in diverse pathways to regulate the function of such receptors.
Resumo:
Many of the biochemical reactions of apoptotic cell death, including mitochondrial cytochrome c release and caspase activation, can be reconstituted in cell-free extracts derived from Xenopus eggs. In addition, because caspase activation does not occur until the egg extract has been incubated for several hours on the bench, upstream signaling processes occurring before full apoptosis are rendered accessible to biochemical manipulation. We reported previously that the adaptor protein Crk is required for apoptotic signaling in egg extracts (Evans, E.K., W. Lu, S.L. Strum, B.J. Mayer, and S. Kornbluth. 1997. EMBO (Eur. Mol. Biol. Organ.) J. 16:230-241). Moreover, we demonstrated that removal of Crk Src homology (SH)2 or SH3 interactors from the extracts prevented apoptosis. We now report the finding that the relevant Crk SH2-interacting protein, important for apoptotic signaling in the extract, is the well-known cell cycle regulator, Wee1. We have demonstrated a specific interaction between tyrosine-phosphorylated Wee1 and the Crk SH2 domain and have shown that recombinant Wee1 can restore apoptosis to an extract depleted of SH2 interactors. Moreover, exogenous Wee1 accelerated apoptosis in egg extracts, and this acceleration was largely dependent on the presence of endogenous Crk protein. As other Cdk inhibitors, such as roscovitine and Myt1, did not act like Wee1 to accelerate apoptosis, we propose that Wee1-Crk complexes signal in a novel apoptotic pathway, which may be unrelated to Wee1's role as a cell cycle regulator.
Resumo:
BACKGROUND: The evolutionary relationships of modern birds are among the most challenging to understand in systematic biology and have been debated for centuries. To address this challenge, we assembled or collected the genomes of 48 avian species spanning most orders of birds, including all Neognathae and two of the five Palaeognathae orders, and used the genomes to construct a genome-scale avian phylogenetic tree and perform comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here we release assemblies and datasets associated with the comparative genome analyses, which include 38 newly sequenced avian genomes plus previously released or simultaneously released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck, Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that this resource will serve future efforts in phylogenomics and comparative genomics. FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform and assembled using a whole genome shotgun strategy. The 48 genomes were categorized into two groups according to the N50 scaffold size of the assemblies: a high depth group comprising 23 species sequenced at high coverage (>50X) with multiple insert size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein coding genes in each avian genome relative to chicken, zebra finch and human, as well as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full genome assemblies of 38 newly sequenced avian species, link genome assembly downloads for the 7 of the remaining 10 species, and provide a guideline of genomic data that has been generated and used in our Avian Phylogenomics Project. To the best of our knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics project to date. The genomic data presented here is expected to accelerate further analyses in many fields, including phylogenetics, comparative genomics, evolution, neurobiology, development biology, and other related areas.
Resumo:
BACKGROUND: Epigenetic alterations have been implicated in the pathogenesis of solid tumors, however, proto-oncogenes activated by promoter demethylation have been sporadically reported. We used an integrative method to analyze expression in primary head and neck squamous cell carcinoma (HNSCC) and pharmacologically demethylated cell lines to identify aberrantly demethylated and expressed candidate proto-oncogenes and cancer testes antigens in HNSCC. METHODOLOGY/PRINCIPAL FINDINGS: We noted coordinated promoter demethylation and simultaneous transcriptional upregulation of proto-oncogene candidates with promoter homology, and phylogenetic footprinting of these promoters demonstrated potential recognition sites for the transcription factor BORIS. Aberrant BORIS expression correlated with upregulation of candidate proto-oncogenes in multiple human malignancies including primary non-small cell lung cancers and HNSCC, induced coordinated proto-oncogene specific promoter demethylation and expression in non-tumorigenic cells, and transformed NIH3T3 cells. CONCLUSIONS/SIGNIFICANCE: Coordinated, epigenetic unmasking of multiple genes with growth promoting activity occurs in aerodigestive cancers, and BORIS is implicated in the coordinated promoter demethylation and reactivation of epigenetically silenced genes in human cancers.
Resumo:
Staphylococcal protein A (SpA) is an important virulence factor from Staphylococcus aureus responsible for the bacterium's evasion of the host immune system. SpA includes five small three-helix-bundle domains that can each bind with high affinity to many host proteins such as antibodies. The interaction between a SpA domain and the Fc fragment of IgG was partially elucidated previously in the crystal structure 1FC2. Although informative, the previous structure was not properly folded and left many substantial questions unanswered, such as a detailed description of the tertiary structure of SpA domains in complex with Fc and the structural changes that take place upon binding. Here we report the 2.3-Å structure of a fully folded SpA domain in complex with Fc. Our structure indicates that there are extensive structural rearrangements necessary for binding Fc, including a general reduction in SpA conformational heterogeneity, freezing out of polyrotameric interfacial residues, and displacement of a SpA side chain by an Fc side chain in a molecular-recognition pocket. Such a loss of conformational heterogeneity upon formation of the protein-protein interface may occur when SpA binds its multiple binding partners. Suppression of conformational heterogeneity may be an important structural paradigm in functionally plastic proteins.
Resumo:
Cellular stresses activate the tumor suppressor p53 protein leading to selective binding to DNA response elements (REs) and gene transactivation from a large pool of potential p53 REs (p53REs). To elucidate how p53RE sequences and local chromatin context interact to affect p53 binding and gene transactivation, we mapped genome-wide binding localizations of p53 and H3K4me3 in untreated and doxorubicin (DXR)-treated human lymphoblastoid cells. We examined the relationships among p53 occupancy, gene expression, H3K4me3, chromatin accessibility (DNase 1 hypersensitivity, DHS), ENCODE chromatin states, p53RE sequence, and evolutionary conservation. We observed that the inducible expression of p53-regulated genes was associated with the steady-state chromatin status of the cell. Most highly inducible p53-regulated genes were suppressed at baseline and marked by repressive histone modifications or displayed CTCF binding. Comparison of p53RE sequences residing in different chromatin contexts demonstrated that weaker p53REs resided in open promoters, while stronger p53REs were located within enhancers and repressed chromatin. p53 occupancy was strongly correlated with similarity of the target DNA sequences to the p53RE consensus, but surprisingly, inversely correlated with pre-existing nucleosome accessibility (DHS) and evolutionary conservation at the p53RE. Occupancy by p53 of REs that overlapped transposable element (TE) repeats was significantly higher (p<10-7) and correlated with stronger p53RE sequences (p<10-110) relative to nonTE-associated p53REs, particularly for MLT1H, LTR10B, and Mer61 TEs. However, binding at these elements was generally not associated with transactivation of adjacent genes. Occupied p53REs located in L2-like TEs were unique in displaying highly negative PhyloP scores (predicted fast-evolving) and being associated with altered H3K4me3 and DHS levels. These results underscore the systematic interaction between chromatin status and p53RE context in the induced transactivation response. This p53 regulated response appears to have been tuned via evolutionary processes that may have led to repression and/or utilization of p53REs originating from primate-specific transposon elements.
Resumo:
DNaseI footprinting is an established assay for identifying transcription factor (TF)-DNA interactions with single base pair resolution. High-throughput DNase-seq assays have recently been used to detect in vivo DNase footprints across the genome. Multiple computational approaches have been developed to identify DNase-seq footprints as predictors of TF binding. However, recent studies have pointed to a substantial cleavage bias of DNase and its negative impact on predictive performance of footprinting. To assess the potential for using DNase-seq to identify individual binding sites, we performed DNase-seq on deproteinized genomic DNA and determined sequence cleavage bias. This allowed us to build bias corrected and TF-specific footprint models. The predictive performance of these models demonstrated that predicted footprints corresponded to high-confidence TF-DNA interactions. DNase-seq footprints were absent under a fraction of ChIP-seq peaks, which we show to be indicative of weaker binding, indirect TF-DNA interactions or possible ChIP artifacts. The modeling approach was also able to detect variation in the consensus motifs that TFs bind to. Finally, cell type specific footprints were detected within DNase hypersensitive sites that are present in multiple cell types, further supporting that footprints can identify changes in TF binding that are not detectable using other strategies.
Resumo:
Associating genetic variation with quantitative measures of gene regulation offers a way to bridge the gap between genotype and complex phenotypes. In order to identify quantitative trait loci (QTLs) that influence the binding of a transcription factor in humans, we measured binding of the multifunctional transcription and chromatin factor CTCF in 51 HapMap cell lines. We identified thousands of QTLs in which genotype differences were associated with differences in CTCF binding strength, hundreds of them confirmed by directly observable allele-specific binding bias. The majority of QTLs were either within 1 kb of the CTCF binding motif, or in linkage disequilibrium with a variant within 1 kb of the motif. On the X chromosome we observed three classes of binding sites: a minority class bound only to the active copy of the X chromosome, the majority class bound to both the active and inactive X, and a small set of female-specific CTCF sites associated with two non-coding RNA genes. In sum, our data reveal extensive genetic effects on CTCF binding, both direct and indirect, and identify a diversity of patterns of CTCF binding on the X chromosome.
Resumo:
New representations of tree-structured data objects, using ideas from topological data analysis, enable improved statistical analyses of a population of brain artery trees. A number of representations of each data tree arise from persistence diagrams that quantify branching and looping of vessels at multiple scales. Novel approaches to the statistical analysis, through various summaries of the persistence diagrams, lead to heightened correlations with covariates such as age and sex, relative to earlier analyses of this data set. The correlation with age continues to be significant even after controlling for correlations from earlier significant summaries.
Resumo:
The paper considers the open shop scheduling problem to minimize the make-span, provided that one of the machines has to process the jobs according to a given sequence. We show that in the preemptive case the problem is polynomially solvable for an arbitrary number of machines. If preemption is not allowed, the problem is NP-hard in the strong sense if the number of machines is variable, and is NP-hard in the ordinary sense in the case of two machines. For the latter case we give a heuristic algorithm that runs in linear time and produces a schedule with the makespan that is at most 5/4 times the optimal value. We also show that the two-machine problem in the nonpreemptive case is solvable in pseudopolynomial time by a dynamic programming algorithm, and that the algorithm can be converted into a fully polynomial approximation scheme. © 1998 John Wiley & Sons, Inc. Naval Research Logistics 45: 705–731, 1998
Resumo:
The chromosomal genotype, as judged by multi locus sequence typing, and the episomal genotype, as judged by plasmid profile and cry gene content, were analyzed for a collection of strains of Bacillus thuringiensis. These had been recovered in vegetative form over a period of several months from the leaves of a small plot of clover (Trifolium hybridum). A clonal population structure was indicated, although greater variation in sequence types (STs) was discovered than in previous collections of B. cereus/B. thuringiensis. Isolates taken at the same time had quite different genotypes, whereas those of identical genotypes were recovered at different times. The profiles of plasmid content and cry genes generally bore no relation to each other nor to the STs. Evidently, although relatively little recombination was occurring in the seven chromosomal genes analyzed, a great deal of conjugal transfer, and perhaps recombination, was occurring involving plasmids. A clinical diarrheal isolate of B. cereus and the commercial biopesticide strain HD-1 of B. thuringiensis, both included as out-groups, were found to have very similar STs. This further emphasizes the role of episomal elements in the characteristics and differentiation of these two species.