248 resultados para Regulatory Elements, Transcriptional

em Queensland University of Technology - ePrints Archive


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Light plays a unique role for plants as it is both a source of energy for growth and a signal for development. Light captured by the pigments in the light harvesting complexes is used to drive the synthesis of the chemical energy required for carbon assimilation. The light perceived by photoreceptors activates effectors, such as transcription factors (TFs), which modulate the expression of light-responsive genes. Recently, it has been speculated that increasing the photosynthetic rate could further improve the yield potential of three carbon (C3) crops such as wheat. However, little is currently known about the transcriptional regulation of photosynthesis genes, particularly in crop species. Nuclear factor Y (NF-Y) TF is a functionally diverse regulator of growth and development in the model plant species, with demonstrated roles in embryo development, stress response, flowering time and chloroplast biogenesis. Furthermore, a light-responsive NF-Y binding site (CCAAT-box) is present in the promoter of a spinach photosynthesis gene. As photosynthesis genes are co-regulated by light and co-regulated genes typically have similar regulatory elements in their promoters, it seems likely that other photosynthesis genes would also have light-responsive CCAAT-boxes. This provided the impetus to investigate the NF-Y TF in bread wheat. This thesis is focussed on wheat NF-Y members that have roles in light-mediated gene regulation with an emphasis on their involvement in the regulation of photosynthesis genes. NF-Y is a heterotrimeric complex, comprised of the three subunits NF-YA, NF-YB and NF-YC. Unlike the mammalian and yeast counterparts, each of the three subunits is encoded by multiple genes in Arabidopsis. The initial step taken in this study was the identification of the wheat NF-Y family (Chapter 3). A search of the current wheat nucleotide sequence databases identified 37 NF-Y genes (10 NF-YA, 11 NF-YB, 14 NF-YC & 2 Dr1). Phylogenetic analysis revealed that each of the three wheat NF-Y (TaNF-Y) subunit families could be divided into 4-5 clades based on their conserved core regions. Outside of the core regions, eleven motifs were identified to be conserved between Arabidopsis, rice and wheat NF-Y subunit members. The expression profiles of TaNF-Y genes were constructed using quantitative real-time polymerase chain reaction (RT-PCR). Some TaNF-Y subunit members had little variation in their transcript levels among the organs, while others displayed organ-predominant expression profiles, including those expressed mainly in the photosynthetic organs. To investigate their potential role in light-mediated gene regulation, the light responsiveness of the TaNF-Y genes were examined (Chapters 4 and 5). Two TaNF-YB and five TaNF-YC members were markedly upregulated by light in both the wheat leaves and seedling shoots. To identify the potential target genes of the light-upregulated NF-Y subunit members, a gene expression correlation analysis was conducted using publically available Affymetrix Wheat Genome Array datasets. This analysis revealed that the transcript expression levels of TaNF-YB3 and TaNF-YC11 were significantly correlated with those of photosynthesis genes. These correlated express profiles were also observed in the quantitative RT-PCR dataset from wheat plants grown under light and dark conditions. Sequence analysis of the promoters of these wheat photosynthesis genes revealed that they were enriched with potential NF-Y binding sites (CCAAT-box). The potential role of TaNF-YB3 in the regulation of photosynthetic genes was further investigated using a transgenic approach (Chapter 5). Transgenic wheat lines constitutively expressing TaNF-YB3 were found to have significantly increased expression levels of photosynthesis genes, including those encoding light harvesting chlorophyll a/b-binding proteins, photosystem I reaction centre subunits, a chloroplast ATP synthase subunit and glutamyl-tRNA reductase (GluTR). GluTR is a rate-limiting enzyme in the chlorophyll biosynthesis pathway. In association with the increased expression of the photosynthesis genes, the transgenic lines had a higher leaf chlorophyll content, increased photosynthetic rate and had a more rapid early growth rate compared to the wild-type wheat. In addition to its role in the regulation of photosynthesis genes, TaNF-YB3 overexpression lines flower on average 2-days earlier than the wild-type (Chapter 6). Quantitative RT-PCR analysis showed that there was a 13-fold increase in the expression level of the floral integrator, TaFT. The transcript levels of other downstream genes (TaFT2 and TaVRN1) were also increased in the transgenic lines. Furthermore, the transcript levels of TaNF-YB3 were significantly correlated with those of constans (CO), constans-like (COL) and timing of chlorophyll a/b-binding (CAB) expression 1 [TOC1; (CCT)] domain-containing proteins known to be involved in the regulation of flowering time. To summarise the key findings of this study, 37 NF-Y genes were identified in the crop species wheat. An in depth analysis of TaNF-Y gene expression profiles revealed that the potential role of some light-upregulated members was in the regulation of photosynthetic genes. The involvement of TaNF-YB3 in the regulation of photosynthesis genes was supported by data obtained from transgenic wheat lines with increased constitutive expression of TaNF-YB3. The overexpression of TaNF-YB3 in the transgenic lines revealed this NF-YB member is also involved in the fine-tuning of flowering time. These data suggest that the NF-Y TF plays an important role in light-mediated gene regulation in wheat.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Body fat distribution is a heritable trait and a well-established predictor of adverse metabolic outcomes, independent of overall adiposity. To increase our understanding of the genetic basis of body fat distribution and its molecular links to cardiometabolic traits, here we conduct genome-wide association meta-analyses of traits related to waist and hip circumferences in up to 224,459 individuals. We identify 49 loci (33 new) associated with waist-to-hip ratio adjusted for body mass index (BMI), and an additional 19 loci newly associated with related waist and hip circumference measures (P < 5 × 10−8). In total, 20 of the 49 waist-to-hip ratio adjusted for BMI loci show significant sexual dimorphism, 19 of which display a stronger effect in women. The identified loci were enriched for genes expressed in adipose tissue and for putative regulatory elements in adipocytes. Pathway analyses implicated adipogenesis, angiogenesis, transcriptional regulation and insulin resistance as processes affecting fat distribution, providing insight into potential pathophysiological mechanisms.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

While much of the genetic variation in RNA viruses arises because of the error-prone nature of their RNA-dependent RNA polymerases, much larger changes may occur as a result of recombination. An extreme example of genetic change is found in defective interfering (DI) viral particles, where large sections of the genome of a parental virus have been deleted and the residual sub-genome fragment is replicated by complementation by co-infecting functional viruses. While most reports of DI particles have referred to studies in vitro, there is some evidence for the presence of DI particles in chronic viral infections in vivo. In this study, short fragments of dengue virus (DENV) RNA containing only key regulatory elements at the 3' and 5' ends of the genome were recovered from the sera of patients infected with any of the four DENV serotypes. Identical RNA fragments were detected in the supernatant from cultures of Aedes mosquito cells that were infected by the addition of sera from dengue patients, suggesting that the sub-genomic RNA might be transmitted between human and mosquito hosts in defective interfering (DI) viral particles. In vitro transcribed sub-genomic RNA corresponding to that detected in vivo could be packaged in virus like particles in the presence of wild type virus and transmitted for at least three passages in cell culture. DENV preparations enriched for these putative DI particles reduced the yield of wild type dengue virus following co-infections of C6-36 cells. This is the first report of DI particles in an acute arboviral infection in nature. The internal genomic deletions described here are the most extensive defects observed in DENV and may be part of a much broader disease attenuating process that is mediated by defective viruses.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

KLK15 over-expression is reported to be a significant predictor of reduced progression-free survival and overall survival in ovarian cancer. Our aim was to analyse the KLK15 gene for putative functional single nucleotide polymorphisms (SNPs) and assess the association of these and KLK15 HapMap tag SNPs with ovarian cancer survival. Results In silico analysis was performed to identify KLK15 regulatory elements and to classify potentially functional SNPs in these regions. After SNP validation and identification by DNA sequencing of ovarian cancer cell lines and aggressive ovarian cancer patients, 9 SNPs were shortlisted and genotyped using the Sequenom iPLEX Mass Array platform in a cohort of Australian ovarian cancer patients (N = 319). In the Australian dataset we observed significantly worse survival for the KLK15 rs266851 SNP in a dominant model (Hazard Ratio (HR) 1.42, 95% CI 1.02-1.96). This association was observed in the same direction in two independent datasets, with a combined HR for the three studies of 1.16 (1.00-1.34). This SNP lies 15bp downstream of a novel exon and is predicted to be involved in mRNA splicing. The mutant allele is also predicted to abrogate an HSF-2 binding site. Conclusions We provide evidence of association for the SNP rs266851 with ovarian cancer survival. Our results provide the impetus for downstream functional assays and additional independent validation studies to assess the role of KLK15 regulatory SNPs and KLK15 isoforms with alternative intracellular functional roles in ovarian cancer survival.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Endometrial cancer is one of the most common female diseases in developed nations and is the most commonly diagnosed gynaecological cancer in Australia. The disease is commonly classified by histology: endometrioid or non-endometrioid endometrial cancer. While non-endometrioid endometrial cancers are accepted to be high-grade, aggressive cancers, endometrioid cancers (comprising 80% of all endometrial cancers diagnosed) generally carry a favourable patient prognosis. However, endometrioid endometrial cancer patients endure significant morbidity due to surgery and radiotherapy used for disease treatment, and patients with recurrent disease have a 5-year survival rate of less than 50%. Genetic analysis of women with endometrial cancer could uncover novel markers associated with disease risk and/or prognosis, which could then be used to identify women at high risk and for the use of specialised treatments. Proteases are widely accepted to play an important role in the development and progression of cancer. This PhD project hypothesised that SNPs from two protease gene families, the matrix metalloproteases (MMPs, including their tissue inhibitors, TIMPs) and the tissue kallikrein-related peptidases (KLKs) would be associated with endometrial cancer susceptibility and/or prognosis. In the first part of this study, optimisation of the genotyping techniques was performed. Results from previously published endometrial cancer genetic association studies were attempted to be validated in a large, multicentre replication set (maximum cases n = 2,888, controls n = 4,483, 3 studies). The rs11224561 progesterone receptor SNP (PGR, A/G) was observed to be associated with increased endometrial cancer risk (per A allele OR 1.31, 95% CI 1.12-1.53; p-trend = 0.001), a result which was initially reported among a Chinese sample set. Previously reported associations for the remaining 8 SNPs investigated for this section of the PhD study were not confirmed, thereby reinforcing the importance of validation of genetic association studies. To examine the effect of SNPs from the MMP and KLK families on endometrial cancer risk, we selected the most significantly associated MMP and KLK SNPs from genome-wide association study analysis (GWAS) to be genotyped in the GWAS replication set (cases n = 4,725, controls n = 9,803, 13 studies). The significance of the MMP24 rs932562 SNP was unchanged after incorporation of the stage 2 samples (Stage 1 per allele OR 1.18, p = 0.002; Combined Stage 1 and 2 OR 1.09, p = 0.002). The rs10426 SNP, located 3' to KLK10 was predicted by bioinformatic analysis to effect miRNA binding. This SNP was observed in the GWAS stage 1 result to exhibit a recessive effect on endometrial cancer risk, a result which was not validated in the stage 2 sample set (Stage 1 OR 1.44, p = 0.007; Combined Stage 1 and 2 OR 1.14, p = 0.08). Investigation of the regions imputed surrounding the MMP, TIMP and KLK genes did not reveal any significant targets for further analysis. Analysis of the case data from the endometrial cancer GWAS to identify genetic variation associated with cancer grade did not reveal SNPs from the MMP, TIMP or KLK genes to be statistically significant. However, the representation of SNPs from the MMP, TIMP and KLK families by the GWAS genotyping platform used in this PhD project was examined and observed to be very low, with the genetic variation of four genes (MMP23A, MMP23B, MMP28 and TIMP1) not captured at all by this technique. This suggests that comprehensive candidate gene association studies will be required to assess the role of SNPs from these genes with endometrial cancer risk and prognosis. Meta-analysis of gene expression microarray datasets curated as part of this PhD study identified a number of MMP, TIMP and KLK genes to display differential expression by endometrial cancer status (MMP2, MMP10, MMP11, MMP13, MMP19, MMP25 and KLK1) and histology (MMP2, MMP11, MMP12, MMP26, MMP28, TIMP2, TIMP3, KLK6, KLK7, KLK11 and KLK12). In light of these findings these genes should be prioritised for future targeted genetic association studies. Two SNPs located 43.5 Mb apart on chromosome 15 were observed from the GWAS analysis to be associated with increased endometrial cancer grade, results that were validated in silico in two independent datasets. One of these SNPs, rs8035725 is located in the 5' untranslated region of a MYC promoter binding protein DENND4A (Stage 1 OR 1.15, p = 9.85 x 10P -5 P, combined Stage 1 and in silico validation OR 1.13, p = 5.24 x 10P -6 P). This SNP has previously been reported to alter the expression of PTPLAD1, a gene involved in the synthesis of very long fatty acid chains and in the Rac1 signaling pathway. Meta-analysis of gene expression microarray data found PTPLAD1 to display increased expression in the aggressive non-endometrioid histology compared with endometrioid endometrial cancer, suggesting that the causal SNP underlying the observed genetic association may influence expression of this gene. Neither rs8035725 nor significant SNPs identified by imputation were predicted bioinformatically to affect transcription factor binding sites, indicating that further studies are required to assess their potential effect on other regulatory elements. The other grade- associated SNP, rs6606792, is located upstream of an inferred pseudogene, ELMO2P1 (Stage 1 OR 1.12, p = 5 x 10P -5 P; combined Stage 1 and in silico validation OR 1.09, p = 3.56 x 10P -5 P). Imputation of the ±1 Mb region surrounding this SNP revealed a cluster of significantly associated variants which are predicted to abolish various transcription factor binding sites, and would be expected to decrease gene expression. ELMO2P1 was not included on the microarray platforms collected for this PhD, and so its expression could not be investigated. However, the high sequence homology of ELMO2P1 with ELMO2, a gene important to cell motility, indicates that ELMO2 could be the parent gene for ELMO2P1 and as such, ELMO2P1 could function to regulate the expression of ELMO2. Increased expression of ELMO2 was seen to be associated with increasing endometrial cancer grade, as well as with aggressive endometrial cancer histological subtypes by microarray meta-analysis. Thus, it is hypothesised that SNPs in linkage disequilibrium with rs6606792 decrease the transcription of ELMO2P1, reducing the regulatory effect of ELMO2P1 on ELMO2 expression. Consequently, ELMO2 expression is increased, cell motility is enhanced leading to an aggressive endometrial cancer phenotype. In summary, these findings have identified several areas of research for further study. The results presented in this thesis provide evidence that a SNP in PGR is associated with risk of developing endometrial cancer. This PhD study also reports two independent loci on chromosome 15 to be associated with increased endometrial cancer grade, and furthermore, genes associated with these SNPs to be differentially expressed according in aggressive subtypes and/or by grade. The studies reported in this thesis support the need for comprehensive SNP association studies on prioritised MMP, TIMP and KLK genes in large sample sets. Until these studies are performed, the role of MMP, TIMP and KLK genetic variation remains unclear. Overall, this PhD study has contributed to the understanding of genetic variation involvement in endometrial cancer susceptibility and prognosis. Importantly, the genetic regions highlighted in this study could lead to the identification of novel gene targets to better understand the biology of endometrial cancer and also aid in the development of therapeutics directed at treating this disease.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Bladder cancer is associated with high recurrence and mortality rates due to metastasis. The elucidation of metastasis suppressors may offer therapeutic opportunities if their mechanisms of action can be elucidated and tractably exploited. In this study, we investigated the clinical and functional significance of the transcription factor activating transcription factor 3 (ATF3) in bladder cancer metastasis. Gene expression analysis revealed that decreased ATF3 was associated with bladder cancer progression and reduced survival of patients with bladder cancer. Correspondingly, ATF3 overexpression in highly metastatic bladder cancer cells decreased migration in vitro and experimental metastasis in vivo. Conversely, ATF3 silencing increased the migration of bladder cancer cells with limited metastatic capability in the absence of any effect on proliferation. In keeping with their increased motility, metastatic bladder cancer cells had increased numbers of actin filaments. Moreover, ATF3 expression correlated with expression of the actin filament severing protein gelsolin (GSN). Mechanistic studies revealed that ATF3 upregulated GSN, whereas ATF3 silencing reduced GSN levels, concomitant with alterations in the actin cytoskeleton. We identified six ATF3 regulatory elements in the first intron of the GSN gene confirmed by chromatin immunoprecipitation analysis. Critically, GSN expression reversed the metastatic capacity of bladder cancer cells with diminished levels of ATF3. Taken together, our results indicate that ATF3 suppresses metastasis of bladder cancer cells, at least in part through the upregulation of GSN-mediated actin remodeling. These findings suggest ATF3 coupled with GSN as prognostic markers for bladder cancer metastasis.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Computational epigenetics is a new area of research focused on exploring how DNA methylation patterns affect transcription factor binding that affect gene expression patterns. The aim of this study was to produce a new protocol for the detection of DNA methylation patterns using computational analysis which can be further confirmed by bisulfite PCR with serial pyrosequencing. The upstream regulatory element and pre-initiation complex relative to CpG islets within the methylenetetrahydrofolate reductase gene were determined via computational analysis and online databases. The 1,104 bp long CpG island located near to or at the alternative promoter site of methylenetetrahydrofolate reductase gene was identified. The CpG plot indicated that CpG islets A and B, within the island, contained 62 and 75 % GC content CpG ratios of 0.70 and 0.80–0.95, respectively. Further exploration of the CpG islets A and B indicates that the transcription start sites were GGC which were absent from the TATA boxes. In addition, although six PROSITE motifs were identified in CpG B, no motifs were detected in CpG A. A number of cis-regulatory elements were found in different regions within the CpGs A and B. Transcription factors were predicted to bind to CpGs A and B with varying affinities depending on the DNA methylation status. In addition, transcription factor binding may influence the expression patterns of the methylenetetrahydrofolate reductase gene by recruiting chromatin condensation inducing factors. These results have significant implications for the understanding of the architecture of transcription factor binding at CpG islets as well as DNA methylation patterns that affect chromatin structure.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Candidate gene studies have reported CYP19A1 variants to be associated with endometrial cancer and with estradiol (E2) concentrations. We analyzed 2937 single nucleotide polymorphisms (SNPs) in 6608 endometrial cancer cases and 37 925 controls and report the first genome wide-significant association between endometrial cancer and a CYP19A1 SNP (rs727479 in intron 2, P=4.8x10(-11)). SNP rs727479 was also among those most strongly associated with circulating E2 concentrations in 2767 post-menopausal controls (P=7.4x10(-8)). The observed endometrial cancer odds ratio per rs727479 A-allele (1.15, CI=1.11-1.21) is compatible with that predicted by the observed effect on E2 concentrations (1.09, CI=1.03-1.21), consistent with the hypothesis that endometrial cancer risk is driven by E2. From 28 candidate-causal SNPs, 12 co-located with three putative gene-regulatory elements and their risk alleles associated with higher CYP19A1 expression in bioinformatical analyses. For both phenotypes, the associations with rs727479 were stronger among women with a higher BMI (Pinteraction=0.034 and 0.066 respectively), suggesting a biologically plausible gene-environment interaction.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thirteen common susceptibility loci have been reproducibly associated with cutaneous malignant melanoma (CMM). We report the results of an international 2-stage meta-analysis of CMM genome-wide association studies (GWAS). This meta-analysis combines 11 GWAS (5 previously unpublished) and a further three stage 2 data sets, totaling 15,990 CMM cases and 26,409 controls. Five loci not previously associated with CMM risk reached genome-wide significance (P < 5 × 10−8), as did 2 previously reported but unreplicated loci and all 13 established loci. Newly associated SNPs fall within putative melanocyte regulatory elements, and bioinformatic and expression quantitative trait locus (eQTL) data highlight candidate genes in the associated regions, including one involved in telomere biology.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The clinical overlap between monogenic Familial Hemiplegic Migraine (FHM) and common migraine subtypes, and the fact that all three FHM genes are involved in the transport of ions, suggest that ion transport genes may underlie susceptibility to common forms of migraine. To test this leading hypothesis, we examined common variation in 155 ion transport genes using 5257 single nucleotide polymorphisms (SNPs) in a Finnish sample of 841 unrelated migraine with aura cases and 884 unrelated non-migraine controls. The top signals were then tested for replication in four independent migraine case-control samples from the Netherlands, Germany and Australia, totalling 2835 unrelated migraine cases and 2740 unrelated controls. SNPs within 12 genes (KCNB2, KCNQ3, CLIC5, ATP2C2, CACNA1E, CACNB2, KCNE2, KCNK12, KCNK2, KCNS3, SCN5A and SCN9A) with promising nominal association (0.00041 < P < 0.005) in the Finnish sample were selected for replication. Although no variant remained significant after adjusting for multiple testing nor produced consistent evidence for association across all cohorts, a significant epistatic interaction between KCNB2 SNP rs1431656 (chromosome 8q13.3) and CACNB2 SNP rs7076100 (chromosome 10p12.33) (pointwise P = 0.00002; global P = 0.02) was observed in the Finnish case-control sample. We conclude that common variants of moderate effect size in ion transport genes do not play a major role in susceptibility to common migraine within these European populations, although there is some evidence for epistatic interaction between potassium and calcium channel genes, KCNB2 and CACNB2. Multiple rare variants or trans-regulatory elements of these genes are not ruled out.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Genomic and proteomic analyses have attracted a great deal of interests in biological research in recent years. Many methods have been applied to discover useful information contained in the enormous databases of genomic sequences and amino acid sequences. The results of these investigations inspire further research in biological fields in return. These biological sequences, which may be considered as multiscale sequences, have some specific features which need further efforts to characterise using more refined methods. This project aims to study some of these biological challenges with multiscale analysis methods and stochastic modelling approach. The first part of the thesis aims to cluster some unknown proteins, and classify their families as well as their structural classes. A development in proteomic analysis is concerned with the determination of protein functions. The first step in this development is to classify proteins and predict their families. This motives us to study some unknown proteins from specific families, and to cluster them into families and structural classes. We select a large number of proteins from the same families or superfamilies, and link them to simulate some unknown large proteins from these families. We use multifractal analysis and the wavelet method to capture the characteristics of these linked proteins. The simulation results show that the method is valid for the classification of large proteins. The second part of the thesis aims to explore the relationship of proteins based on a layered comparison with their components. Many methods are based on homology of proteins because the resemblance at the protein sequence level normally indicates the similarity of functions and structures. However, some proteins may have similar functions with low sequential identity. We consider protein sequences at detail level to investigate the problem of comparison of proteins. The comparison is based on the empirical mode decomposition (EMD), and protein sequences are detected with the intrinsic mode functions. A measure of similarity is introduced with a new cross-correlation formula. The similarity results show that the EMD is useful for detection of functional relationships of proteins. The third part of the thesis aims to investigate the transcriptional regulatory network of yeast cell cycle via stochastic differential equations. As the investigation of genome-wide gene expressions has become a focus in genomic analysis, researchers have tried to understand the mechanisms of the yeast genome for many years. How cells control gene expressions still needs further investigation. We use a stochastic differential equation to model the expression profile of a target gene. We modify the model with a Gaussian membership function. For each target gene, a transcriptional rate is obtained, and the estimated transcriptional rate is also calculated with the information from five possible transcriptional regulators. Some regulators of these target genes are verified with the related references. With these results, we construct a transcriptional regulatory network for the genes from the yeast Saccharomyces cerevisiae. The construction of transcriptional regulatory network is useful for detecting more mechanisms of the yeast cell cycle.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is widely held that strong relationships exist between housing, economic status, and well being. Therefore, recent events emerging from the United States, culminating in widespread housing stock surpluses in that country and others, threaten to destabilise many aspects related to individuals and community. However, despite global impact, the position of housing demand and supply is not consistent. The Australian position provides a strong contrast whereby continued strong housing demand generally remains a critical issue affecting the socio-economic landscape. Underpinned by strong levels of immigration, and further buoyed by sustained historically low interest rates, increasing income levels, and increased government assistance for first home buyers, this strong housing demand ensures elements related to housing affordability continue to gain prominence. A significant, but less visible factor impacting housing affordability – particularly new housing development – relates to holding costs. These costs are in many ways “hidden” and cannot always be easily identified. Although it is only one contributor, the nature and extent of its impact requires elucidation. In its simplest form, it commences with a calculation of the interest or opportunity cost of land holding. However, there is significantly more complexity for major new developments - particularly greenfield development. Analysis suggests that even small shifts in primary factors impacting holding costs can appreciably affect housing affordability. Those factors of greatest significance not only include interest rates and the rate of inflation, but even less apparent factors such as the regulatory assessment period. These are not just theoretical concepts but real, measurable price drivers. Ultimately, the real impact is felt by the one market segment whom can typically least afford it – new home, first home buyers. They can be easily pushed out of affordability. This paper suggests the stability and sustainability of growing, new communities require this problem to be acknowledged and accurately identified if the well being of such communities is to be achieved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Plants have been identified as promising expression systems for the commercial production of recombinant proteins. Plant-based protein production or “biofarming” offers a number of advantages over traditional expression systems in terms of scale of production, the capacity for post-translation processing, providing a product free of contaminants and cost effectiveness. A number of pharmaceutically important and commercially valuable proteins, such as antibodies, biopharmaceuticals and industrial enzymes are currently being produced in plant expression systems. However, several challenges still remain to improve recombinant protein yield with no ill effect on the host plant. The ability for transgenic plants to produce foreign proteins at commercially viable levels can be directly related to the level and cell specificity of the selected promoter driving the transgene. The accumulation of recombinant proteins may be controlled by a tissue-specific, developmentally-regulated or chemically-inducible promoter such that expression of recombinant proteins can be spatially- or temporally- controlled. The strict control of gene expression is particularly useful for proteins that are considered toxic and whose expression is likely to have a detrimental effect on plant growth. To date, the most commonly used promoter in plant biotechnology is the cauliflower mosaic virus (CaMV) 35S promoter which is used to drive strong, constitutive transgene expression in most organs of transgenic plants. Of particular interest to researchers in the Centre for Tropical Crops and Biocommodities at QUT are tissue-specific promoters for the accumulation of foreign proteins in the roots, seeds and fruit of various plant species, including tobacco, banana and sugarcane. Therefore this Masters project aimed to isolate and characterise root- and seed-specific promoters for the control of genes encoding recombinant proteins in plant-based expression systems. Additionally, the effects of matching cognate terminators with their respective gene promoters were assessed. The Arabidopsis root promoters ARSK1 and EIR1 were selected from the literature based on their reported limited root expression profiles. Both promoters were analysed using the PlantCARE database to identify putative motifs or cis-acting elements that may be associated with this activity. A number of motifs were identified in the ARSK1 promoter region including, WUN (wound-inducible), MBS (MYB binding site), Skn-1, and a RY core element (seed-specific) and in the EIR1 promoter region including, Skn-1 (seed-specific), Box-W1 (fungal elicitor), Aux-RR core (auxin response) and ABRE (ABA response). However, no previously reported root-specific cis-acting elements were observed in either promoter region. To confirm root specificity, both promoters, and truncated versions, were fused to the GUS reporter gene and the expression cassette introduced into Arabidopsis via Agrobacterium-mediated transformation. Despite the reported tissue-specific nature of these promoters, both upstream regulatory regions directed constitutive GUS expression in all transgenic plants. Further, similar levels of GUS expression from the ARSK1 promoter were directed by the control CaMV 35S promoter. The truncated version of the EIR1 promoter (1.2 Kb) showed some differences in the level of GUS expression compared to the 2.2 Kb promoter. Therefore, this suggests an enhancer element is contained in the 2.2 Kb upstream region that increases transgene expression. The Arabidopsis seed-specific genes ATS1 and ATS3 were selected from the literature based on their seed-specific expression profiles and gene expression confirmed in this study as seed-specific by RT-PCR analysis. The selected promoter regions were analysed using the PlantCARE database in order to identify any putative cis elements. The seed-specific motifs GCN4 and Skn-1 were identified in both promoter regions that are associated with elevated expression levels in the endosperm. Additionaly, the seed-specific RY element and the ABRE were located in the ATS1 promoter. Both promoters were fused to the GUS reporter gene and used to transform Arabidopsis plants. GUS expression from the putative promoters was consitutive in all transgenic Arabidopsis tissue tested. Importantly, the positive control FAE1 seed-specific promoter also directed constitutive GUS expression throughout transgenic Arabidopsis plants. The constitutive nature seen in all of the promoters used in this study was not anticipated. While variations in promoter activity can be caused by a number of influencing factors, the variation in promoter activity observed here would imply a major contributing factor common to all plant expression cassettes tested. All promoter constructs generated in this study were based on the binary vector pCAMBIA2300. This vector contains the plant selection gene (NPTII) under the transcriptional control of the duplicated CaMV 35S promoter. This CaMV 35S promoter contains two enhancer domains that confer strong, constitutive expression of the selection gene and is located immediately upstream of the promoter-GUS fusion. During the course of this project, Yoo et al. (2005) reported that transgene expression is significantly affected when the expression cassette is located on the same T-DNA as the 35S enhancer. It was concluded, the trans-acting effects of the enhancer activate and control transgene expression causing irregular expression patterns. This phenomenon seems the most plausible reason for the constitutive expression profiles observed with the root- and seed-specific promoters assessed in this study. The expression from some promoters can be influenced by their cognate terminator sequences. Therefore, the Arabidopsis ARSK1, EIR1, ATS1 and ATS3 terminator sequences were isolated and incorporated into expression cassettes containing the GUS reporter gene under the control of their cognate promoters. Again, unrestricted GUS activity was displayed throughout transgenic plants transformed with these reporter gene fusions. As previously discussed constitutive GUS expression was most likely due to the trans-acting effect of the upstream CaMV 35S promoter in the selection cassette located on the same T-DNA. The results obtained in this study make it impossible to assess the influence matching terminators with their cognate promoters have on transgene expression profiles. The obvious future direction of research continuing from this study would be to transform pBIN-based promoter-GUS fusions (ie. constructs containing no CaMV 35S promoter driving the plant selection gene) into Arabidopsis in order to determine the true tissue specificity of these promoters and evaluate the effects of their cognate 3’ terminator sequences. Further, promoter truncations based around the cis-elements identified here may assist in determining whether these motifs are in fact involved in the overall activity of the promoter.