916 resultados para Genome-specific Sequence
Resumo:
The proportion of functional sequence in the human genome is currently a subject of debate. The most widely accepted figure is that approximately 5% is under purifying selection. In Drosophila, estimates are an order of magnitude higher, though this corresponds to a similar quantity of sequence. These estimates depend on the difference between the distribution of genomewide evolutionary rates and that observed in a subset of sequences presumed to be neutrally evolving. Motivated by the widening gap between these estimates and experimental evidence of genome function, especially in mammals, we developed a sensitive technique for evaluating such distributions and found that they are much more complex than previously apparent. We found strong evidence for at least nine well-resolved evolutionary rate classes in an alignment of four Drosophila species and at least seven classes in an alignment of four mammals, including human. We also identified at least three rate classes in human ancestral repeats. By positing that the largest of these ancestral repeat classes is neutrally evolving, we estimate that the proportion of nonneutrally evolving sequence is 30% of human ancestral repeats and 45% of the aligned portion of the genome. However, we also question whether any of the classes represent neutrally evolving sequences and argue that a plausible alternative is that they reflect variable structure-function constraints operating throughout the genomes of complex organisms.
Resumo:
Vitamin A deficiency (VAD) is a serious problem in developing countries, affecting approximately 127 million children of preschool age and 7.2 million pregnant women each year. However, this deficiency is readily treated and prevented through adequate nutrition. This can potentially be achieved through genetically engineered biofortification of staple food crops to enhance provitamin A (pVA) carotenoid content. Bananas are the fourth most important food crop with an annual production of 100 million tonnes and are widely consumed in areas affected by VAD. However, the fruit pVA content of most widely consumed banana cultivars is low (~ 0.2 to 0.5 ìg/g dry weight). This includes cultivars such as the East African highland banana (EAHB), the staple crop in countries such as Uganda, where annual banana consumption is approximately 250 kg per person. This fact, in addition to the agronomic properties of staple banana cultivars such as vegetative reproduction and continuous cropping, make bananas an ideal target for pVA enhancement through genetic engineering. Interestingly, there are banana varieties known with high fruit pVA content (up to 27.8 ìg/g dry weight), although they are not widely consumed due to factors such as cultural preference and availability. The genes involved in carotenoid accumulation during banana fruit ripening have not been well studied and an understanding of the molecular basis for the differential capacity of bananas to accumulate carotenoids may impact on the effective production of genetically engineered high pVA bananas. The production of phytoene by the enzyme phytoene synthase (PSY) has been shown to be an important rate limiting determinant of pVA accumulation in crop systems such as maize and rice. Manipulation of this gene in rice has been used successfully to produce Golden Rice, which exhibits higher seed endosperm pVA levels than wild type plants. Therefore, it was hypothesised that differences between high and low pVA accumulating bananas could be due either to differences in PSY enzyme activity or factors regulating the expression of the psy gene. Therefore, the aim of this thesis was to investigate the role of PSY in accumulation of pVA in banana fruit of representative high (Asupina) and low (Cavendish) pVA banana cultivars by comparing the nucleic acid and encoded amino acid sequences of the banana psy genes, in vivo enzyme activity of PSY in rice callus and expression of PSY through analysis of promoter activity and mRNA levels. Initially, partial sequences of the psy coding region from five banana cultivars were obtained using reverse transcriptase (RT)-PCR with degenerate primers designed to conserved amino acids in the coding region of available psy sequences from other plants. Based on phylogenetic analysis and comparison to maize psy sequences, it was found that in banana, psy occurs as a gene family of at least three members (psy1, psy2a and psy2b). Subsequent analysis of the complete coding regions of these genes from Asupina and Cavendish suggested that they were all capable of producing functional proteins due to high conservation in the catalytic domain. However, inability to obtain the complete mRNA sequences of Cavendish psy2a, and isolation of two non-functional Cavendish psy2a coding region variants, suggested that psy2a expression may be impaired in Cavendish. Sequence analysis indicated that these Cavendish psy2a coding region variants may have resulted from alternate splicing. Evidence of alternate splicing was also observed in one Asupina psy1 coding region variant, which was predicted to produce a functional PSY1 isoform. The complete mRNA sequence of the psy2b coding regions could not be isolated from either cultivar. Interestingly, psy1 was cloned predominantly from leaf while psy2 was obtained preferentially from fruit, suggesting some level of tissue-specific expression. The Asupina and Cavendish psy1 and psy2a coding regions were subsequently expressed in rice callus and the activity of the enzymes compared in vivo through visual observation and quantitative measurement of carotenoid accumulation. The maize B73 psy1 coding region was included as a positive control. After several weeks on selection, regenerating calli showed a range of colours from white to dark orange representing various levels of carotenoid accumulation. These results confirmed that the banana psy coding regions were all capable of producing functional enzymes. No statistically significant differences in levels of activity were observed between banana PSYs, suggesting that differences in PSY activity were not responsible for differences in the fruit pVA content of Asupina and Cavendish. The psy1 and psy2a promoter sequences were isolated from Asupina and Cavendish gDNA using a PCR-based genome walking strategy. Interestingly, three Cavendish psy2a promoter clones of different sizes, representing possible allelic variants, were identified while only single promoter sequences were obtained for the other Asupina and Cavendish psy genes. Bioinformatic analysis of these sequences identified motifs that were previously characterised in the Arabidopsis psy promoter. Notably, an ATCTA motif associated with basal expression in Arabidopsis was identified in all promoters with the exception of two of the Cavendish psy2a promoter clones (Cpsy2apr2 and Cpsy2apr3). G1 and G2 motifs, linked to light-regulated responses in Arabidopsis, appeared to be differentially distributed between psy1 and psy2a promoters. In the untranscribed regulatory regions, the G1 motifs were found only in psy1 promoters, while the G2 motifs were found only in psy2a. Interestingly, both ATCTA and G2 motifs were identified in the 5’ UTRs of Asupina and Cavendish psy1. Consistent with other monocot promoters, introns were present in the Asupina and Cavendish psy1 5’ UTRs, while none were observed in the psy2a 5’ UTRs. Promoters were cloned into expression constructs, driving the â-glucuronidase (GUS) reporter gene. Transient expression of the Asupina and Cavendish psy1 and psy2a promoters in both Cavendish embryogenic cells and Cavendish fruit demonstrated that all promoters were active, except Cpsy2apr2 and Cpsy2apr3. The functional Cavendish psy2a promoter (Cpsy2apr1) appeared to have activity similar to the Asupina psy2a promoter. The activities of the Asupina and Cavendish psy1 promoters were similar to each other, and comparable to those of the functional psy2a promoters. Semi-quantitative PCR analysis of Asupina and Cavendish psy1 and psy2a transcripts showed that psy2a levels were high in green fruit and decreased during ripening, reinforcing the hypothesis that fruit pVA levels were largely dependent on levels of psy2a expression. Additionally, semi-quantitative PCR using intron-spanning primers indicated that high levels of unprocessed psy2a and psy2b mRNA were present in the ripe fruit of Cavendish but not in Asupina. This raised the possibility that differences in intron processing may influence pVA accumulation in Asupina and Cavendish. In this study the role of PSY in banana pVA accumulation was analysed at a number of different levels. Both mRNA accumulation and promoter activity of psy genes studied were very similar between Asupina and Cavendish. However, in several experiments there was evidence of cryptic or alternate splicing that differed in Cavendish compared to Asupina, although these differences were not conclusively linked to the differences in fruit pVA accumulation between Asupina and Cavendish. Therefore, other carotenoid biosynthetic genes or regulatory mechanisms may be involved in determining pVA levels in these cultivars. This study has contributed to an increased understanding of the role of PSY in the production of pVA carotenoids in banana fruit, corroborating the importance of this enzyme in regulating carotenoid production. Ultimately, this work may serve to inform future research into pVA accumulation in important crop varieties such as the EAHB and the discovery of avenues to improve such crops through genetic modification.
Resumo:
Bananas are one of the world's most important food crops, providing sustenance and income for millions of people in developing countries and supporting large export industries. Viruses are considered major constraints to banana production, germplasm multiplication and exchange, and to genetic improvement of banana through traditional breeding. In Africa, the two most important virus diseases are bunchy top, caused by Banana bunchy top virus (BBTV), and banana streak disease, caused by Banana streak virus (BSV). BBTV is a serious production constraint in a number of countries within/bordering East Africa, such as Burundi, Democratic Republic of Congo, Malawi, Mozambique, Rwanda and Zambia, but is not present in Kenya, Tanzania and Uganda. Additionally, epidemics of banana streak disease are occurring in Kenya and Uganda. The rapidly growing tissue culture (TC) industry within East Africa, aiming to provide planting material to banana farmers, has stimulated discussion about the need for virus indexing to certify planting material as virus-free. Diagnostic methods for BBTV and BSV have been reported and, for BBTV, PCR-based assays are reliable and relatively straightforward. However for BSV, high levels of serological and genetic variability and the presence of endogenous virus sequences within the banana genome complicate diagnosis. Uganda has been shown to contain the greatest diversity in BSV isolates found anywhere in the world. A broad-spectrum diagnostic test for BSV detection, which can discriminate between endogenous and episomal BSV sequences, is a priority. This PhD project aimed to establish diagnostic methods for banana viruses, with a particular focus on the development of novel methods for BSV detection, and to use these diagnostic methods for the detection and characterisation of banana viruses in East Africa. A novel rolling-circle amplification (RCA) method was developed for the detection of BSV. Using samples of Banana streak MY virus (BSMYV) and Banana streak OL virus (BSOLV) from Australia, this method was shown to distinguish between endogenous and episomal BSV sequences in banana plants. The RCA assay was used to screen a collection of 56 banana samples from south-west Uganda for BSV. RCA detected at least five distinct BSV isolates in these samples, including BSOLV and Banana streak GF virus (BSGFV) as well as three BSV isolates (Banana streak Uganda-I, -L and -M virus) for which only partial sequences had been previously reported. These latter three BSV had only been detected using immuno-capture (IC)-PCR and thus were possible endogenous sequences. In addition to its ability to detect BSV, the RCA protocol was also demonstrated to detect other viruses within the family Caulimoviridae, including Sugar cane bacilliform virus, and Cauliflower mosaic virus. Using the novel RCA method, three distinct BSV isolates from both Kenya and Uganda were identified and characterised. The complete genome of these isolates was sequenced and annotated. All six isolates were shown to have a characteristic badnavirus genome organisation with three open reading frames (ORFs) and the large polyprotein encoded by ORF 3 was shown to contain conserved amino acid motifs for movement, aspartic protease, reverse transcriptase and ribonuclease H activities. As well, several sequences important for expression and replication of the virus genome were identified including the conserved tRNAmet primer binding site present in the intergenic region of all badnaviruses. Based on the International Committee on Taxonomy of Viruses (ICTV) guidelines for species demarcation in the genus Badnavirus, these six isolates were proposed as distinct species, and named Banana streak UA virus (BSUAV), Banana streak UI virus (BSUIV), Banana streak UL virus (BSULV), Banana streak UM virus (BSUMV), Banana streak CA virus (BSCAV) and Banana streak IM virus (BSIMV). Using PCR with species-specific primers designed to each isolate, a genotypically diverse collection of 12 virus-free banana cultivars were tested for the presence of endogenous sequences. For five of the BSV no amplification was observed in any cultivar tested, while for BSIMV, four positive samples were identified in cultivars with a B-genome component. During field visits to Kenya, Tanzania and Uganda, 143 samples were collected and assayed for BSV. PCR using nine sets of species-specific primers, and RCA, were compared for BSV detection. For five BSV species with no known endogenous counterpart (namely BSCAV, BSUAV, BSUIV, BSULV and BSUMV), PCR was used to detect 30 infections from the 143 samples. Using RCA, 96.4% of these samples were considered positive, with one additional sample detected using RCA which was not positive using PCR. For these five BSV, PCR and RCA were both useful for identifying infected samples, irrespective of the host cultivar genotype (Musa A- or B-genome components). For four additional BSV with known endogenous counterparts in the M. balbisiana genome (BSOLV, BSGFV, BSMYV and BSIMV), PCR was shown to detect 75 infections from the 143 samples. In 30 samples from cultivars with an A-only genome component there was 96.3% agreement between PCR positive samples and detection using RCA, again demonstrating either PCR or RCA are suitable methods for detection. However, in 45 samples from cultivars with some B-genome component, the level of agreement between PCR positive samples and RCA positive samples was 70.5%. This suggests that, in cultivars with some B-genome component, many infections were detected using PCR which were the result of amplification of endogenous sequences. In these latter cases, RCA or another method which discriminates between endogenous and episomal sequences, such as immuno-capture PCR, is needed to diagnose episomal BSV infection. Field visits were made to Malawi and Rwanda to collect local isolates of BBTV for validation of a PCR-based diagnostic assay. The presence of BBTV in samples of bananas with bunchy top disease was confirmed in 28 out of 39 samples from Malawi and all nine samples collected in Rwanda, using PCR and RCA. For three isolates, one from Malawi and two from Rwanda, the complete nucleotide sequences were determined and shown to have a similar genome organisation to previously published BBTV isolates. The two isolates from Rwanda had at least 98.1% nucleotide sequence identity between each of the six DNA components, while the similarity between isolates from Rwanda and Malawi was between 96.2% and 99.4% depending on the DNA component. At the amino acid level, similarities in the putative proteins encoded by DNA-R, -S, -M, - C and -N were found to range between 98.8% to 100%. In a phylogenetic analysis, the three East African isolates clustered together within the South Pacific subgroup of BBTV isolates. Nucleotide sequence comparison to isolates of BBTV from outside Africa identified India as the possible origin of East African isolates of BBTV.
The use of virtual prototyping to rehearse the sequence of construction work involving mobile cranes
Resumo:
Purpose – Rehearsing practical site operations is without doubt one of the most effective methods for minimising planning mistakes, because of the learning that takes place during the rehearsal activity. However, real rehearsal is not a practical solution for on-site construction activities, as it not only involves a considerable amount of cost but can also have adverse environmental implications. One approach to overcoming this is by the use of virtual rehearsals. The purpose of this paper is to investigate an approach to simulation of the motion of cranes in order to test the feasibility of associated construction sequencing and generate construction schedules for review and visualisation. Design/methodology/approach – The paper describes a system involving two technologies, virtual prototyping (VP) and four-dimensional (4D) simulation, to assist construction planners in testing the sequence of construction activities when mobile cranes are involved. The system consists of five modules, comprising input, database, equipment, process and output, and is capable of detecting potential collisions. A real-world trial is described in which the system was tested and validated. Findings – Feedback from the planners involved in the trial indicated that they found the system to be useful in its present form and that they would welcome its further development into a fully automated platform for validating construction sequencing decisions. Research limitations/implications – The tool has the potential to provide a cost-effective means of improving construction planning. However, it is limited at present to the specific case of crane movement under special consideration. Originality/value – This paper presents a large-scale, real life case of applying VP technology in planning construction processes and activities.
Resumo:
Bananas are one of the world�fs most important crops, serving as a staple food and an important source of income for millions of people in the subtropics. Pests and diseases are a major constraint to banana production. To prevent the spread of pests and disease, farmers are encouraged to use disease�] and insect�]free planting material obtained by micropropagation. This option, however, does not always exclude viruses and concern remains on the quality of planting material. Therefore, there is a demand for effective and reliable virus indexing procedures for tissue culture (TC) material. Reliable diagnostic tests are currently available for all of the economically important viruses of bananas with the exception of Banana streak viruses (BSV, Caulimoviridae, Badnavirus). Development of a reliable diagnostic test for BSV is complicated by the significant serological and genetic variation reported for BSV isolates, and the presence of endogenous BSV (eBSV). Current PCR�] and serological�]based diagnostic methods for BSV may not detect all species of BSV, and PCR�]based methods may give false positives because of the presence of eBSV. Rolling circle amplification (RCA) has been reported as a technique to detect BSV which can also discriminate between episomal and endogenous BSV sequences. However, the method is too expensive for large scale screening of samples in developing countries, and little information is available regarding its sensitivity. Therefore the development of reliable PCR�]based assays is still considered the most appropriate option for large scale screening of banana plants for BSV. This MSc project aimed to refine and optimise the protocols for BSV detection, with a particular focus on developing reliable PCR�]based diagnostics Initially, the appropriateness and reliability of PCR and RCA as diagnostic tests for BSV detection were assessed by testing 45 field samples of banana collected from nine districts in the Eastern region of Uganda in February 2010. This research was also aimed at investigating the diversity of BSV in eastern Uganda, identifying the BSV species present and characterising any new BSV species. Out of the 45 samples tested, 38 and 40 samples were considered positive by PCR and RCA, respectively. Six different species of BSV, namely Banana streak IM virus (BSIMV), Banana streak MY virus (BSMYV), Banana streak OL virus (BSOLV), Banana streak UA virus (BSUAV), Banana streak UL virus (BSULV), Banana streak UM virus (BSUMV), were detected by PCR and confirmed by RCA and sequencing. No new species were detected, but this was the first report of BSMYV in Uganda. Although RCA was demonstrated to be suitable for broad�]range detection of BSV, it proved time�]consuming and laborious for identification in field samples. Due to the disadvantages associated with RCA, attempts were made to develop a reliable PCR�]based assay for the specific detection of episomal BSOLV, Banana streak GF virus (BSGFV), BSMYV and BSIMV. For BSOLV and BSGFV, the integrated sequences exist in rearranged, repeated and partially inverted portions at their site of integration. Therefore, for these two viruses, primers sets were designed by mapping previously published sequences of their endogenous counterparts onto published sequences of the episomal genomes. For BSOLV, two primer sets were designed while, for BSGFV, a single primer set was designed. The episomalspecificity of these primer sets was assessed by testing 106 plant samples collected during surveys in Kenya and Uganda, and 33 leaf samples from a wide range of banana cultivars maintained in TC at the Maroochy Research Station of the Department of Employment, Economic Development and Innovation (DEEDI), Queensland. All of these samples had previously been tested for episomal BSV by RCA and for both BSOLV and BSGFV by PCR using published primer sets. The outcome from these analyses was that the newly designed primer sets for BSOLV and BSGFV were able to distinguish between episomal BSV and eBSV in most cultivars with some B�]genome component. In some samples, however, amplification was observed using the putative episomal�]specific primer sets where episomal BSV was not identified using RCA. This may reflect a difference in the sensitivity of PCR compared to RCA, or possibly the presence of an eBSV sequence of different conformation. Since the sequences of the respective eBSV for BSMYV and BSIMV in the M. balbisiana genome are not available, a series of random primer combinations were tested in an attempt to find potential episomal�]specific primer sets for BSMYV and BSIMV. Of an initial 20 primer combinations screened for BSMYV detection on a small number of control samples, 11 primers sets appeared to be episomal�]specific. However, subsequent testing of two of these primer combinations on a larger number of control samples resulted in some inconsistent results which will require further investigation. Testing of the 25 primer combinations for episomal�]specific detection of BSIMV on a number of control samples showed that none were able to discriminate between episomal and endogenous BSIMV. The final component of this research project was the development of an infectious clone of a BSV endemic in Australia, namely BSMYV. This was considered important to enable the generation of large amounts of diseased plant material needed for further research. A terminally redundant fragment (.1.3 �~ BSMYV genome) was cloned and transformed into Agrobacterium tumefaciens strain AGL1, and used to inoculate 12 healthy banana plants of the cultivars Cavendish (Williams) by three different methods. At 12 weeks post�]inoculation, (i) four of the five banana plants inoculated by corm injection showed characteristic BSV symptoms while the remaining plant was wilting/dying, (ii) three of the five banana plants inoculated by needle�]pricking of the stem showed BSV symptoms, one plant was symptomless while the remaining had died and (iii) both banana plants inoculated by leaf infiltration were symptomless. When banana leaf samples were tested for BSMYV by PCR and RCA, BSMYV was confirmed in all banana plants showing symptoms including those were wilting and/or dying. The results from this research have provided several avenues for further research. By completely sequencing all variants of eBSOLV and eBSGFV and fully sequencing the eBSIMV and eBSMYV regions, episomal BSV�]specific primer sets for all eBSVs could potentially be designed that could avoid all integrants of that particular BSV species. Furthermore, the development of an infectious BSV clone will enable large numbers of BSVinfected plants to be generated for the further testing of the sensitivity of RCA compared to other more established assays such as PCR. The development of infectious clones also opens the possibility for virus induced gene silencing studies in banana.
Resumo:
Background. Recent reports have indicated that single-stranded DNA (ssDNA) viruses in the taxonomic families Geminiviridae, Parvoviridae and Anellovirus may be evolving at rates of ∼10-4 substitutions per site per year (subs/site/year). These evolution rates are similar to those of RNA viruses and are surprisingly high given that ssDNA virus replication involves host DNA polymerases with fidelities approximately 10 000 times greater than those of error-prone viral RNA polymerases. Although high ssDNA virus evolution rates were first suggested in evolution experiments involving the geminivirus maize streak virus (MSV), the evolution rate of this virus has never been accurately measured. Also, questions regarding both the mechanistic basis and adaptive value of high geminivirus mutation rates remain unanswered. Results. We determined the short-term evolution rate of MSV using full genome analysis of virus populations initiated from cloned genomes. Three wild type viruses and three defective artificial chimaeric viruses were maintained in planta for up to five years and displayed evolution rates of between 7.4 × 10-4 and 7.9 × 10-4 subs/site/year. Conclusion. These MSV evolution rates are within the ranges observed for other ssDNA viruses and RNA viruses. Although no obvious evidence of positive selection was detected, the uneven distribution of mutations within the defective virus genomes suggests that some of the changes may have been adaptive. We also observed inter-strand nucleotide substitution imbalances that are consistent with a recent proposal that high mutation rates in geminiviruses (and possibly ssDNA viruses in general) may be due to mutagenic processes acting specifically on ssDNA molecules. © 2008 Walt et al; licensee BioMed Central Ltd.
Resumo:
The main cis-acting control regions for replication of the single-stranded DNA genome of maize streak virus (MSV) are believed to reside within an approximately 310 nt long intergenic region (LIR). However, neither the minimum LIR sequence required nor the sequence determinants of replication specificity have been determined experimentally. There are iterated sequences, or iterons, both within the conserved inverted-repeat sequences with the potential to form a stem-loop structure at the origin of virion-strand replication, and upstream of the rep gene TATA box (the rep-proximal iteron or RPI). Based on experimental analyses of similar iterons in viruses from other geminivirus genera and their proximity to known Rep-binding sites in the distantly related mastrevirus wheat dwarf virus, it has been hypothesized that the iterons may be Rep-binding and/or -recognition sequences. Here, a series of LIR deletion mutants was used to define the upper bounds of the LIR sequence required for replication. After identifying MSV strains and distinct mastreviruses with incompatible replication-specificity determinants (RSDs), LIR chimaeras were used to map the primary MSV RSD to a 67 nt sequence containing the RPI. Although the results generally support the prevailing hypothesis that MSV iterons are functional analogues of those found in other geminivirus genera, it is demonstrated that neither the inverted-repeat nor RPI sequences are absolute determinants of replication specificity. Moreover, widely divergent mastreviruses can trans-replicate one another. These results also suggest that sequences in the 67 nt region surrounding the RPI interact in a sequence-specific manner with those of the inverted repeat.
Resumo:
Exponential growth of genomic data in the last two decades has made manual analyses impractical for all but trial studies. As genomic analyses have become more sophisticated, and move toward comparisons across large datasets, computational approaches have become essential. One of the most important biological questions is to understand the mechanisms underlying gene regulation. Genetic regulation is commonly investigated and modelled through the use of transcriptional regulatory network (TRN) structures. These model the regulatory interactions between two key components: transcription factors (TFs) and the target genes (TGs) they regulate. Transcriptional regulatory networks have proven to be invaluable scientific tools in Bioinformatics. When used in conjunction with comparative genomics, they have provided substantial insights into the evolution of regulatory interactions. Current approaches to regulatory network inference, however, omit two additional key entities: promoters and transcription factor binding sites (TFBSs). In this study, we attempted to explore the relationships among these regulatory components in bacteria. Our primary goal was to identify relationships that can assist in reducing the high false positive rates associated with transcription factor binding site predictions and thereupon enhance the reliability of the inferred transcription regulatory networks. In our preliminary exploration of relationships between the key regulatory components in Escherichia coli transcription, we discovered a number of potentially useful features. The combination of location score and sequence dissimilarity scores increased de novo binding site prediction accuracy by 13.6%. Another important observation made was with regards to the relationship between transcription factors grouped by their regulatory role and corresponding promoter strength. Our study of E.coli ��70 promoters, found support at the 0.1 significance level for our hypothesis | that weak promoters are preferentially associated with activator binding sites to enhance gene expression, whilst strong promoters have more repressor binding sites to repress or inhibit gene transcription. Although the observations were specific to �70, they nevertheless strongly encourage additional investigations when more experimentally confirmed data are available. In our preliminary exploration of relationships between the key regulatory components in E.coli transcription, we discovered a number of potentially useful features { some of which proved successful in reducing the number of false positives when applied to re-evaluate binding site predictions. Of chief interest was the relationship observed between promoter strength and TFs with respect to their regulatory role. Based on the common assumption, where promoter homology positively correlates with transcription rate, we hypothesised that weak promoters would have more transcription factors that enhance gene expression, whilst strong promoters would have more repressor binding sites. The t-tests assessed for E.coli �70 promoters returned a p-value of 0.072, which at 0.1 significance level suggested support for our (alternative) hypothesis; albeit this trend may only be present for promoters where corresponding TFBSs are either all repressors or all activators. Nevertheless, such suggestive results strongly encourage additional investigations when more experimentally confirmed data will become available. Much of the remainder of the thesis concerns a machine learning study of binding site prediction, using the SVM and kernel methods, principally the spectrum kernel. Spectrum kernels have been successfully applied in previous studies of protein classification [91, 92], as well as the related problem of promoter predictions [59], and we have here successfully applied the technique to refining TFBS predictions. The advantages provided by the SVM classifier were best seen in `moderately'-conserved transcription factor binding sites as represented by our E.coli CRP case study. Inclusion of additional position feature attributes further increased accuracy by 9.1% but more notable was the considerable decrease in false positive rate from 0.8 to 0.5 while retaining 0.9 sensitivity. Improved prediction of transcription factor binding sites is in turn extremely valuable in improving inference of regulatory relationships, a problem notoriously prone to false positive predictions. Here, the number of false regulatory interactions inferred using the conventional two-component model was substantially reduced when we integrated de novo transcription factor binding site predictions as an additional criterion for acceptance in a case study of inference in the Fur regulon. This initial work was extended to a comparative study of the iron regulatory system across 20 Yersinia strains. This work revealed interesting, strain-specific difierences, especially between pathogenic and non-pathogenic strains. Such difierences were made clear through interactive visualisations using the TRNDifi software developed as part of this work, and would have remained undetected using conventional methods. This approach led to the nomination of the Yfe iron-uptake system as a candidate for further wet-lab experimentation due to its potential active functionality in non-pathogens and its known participation in full virulence of the bubonic plague strain. Building on this work, we introduced novel structures we have labelled as `regulatory trees', inspired by the phylogenetic tree concept. Instead of using gene or protein sequence similarity, the regulatory trees were constructed based on the number of similar regulatory interactions. While the common phylogentic trees convey information regarding changes in gene repertoire, which we might regard being analogous to `hardware', the regulatory tree informs us of the changes in regulatory circuitry, in some respects analogous to `software'. In this context, we explored the `pan-regulatory network' for the Fur system, the entire set of regulatory interactions found for the Fur transcription factor across a group of genomes. In the pan-regulatory network, emphasis is placed on how the regulatory network for each target genome is inferred from multiple sources instead of a single source, as is the common approach. The benefit of using multiple reference networks, is a more comprehensive survey of the relationships, and increased confidence in the regulatory interactions predicted. In the present study, we distinguish between relationships found across the full set of genomes as the `core-regulatory-set', and interactions found only in a subset of genomes explored as the `sub-regulatory-set'. We found nine Fur target gene clusters present across the four genomes studied, this core set potentially identifying basic regulatory processes essential for survival. Species level difierences are seen at the sub-regulatory-set level; for example the known virulence factors, YbtA and PchR were found in Y.pestis and P.aerguinosa respectively, but were not present in both E.coli and B.subtilis. Such factors and the iron-uptake systems they regulate, are ideal candidates for wet-lab investigation to determine whether or not they are pathogenic specific. In this study, we employed a broad range of approaches to address our goals and assessed these methods using the Fur regulon as our initial case study. We identified a set of promising feature attributes; demonstrated their success in increasing transcription factor binding site prediction specificity while retaining sensitivity, and showed the importance of binding site predictions in enhancing the reliability of regulatory interaction inferences. Most importantly, these outcomes led to the introduction of a range of visualisations and techniques, which are applicable across the entire bacterial spectrum and can be utilised in studies beyond the understanding of transcriptional regulatory networks.
Resumo:
In this issue of Cancer Discovery, Guagnano and colleagues use a large and diverse annotated collection of cancer cell lines, the Cancer Cell Line Encyclopedia, to correlate whole-genome expression and genomic alteration datasets with cell line sensitivity data to the novel pan-fibroblast growth factor receptor (FGFR) inhibitor NVP-BGJ398. Their findings underscore not only the preclinical use of such cell line panels in identifying predictive biomarkers, but also the emergence of the FGFRs as valid therapeutic targets, across an increasingly broad range of malignancies.
Resumo:
Migraine is a common neurological disease with a complex genetic aetiology. The disease affects ~12% of the Caucasian population and females are three times more likely than males to be diagnosed. In an effort to identify loci involved in migraine susceptibility, we performed a pedigree-based genome-wide association study of the isolated population of Norfolk Island, which has a high prevalence of migraine. This unique population originates from a small number of British and Polynesian founders who are descendents of the Bounty mutiny and forms a very large multigenerational pedigree (Bellis et al.; Human Genetics, 124(5):543-5542, 2008). These population genetic features may facilitate disease gene mapping strategies (Peltonen et al.; Nat Rev Genet, 1(3):182-90, 2000. In this study, we identified a high heritability of migraine in the Norfolk Island population (h (2) = 0.53, P = 0.016). We performed a pedigree-based GWAS and utilised a statistical and pathological prioritisation approach to implicate a number of variants in migraine. An SNP located in the zinc finger protein 555 (ZNF555) gene (rs4807347) showed evidence of statistical association in our Norfolk Island pedigree (P = 9.6 × 10(-6)) as well as replication in a large independent and unrelated cohort with >500 migraineurs. In addition, we utilised a biological prioritisation to implicate four SNPs, in within the ADARB2 gene, two SNPs within the GRM7 gene and a single SNP in close proximity to a HTR7 gene. Association of SNPs within these neurotransmitter-related genes suggests a disrupted serotoninergic system that is perhaps specific to the Norfolk Island pedigree, but that might provide clues to understanding migraine more generally.
Resumo:
The high risk of metabolic disease traits in Polynesians may be partly explained by elevated prevalence of genetic variants involved in energy metabolism. The genetics of Polynesian populations has been shaped by island hoping migration events which have possibly favoured thrifty genes. The aim of this study was to sequence the mitochondrial genome in a group of Maoris in an effort to characterise genome variation in this Polynesian population for use in future disease association studies. We sequenced the complete mitochondrial genomes of 20 non-admixed Maori subjects using Affymetrix technology. DNA diversity analyses showed the Maori group exhibited reduced mitochondrial genome diversity compared to other worldwide populations, which is consistent with historical bottleneck and founder effects. Global phylogenetic analysis positioned these Maori subjects specifically within mitochondrial haplogroup - B4a1a1. Interestingly, we identified several novel variants that collectively form new and unique Maori motifs – B4a1a1c, B4a1a1a3 and B4a1a1a5. Compared to ancestral populations we observed an increased frequency of non-synonymous coding variants of several mitochondrial genes in the Maori group, which may be a result of positive selection and/or genetic drift effects. In conclusion, this study reports the first complete mitochondrial genome sequence data for a Maori population. Overall, these new data reveal novel mitochondrial genome signatures in this Polynesian population and enhance the phylogenetic picture of maternal ancestry in Oceania. The increased frequency of several mitochondrial coding variants makes them good candidates for future studies aimed at assessment of metabolic disease risk in Polynesian populations.
Resumo:
Migraine is a common, heterogeneous and heritable neurological disorder. Its pathophysiology is incompletely understood, and its genetic influences at the population level are unknown. In a population-based genome-wide analysis including 5,122 migraineurs and 18,108 non-migraineurs, rs2651899 (1p36.32, PRDM16), rs10166942 (2q37.1, TRPM8) and rs11172113 (12q13.3, LRP1) were among the top seven associations (P < 5 × 10(-6)) with migraine. These SNPs were significant in a meta-analysis among three replication cohorts and met genome-wide significance in a meta-analysis combining the discovery and replication cohorts (rs2651899, odds ratio (OR) = 1.11, P = 3.8 × 10(-9); rs10166942, OR = 0.85, P = 5.5 × 10(-12); and rs11172113, OR = 0.90, P = 4.3 × 10(-9)). The associations at rs2651899 and rs10166942 were specific for migraine compared with non-migraine headache. None of the three SNP associations was preferential for migraine with aura or without aura, nor were any associations specific for migraine features. TRPM8 has been the focus of neuropathic pain models, whereas LRP1 modulates neuronal glutamate signaling, plausibly linking both genes to migraine pathophysiology.
Resumo:
The transient leaf assay in Nicotiana benthamiana is widely used in plant sciences, with one application being the rapid assembly of complex multigene pathways that produce new fatty acid profiles. This rapid and facile assay would be further improved if it were possible to simultaneously overexpress transgenes while accurately silencing endogenes. Here, we report a draft genome resource for N. benthamiana spanning over 75% of the 3.1 Gb haploid genome. This resource revealed a two-member NbFAD2 family, NbFAD2.1 and NbFAD2.2, and quantitative RT-PCR (qRT-PCR) confirmed their expression in leaves. FAD2 activities were silenced using hairpin RNAi as monitored by qRT-PCR and biochemical assays. Silencing of endogenous FAD2 activities was combined with overexpression of transgenes via the use of the alternative viral silencing-suppressor protein, V2, from Tomato yellow leaf curl virus. We show that V2 permits maximal overexpression of transgenes but, crucially, also allows hairpin RNAi to operate unimpeded. To illustrate the efficacy of the V2-based leaf assay system, endogenous lipids were shunted from the desaturation of 18:1 to elongation reactions beginning with 18:1 as substrate. These V2-based leaf assays produced ~50% more elongated fatty acid products than p19-based assays. Analyses of small RNA populations generated from hairpin RNAi against NbFAD2 confirm that the siRNA population is dominated by 21 and 22 nt species derived from the hairpin. Collectively, these new tools expand the range of uses and possibilities for metabolic engineering in transient leaf assays. © 2012 Naim et al.