928 resultados para genomics
Resumo:
Background: Tuberculosis still remains one of the largest killer infectious diseases, warranting the identification of newer targets and drugs. Identification and validation of appropriate targets for designing drugs are critical steps in drug discovery, which are at present major bottle-necks. A majority of drugs in current clinical use for many diseases have been designed without the knowledge of the targets, perhaps because standard methodologies to identify such targets in a high-throughput fashion do not really exist. With different kinds of 'omics' data that are now available, computational approaches can be powerful means of obtaining short-lists of possible targets for further experimental validation. Results: We report a comprehensive in silico target identification pipeline, targetTB, for Mycobacterium tuberculosis. The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us. Using flux balance analysis and network analysis, proteins critical for survival of M. tuberculosis are first identified, followed by comparative genomics with the host, finally incorporating a novel structural analysis of the binding sites to assess the feasibility of a protein as a target. Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets. Through phylogenetic profiling against 228 pathogen genomes, shortlisted targets have been further explored to identify broad-spectrum antibiotic targets, while also identifying those specific to tuberculosis. Targets that address mycobacterial persistence and drug resistance mechanisms are also analysed. Conclusion: The pipeline developed provides rational schema for drug target identification that are likely to have high rates of success, which is expected to save enormous amounts of money, resources and time in the drug discovery process. A thorough comparison with previously suggested targets in the literature demonstrates the usefulness of the integrated approach used in our study, highlighting the importance of systems-level analyses in particular. The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.
Resumo:
Background: Regulation of gene expression in Plasmodium falciparum (Pf) remains poorly understood. While over half the genes are estimated to be regulated at the transcriptional level, few regulatory motifs and transcription regulators have been found. Results: The study seeks to identify putative regulatory motifs in the upstream regions of 13 functional groups of genes expressed in the intraerythrocytic developmental cycle of Pf. Three motif-discovery programs were used for the purpose, and motifs were searched for only on the gene coding strand. Four motifs – the 'G-rich', the 'C-rich', the 'TGTG' and the 'CACA' motifs – were identified, and zero to all four of these occur in the 13 sets of upstream regions. The 'CACA motif' was absent in functional groups expressed during the ring to early trophozoite transition. For functional groups expressed in each transition, the motifs tended to be similar. Upstream motifs in some functional groups showed 'positional conservation' by occurring at similar positions relative to the translational start site (TLS); this increases their significance as regulatory motifs. In the ribonucleotide synthesis, mitochondrial, proteasome and organellar translation machinery genes, G-rich, C-rich, CACA and TGTG motifs, respectively, occur with striking positional conservation. In the organellar translation machinery group, G-rich motifs occur close to the TLS. The same motifs were sometimes identified for multiple functional groups; differences in location and abundance of the motifs appear to ensure different modes of action. Conclusion: The identification of positionally conserved over-represented upstream motifs throws light on putative regulatory elements for transcription in Pf.
Resumo:
The development of innovative methods of stock assessment is a priority for State and Commonwealth fisheries agencies. It is driven by the need to facilitate sustainable exploitation of naturally occurring fisheries resources for the current and future economic, social and environmental well being of Australia. This project was initiated in this context and took advantage of considerable recent achievements in genomics that are shaping our comprehension of the DNA of humans and animals. The basic idea behind this project was that genetic estimates of effective population size, which can be made from empirical measurements of genetic drift, were equivalent to estimates of the successful number of spawners that is an important parameter in process of fisheries stock assessment. The broad objectives of this study were to 1. Critically evaluate a variety of mathematical methods of calculating effective spawner numbers (Ne) by a. conducting comprehensive computer simulations, and by b. analysis of empirical data collected from the Moreton Bay population of tiger prawns (P. esculentus). 2. Lay the groundwork for the application of the technology in the northern prawn fishery (NPF). 3. Produce software for the calculation of Ne, and to make it widely available. The project pulled together a range of mathematical models for estimating current effective population size from diverse sources. Some of them had been recently implemented with the latest statistical methods (eg. Bayesian framework Berthier, Beaumont et al. 2002), while others had lower profiles (eg. Pudovkin, Zaykin et al. 1996; Rousset and Raymond 1995). Computer code and later software with a user-friendly interface (NeEstimator) was produced to implement the methods. This was used as a basis for simulation experiments to evaluate the performance of the methods with an individual-based model of a prawn population. Following the guidelines suggested by computer simulations, the tiger prawn population in Moreton Bay (south-east Queensland) was sampled for genetic analysis with eight microsatellite loci in three successive spring spawning seasons in 2001, 2002 and 2003. As predicted by the simulations, the estimates had non-infinite upper confidence limits, which is a major achievement for the application of the method to a naturally-occurring, short generation, highly fecund invertebrate species. The genetic estimate of the number of successful spawners was around 1000 individuals in two consecutive years. This contrasts with about 500,000 prawns participating in spawning. It is not possible to distinguish successful from non-successful spawners so we suggest a high level of protection for the entire spawning population. We interpret the difference in numbers between successful and non-successful spawners as a large variation in the number of offspring per family that survive – a large number of families have no surviving offspring, while a few have a large number. We explored various ways in which Ne can be useful in fisheries management. It can be a surrogate for spawning population size, assuming the ratio between Ne and spawning population size has been previously calculated for that species. Alternatively, it can be a surrogate for recruitment, again assuming that the ratio between Ne and recruitment has been previously determined. The number of species that can be analysed in this way, however, is likely to be small because of species-specific life history requirements that need to be satisfied for accuracy. The most universal approach would be to integrate Ne with spawning stock-recruitment models, so that these models are more accurate when applied to fisheries populations. A pathway to achieve this was established in this project, which we predict will significantly improve fisheries sustainability in the future. Regardless of the success of integrating Ne into spawning stock-recruitment models, Ne could be used as a fisheries monitoring tool. Declines in spawning stock size or increases in natural or harvest mortality would be reflected by a decline in Ne. This would be good for data-poor fisheries and provides fishery independent information, however, we suggest a species-by-species approach. Some species may be too numerous or experiencing too much migration for the method to work. During the project two important theoretical studies of the simultaneous estimation of effective population size and migration were published (Vitalis and Couvet 2001b; Wang and Whitlock 2003). These methods, combined with collection of preliminary genetic data from the tiger prawn population in southern Gulf of Carpentaria population and a computer simulation study that evaluated the effect of differing reproductive strategies on genetic estimates, suggest that this technology could make an important contribution to the stock assessment process in the northern prawn fishery (NPF). Advances in the genomics world are rapid and already a cheaper, more reliable substitute for microsatellite loci in this technology is available. Digital data from single nucleotide polymorphisms (SNPs) are likely to super cede ‘analogue’ microsatellite data, making it cheaper and easier to apply the method to species with large population sizes.
Resumo:
Tick resistant cattle could provide a potentially sustainable and environmentally sound method of controlling cattle ticks. Advances in genomics and the availability of the bovine genome sequence open up opportunities to identify useful and selectable genes controlling cattle tick resistance. Using quantitative real-time PCR and the Affymetrix bovine array platform, differences in gene expression of skin biopsies from tick resistant Bos indicus (Brahman) and tick susceptible Bos taurus (Holstein-Friesian) cattle following tick challenge were examined. We identified 138 significant differentially-expressed genes, including several immunological/host defence genes, extracellular matrix proteins, and transcription factors as well as genes involved in lipid metabolism. Three key pathways, represented by genes differentially expressed in resistant Brahmans, were identified; the development of the cell-mediated immune response, structural integrity of the dermis and intracellular Ca 2+ levels. Ca2+, which is implicated in host responses to microbial stimuli, may be required for the enhancement or fine-tuning of transcriptional activation of Ca2+- dependant host defence signalling pathways. Animal Genomics for Animal Health International Symposium, Paris, October 2007: (Proceedings)
Resumo:
White clover (Trifolium repens L.) is an obligate outbreeding allotetraploid forage legume. Gene-associated SNPs provide the optimum genetic system for improvement of such crop species. An EST resource obtained from multiple cDNA libraries constructed from numerous genotypes of a single cultivar has been used for in silico SNP discovery and validation. A total of 58 from 236 selected sequence clusters (24.5%) were fully validated as containing polymorphic SNPs by genotypic analysis across the parents and progeny of several two-way pseudo-testcross mapping families. The clusters include genes belonging to a broad range of predicted functional categories. Polymorphic SNP-containing ESTs have also been used for comparative genomic analysis by comparison with whole genome data from model legume species, as well as Arabidopsis thaliana. A total of 29 (50%) of the 58 clusters detected putative ortholoci with known chromosomal locations in Medicago truncatula, which is closely related to white clover within the Trifolieae tribe of the Fabaceae. This analysis provides access to translational data from model species. The efficiency of in silico SNP discovery in white clover is limited by paralogous and homoeologous gene duplication effects, which are resolved unambiguously by the transmission test. This approach will also be applicable to other agronomically important cross-pollinating allopolyploid plant species.
Resumo:
Background Fusion transcripts are found in many tissues and have the potential to create novel functional products. Here, we investigate the genomic sequences around fusion junctions to better understand the transcriptional mechanisms mediating fusion transcription/splicing. We analyzed data from prostate (cancer) cells as previous studies have shown extensively that these cells readily undergo fusion transcription. Results We used the FusionMap program to identify high-confidence fusion transcripts from RNAseq data. The RNAseq datasets were from our (N = 8) and other (N = 14) clinical prostate tumors with adjacent non-cancer cells, and from the LNCaP prostate cancer cell line that were mock-, androgen- (DHT), and anti-androgen- (bicalutamide, enzalutamide) treated. In total, 185 fusion transcripts were identified from all RNAseq datasets. The majority (76 %) of these fusion transcripts were ‘read-through chimeras’ derived from adjacent genes in the genome. Characterization of sequences at fusion loci were carried out using a combination of the FusionMap program, custom Perl scripts, and the RNAfold program. Our computational analysis indicated that most fusion junctions (76 %) use the consensus GT-AG intron donor-acceptor splice site, and most fusion transcripts (85 %) maintained the open reading frame. We assessed whether parental genes of fusion transcripts have the potential to form complementary base pairing between parental genes which might bring them into physical proximity. Our computational analysis of sequences flanking fusion junctions at parental loci indicate that these loci have a similar propensity as non-fusion loci to hybridize. The abundance of repetitive sequences at fusion and non-fusion loci was also investigated given that SINE repeats are involved in aberrant gene transcription. We found few instances of repetitive sequences at both fusion and non-fusion junctions. Finally, RT-qPCR was performed on RNA from both clinical prostate tumors and adjacent non-cancer cells (N = 7), and LNCaP cells treated as above to validate the expression of seven fusion transcripts and their respective parental genes. We reveal that fusion transcript expression is similar to the expression of parental genes. Conclusions Fusion transcripts maintain the open reading frame, and likely use the same transcriptional machinery as non-fusion transcripts as they share many genomic features at splice/fusion junctions.
Resumo:
Background: Molecular marker technologies are undergoing a transition from largely serial assays measuring DNA fragment sizes to hybridization-based technologies with high multiplexing levels. Diversity Arrays Technology (DArT) is a hybridization-based technology that is increasingly being adopted by barley researchers. There is a need to integrate the information generated by DArT with previous data produced with gel-based marker technologies. The goal of this study was to build a high-density consensus linkage map from the combined datasets of ten populations, most of which were simultaneously typed with DArT and Simple Sequence Repeat (SSR), Restriction Enzyme Fragment Polymorphism (RFLP) and/or Sequence Tagged Site (STS) markers. Results: The consensus map, built using a combination of JoinMap 3.0 software and several purpose-built perl scripts, comprised 2,935 loci (2,085 DArT, 850 other loci) and spanned 1,161 cM. It contained a total of 1,629 'bins' (unique loci), with an average inter-bin distance of 0.7 ± 1.0 cM (median = 0.3 cM). More than 98% of the map could be covered with a single DArT assay. The arrangement of loci was very similar to, and almost as optimal as, the arrangement of loci in component maps built for individual populations. The locus order of a synthetic map derived from merging the component maps without considering the segregation data was only slightly inferior. The distribution of loci along chromosomes indicated centromeric suppression of recombination in all chromosomes except 5H. DArT markers appeared to have a moderate tendency toward hypomethylated, gene-rich regions in distal chromosome areas. On the average, 14 ± 9 DArT loci were identified within 5 cM on either side of SSR, RFLP or STS loci previously identified as linked to agricultural traits. Conclusion: Our barley consensus map provides a framework for transferring genetic information between different marker systems and for deploying DArT markers in molecular breeding schemes. The study also highlights the need for improved software for building consensus maps from high-density segregation data of multiple populations.
Resumo:
Using an established genetic map, a single gene conditioning covered smut resistance, Ruh.7H, was mapped to the telomere region of chromosome 7HS in an Alexis/Sloop doubled haploid barley population. The closest marker to Ruh.7H, abg704 was 7.5 cM away. Thirteen loci on the distal end of 7HS with potential to contain single nucleotide polymorphisms (SNPs) were identified by applying a comparative genomics approach using rice sequence data. Of these, one locus produced polymorphic co-dominant bands of different size while two further loci contained SNPs that were identified using the recently developed high resolution melting (HRM) technique. Two of these markers flanked Ruh.7H with the proximal marker located 3.8 cM and the distal marker 2.7 cM away. This is the first report on the application of the HRM technique to SNP detection and to rapid scoring of known cleaved amplified polymorphic sequence (CAPS) markers in plants. This simple, precise post-PCR technique should find widespread use in the fine-mapping of genetic regions of interest in complex cereal and other plant genomes.
Resumo:
The sequential nature of gel-based marker systems entails low throughput and high costs per assay. Commonly used marker systems such as SSR and SNP are also dependent on sequence information. These limitations result in high cost per data point and significantly limit the capacity of breeding programs to obtain sufficient return on investment to justify the routine use of marker-assisted breeding for many traits and particularly quantitative traits. Diversity Arrays Technology (DArT™) is a cost effective hybridisation-based marker technology that offers a high multiplexing level while being independent of sequence information. This technology offers sorghum breeding programs an alternative approach to whole-genome profiling. We report on the development, application, mapping and utility of DArT™ markers for sorghum germplasm. Results: A genotyping array was developed representing approximately 12,000 genomic clones using PstI+BanII complexity with a subset of clones obtained through the suppression subtractive hybridisation (SSH) method. The genotyping array was used to analyse a diverse set of sorghum genotypes and screening a Recombinant Inbred Lines (RIL) mapping population. Over 500 markers detected variation among 90 accessions used in a diversity analysis. Cluster analysis discriminated well between all 90 genotypes. To confirm that the sorghum DArT markers behave in a Mendelian manner, we constructed a genetic linkage map for a cross between R931945-2-2 and IS 8525 integrating DArT and other marker types. In total, 596 markers could be placed on the integrated linkage map, which spanned 1431.6 cM. The genetic linkage map had an average marker density of 1/2.39 cM, with an average DArT marker density of 1/3.9 cM. Conclusion: We have successfully developed DArT markers for Sorghum bicolor and have demonstrated that DArT provides high quality markers that can be used for diversity analyses and to construct medium-density genetic linkage maps. The high number of DArT markers generated in a single assay not only provides a precise estimate of genetic relationships among genotypes, but also their even distribution over the genome offers real advantages for a range of molecular breeding and genomics applications.
Differential expression profiling of components associated with exoskeletal hardening in crustaceans
Resumo:
Background: Exoskeletal hardening in crustaceans can be attributed to mineralization and sclerotization of the organic matrix. Glycoproteins have been implicated in the calcification process of many matrices. Sclerotization, on the other hand, is catalysed by phenoloxidases, which also play a role in melanization and the immunological response in arthropods. Custom cDNA microarrays from Portunus pelagicus were used to identify genes possibly associated with the activation pathways involved in these processes. Results: Two genes potentially involved in the recognition of glycosylation, the C-type lectin receptor and the mannose-binding protein, were found to display molt cycle-related differential expression profiles. C-type lectin receptor up-regulation was found to coincide with periods associated with new uncalcified cuticle formation, while the up-regulation of mannose-binding protein occurred only in the post-molt stage, during which calcification takes place, implicating both in the regulation of calcification. Genes presumed to be involved in the phenoloxidase activation pathway that facilitates sclerotization also displayed molt cycle-related differential expression profiles. Members of the serine protease superfamily, trypsin-like and chymotrypsin-like, were up-regulated in the intermolt stage when compared to post-molt, while trypsin-like was also up-regulated in pre-molt compared to ecdysis. Additionally, up-regulation in pre- and intermolt stages was observed by transcripts encoding other phenoloxidase activators including the putative antibacterial protein carcinin-like, and clotting protein precursor-like. Furthermore, hemocyanin, itself with phenoloxidase activity, displayed an identical expression pattern to that of the phenoloxidase activators, i.e. up-regulation in pre- and intermolt. Conclusion: Cuticle hardening in crustaceans is a complex process that is precisely timed to occur in the post-molt stage of the molt cycle. We have identified differential expression patterns of several genes that are believed to be involved in biomineralization and sclerotization and propose possible regulatory mechanisms for these processes based on their expression profiles, such as the potential involvement of C-type lectin receptors and mannose binding protein in the regulation of calcification.
Resumo:
Improving the genetic base of cultivars that underpin commercial mango production is generally recognized as necessary for long term industry stability. Genetic improvement can take many approaches to improve cultivars, each with their own advantages and disadvantages. This paper will discuss several approaches used in the genetic improvement of mangoes in Australia, including varietal introductions, selection of monoembryonic progeny, selection within polyembryonic populations, assisted open pollination and controlled closed pollination. The current activities of the Australian National Mango Breeding Program will be outlined, and the analysis and use of hybrid phenotype data from the project for selection of next generation parents will be discussed. Some of the important traits that will enhance the competitiveness of future cultivars will be introduced and the challenges in achieving them discussed. The use of a genomics approach and its impact on future mango breeding is examined.
Resumo:
Background: The hot dog fold has been found in more than sixty proteins since the first report of its existence about a decade ago. The fold appears to have a strong association with fatty acid biosynthesis, its regulation and metabolism, as the proteins with this fold are predominantly coenzyme A-binding enzymes with a variety of substrates located at their active sites. Results: We have analyzed the structural features and sequences of proteins having the hot dog fold. This study reveals that though the basic architecture of the fold is well conserved in these proteins, significant differences exist in their sequence, nature of substrate and oligomerization. Segments with certain conserved sequence motifs seem to play crucial structural and functional roles in various classes of these proteins. Conclusion: The analysis led to predictions regarding the functional classification and identification of possible catalytic residues of a number of hot dog fold-containing hypothetical proteins whose structures were determined in high throughput structural genomics projects.
Resumo:
Most psychiatric disorders are moderately to highly heritable. The degree to which genetic variation is unique to individual disorders or shared across disorders is unclear. To examine shared genetic etiology, we use genome-wide genotype data from the Psychiatric Genomics Consortium (PGC) for cases and controls in schizophrenia, bipolar disorder, major depressive disorder, autism spectrum disorders (ASD) and attention-deficit/hyperactivity disorder (ADHD). We apply univariate and bivariate methods for the estimation of genetic variation within and covariation between disorders. SNPs explained 17-29% of the variance in liability. The genetic correlation calculated using common SNPs was high between schizophrenia and bipolar disorder (0.68 +/- 0.04 s.e.), moderate between schizophrenia and major depressive disorder (0.43 +/- 0.06 s.e.), bipolar disorder and major depressive disorder (0.47 +/- 0.06 s.e.), and ADHD and major depressive disorder (0.32 +/- 0.07 s.e.), low between schizophrenia and ASD (0.16 +/- 0.06 s.e.) and non-significant for other pairs of disorders as well as between psychiatric disorders and the negative control of Crohn's disease. This empirical evidence of shared genetic etiology for psychiatric disorders can inform nosology and encourages the investigation of common pathophysiologies for related disorders.
Resumo:
The number of genetic factors associated with common human traits and disease is increasing rapidly, and the general public is utilizing affordable, direct-to-consumer genetic tests. The results of these tests are often in the public domain. A combination of factors has increased the potential for the indirect estimation of an individual's risk for a particular trait. Here we explain the basic principals underlying risk estimation which allowed us to test the ability to make an indirect risk estimation from genetic data by imputing Dr. James Watson's redacted apolipoprotein E gene (APOE) information. The principles underlying risk prediction from genetic data have been well known and applied for many decades, however, the recent increase in genomic knowledge, and advances in mathematical and statistical techniques and computational power, make it relatively easy to make an accurate but indirect estimation of risk. There is a current hazard for indirect risk estimation that is relevant not only to the subject but also to individuals related to the subject; this risk will likely increase as more detailed genomic data and better computational tools become available.
Resumo:
There is evidence across several species for genetic control of phenotypic variation of complex traits1, 2, 3, 4, such that the variance among phenotypes is genotype dependent. Understanding genetic control of variability is important in evolutionary biology, agricultural selection programmes and human medicine, yet for complex traits, no individual genetic variants associated with variance, as opposed to the mean, have been identified. Here we perform a meta-analysis of genome-wide association studies of phenotypic variation using ~170,000 samples on height and body mass index (BMI) in human populations. We report evidence that the single nucleotide polymorphism (SNP) rs7202116 at the FTO gene locus, which is known to be associated with obesity (as measured by mean BMI for each rs7202116 genotype)5, 6, 7, is also associated with phenotypic variability. We show that the results are not due to scale effects or other artefacts, and find no other experiment-wise significant evidence for effects on variability, either at loci other than FTO for BMI or at any locus for height. The difference in variance for BMI among individuals with opposite homozygous genotypes at the FTO locus is approximately 7%, corresponding to a difference of ~0.5 kilograms in the standard deviation of weight. Our results indicate that genetic variants can be discovered that are associated with variability, and that between-person variability in obesity can partly be explained by the genotype at the FTO locus. The results are consistent with reported FTO by environment interactions for BMI8, possibly mediated by DNA methylation9, 10. Our BMI results for other SNPs and our height results for all SNPs suggest that most genetic variants, including those that influence mean height or mean BMI, are not associated with phenotypic variance, or that their effects on variability are too small to detect even with samples sizes greater than 100,000.