118 resultados para DNA Mutational Analysis
Resumo:
Over the last few years, investigations of human epigenetic profiles have identified key elements of change to be Histone Modifications, stable and heritable DNA methylation and Chromatin remodeling. These factors determine gene expression levels and characterise conditions leading to disease. In order to extract information embedded in long DNA sequences, data mining and pattern recognition tools are widely used, but efforts have been limited to date with respect to analyzing epigenetic changes, and their role as catalysts in disease onset. Useful insight, however, can be gained by investigation of associated dinucleotide distributions. The focus of this paper is to explore specific dinucleotides frequencies across defined regions within the human genome, and to identify new patterns between epigenetic mechanisms and DNA content. Signal processing methods, including Fourier and Wavelet Transformations, are employed and principal results are reported.
Resumo:
Synchronous fluorescence spectroscopy (SFS) was applied for the investigation of interactions of the antibiotic, tetracycline (TC), with DNA in the presence of aluminium ions (Al3+). The study was facilitated by the use of the Methylene Blue (MB) dye probe, and the interpretation of the spectral data with the aid of the chemometrics method, parallel factor analysis (PARAFAC). Three-way synchronous fluorescence analysis extracted the important optimum constant wavelength differences, Δλ, and showed that for the TC–Al3+–DNA, TC–Al3+ and MB dye systems, the associated Δλ values were different (Δλ = 80, 75 and 30 nm, respectively). Subsequent PARAFAC analysis demonstrated the extraction of the equilibrium concentration profiles for the TC–Al3+, TC–Al3+–DNA and MB probe systems. This information is unobtainable by conventional means of data interpretation. The results indicated that the MB dye interacted with the TC–Al3+–DNA surface complex, presumably via a reaction intermediate, TC–Al3+–DNA–MB, leading to the displacement of the TC–Al3+ by the incoming MB dye probe.
Resumo:
The highly variable flagellin-encoding flaA gene has long been used for genotyping Campylobacter jejuni and Campylobacter coli. High-resolution melting (HRM) analysis is emerging as an efficient and robust method for discriminating DNA sequence variants. The objective of this study was to apply HRM analysis to flaA-based genotyping. The initial aim was to identify a suitable flaA fragment. It was found that the PCR primers commonly used to amplify the flaA short variable repeat (SVR) yielded a mixed PCR product unsuitable for HRM analysis. However, a PCR primer set composed of the upstream primer used to amplify the fragment used for flaA restriction fragment length polymorphism (RFLP) analysis and the downstream primer used for flaA SVR amplification generated a very pure PCR product, and this primer set was used for the remainder of the study. Eighty-seven C. jejuni and 15 C. coli isolates were analyzed by flaA HRM and also partial flaA sequencing. There were 47 flaA sequence variants, and all were resolved by HRM analysis. The isolates used had previously also been genotyped using single-nucleotide polymorphisms (SNPs), binary markers, CRISPR HRM, and flaA RFLP. flaAHRManalysis provided resolving power multiplicative to the SNPs, binary markers, and CRISPR HRM and largely concordant with the flaA RFLP. It was concluded that HRM analysis is a promising approach to genotyping based on highly variable genes.
Resumo:
A protein-truncating variant of CHEK2, 1100delC, is associated with a moderate increase in breast cancer risk. We have determined the prevalence of this allele in index cases from 300 Australian multiple-case breast cancer families, 95% of which had been found to be negative for mutations in BRCA1 and BRCA2. Only two (0.6%) index cases heterozygous for the CHEK2 mutation were identified. All available relatives in these two families were genotyped, but there was no evidence of co-segregation between the CHEK2 variant and breast cancer. Lymphoblastoid cell lines established from a heterozygous carrier contained approximately 20% of the CHEK2 1100delC mRNA relative to wild-type CHEK2 transcript. However, no truncated CHK2 protein was detectable. Analyses of expression and phosphorylation of wild-type CHK2 suggest that the variant is likely to act by haploinsufficiency. Analysis of CDC25A degradation, a downstream target of CHK2, suggests that some compensation occurs to allow normal degradation of CDC25A. Such compensation of the 1100delC defect in CHEK2 might explain the rather low breast cancer risk associated with the CHEK2 variant, compared to that associated with truncating mutations in BRCA1 or BRCA2.
Resumo:
hSSB1 is a recently discovered single-stranded DNA binding protein that is essential for efficient repair of DNA double-strand breaks (DSBs) by the homologous recombination pathway. hSSB1 is required for the efficient recruitment of the MRN complex to sites of DSBs and for the efficient initiation of ATM dependent signalling. Here we explore the interplay between hSSB1 and MRN. We demonstrate that hSSB1 binds directly to NBS1, a component of the MRN complex, in a DNA damage independent manner. Consistent with the direct interaction, we observe that hSSB1 greatly stimulates the endo-nuclease activity of the MRN complex, a process that requires the C-terminal tail of hSSB1. Interestingly, analysis of two point mutations in NBS1, associated with Nijmegen breakage syndrome, revealed weaker binding to hSSB1, suggesting a possible disease mechanism.
Resumo:
Nuclear Factor Y (NF-Y) is a trimeric complex that binds to the CCAAT box, a ubiquitous eukaryotic promoter element. The three subunits NF-YA, NF-YB and NF-YC are represented by single genes in yeast and mammals. However, in model plant species (Arabidopsis and rice) multiple genes encode each subunit providing the impetus for the investigation of the NF-Y transcription factor family in wheat. A total of 37 NF-Y and Dr1 genes (10 NF-YA, 11 NF-YB, 14 NF-YC and 2 Dr1) in Triticum aestivum were identified in the global DNA databases by computational analysis in this study. Each of the wheat NF-Y subunit families could be further divided into 4-5 clades based on their conserved core region sequences. Several conserved motifs outside of the NF-Y core regions were also identified by comparison of NF-Y members from wheat, rice and Arabidopsis. Quantitative RT-PCR analysis revealed that some of the wheat NF-Y genes were expressed ubiquitously, while others were expressed in an organ-specific manner. In particular, each TaNF-Y subunit family had members that were expressed predominantly in the endosperm. The expression of nine NF-Y and two Dr1 genes in wheat leaves appeared to be responsive to drought stress. Three of these genes were up-regulated under drought conditions, indicating that these members of the NF-Y and Dr1 families are potentially involved in plant drought adaptation. The combined expression and phylogenetic analyses revealed that members within the same phylogenetic clade generally shared a similar expression profile. Organ-specific expression and differential response to drought indicate a plant-specific biological role for various members of this transcription factor family.
Resumo:
The interaction of 10-hydroxycamptothecine (HCPT) with DNA under pseudo-physiological conditions (Tris-HCl buffer of pH 7.4), using ethidium bromide (EB) dye as a probe, was investigated with the use of spectrofluorimetry, UV-vis spectrometry and viscosity measurement. The binding constant and binding number for HCPT with DNA were evaluated as (7.1 ± 0.5) × 104 M-1 and 1.1, respectively, by multivariate curve resolution-alternating least squares (MCR-ALS). Moreover, parallel factor analysis (PARAFAC) was applied to resolve the three-way fluorescence data obtained from the interaction system, and the concentration information for the three components of the system at equilibrium was simultaneously obtained. It was found that there was a cooperative interaction between the HCPT-DNA complex and EB, which produced a ternary complex of HCPT-DNA-EB. © 2011 Elsevier B.V.
Resumo:
With the identification of common single locus point mutations as risk factors for thrombophilia, many DNA testing methodologies have been described for detecting these variations. Traditionally, functional or immunological testing methods have been used to investigate quantitative anticoagulant deficiencies. However, with the emergence of the genetic variations, factor V Leiden, prothrombin 20210 and, to a lesser extent, the methylene tetrahydrofolate reductase (MTHFR677) and factor V HR2 haplotype, traditional testing methodologies have proved to be less useful and instead DNA technology is more commonly employed in diagnostics. This review considers many of the DNA techniques that have proved to be useful in the detection of common genetic variants that predispose to thrombophilia. Techniques involving gel analysis are used to detect the presence or absence of restriction sites, electrophoretic mobility shifts, as in single strand conformation polymorphism or denaturing gradient gel electrophoresis, and product formation in allele-specific amplification. Such techniques may be sensitive, but are unwielding and often need to be validated objectively. In order to overcome some of the limitations of gel analysis, especially when dealing with larger sample numbers, many alternative detection formats, such as closed tube systems, microplates and microarrays (minisequencing, real-time polymerase chain reaction, and oligonucleotide ligation assays) have been developed. In addition, many of the emerging technologies take advantage of colourimetric or fluorescence detection (including energy transfer) that allows qualitative and quantitative interpretation of results. With the large variety of DNA technologies available, the choice of methodology will depend on several factors including cost and the need for speed, simplicity and robustness. © 2000 Lippincott Williams & Wilkins.
Resumo:
We have previously reported the use of a novel mini-sequencing protocol for detection of the factor V Leiden variant, the first nucleotide change (FNC) technology. This technology is based on a single nucleotide extension of a primer, which is hybridized immediately adjacent to the site of mutation. The extended nucleotide that carries a reporter molecule (fluorescein) has the power to discriminate the genotype at the site of mutation. More recently, the prothrombin 20210 and thermolabile methylene tetrahydrofolate reductase (MTHFR) 677 variants have been identified as possible risk factors associated with thrombophilia. This study describes the use of the FNC technology in a combined assay to detect factor V, prothrombin and MTHFR variants in a population of Australian blood donors, and describes the objective numerical methodology used to determine genotype cut-off values for each genetic variation. Using FNC to test 500 normal blood donors, the incidence of Factor V Leiden was 3.6% (all heterozygous), that of prothrombin 20210 was 2.8% (all heterozygous) and that of MTHFR was 10% (homozygous). The combined FNC technology offers a simple, rapid, automatable DNA-based test for the detection of these three important mutations that are associated with familial thrombophilia. (C) 2000 Lippincott Williams and Wilkins.
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.
Resumo:
Background Bactrocera dorsalis s.s. is a pestiferous tephritid fruit fly distributed from Pakistan to the Pacific, with the Thai/Malay peninsula its southern limit. Sister pest taxa, B. papayae and B. philippinensis, occur in the southeast Asian archipelago and the Philippines, respectively. The relationship among these species is unclear due to their high molecular and morphological similarity. This study analysed population structure of these three species within a southeast Asian biogeographical context to assess potential dispersal patterns and the validity of their current taxonomic status. Results Geometric morphometric results generated from 15 landmarks for wings of 169 flies revealed significant differences in wing shape between almost all sites following canonical variate analysis. For the combined data set there was a greater isolation-by-distance (IBD) effect under a ‘non-Euclidean’ scenario which used geographical distances within a biogeographical ‘Sundaland context’ (r2 = 0.772, P < 0.0001) as compared to a ‘Euclidean’ scenario for which direct geographic distances between sample sites was used (r2 = 0.217, P < 0.01). COI sequence data were obtained for 156 individuals and yielded 83 unique haplotypes with no correlation to current taxonomic designations via a minimum spanning network. BEAST analysis provided a root age and location of 540kya in northern Thailand, with migration of B. dorsalis s.l. into Malaysia 470kya and Sumatra 270kya. Two migration events into the Philippines are inferred. Sequence data revealed a weak but significant IBD effect under the ‘non-Euclidean’ scenario (r2 = 0.110, P < 0.05), with no historical migration evident between Taiwan and the Philippines. Results are consistent with those expected at the intra-specific level. Conclusions Bactrocera dorsalis s.s., B. papayae and B. philippinensis likely represent one species structured around the South China Sea, having migrated from northern Thailand into the southeast Asian archipelago and across into the Philippines. No migration is apparent between the Philippines and Taiwan. This information has implications for quarantine, trade and pest management.
Resumo:
Genetic variation at allozyme and mitochondrial DNA loci was investigated in the Australian lungfish, Neoceratodus forsteri Krefft 1870. Tissue samples for genetic analysis were taken non-lethally from 278 individuals representing two spatially distinct endemic populations (Mary and Burnett rivers), as well as one population thought to be derived from an anthropogenic translocation in the 1890's (Brisbane river). Two of 24 allozyme loci resolved from muscle tissue were polymorphic. Mitochondrial DNA nucleotide sequence diversity estimated across 2,235 base pairs in each of 40 individuals ranged between 0.000423 and 0.001470 per river. Low genetic variation at allozyme and mitochondrial loci could be attributed to population bottlenecks, possibly induced by Pleistocene aridity. Limited genetic differentiation was detected among rivers using nuclear and mitochondrial markers suggesting that admixture may have occurred between the endemic Mary and Burnett populations during periods of low sea level when the drainages may have converged before reaching the ocean. Genetic data was consistent with the explanation that lungfish were introduced to the Brisbane river from the Mary river. Further research using more variable genetic loci is needed before the conservation status of populations can be determined, particularly as anthropogenic demands on lungfish habitat are increasing. In the interim we recommend a management strategy aimed at conserving existing genetic variation within and between rivers.
Resumo:
Siamese mud carp (Henichorynchus siamensis) is a freshwater teleost of high economic importance in the Mekong River Basin. However, genetic data relevant for delineating wild stocks for management purposes currently are limited for this species. Here, we used 454 pyrosequencing to generate a partial genome survey sequence (GSS) dataset to develop simple sequence repeat (SSR) markers from H. siamensis genomic DNA. Data generated included a total of 65,954 sequence reads with average length of 264 nucleotides, of which 2.79% contain SSR motifs. Based on GSS-BLASTx results, 10.5% of contigs and 8.1% singletons possessed significant similarity (E value < 10–5) with the majority matching well to reported fish sequences. KEGG analysis identified several metabolic pathways that provide insights into specific potential roles and functions of sequences involved in molecular processes in H. siamensis. Top protein domains detected included reverse transcriptase and the top putative functional transcript identified was an ORF2-encoded protein. One thousand eight hundred and thirty seven sequences containing SSR motifs were identified, of which 422 qualified for primer design and eight polymorphic loci have been tested with average observed and expected heterozygosity estimated at 0.75 and 0.83, respectively. Regardless of their relative levels of polymorphism and heterozygosity, microsatellite loci developed here are suitable for further population genetic studies in H. siamensis and may also be applicable to other related taxa.
Resumo:
Background Human papillomavirus (HPV) is the aetiological agent for cervical cancer and genital warts. Concurrent HPV and HIV infection in the South African population is high. HIV positive (+) women are often infected with multiple, rare and undetermined HPV types. Data on HPV incidence and genotype distribution are based on commercial HPV detection kits, but these kits may not detect all HPV types in HIV + women. The objectives of this study were to (i) identify the HPV types not detected by commercial genotyping kits present in a cervical specimen from an HIV positive South African woman using next generation sequencing, and (ii) determine if these types were prevalent in a cohort of HIV-infected South African women. Methods Total DNA was isolated from 109 cervical specimens from South African HIV + women. A specimen within this cohort representing a complex multiple HPV infection, with 12 HPV genotypes detected by the Roche Linear Array HPV genotyping (LA) kit, was selected for next generation sequencing analysis. All HPV types present in this cervical specimen were identified by Illumina sequencing of the extracted DNA following rolling circle amplification. The prevalence of the HPV types identified by sequencing, but not included in the Roche LA, was then determined in the 109 HIV positive South African women by type-specific PCR. Results Illumina sequencing identified a total of 16 HPV genotypes in the selected specimen, with four genotypes (HPV-30, 74, 86 and 90) not included in the commercial kit. The prevalence's of HPV-30, 74, 86 and 90 in 109 HIV positive South African women were found to be 14.6 %, 12.8 %, 4.6 % and 8.3 % respectively. Conclusions Our results indicate that there are HPV types, with substantial prevalence, in HIV positive women not being detected in molecular epidemiology studies using commercial kits. The significance of these types in relation to cervical disease remains to be investigated.