40 resultados para microarrays
em Queensland University of Technology - ePrints Archive
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Cell line array (CMA) and tissue microarray (TMA) technologies are high-throughput methods for analysing both the abundance and distribution of gene expression in a panel of cell lines or multiple tissue specimens in an efficient and cost-effective manner. The process is based on Kononen's method of extracting a cylindrical core of paraffin-embedded donor tissue and inserting it into a recipient paraffin block. Donor tissue from surgically resected paraffin-embedded tissue blocks, frozen needle biopsies or cell line pellets can all be arrayed in the recipient block. The representative area of interest is identified and circled on a haematoxylin and eosin (H&E)-stained section of the donor block. Using a predesigned map showing a precise spacing pattern, a high density array of up to 1,000 cores of cell pellets and/or donor tissue can be embedded into the recipient block using a tissue arrayer from Beecher Instruments. Depending on the depth of the cell line/tissue removed from the donor block 100-300 consecutive sections can be cut from each CMA/TMA block. Sections can be stained for in situ detection of protein, DNA or RNA targets using immunohistochemistry (IHC), fluorescent in situ hybridisation (FISH) or mRNA in situ hybridisation (RNA-ISH), respectively. This chapter provides detailed methods for CMA/TMA design, construction and analysis with in-depth notes on all technical aspects including tips to deal with common pitfalls the user may encounter. © Springer Science+Business Media, LLC 2011.
Resumo:
The on-demand printing of living cells using inkjet technologies has recently been demonstrated and allows for the controlled deposition of cells in microarrays. Here, we show that such arrays can be interrogated directly by robot-controlled liquid microextraction coupled with chip-based nanoelectospray mass spectrometry. Such automated analyses generate a profile of abundant membrane lipids that are characteristic of cell type. Significantly, the spatial control in both deposition and extraction steps combined with the sensitivity of the mass spectrometric detection allows for robust molecular profiling of individual cells. © 2012 American Chemical Society.
Resumo:
The practice of medicine has always aimed at individualized treatment of disease. The relationship between patient and physician has always been a personal one, and the physician's choice of treatment has been intended to be the best fit for the patient's needs. The necessary pooling/grouping of disease families and their assignment to a number of drugs or treatment methods has, consequently, led to an increase in the number of effective therapies. However, given the heterogeneity of most human diseases, and cancer specifically, it is currently impossible for the treating clinician to effectively predict a patient's response and outcome based on current technologies, much less the idiosyncratic resistances and adverse effects associated with the limited therapeutic options.
Resumo:
While genomics provide important information about the somatic genetic changes, and RNA transcript profiling can reveal important expression changes that correlate with outcome and response to therapy, it is the proteins that do the work in the cell. At a functional level, derangements within the proteome, driven by post-translational and epigenetic modifications, such as phosphorylation, is the cause of a vast majority of human diseases. Cancer, for instance, is a manifestation of deranged cellular protein molecular networks and cell signaling pathways that are based on genetic changes at the DNA level. Importantly, the protein pathways contain the drug targets in signaling networks that govern overall cellular survival, proliferation, invasion and cell death. Consequently, the promise of proteomics resides in the ability to extend analysis beyond correlation to causality. A critical gap in the information knowledge base of molecular profiling is an understanding of the ongoing activity of protein signaling in human tissue: what is activated and “in use” within the human body at any given point in time. To address this gap, we have invented a new technology, called reverse phase protein microarrays, that can generate a functional read-out of cell signaling networks or pathways for an individual patient obtained directly from a biopsy specimen. This “wiring diagram” can serve as the basis for both, selection of a therapy and patient stratification.
Resumo:
Cancer can be defined as a deregulation or hyperactivity in the ongoing network of intracellular and extracellular signaling events. Reverse phase protein microarray technology may offer a new opportunity to measure and profile these signaling pathways, providing data on post-translational phosphorylation events not obtainable by gene microarray analysis. Treatment of ovarian epithelial carcinoma almost always takes place in a metastatic setting since unfortunately the disease is often not detected until later stages. Thus, in addition to elucidation of the molecular network within a tumor specimen, critical questions are to what extent do signaling changes occur upon metastasis and are there common pathway elements that arise in the metastatic microenvironment. For individualized combinatorial therapy, ideal therapeutic selection based on proteomic mapping of phosphorylation end points may require evaluation of the patient's metastatic tissue. Extending these findings to the bedside will require the development of optimized protocols and reference standards. We have developed a reference standard based on a mixture of phosphorylated peptides to begin to address this challenge.
Resumo:
Melanoma is one of the most aggressive cancers affecting humans. Although early melanomas are curable with surgical excision, metastatic melanomas are associated with high mortality. The mechanism of melanoma development, progression, and metastasis is largely unknown. In order to uncover genes unique to melanoma cells, we used high-density DNA microarrays to examine the gene expression profiles of metastatic melanoma nodules using benign nevi as controls. Over 190 genes were significantly overexpressed in metastatic melanomas compared with normal nevi by at least 2-fold. One of the most abundantly expressed genes in metastatic melanoma nodules is osteopontin (OPN). Immunohistochemistry staining on tissue microarrays and individual skin biopsies representing different stages of melanoma progression revealed that OPN expression is first acquired at the step of melanoma tissue invasion. In addition, blocking of OPN expression by RNA interference reduced melanoma cell numbers in vitro. Our observations suggest that OPN may be acquired early in melanoma development and progression, and may enhance tumor cell growth in invasive melanoma.
Resumo:
Background: The aim of this study is to seek an association between markers of metastatic potential, drug resistance-related protein and monocarboxylate transporters in prostate cancer (CaP). Methods: We evaluated the expression of invasive markers (CD147, CD44v3-10), drug-resistance protein (MDR1) and monocarboxylate transporters (MCT1 and MCT4) in CaP metastatic cell lines and CaP tissue microarrays (n=140) by immunostaining. The co-expression of CD147 and CD44v3-10 with that of MDR1, MCT1 and MCT4 in CaP cell lines was evaluated using confocal microscopy. The relationship between the expression of CD147 and CD44v3-10 and the sensitivity (IC50) to docetaxel in CaP cell lines was assessed using MTT assay. The relationship between expression of CD44v3-10, MDR1 and MCT4 and various clinicopathological CaP progression parameters was examined. Results: CD147 and CD44v3-10 were co-expressed with MDR1, MCT1 and MCT4 in primary and metastatic CaP cells. Both CD147 and CD44v3-10 expression levels were inversely related to docetaxel sensitivity (IC50) in metastatic CaP cell lines. Overexpression of CD44v3-10, MDR1 and MCT4 was found in most primary CaP tissues, and was significantly associated with CaP progression. Conclusions: Our results suggest that the overexpression of CD147, CD44v3-10, MDR1 and MCT4 is associated with CaP progression. Expression of both CD147 and CD44v3-10 is correlated with drug resistance during CaP metastasis and could be a useful potential therapeutic target in advanced disease.
Resumo:
With the identification of common single locus point mutations as risk factors for thrombophilia, many DNA testing methodologies have been described for detecting these variations. Traditionally, functional or immunological testing methods have been used to investigate quantitative anticoagulant deficiencies. However, with the emergence of the genetic variations, factor V Leiden, prothrombin 20210 and, to a lesser extent, the methylene tetrahydrofolate reductase (MTHFR677) and factor V HR2 haplotype, traditional testing methodologies have proved to be less useful and instead DNA technology is more commonly employed in diagnostics. This review considers many of the DNA techniques that have proved to be useful in the detection of common genetic variants that predispose to thrombophilia. Techniques involving gel analysis are used to detect the presence or absence of restriction sites, electrophoretic mobility shifts, as in single strand conformation polymorphism or denaturing gradient gel electrophoresis, and product formation in allele-specific amplification. Such techniques may be sensitive, but are unwielding and often need to be validated objectively. In order to overcome some of the limitations of gel analysis, especially when dealing with larger sample numbers, many alternative detection formats, such as closed tube systems, microplates and microarrays (minisequencing, real-time polymerase chain reaction, and oligonucleotide ligation assays) have been developed. In addition, many of the emerging technologies take advantage of colourimetric or fluorescence detection (including energy transfer) that allows qualitative and quantitative interpretation of results. With the large variety of DNA technologies available, the choice of methodology will depend on several factors including cost and the need for speed, simplicity and robustness. © 2000 Lippincott Williams & Wilkins.
Resumo:
The tumor suppressor PTEN antagonizes phosphatidylinositol 3-kinase (PI3K), which contributes to tumorigenesis in many cancer types. While PTEN mutations occur in some melanomas, their precise mechanistic consequences have yet to be elucidated. We sought to identify novel downstream effectors of PI3K using a combination of genomic and functional tests. Microarray analysis of 53 melanoma cell lines identified 610 genes differentially expressed (P<0.05) between wild-type lines and those with PTEN aberrations. Many of these genes are known to be involved in the PI3K pathway and other signaling pathways influenced by PTEN. Validation of differential gene expression by qRT-PCR was performed in the original 53 cell lines and an independent set of 18 melanoma lines with known PTEN status. Osteopontin (OPN), a secreted glycophosphoprotein that contributes to tumor progression, was more abundant at both the mRNA and protein level in PTEN mutants. The inverse correlation between OPN and PTEN expression was validated (P<0.02) by immunohistochemistry using melanoma tissue microarrays. Finally, treatment of cell lines with the PI3K inhibitor LY294002 caused a reduction in expression of OPN. These data indicate that OPN acts downstream of PI3K in melanoma and provides insight into how PTEN loss contributes to melanoma development.
Resumo:
Although germline mutations in CDKN2A are present in approximately 25% of large multicase melanoma families, germline mutations are much rarer in the smaller melanoma families that make up most individuals reporting a family history of this disease. In addition, only three families worldwide have been reported with germline mutations in a gene other than CDKN2A (i.e., CDK4). Accordingly, current genomewide scans underway at the National Human Genome Research Institute hope to reveal linkage to one or more chromosomal regions, and ultimately lead to the identification of novel genes involved in melanoma predisposition. Both CDKN2A and PTEN have been identified as genes involved in sporadic melanoma development; however, mutations are more common in cell lines than uncultured tumors. A combination of cytogenetic, molecular, and functional studies suggests that additional genes involved in melanoma development are located to chromosomal regions 1p, 6q, 7p, 11q, and possibly also 9p and 10q. With the near completion of the human genome sequencing effort, combined with the advent of high throughput mutation analyses and new techniques including cDNA and tissue microarrays, the identification and characterization of additional genes involved in melanoma pathogenesis seem likely in the near future.
Resumo:
Complex networks have been studied extensively due to their relevance to many real-world systems such as the world-wide web, the internet, biological and social systems. During the past two decades, studies of such networks in different fields have produced many significant results concerning their structures, topological properties, and dynamics. Three well-known properties of complex networks are scale-free degree distribution, small-world effect and self-similarity. The search for additional meaningful properties and the relationships among these properties is an active area of current research. This thesis investigates a newer aspect of complex networks, namely their multifractality, which is an extension of the concept of selfsimilarity. The first part of the thesis aims to confirm that the study of properties of complex networks can be expanded to a wider field including more complex weighted networks. Those real networks that have been shown to possess the self-similarity property in the existing literature are all unweighted networks. We use the proteinprotein interaction (PPI) networks as a key example to show that their weighted networks inherit the self-similarity from the original unweighted networks. Firstly, we confirm that the random sequential box-covering algorithm is an effective tool to compute the fractal dimension of complex networks. This is demonstrated on the Homo sapiens and E. coli PPI networks as well as their skeletons. Our results verify that the fractal dimension of the skeleton is smaller than that of the original network due to the shortest distance between nodes is larger in the skeleton, hence for a fixed box-size more boxes will be needed to cover the skeleton. Then we adopt the iterative scoring method to generate weighted PPI networks of five species, namely Homo sapiens, E. coli, yeast, C. elegans and Arabidopsis Thaliana. By using the random sequential box-covering algorithm, we calculate the fractal dimensions for both the original unweighted PPI networks and the generated weighted networks. The results show that self-similarity is still present in generated weighted PPI networks. This implication will be useful for our treatment of the networks in the third part of the thesis. The second part of the thesis aims to explore the multifractal behavior of different complex networks. Fractals such as the Cantor set, the Koch curve and the Sierspinski gasket are homogeneous since these fractals consist of a geometrical figure which repeats on an ever-reduced scale. Fractal analysis is a useful method for their study. However, real-world fractals are not homogeneous; there is rarely an identical motif repeated on all scales. Their singularity may vary on different subsets; implying that these objects are multifractal. Multifractal analysis is a useful way to systematically characterize the spatial heterogeneity of both theoretical and experimental fractal patterns. However, the tools for multifractal analysis of objects in Euclidean space are not suitable for complex networks. In this thesis, we propose a new box covering algorithm for multifractal analysis of complex networks. This algorithm is demonstrated in the computation of the generalized fractal dimensions of some theoretical networks, namely scale-free networks, small-world networks, random networks, and a kind of real networks, namely PPI networks of different species. Our main finding is the existence of multifractality in scale-free networks and PPI networks, while the multifractal behaviour is not confirmed for small-world networks and random networks. As another application, we generate gene interactions networks for patients and healthy people using the correlation coefficients between microarrays of different genes. Our results confirm the existence of multifractality in gene interactions networks. This multifractal analysis then provides a potentially useful tool for gene clustering and identification. The third part of the thesis aims to investigate the topological properties of networks constructed from time series. Characterizing complicated dynamics from time series is a fundamental problem of continuing interest in a wide variety of fields. Recent works indicate that complex network theory can be a powerful tool to analyse time series. Many existing methods for transforming time series into complex networks share a common feature: they define the connectivity of a complex network by the mutual proximity of different parts (e.g., individual states, state vectors, or cycles) of a single trajectory. In this thesis, we propose a new method to construct networks of time series: we define nodes by vectors of a certain length in the time series, and weight of edges between any two nodes by the Euclidean distance between the corresponding two vectors. We apply this method to build networks for fractional Brownian motions, whose long-range dependence is characterised by their Hurst exponent. We verify the validity of this method by showing that time series with stronger correlation, hence larger Hurst exponent, tend to have smaller fractal dimension, hence smoother sample paths. We then construct networks via the technique of horizontal visibility graph (HVG), which has been widely used recently. We confirm a known linear relationship between the Hurst exponent of fractional Brownian motion and the fractal dimension of the corresponding HVG network. In the first application, we apply our newly developed box-covering algorithm to calculate the generalized fractal dimensions of the HVG networks of fractional Brownian motions as well as those for binomial cascades and five bacterial genomes. The results confirm the monoscaling of fractional Brownian motion and the multifractality of the rest. As an additional application, we discuss the resilience of networks constructed from time series via two different approaches: visibility graph and horizontal visibility graph. Our finding is that the degree distribution of VG networks of fractional Brownian motions is scale-free (i.e., having a power law) meaning that one needs to destroy a large percentage of nodes before the network collapses into isolated parts; while for HVG networks of fractional Brownian motions, the degree distribution has exponential tails, implying that HVG networks would not survive the same kind of attack.