542 resultados para Microarrays
Resumo:
Bioinformatics involves analyses of biological data such as DNA sequences, microarrays and protein-protein interaction (PPI) networks. Its two main objectives are the identification of genes or proteins and the prediction of their functions. Biological data often contain uncertain and imprecise information. Fuzzy theory provides useful tools to deal with this type of information, hence has played an important role in analyses of biological data. In this thesis, we aim to develop some new fuzzy techniques and apply them on DNA microarrays and PPI networks. We will focus on three problems: (1) clustering of microarrays; (2) identification of disease-associated genes in microarrays; and (3) identification of protein complexes in PPI networks. The first part of the thesis aims to detect, by the fuzzy C-means (FCM) method, clustering structures in DNA microarrays corrupted by noise. Because of the presence of noise, some clustering structures found in random data may not have any biological significance. In this part, we propose to combine the FCM with the empirical mode decomposition (EMD) for clustering microarray data. The purpose of EMD is to reduce, preferably to remove, the effect of noise, resulting in what is known as denoised data. We call this method the fuzzy C-means method with empirical mode decomposition (FCM-EMD). We applied this method on yeast and serum microarrays, and the silhouette values are used for assessment of the quality of clustering. The results indicate that the clustering structures of denoised data are more reasonable, implying that genes have tighter association with their clusters. Furthermore we found that the estimation of the fuzzy parameter m, which is a difficult step, can be avoided to some extent by analysing denoised microarray data. The second part aims to identify disease-associated genes from DNA microarray data which are generated under different conditions, e.g., patients and normal people. We developed a type-2 fuzzy membership (FM) function for identification of diseaseassociated genes. This approach is applied to diabetes and lung cancer data, and a comparison with the original FM test was carried out. Among the ten best-ranked genes of diabetes identified by the type-2 FM test, seven genes have been confirmed as diabetes-associated genes according to gene description information in Gene Bank and the published literature. An additional gene is further identified. Among the ten best-ranked genes identified in lung cancer data, seven are confirmed that they are associated with lung cancer or its treatment. The type-2 FM-d values are significantly different, which makes the identifications more convincing than the original FM test. The third part of the thesis aims to identify protein complexes in large interaction networks. Identification of protein complexes is crucial to understand the principles of cellular organisation and to predict protein functions. In this part, we proposed a novel method which combines the fuzzy clustering method and interaction probability to identify the overlapping and non-overlapping community structures in PPI networks, then to detect protein complexes in these sub-networks. Our method is based on both the fuzzy relation model and the graph model. We applied the method on several PPI networks and compared with a popular protein complex identification method, the clique percolation method. For the same data, we detected more protein complexes. We also applied our method on two social networks. The results showed our method works well for detecting sub-networks and give a reasonable understanding of these communities.
Resumo:
Cell line array (CMA) and tissue microarray (TMA) technologies are high-throughput methods for analysing both the abundance and distribution of gene expression in a panel of cell lines or multiple tissue specimens in an efficient and cost-effective manner. The process is based on Kononen's method of extracting a cylindrical core of paraffin-embedded donor tissue and inserting it into a recipient paraffin block. Donor tissue from surgically resected paraffin-embedded tissue blocks, frozen needle biopsies or cell line pellets can all be arrayed in the recipient block. The representative area of interest is identified and circled on a haematoxylin and eosin (H&E)-stained section of the donor block. Using a predesigned map showing a precise spacing pattern, a high density array of up to 1,000 cores of cell pellets and/or donor tissue can be embedded into the recipient block using a tissue arrayer from Beecher Instruments. Depending on the depth of the cell line/tissue removed from the donor block 100-300 consecutive sections can be cut from each CMA/TMA block. Sections can be stained for in situ detection of protein, DNA or RNA targets using immunohistochemistry (IHC), fluorescent in situ hybridisation (FISH) or mRNA in situ hybridisation (RNA-ISH), respectively. This chapter provides detailed methods for CMA/TMA design, construction and analysis with in-depth notes on all technical aspects including tips to deal with common pitfalls the user may encounter. © Springer Science+Business Media, LLC 2011.
Resumo:
The on-demand printing of living cells using inkjet technologies has recently been demonstrated and allows for the controlled deposition of cells in microarrays. Here, we show that such arrays can be interrogated directly by robot-controlled liquid microextraction coupled with chip-based nanoelectospray mass spectrometry. Such automated analyses generate a profile of abundant membrane lipids that are characteristic of cell type. Significantly, the spatial control in both deposition and extraction steps combined with the sensitivity of the mass spectrometric detection allows for robust molecular profiling of individual cells. © 2012 American Chemical Society.
Resumo:
The practice of medicine has always aimed at individualized treatment of disease. The relationship between patient and physician has always been a personal one, and the physician's choice of treatment has been intended to be the best fit for the patient's needs. The necessary pooling/grouping of disease families and their assignment to a number of drugs or treatment methods has, consequently, led to an increase in the number of effective therapies. However, given the heterogeneity of most human diseases, and cancer specifically, it is currently impossible for the treating clinician to effectively predict a patient's response and outcome based on current technologies, much less the idiosyncratic resistances and adverse effects associated with the limited therapeutic options.
Resumo:
While genomics provide important information about the somatic genetic changes, and RNA transcript profiling can reveal important expression changes that correlate with outcome and response to therapy, it is the proteins that do the work in the cell. At a functional level, derangements within the proteome, driven by post-translational and epigenetic modifications, such as phosphorylation, is the cause of a vast majority of human diseases. Cancer, for instance, is a manifestation of deranged cellular protein molecular networks and cell signaling pathways that are based on genetic changes at the DNA level. Importantly, the protein pathways contain the drug targets in signaling networks that govern overall cellular survival, proliferation, invasion and cell death. Consequently, the promise of proteomics resides in the ability to extend analysis beyond correlation to causality. A critical gap in the information knowledge base of molecular profiling is an understanding of the ongoing activity of protein signaling in human tissue: what is activated and “in use” within the human body at any given point in time. To address this gap, we have invented a new technology, called reverse phase protein microarrays, that can generate a functional read-out of cell signaling networks or pathways for an individual patient obtained directly from a biopsy specimen. This “wiring diagram” can serve as the basis for both, selection of a therapy and patient stratification.
Resumo:
Cancer can be defined as a deregulation or hyperactivity in the ongoing network of intracellular and extracellular signaling events. Reverse phase protein microarray technology may offer a new opportunity to measure and profile these signaling pathways, providing data on post-translational phosphorylation events not obtainable by gene microarray analysis. Treatment of ovarian epithelial carcinoma almost always takes place in a metastatic setting since unfortunately the disease is often not detected until later stages. Thus, in addition to elucidation of the molecular network within a tumor specimen, critical questions are to what extent do signaling changes occur upon metastasis and are there common pathway elements that arise in the metastatic microenvironment. For individualized combinatorial therapy, ideal therapeutic selection based on proteomic mapping of phosphorylation end points may require evaluation of the patient's metastatic tissue. Extending these findings to the bedside will require the development of optimized protocols and reference standards. We have developed a reference standard based on a mixture of phosphorylated peptides to begin to address this challenge.
Resumo:
Microarrays have a wide range of applications in the biomedical field. From the beginning, arrays have mostly been utilized in cancer research, including classification of tumors into different subgroups and identification of clinical associations. In the microarray format, a collection of small features, such as different oligonucleotides, is attached to a solid support. The advantage of microarray technology is the ability to simultaneously measure changes in the levels of multiple biomolecules. Because many diseases, including cancer, are complex, involving an interplay between various genes and environmental factors, the detection of only a single marker molecule is usually insufficient for determining disease status. Thus, a technique that simultaneously collects information on multiple molecules allows better insights into a complex disease. Since microarrays can be custom-manufactured or obtained from a number of commercial providers, understanding data quality and comparability between different platforms is important to enable the use of the technology to areas beyond basic research. When standardized, integrated array data could ultimately help to offer a complete profile of the disease, illuminating mechanisms and genes behind disorders as well as facilitating disease diagnostics. In the first part of this work, we aimed to elucidate the comparability of gene expression measurements from different oligonucleotide and cDNA microarray platforms. We compared three different gene expression microarrays; one was a commercial oligonucleotide microarray and the others commercial and custom-made cDNA microarrays. The filtered gene expression data from the commercial platforms correlated better across experiments (r=0.78-0.86) than the expression data between the custom-made and either of the two commercial platforms (r=0.62-0.76). Although the results from different platforms correlated reasonably well, combining and comparing the measurements were not straightforward. The clone errors on the custom-made array and annotation and technical differences between the platforms introduced variability in the data. In conclusion, the different gene expression microarray platforms provided results sufficiently concordant for the research setting, but the variability represents a challenge for developing diagnostic applications for the microarrays. In the second part of the work, we performed an integrated high-resolution microarray analysis of gene copy number and expression in 38 laryngeal and oral tongue squamous cell carcinoma cell lines and primary tumors. Our aim was to pinpoint genes for which expression was impacted by changes in copy number. The data revealed that especially amplifications had a clear impact on gene expression. Across the genome, 14-32% of genes in the highly amplified regions (copy number ratio >2.5) had associated overexpression. The impact of decreased copy number on gene underexpression was less clear. Using statistical analysis across the samples, we systematically identified hundreds of genes for which an increased copy number was associated with increased expression. For example, our data implied that FADD and PPFIA1 were frequently overexpressed at the 11q13 amplicon in HNSCC. The 11q13 amplicon, including known oncogenes such as CCND1 and CTTN, is well-characterized in different type of cancers, but the roles of FADD and PPFIA1 remain obscure. Taken together, the integrated microarray analysis revealed a number of known as well as novel target genes in altered regions in HNSCC. The identified genes provide a basis for functional validation and may eventually lead to the identification of novel candidates for targeted therapy in HNSCC.
Resumo:
Chromosomal alterations in leukemia have been shown to have prognostic and predictive significance and are also important minimal residual disease (MRD) markers in the follow-up of leukemia patients. Although specific oncogenes and tumor suppressors have been discovered in some of the chromosomal alterations, the role and target genes of many alterations in leukemia remain unknown. In addition, a number of leukemia patients have a normal karyotype by standard cytogenetics, but have variability in clinical course and are often molecularly heterogeneous. Cytogenetic methods traditionally used in leukemia analysis and diagnostics; G-banding, various fluorescence in situ hybridization (FISH) techniques, and chromosomal comparative genomic hybridization (cCGH), have enormously increased knowledge about the leukemia genome, but have limitations in resolution or in genomic coverage. In the last decade, the development of microarray comparative genomic hybridization (array-CGH, aCGH) for DNA copy number analysis and the SNP microarray (SNP-array) method for simultaneous copy number and loss of heterozygosity (LOH) analysis has enabled investigation of chromosomal and gene alterations genome-wide with high resolution and high throughput. In these studies, genetic alterations were analyzed in acute myeloid leukemia (AML) and chronic lymphocytic leukemia (CLL). The aim was to screen and characterize genomic alterations that could play role in leukemia pathogenesis by using aCGH and SNP-arrays. One of the most important goals was to screen cryptic alterations in karyotypically normal leukemia patients. In addition, chromosomal changes were evaluated to narrow the target regions, to find new markers, and to obtain tumor suppressor and oncogene candidates. The work presented here shows the capability of aCGH to detect submicroscopic copy number alterations in leukemia, with information about breakpoints and genes involved in the alterations, and that genome-wide microarray analyses with aCGH and SNP-array are advantageous methods in the research and diagnosis of leukemia. The most important findings were the cryptic changes detected with aCGH in karyotypically normal AML and CLL, characterization of amplified genes in 11q marker chromosomes, detection of deletion-based mechanisms of MLL-ARHGEF12 fusion gene formation, and detection of LOH without copy number alteration in karyotypically normal AML. These alterations harbor candidate oncogenes and tumor suppressors for further studies.
Resumo:
Background: The Ewing sarcoma family of tumors (ESFT) are rare but highly malignant neoplasms that occur mainly in bone or but also in soft tissue. ESFT affects patients typically in their second decade of life, whereby children and adolescents bear the heaviest incidence burden. Despite recent advances in the clinical management of ESFT patients, their prognosis and survival are still disappointingly poor, especially in cases with metastasis. No targeted therapy for ESFT patients is currently available. Moreover, based merely on current clinical and biological characteristics, accurate classification of ESFT patients often fails at the time of diagnosis. Therefore, there is a constant need for novel molecular biomarkers to be applied in tandem with conventional parameters to further intensify ESFT risk-stratification and treatment selection, and ultimately to develop novel targeted therapies. In this context, a greater understanding of the genetics and immune characteristics of ESFT is needed. Aims: This study sought to open novel insights into gene copy number changes and gene expression in ESFT and, further, to enlighten the role of inflammation in ESFT. For this purpose, microarrays were used to provide gene-level information on a genomewide scale. In addition, this study focused on screening of 9p21.3 deletion sizes and frequencies in ESFT and, in another pediatric cancer, acute lymphocytic leukemia (ALL), in order to define more exact criteria for highrisk patient selection and to provide data for developing a more reliable diagnostic method to detect CDKN2A deletions. Results: In study I, 20 novel ESFT-associated suppressor genes and oncogenes were pinpointed using combined array CGH and expression analysis. In addition, interesting chromosomal rearrangements were identified: (1) Duplication of derivative chromosome der(22)(11;22) was detected in three ESFT patients. This duplication included the EWSR1-FLI1 fusion gene leading to increase in its copy number; (2) Cryptic amplifications on chromosomes 20 and 22 were detected, suggesting a novel translocation between chromosomes 20 and 22, which most probably produces a fusion between EWSR1 and NFATC2. In study II, bioinformatic analysis of ESFT expression profiles showed that inflammatory gene activation is detectable in ESFT patient samples and that the activation is characterized by macrophage gene expression. Most interestingly, ESFT patient samples were shown to express certain inflammatory genes that were prognostically significant. High local expression of C5 and JAK1 at the tumor site was shown to associate with favorable clinical outcome, whereas high local expression of IL8 was shown to be detrimental. Studies III and IV showed that the smallest overlapping region of deletion in 9p21.3 includes CDKN2A in all cases and that the length of this region is 12.2 kb in both Ewing sarcoma and ALL. Furthermore, our results showed that the most widely used commercial CDKN2A FISH probe creates false negative results in the narrowest microdeletion cases (<190 kb). Therefore, more accurate methods should be developed for the detection of deletions in the CDKN2A locus. Conclusions: This study provides novel insights into the genetic changes involved in the biology of ESFT, in the interaction between ESFT cells and immune system, and in the inactivation of CDKN2A. Novel ESFT biomarker genes identified in this study serve as a useful resource for future studies and in developing novel therapeutic strategies to improve the survival of patients with ESFT.
Resumo:
Microarrays are high throughput biological assays that allow the screening of thousands of genes for their expression. The main idea behind microarrays is to compute for each gene a unique signal that is directly proportional to the quantity of mRNA that was hybridized on the chip. A large number of steps and errors associated with each step make the generated expression signal noisy. As a result, microarray data need to be carefully pre-processed before their analysis can be assumed to lead to reliable and biologically relevant conclusions. This thesis focuses on developing methods for improving gene signal and further utilizing this improved signal for higher level analysis. To achieve this, first, approaches for designing microarray experiments using various optimality criteria, considering both biological and technical replicates, are described. A carefully designed experiment leads to signal with low noise, as the effect of unwanted variations is minimized and the precision of the estimates of the parameters of interest are maximized. Second, a system for improving the gene signal by using three scans at varying scanner sensitivities is developed. A novel Bayesian latent intensity model is then applied on these three sets of expression values, corresponding to the three scans, to estimate the suitably calibrated true signal of genes. Third, a novel image segmentation approach that segregates the fluorescent signal from the undesired noise is developed using an additional dye, SYBR green RNA II. This technique helped in identifying signal only with respect to the hybridized DNA, and signal corresponding to dust, scratch, spilling of dye, and other noises, are avoided. Fourth, an integrated statistical model is developed, where signal correction, systematic array effects, dye effects, and differential expression, are modelled jointly as opposed to a sequential application of several methods of analysis. The methods described in here have been tested only for cDNA microarrays, but can also, with some modifications, be applied to other high-throughput technologies. Keywords: High-throughput technology, microarray, cDNA, multiple scans, Bayesian hierarchical models, image analysis, experimental design, MCMC, WinBUGS.
Resumo:
Currently, there are nine known human herpesviruses and these viruses appear to have been a very common companion of humans throughout the millenia. Of human herpesviruses, herpes simplex viruses 1 and 2 (HSV-1, HSV-2), causative agents of herpes labialis and genital herpes, and varicella-zoster virus (VZV), causative agent of chicken pox, are also common causes of central nervous system (CNS) infections. In addition, human cytomegalovirus (CMV), Epstein-Barr virus (EBV) and human herpesviruses 6A, 6B, and 7 (HHV-6A, HHV-6B, HHV-7), all members of the herpesvirus family, can also be associated with encephalitis and meningitis. Accurate diagnostics and fast treatment are essential for patient recovery in CNS infections and therefore sensitive and effective diagnostic methods are needed. The aim of this thesis was to develop new potential detection methods for diagnosing of human herpesvirus infections, especially in immunocompetent patients, using the microarray technique. Therefore, methods based on microarrays were developed for simultaneous detection of HSV-1, HSV-2, VZV, CMV, EBV, HHV-6A, HHV-6B, and HHV-7 nucleic acids, and for HSV-1, HSV-2, VZV, and CMV antibodies from various clinical samples. The microarray methods developed showed potential for efficiently and accurately detecting human herpesvirus DNAs, especially in CNS infections, and for simultaneous detection of DNAs or antibodies for multiple different human herpesviruses from clinical samples. In fact, the microarray method revealed several previously unrecognized co-infections. The microarray methods developed were sensitive and provided rapid detection of human herpesvirus DNA, and therefore the method could be applied to routine diagnostics. The microarrays might also be considered as an economical tool for diagnosing human herpesvirus infections.