6 resultados para MICROARRAY DATA
em Doria (National Library of Finland DSpace Services) - National Library of Finland, Finland
Resumo:
Prostate cancers form a heterogeneous group of diseases and there is a need for novel biomarkers, and for more efficient and targeted methods of treatment. In this thesis, the potential of microarray data, RNA interference (RNAi) and compound screens were utilized in order to identify novel biomarkers, drug targets and drugs for future personalized prostate cancer therapeutics. First, a bioinformatic mRNA expression analysis covering 9873 human tissue and cell samples, including 349 prostate cancer and 147 normal prostate samples, was used to distinguish in silico prevalidated putative prostate cancer biomarkers and drug targets. Second, RNAi based high-throughput (HT) functional profiling of 295 prostate and prostate cancer tissue specific genes was performed in cultured prostate cancer cells. Third, a HT compound screen approach using a library of 4910 drugs and drug-like molecules was exploited to identify potential drugs inhibiting prostate cancer cell growth. Nine candidate drug targets, with biomarker potential, and one cancer selective compound were validated in vitro and in vivo. In addition to androgen receptor (AR) signaling, endoplasmic reticulum (ER) function, arachidonic acid (AA) pathway, redox homeostasis and mitosis were identified as vital processes in prostate cancer cells. ERG oncogene positive cancer cells exhibited sensitivity to induction of oxidative and ER stress, whereas advanced and castrate-resistant prostate cancer (CRPC) could be potentially targeted through AR signaling and mitosis. In conclusion, this thesis illustrates the power of systems biological data analysis in the discovery of potential vulnerabilities present in prostate cancer cells, as well as novel options for personalized cancer management.
Resumo:
The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments. First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing value imputation. Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA). Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.
Resumo:
The recent rapid development of biotechnological approaches has enabled the production of large whole genome level biological data sets. In order to handle thesedata sets, reliable and efficient automated tools and methods for data processingand result interpretation are required. Bioinformatics, as the field of studying andprocessing biological data, tries to answer this need by combining methods and approaches across computer science, statistics, mathematics and engineering to studyand process biological data. The need is also increasing for tools that can be used by the biological researchers themselves who may not have a strong statistical or computational background, which requires creating tools and pipelines with intuitive user interfaces, robust analysis workflows and strong emphasis on result reportingand visualization. Within this thesis, several data analysis tools and methods have been developed for analyzing high-throughput biological data sets. These approaches, coveringseveral aspects of high-throughput data analysis, are specifically aimed for gene expression and genotyping data although in principle they are suitable for analyzing other data types as well. Coherent handling of the data across the various data analysis steps is highly important in order to ensure robust and reliable results. Thus,robust data analysis workflows are also described, putting the developed tools andmethods into a wider context. The choice of the correct analysis method may also depend on the properties of the specific data setandthereforeguidelinesforchoosing an optimal method are given. The data analysis tools, methods and workflows developed within this thesis have been applied to several research studies, of which two representative examplesare included in the thesis. The first study focuses on spermatogenesis in murinetestis and the second one examines cell lineage specification in mouse embryonicstem cells.
Resumo:
Currently, numerous high-throughput technologies are available for the study of human carcinomas. In literature, many variations of these techniques have been described. The common denominator for these methodologies is the high amount of data obtained in a single experiment, in a short time period, and at a fairly low cost. However, these methods have also been described with several problems and limitations. The purpose of this study was to test the applicability of two selected high-throughput methods, cDNA and tissue microarrays (TMA), in cancer research. Two common human malignancies, breast and colorectal cancer, were used as examples. This thesis aims to present some practical considerations that need to be addressed when applying these techniques. cDNA microarrays were applied to screen aberrant gene expression in breast and colon cancers. Immunohistochemistry was used to validate the results and to evaluate the association of selected novel tumour markers with the outcome of the patients. The type of histological material used in immunohistochemistry was evaluated especially considering the applicability of whole tissue sections and different types of TMAs. Special attention was put on the methodological details in the cDNA microarray and TMA experiments. In conclusion, many potential tumour markers were identified in the cDNA microarray analyses. Immunohistochemistry could be applied to validate the observed gene expression changes of selected markers and to associate their expression change with patient outcome. In the current experiments, both TMAs and whole tissue sections could be used for this purpose. This study showed for the first time that securin and p120 catenin protein expression predict breast cancer outcome and the immunopositivity of carbonic anhydrase IX associates with the outcome of rectal cancer. The predictive value of these proteins was statistically evident also in multivariate analyses with up to a 13.1- fold risk for cancer specific death in a specific subgroup of patients.
Resumo:
High-throughput screening of cellular effects of RNA interference (RNAi) libraries is now being increasingly applied to explore the role of genes in specific cell biological processes and disease states. However, the technology is still limited to specialty laboratories, due to the requirements for robotic infrastructure, access to expensive reagent libraries, expertise in high-throughput screening assay development, standardization, data analysis and applications. In the future, alternative screening platforms will be required to expand functional large-scale experiments to include more RNAi constructs, allow combinatorial loss-of-function analyses (e.g. genegene or gene-drug interaction), gain-of-function screens, multi-parametric phenotypic readouts or comparative analysis of many different cell types. Such comprehensive perturbation of gene networks in cells will require a major increase in the flexibility of the screening platforms, throughput and reduction of costs. As an alternative for the conventional multi-well based high-throughput screening -platforms, here the development of a novel cell spot microarray method for production of high density siRNA reverse transfection arrays is described. The cell spot microarray platform is distinguished from the majority of other transfection cell microarray techniques by the spatially confined array layout that allow highly parallel screening of large-scale RNAi reagent libraries with assays otherwise difficult or not applicable to high-throughput screening. This study depicts the development of the cell spot microarray method along with biological application examples of high-content immunofluorescence and phenotype based cancer cell biological analyses focusing on the regulation of prostate cancer cell growth, maintenance of genomic integrity in breast cancer cells, and functional analysis of integrin protein-protein interactions in situ.
Resumo:
Mass spectrometry (MS)-based proteomics has seen significant technical advances during the past two decades and mass spectrometry has become a central tool in many biosciences. Despite the popularity of MS-based methods, the handling of the systematic non-biological variation in the data remains a common problem. This biasing variation can result from several sources ranging from sample handling to differences caused by the instrumentation. Normalization is the procedure which aims to account for this biasing variation and make samples comparable. Many normalization methods commonly used in proteomics have been adapted from the DNA-microarray world. Studies comparing normalization methods with proteomics data sets using some variability measures exist. However, a more thorough comparison looking at the quantitative and qualitative differences of the performance of the different normalization methods and at their ability in preserving the true differential expression signal of proteins, is lacking. In this thesis, several popular and widely used normalization methods (the Linear regression normalization, Local regression normalization, Variance stabilizing normalization, Quantile-normalization, Median central tendency normalization and also variants of some of the forementioned methods), representing different strategies in normalization are being compared and evaluated with a benchmark spike-in proteomics data set. The normalization methods are evaluated in several ways. The performance of the normalization methods is evaluated qualitatively and quantitatively on a global scale and in pairwise comparisons of sample groups. In addition, it is investigated, whether performing the normalization globally on the whole data or pairwise for the comparison pairs examined, affects the performance of the normalization method in normalizing the data and preserving the true differential expression signal. In this thesis, both major and minor differences in the performance of the different normalization methods were found. Also, the way in which the normalization was performed (global normalization of the whole data or pairwise normalization of the comparison pair) affected the performance of some of the methods in pairwise comparisons. Differences among variants of the same methods were also observed.