5 resultados para DNA-microarray data
em Aston University Research Archive
Resumo:
Clustering techniques such as k-means and hierarchical clustering are commonly used to analyze DNA microarray derived gene expression data. However, the interactions between processes underlying the cell activity suggest that the complexity of the microarray data structure may not be fully represented with discrete clustering methods.
Resumo:
This thesis is a study of low-dimensional visualisation methods for data visualisation under certainty of the input data. It focuses on the two main feed-forward neural network algorithms which are NeuroScale and Generative Topographic Mapping (GTM) by trying to make both algorithms able to accommodate the uncertainty. The two models are shown not to work well under high levels of noise within the data and need to be modified. The modification of both models, NeuroScale and GTM, are verified by using synthetic data to show their ability to accommodate the noise. The thesis is interested in the controversy surrounding the non-uniqueness of predictive gene lists (PGL) of predicting prognosis outcome of breast cancer patients as available in DNA microarray experiments. Many of these studies have ignored the uncertainty issue resulting in random correlations of sparse model selection in high dimensional spaces. The visualisation techniques are used to confirm that the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of ‘unclassifiable’ should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Resumo:
Oral drug delivery is considered the most popular route of delivery because of the ease of administration, availability of a wide range of dosage forms and the large surface area for drug absorption via the intestinal membrane. However, besides the unfavourable biopharmaceutical properties of the therapeutic agents, efflux transporters such as Pglycoprotein (P-gp) and multiple resistance proteins (MRP) decrease the overall drug uptake by extruding the drug from the cells. Although, prodrugs have been investigated to improve drug partitioning by masking the polar groups covalently with pre-moieties promoting increased uptake, they present significant challenges including reduced solubility and increased toxicity. The current work investigates the use of amino acids as ion-pairs for three model drugs: indomethacin (weak acid), trimethoprim (weak base) and ciprofloxacin (zwitter ion) in an attempt to improve both solubility and uptake. Solubility was studied by salt formation while creating new routes for uptake across the membranes via amino acids transporter proteins or dipeptidyl transporters was the rationale to enhance absorption. New salts were prepared for the model drugs and the oppositely charged amino acids by freeze drying and they were characterised using FTIR, 1HNMR, DSC, SEM, pH solubility profile, solubility and dissolution. Permeability profiles were assessed using an in vitro cell based method; Caco-2 cells and the genetic changes occurring across the transporter genes and various pathways involved in the cellular activities were studied using DNA microarrays. Solubility data showed a significant increase in drug solubility upon preparing the new salts with the oppositely charged counter ions (ciprofloxacin glutamate salt exhibiting 2.9x103 fold enhancement when compared to the free drug). Moreover, permeability studies showed a 3 fold increase in trimethoprim and indomethacin permeabilities upon ion-pairing with amino acids and more than 10 fold when the zwitter ionic drug was paired with glutamic acid. Microarray data revealed that trimethoprim was absorbed actively via OCTN1 transporters while MRP7 is the main transporter gene that mediates its efflux. The absorption of trimethoprim from trimethoprim glutamic acid ion-paired formulations was affected by the ratio of glutamic acid in the formulation which was inversely proportional to the degree of expression of OCTN1. Interestingly, ciprofloxacin glutamic acid ion-pairs were found to decrease the up-regulation of ciprofloxacin efflux proteins (P-gp and MRP4) and over-express two solute carrier transporters; (PEPT2 and SLCO1A2) suggesting that a high aqueous binding constant (K11aq) enables the ion-paired formulations to be absorbed as one entity. In conclusion, formation of ion-pairs with amino acids can influence in a positive way solubility, transfer and gene expression effects of drugs.
Resumo:
Background: The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged 1. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods: We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results: The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion: The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.
Resumo:
To capture the genomic profiles for histone modification, chromatin immunoprecipitation (ChIP) is combined with next generation sequencing, which is called ChIP-seq. However, enriched regions generated from the ChIP-seq data are only evaluated on the limited knowledge acquired from manually examining the relevant biological literature. This paper proposes a novel framework, which integrates multiple knowledge sources such as biological literature, Gene Ontology, and microarray data. In order to precisely analyze ChIP-seq data for histone modification, knowledge integration is based on a unified probabilistic model. The model is employed to re-rank the enriched regions generated from peak finding algorithms. Through filtering the reranked enriched regions using some predefined threshold, more reliable and precise results could be generated. The combination of the multiple knowledge sources with the peaking finding algorithm produces a new paradigm for ChIP-seq data analysis. © (2012) Trans Tech Publications, Switzerland.