916 resultados para EXPRESSION DATA
Resumo:
Chronic myelomonocytic leukaemia (CMML) is a heterogeneous haematopoietic disorder characterized by myeloproliferative or myelodysplastic features. At present, the pathogenesis of this malignancy is not completely understood. In this study, we sought to analyse gene expression profiles of CMML in order to characterize new molecular outcome predictors. A learning set of 32 untreated CMML patients at diagnosis was available for TaqMan low-density array gene expression analysis. From 93 selected genes related to cancer and cell cycle, we built a five-gene prognostic index after multiplicity correction. Using this index, we characterized two categories of patients with distinct overall survival (94% vs. 19% for good and poor overall survival, respectively; P = 0.007) and we successfully validated its strength on an independent cohort of 21 CMML patients with Affymetrix gene expression data. We found no specific patterns of association with traditional prognostic stratification parameters in the learning cohort. However, the poor survival group strongly correlated with high-risk treated patients and transformation to acute myeloid leukaemia. We report here a new multigene prognostic index for CMML, independent of the gene expression measurement method, which could be used as a powerful tool to predict clinical outcome and help physicians to evaluate criteria for treatments.
Resumo:
Overexpression of MN1, ERG, BAALC, and EVI1 (MEBE) genes in cytogenetically normal acute myeloid leukemia (AML) patients is associated with poor prognosis, but their prognostic effect in patients with myelodysplastic syndromes (MDS) has not been studied systematically. Expression data of the four genes from 140 MDS patients were combined in an additive score, which was validated in an independent patient cohort of 110 MDS patients. A high MEBE score, defined as high expression of at least two of the four genes, predicted a significantly shorter overall survival (OS) (HR 2.29, 95 % CI 1.3-4.09, P?=?.005) and time to AML progression (HR 4.83, 95 % CI 2.01-11.57, P?
Resumo:
Identifying differential expression of genes in psoriatic and healthy skin by microarray data analysis is a key approach to understand the pathogenesis of psoriasis. Analysis of more than one dataset to identify genes commonly upregulated reduces the likelihood of false positives and narrows down the possible signature genes. Genes controlling the critical balance between T helper 17 and regulatory T cells are of special interest in psoriasis. Our objectives were to identify genes that are consistently upregulated in lesional skin from three published microarray datasets. We carried out a reanalysis of gene expression data extracted from three experiments on samples from psoriatic and nonlesional skin using the same stringency threshold and software and further compared the expression levels of 92 genes related to the T helper 17 and regulatory T cell signaling pathways. We found 73 probe sets representing 57 genes commonly upregulated in lesional skin from all datasets. These included 26 probe sets representing 20 genes that have no previous link to the etiopathogenesis of psoriasis. These genes may represent novel therapeutic targets and surely need more rigorous experimental testing to be validated. Our analysis also identified 12 of 92 genes known to be related to the T helper 17 and regulatory T cell signaling pathways, and these were found to be differentially expressed in the lesional skin samples.
Resumo:
The inference of gene regulatory networks gained within recent years a considerable interest in the biology and biomedical community. The purpose of this paper is to investigate the influence that environmental conditions can exhibit on the inference performance of network inference algorithms. Specifically, we study five network inference methods, Aracne, BC3NET, CLR, C3NET and MRNET, and compare the results for three different conditions: (I) observational gene expression data: normal environmental condition, (II) interventional gene expression data: growth in rich media, (III) interventional gene expression data: normal environmental condition interrupted by a positive spike-in stimulation. Overall, we find that different statistical inference methods lead to comparable, but condition-specific results. Further, our results suggest that non-steady-state data enhance the inferability of regulatory networks.
Resumo:
Model selection between competing models is a key consideration in the discovery of prognostic multigene signatures. The use of appropriate statistical performance measures as well as verification of biological significance of the signatures is imperative to maximise the chance of external validation of the generated signatures. Current approaches in time-to-event studies often use only a single measure of performance in model selection, such as logrank test p-values, or dichotomise the follow-up times at some phase of the study to facilitate signature discovery. In this study we improve the prognostic signature discovery process through the application of the multivariate partial Cox model combined with the concordance index, hazard ratio of predictions, independence from available clinical covariates and biological enrichment as measures of signature performance. The proposed framework was applied to discover prognostic multigene signatures from early breast cancer data. The partial Cox model combined with the multiple performance measures were used in both guiding the selection of the optimal panel of prognostic genes and prediction of risk within cross validation without dichotomising the follow-up times at any stage. The signatures were successfully externally cross validated in independent breast cancer datasets, yielding a hazard ratio of 2.55 [1.44, 4.51] for the top ranking signature.
Resumo:
One of the major challenges in systems biology is to understand the complex responses of a biological system to external perturbations or internal signalling depending on its biological conditions. Genome-wide transcriptomic profiling of cellular systems under various chemical perturbations allows the manifestation of certain features of the chemicals through their transcriptomic expression profiles. The insights obtained may help to establish the connections between human diseases, associated genes and therapeutic drugs. The main objective of this study was to systematically analyse cellular gene expression data under various drug treatments to elucidate drug-feature specific transcriptomic signatures. We first extracted drug-related information (drug features) from the collected textual description of DrugBank entries using text-mining techniques. A novel statistical method employing orthogonal least square learning was proposed to obtain drug-feature-specific signatures by integrating gene expression with DrugBank data. To obtain robust signatures from noisy input datasets, a stringent ensemble approach was applied with the combination of three techniques: resampling, leave-one-out cross validation, and aggregation. The validation experiments showed that the proposed method has the capacity of extracting biologically meaningful drug-feature-specific gene expression signatures. It was also shown that most of signature genes are connected with common hub genes by regulatory network analysis. The common hub genes were further shown to be related to general drug metabolism by Gene Ontology analysis. Each set of genes has relatively few interactions with other sets, indicating the modular nature of each signature and its drug-feature-specificity. Based on Gene Ontology analysis, we also found that each set of drug feature (DF)-specific genes were indeed enriched in biological processes related to the drug feature. The results of these experiments demonstrated the pot- ntial of the method for predicting certain features of new drugs using their transcriptomic profiles, providing a useful methodological framework and a valuable resource for drug development and characterization.
Resumo:
Quantile normalization (QN) is a technique for microarray data processing and is the default normalization method in the Robust Multi-array Average (RMA) procedure, which was primarily designed for analysing gene expression data from Affymetrix arrays. Given the abundance of Affymetrix microarrays and the popularity of the RMA method, it is crucially important that the normalization procedure is applied appropriately. In this study we carried out simulation experiments and also analysed real microarray data to investigate the suitability of RMA when it is applied to dataset with different groups of biological samples. From our experiments, we showed that RMA with QN does not preserve the biological signal included in each group, but rather it would mix the signals between the groups. We also showed that the Median Polish method in the summarization step of RMA has similar mixing effect. RMA is one of the most widely used methods in microarray data processing and has been applied to a vast volume of data in biomedical research. The problematic behaviour of this method suggests that previous studies employing RMA could have been misadvised or adversely affected. Therefore we think it is crucially important that the research community recognizes the issue and starts to address it. The two core elements of the RMA method, quantile normalization and Median Polish, both have the undesirable effects of mixing biological signals between different sample groups, which can be detrimental to drawing valid biological conclusions and to any subsequent analyses. Based on the evidence presented here and that in the literature, we recommend exercising caution when using RMA as a method of processing microarray gene expression data, particularly in situations where there are likely to be unknown subgroups of samples.
Resumo:
Angiogenesis is important in cancer progression. Promising results in clinical trials have indicated that targeting vascular epidermal growth factor (VEGF) signaling may prolong lung cancer patient survival. In particular, various studies have implicated VEGFA as a potential prognostic marker in lung cancer, although prognostication using the expression of VEGF receptors (VEGFRs), such as fms-related tyrosine kinase 1 (FLT1; also known as VEGFR1) and kinase insert domain receptor (KDR; also known as VEGFR2), has produced varied results in different lung cancer studies. The present study aimed to investigate the prognostic significance of these three factors, alone or in combination. mRNA expression data were extracted from four independent lung cancer cohorts totaling 583 patients, and the association between mRNA expression and survival was investigated by performing statistical analyses. When VEGFA, FLT1 and KDR expression were considered alone, only VEGFA demonstrated a significant association with patient survival consistently across all four datasets (P<0.05). Patients with a high expression of VEGFA and one of the two receptors were associated with significantly worse survival than patients expressing low levels of VEGFA and the particular receptor (P<0.05). Notably, patients with a high level expression of all three genes in their tumor specimens were associated with a significantly shorter survival time compared with patients exhibiting a low level expression of one, two or all three genes (P<0.05). The results indicate that a high level of VEGFA expression and its receptors may be required for cancer progression. Therefore, these three factors should be considered together as a prognostic indicator for lung cancer patients.
Resumo:
Urothelial cancer (UC) is highly recurrent and can progress from non-invasive (NMIUC) to a more aggressive muscle-invasive (MIUC) subtype that invades the muscle tissue layer of the bladder. We present a proof of principle study that network-based features of gene pairs can be used to improve classifier performance and the functional analysis of urothelial cancer gene expression data. In the first step of our procedure each individual sample of a UC gene expression dataset is inflated by gene pair expression ratios that are defined based on a given network structure. In the second step an elastic net feature selection procedure for network-based signatures is applied to discriminate between NMIUC and MIUC samples. We performed a repeated random subsampling cross validation in three independent datasets. The network signatures were characterized by a functional enrichment analysis and studied for the enrichment of known cancer genes. We observed that the network-based gene signatures from meta collections of proteinprotein interaction (PPI) databases such as CPDB and the PPI databases HPRD and BioGrid improved the classification performance compared to single gene based signatures. The network based signatures that were derived from PPI databases showed a prominent enrichment of cancer genes (e.g., TP53, TRIM27 and HNRNPA2Bl). We provide a novel integrative approach for large-scale gene expression analysis for the identification and development of novel diagnostical targets in bladder cancer. Further, our method allowed to link cancer gene associations to network-based expression signatures that are not observed in gene-based expression signatures.
Resumo:
Ovarian follicle development is primarily regulated by an interplay between the pituitary gonadotrophins, LH and FSH, and ovary-derived steroids. Increasing evidence implicates regulatory roles of transforming growth factor-beta (TGF beta) superfamily members, including inhibins and activins. The aim of this study was to identify the expression of mRNAs encoding key receptors of the inhibin/activin system in ovarian follicles ranging from 4 mm in diameter to the dominant F1 follicle (similar to 40 turn). Ovaries were collected (n=16) from inid-sequence hens maintained on a long-day photoschedule (16h of light:8 h of darkness). All follicles removed were dissected into individual granulosa and thecal layers. RNA was extracted and cDNA synthesized. Real-time quantitative PCR was used to quantify the expression of niRNA encoding betaglycan, activin receptor (ActR) subtypes (type-I, -IIA and -IIB) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH); receptor expression data were normalized to GAPDH expression. Detectable levels of ActRI, -IIA and -IIB and the inhibin co-receptor (betaglycan) expression were found in all granulosa and thecal layers analysed. Granulosa ActRI mRNA peaked (P < 0(.)05) in 8-9(.)9 mm follicles, whereas ActRIIA rose significantly from 6-7(.)9 mm to 8-9(.)9 nun, before filling to F3/2; levels then rose sharply (3-fold) to F1 levels. Granulosa betaglycan niRNA expression rose 3-fold from 4-5(.)9 min to 8-9(.)9 mm, before falling 4-fold to F3/2; levels then rose sharply (4-fold) to F1 levels. ActRIIB levels did not vary significantly during follicular development. Thecal ActRI mRNA expression was similar from 4-7(.)9 mm then decreased significantly to a nadir at the F4 position, before increasing 2-fold to the F1 (P < 0(.)05). Although thecal ActRIIB and -IIA expression did not vary significantly from 4 nim to F3, ActRIIB expression increased significantly (2-fold) from F3 to F1 and ActIIA, increased 22-fold from F2 to F1 (P < 0(.)05). Thecal betaglycan fell to a nadir at F6 after follicle selection; levels then increased significantly to F2, before filling similar to 50% in the F I. In all follicles studied expression of betaglycan and ActRI (granulosa: 1-0(.)65, P < 0-001, n=144/group; theca: r=0(.)49, P < 0-001, n=144/group) was well correlated. No significant correlations were identified between betaglycan and ActRIIA or -IIB. Considering all follicles analysed, granulosa mRNA expression of betaglycan, ActRI ActRIIA and ActRIIB were all significantly lower than in corresponding thecal tissue (betaglycan, 11(.)4-fold; ActRIIB, 5(.)1-fold; ActR(.) 3-8-fold: ActRIIA, 2(.)8-fold). The co-localization of type-I and -II activin receptors and betaglycan on granulosa and thecal cells are consistent with a local auto/paracrine role of inhibins and activins in modulating ovarian follicle development, selection and progression in the domestic fowl.
Resumo:
A number of strategies are emerging for the high throughput (HTP) expression of recombinant proteins to enable structural and functional study. Here we describe a workable HTP strategy based on parallel protein expression in E. coli and insect cells. Using this system we provide comparative expression data for five proteins derived from the Autographa californica polyhedrosis virus genome that vary in amino acid composition and in molecular weight. Although the proteins are part of a set of factors known to be required for viral late gene expression, the precise function of three of the five, late expression factors (lefs) 6, 7 and 10, is unknown. Rapid expression and characterisation has allowed the determination of their ability to bind DNA and shown a cellular location consistent with their properties. Our data point to the utility of a parallel expression strategy to rapidly obtain workable protein expression levels from many open reading frames (ORFs).
Resumo:
Ovarian follicle development is primarily regulated by an interplay between the pituitary gonadotrophins, LH and FSH, and ovary-derived steroids. Increasing evidence implicates regulatory roles of transforming growth factor-beta (TGF beta) superfamily members, including inhibins and activins. The aim of this study was to identify the expression of mRNAs encoding key receptors of the inhibin/activin system in ovarian follicles ranging from 4 mm in diameter to the dominant F1 follicle (similar to 40 turn). Ovaries were collected (n=16) from inid-sequence hens maintained on a long-day photoschedule (16h of light:8 h of darkness). All follicles removed were dissected into individual granulosa and thecal layers. RNA was extracted and cDNA synthesized. Real-time quantitative PCR was used to quantify the expression of niRNA encoding betaglycan, activin receptor (ActR) subtypes (type-I, -IIA and -IIB) and glyceraldehyde-3-phosphate dehydrogenase (GAPDH); receptor expression data were normalized to GAPDH expression. Detectable levels of ActRI, -IIA and -IIB and the inhibin co-receptor (betaglycan) expression were found in all granulosa and thecal layers analysed. Granulosa ActRI mRNA peaked (P < 0(.)05) in 8-9(.)9 mm follicles, whereas ActRIIA rose significantly from 6-7(.)9 mm to 8-9(.)9 nun, before filling to F3/2; levels then rose sharply (3-fold) to F1 levels. Granulosa betaglycan niRNA expression rose 3-fold from 4-5(.)9 min to 8-9(.)9 mm, before falling 4-fold to F3/2; levels then rose sharply (4-fold) to F1 levels. ActRIIB levels did not vary significantly during follicular development. Thecal ActRI mRNA expression was similar from 4-7(.)9 mm then decreased significantly to a nadir at the F4 position, before increasing 2-fold to the F1 (P < 0(.)05). Although thecal ActRIIB and -IIA expression did not vary significantly from 4 nim to F3, ActRIIB expression increased significantly (2-fold) from F3 to F1 and ActIIA, increased 22-fold from F2 to F1 (P < 0(.)05). Thecal betaglycan fell to a nadir at F6 after follicle selection; levels then increased significantly to F2, before filling similar to 50% in the F I. In all follicles studied expression of betaglycan and ActRI (granulosa: 1-0(.)65, P < 0-001, n=144/group; theca: r=0(.)49, P < 0-001, n=144/group) was well correlated. No significant correlations were identified between betaglycan and ActRIIA or -IIB. Considering all follicles analysed, granulosa mRNA expression of betaglycan, ActRI ActRIIA and ActRIIB were all significantly lower than in corresponding thecal tissue (betaglycan, 11(.)4-fold; ActRIIB, 5(.)1-fold; ActR(.) 3-8-fold: ActRIIA, 2(.)8-fold). The co-localization of type-I and -II activin receptors and betaglycan on granulosa and thecal cells are consistent with a local auto/paracrine role of inhibins and activins in modulating ovarian follicle development, selection and progression in the domestic fowl.
Resumo:
This paper is concerned with the computational efficiency of fuzzy clustering algorithms when the data set to be clustered is described by a proximity matrix only (relational data) and the number of clusters must be automatically estimated from such data. A fuzzy variant of an evolutionary algorithm for relational clustering is derived and compared against two systematic (pseudo-exhaustive) approaches that can also be used to automatically estimate the number of fuzzy clusters in relational data. An extensive collection of experiments involving 18 artificial and two real data sets is reported and analyzed. (C) 2011 Elsevier B.V. All rights reserved.
Resumo:
A large amount of biological data has been produced in the last years. Important knowledge can be extracted from these data by the use of data analysis techniques. Clustering plays an important role in data analysis, by organizing similar objects from a dataset into meaningful groups. Several clustering algorithms have been proposed in the literature. However, each algorithm has its bias, being more adequate for particular datasets. This paper presents a mathematical formulation to support the creation of consistent clusters for biological data. Moreover. it shows a clustering algorithm to solve this formulation that uses GRASP (Greedy Randomized Adaptive Search Procedure). We compared the proposed algorithm with three known other algorithms. The proposed algorithm presented the best clustering results confirmed statistically. (C) 2009 Elsevier Ltd. All rights reserved.
Resumo:
Eutherian mammals share a common ancestor that evolved into two main placental types, i.e., hemotrophic (e.g., human and mouse) and histiotrophic (e.g., farm animals), which differ in invasiveness. Pregnancies initiated with assisted reproductive techniques (ART) in farm animals are at increased risk of failure; these losses were associated with placental defects, perhaps due to altered gene expression. Developmentally regulated genes in the placenta seem highly phylogenetically conserved, whereas those expressed later in pregnancy are more species-specific. To elucidate differences between hemotrophic and epitheliochorial placentae, gene expression data were compiled from microarray studies of bovine placental tissues at various stages of pregnancy. Moreover, an in silico subtractive library was constructed based on homology of bovine genes to the database of zebrafish - a nonplacental vertebrate. In addition, the list of placental preferentially expressed genes for the human and mouse were collected using bioinformatics tools (Tissue-specific Gene Expression and Regulation [TiGER] - for humans, and tissue-specific genes database (TiSGeD) - for mice and humans). Humans, mice, and cattle shared 93 genes expressed in their placentae. Most of these were related to immune function (based on analysis of gene ontology). Cattle and women shared expression of 23 genes, mostly related to hormonal activity, whereas mice and women shared 16 genes (primarily sexual differentiation and glycoprotein biology). Because the number of genes expressed by the placentae of both cattle and mice were similar (based on cluster analysis), we concluded that both cattle and mice were suitable models to study the biology of the human placenta. (C) 2011 Elsevier B.V. All rights reserved.