969 resultados para SNP microarray
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Haplotypes formed by polymorphisms (T-786C, rs2070744; a variable number of tandem repeats in intron 4, and Glu298Asp, rs1799983) of the eNOS gene were associated previously with gestational hypertension (GH) and preeclampsia (PE). However, no study has explored the Tag SNPs rs743506 and rs7830 in these disorders. The aim of the current study was to compare the distribution of the genotypes and haplotypes formed by the five eNOS polymorphisms mentioned among healthy pregnant (HP, n = 122), GH (n = 138), and PE (n = 157). The haplotype formed by "C b G G C" was more frequent in HP compared to GH and PE (p = 0.0071), which is supported by previous findings that demonstrated the association of the combination "C b G" with a higher level of nitrite (NO marker). Our results suggest a protective effect of the haplotype "C b G G C" against the development of hypertensive disorders of pregnancy.
Resumo:
Background: The Interleukin 28B (IL28B) rs12979860 polymorphisms was recently reported to be associated with the human T-cell leukemia virus type 1 (HTLV-1) proviral load (PvL) and the development of the HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP). Methods: In an attempt to examine this hypothesis, we assessed the association of the rs12979860 genotypes with HTLV-1 PvL levels and clinical status in 112 unrelated Brazilian subjects (81 HTLV-1 asymptomatic carriers, 24 individuals with HAM/TSP and 7 with Adult T cell Leukemia/Lymphoma (ATLL)). Results: All 112 samples were successfully genotyped and their PvLs compared. Neither the homozygote TT nor the heterozygote CT mutations nor the combination genotypes (TT/CT) were associated with a greater PvL. We also observed no significant difference in allele distribution between asymptomatic carriers and patients with HTLV-1 associated HAM/TSP. Conclusions: Our study failed to support the previously reported positive association between the IL28B rs12979860 polymorphisms and an increased risk of developing HAM/TSP in the Brazilian population.
Resumo:
The sera of a retrospective cohort (n = 41) composed of children with well characterized cow's milk allergy collected from multiple visits were analyzed using a protein microarray system measuring four classes of immunoglobulins. The frequency of the visits, age and gender distribution reflected real situation faced by the clinicians at a pediatric reference center for food allergy in 530 Paulo, Brazil. The profiling array results have shown that total IgG and IgA share similar specificity whilst IgM and in particular IgE are distantly related. The correlation of specificity of IgE and IgA is variable amongst the patients and this relationship cannot be used to predict atopy or the onset of tolerance to milk. The array profiling technique has corroborated the clinical selection criteria for this cohort albeit it clearly suggested that 4 out of the 41 patients might have allergies other than milk origin. There was also a good correlation between the array data and ImmunoCAP results, casein in particular. By using qualitative and quantitative multivariate analysis routines it was possible to produce validated statistical models to predict with reasonable accuracy the onset of tolerance to milk proteins. If expanded to larger study groups, the array profiling in combination with the multivariate techniques show potential to improve the prognostic of milk allergic patients. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Trichoepithelioma is a benign neoplasm that shares both clinical and histological features with basal cell carcinoma. It is important to distinguish these neoplasms because they require different clinical behavior and therapeutic planning. Many studies have addressed the use of immunohistochemistry to improve the differential diagnosis of these tumors. These studies present conflicting results when addressing the same markers, probably owing to the small number of basaloid tumors that comprised their studies, which generally did not exceed 50 cases. We built a tissue microarray with 162 trichoepithelioma and 328 basal cell carcinoma biopsies and tested a panel of immune markers composed of CD34, CD10, epithelial membrane antigen, Bcl-2, cytokeratins 15 and 20 and D2-40. The results were analyzed using multiple linear and logistic regression models. This analysis revealed a model that could differentiate trichoepithelioma from basal cell carcinoma in 36% of the cases. The panel of immunohistochemical markers required to differentiate between these tumors was composed of CD10, cytokeratin 15, cytokeratin 20 and D2-40. The results obtained in this work were generated from a large number of biopsies and resulted in the confirmation of overlapping epithelial and stromal immunohistochemical profiles from these basaloid tumors. The results also corroborate the point of view that trichoepithelioma and basal cell carcinoma tumors represent two different points in the differentiation of a single cell type. Despite the use of panels of immune markers, histopathological criteria associated with clinical data certainly remain the best guideline for the differential diagnosis of trichoepithelioma and basal cell carcinoma. Modern Pathology (2012) 25, 1345-1353; doi: 10.1038/modpathol.2012.96; published online 8 June 2012
Resumo:
Abstract Background With the development of DNA hybridization microarray technologies, nowadays it is possible to simultaneously assess the expression levels of thousands to tens of thousands of genes. Quantitative comparison of microarrays uncovers distinct patterns of gene expression, which define different cellular phenotypes or cellular responses to drugs. Due to technical biases, normalization of the intensity levels is a pre-requisite to performing further statistical analyses. Therefore, choosing a suitable approach for normalization can be critical, deserving judicious consideration. Results Here, we considered three commonly used normalization approaches, namely: Loess, Splines and Wavelets, and two non-parametric regression methods, which have yet to be used for normalization, namely, the Kernel smoothing and Support Vector Regression. The results obtained were compared using artificial microarray data and benchmark studies. The results indicate that the Support Vector Regression is the most robust to outliers and that Kernel is the worst normalization technique, while no practical differences were observed between Loess, Splines and Wavelets. Conclusion In face of our results, the Support Vector Regression is favored for microarray normalization due to its superiority when compared to the other methods for its robustness in estimating the normalization curve.
Resumo:
Abstract Background The search for enriched (aka over-represented or enhanced) ontology terms in a list of genes obtained from microarray experiments is becoming a standard procedure for a system-level analysis. This procedure tries to summarize the information focussing on classification designs such as Gene Ontology, KEGG pathways, and so on, instead of focussing on individual genes. Although it is well known in statistics that association and significance are distinct concepts, only the former approach has been used to deal with the ontology term enrichment problem. Results BayGO implements a Bayesian approach to search for enriched terms from microarray data. The R source-code is freely available at http://blasto.iq.usp.br/~tkoide/BayGO in three versions: Linux, which can be easily incorporated into pre-existent pipelines; Windows, to be controlled interactively; and as a web-tool. The software was validated using a bacterial heat shock response dataset, since this stress triggers known system-level responses. Conclusion The Bayesian model accounts for the fact that, eventually, not all the genes from a given category are observable in microarray data due to low intensity signal, quality filters, genes that were not spotted and so on. Moreover, BayGO allows one to measure the statistical association between generic ontology terms and differential expression, instead of working only with the common significance analysis.
Resumo:
Abstract Background Smallpox is a lethal disease that was endemic in many parts of the world until eradicated by massive immunization. Due to its lethality, there are serious concerns about its use as a bioweapon. Here we analyze publicly available microarray data to further understand survival of smallpox infected macaques, using systems biology approaches. Our goal is to improve the knowledge about the progression of this disease. Results We used KEGG pathways annotations to define groups of genes (or modules), and subsequently compared them to macaque survival times. This technique provided additional insights about the host response to this disease, such as increased expression of the cytokines and ECM receptors in the individuals with higher survival times. These results could indicate that these gene groups could influence an effective response from the host to smallpox. Conclusion Macaques with higher survival times clearly express some specific pathways previously unidentified using regular gene-by-gene approaches. Our work also shows how third party analysis of public datasets can be important to support new hypotheses to relevant biological problems.
Resumo:
Abstract Background From shotgun libraries used for the genomic sequencing of the phytopathogenic bacterium Xanthomonas axonopodis pv. citri (XAC), clones that were representative of the largest possible number of coding sequences (CDSs) were selected to create a DNA microarray platform on glass slides (XACarray). The creation of the XACarray allowed for the establishment of a tool that is capable of providing data for the analysis of global genome expression in this organism. Findings The inserts from the selected clones were amplified by PCR with the universal oligonucleotide primers M13R and M13F. The obtained products were purified and fixed in duplicate on glass slides specific for use in DNA microarrays. The number of spots on the microarray totaled 6,144 and included 768 positive controls and 624 negative controls per slide. Validation of the platform was performed through hybridization of total DNA probes from XAC labeled with different fluorophores, Cy3 and Cy5. In this validation assay, 86% of all PCR products fixed on the glass slides were confirmed to present a hybridization signal greater than twice the standard deviation of the deviation of the global median signal-to-noise ration. Conclusions Our validation of the XACArray platform using DNA-DNA hybridization revealed that it can be used to evaluate the expression of 2,365 individual CDSs from all major functional categories, which corresponds to 52.7% of the annotated CDSs of the XAC genome. As a proof of concept, we used this platform in a previously work to verify the absence of genomic regions that could not be detected by sequencing in related strains of Xanthomonas.
Resumo:
Abstract Background Papaya (Carica papaya L.) is a commercially important crop that produces climacteric fruits with a soft and sweet pulp that contain a wide range of health promoting phytochemicals. Despite its importance, little is known about transcriptional modifications during papaya fruit ripening and their control. In this study we report the analysis of ripe papaya transcriptome by using a cross-species (XSpecies) microarray technique based on the phylogenetic proximity between papaya and Arabidopsis thaliana. Results Papaya transcriptome analyses resulted in the identification of 414 ripening-related genes with some having their expression validated by qPCR. The transcription profile was compared with that from ripening tomato and grape. There were many similarities between papaya and tomato especially with respect to the expression of genes encoding proteins involved in primary metabolism, regulation of transcription, biotic and abiotic stress and cell wall metabolism. XSpecies microarray data indicated that transcription factors (TFs) of the MADS-box, NAC and AP2/ERF gene families were involved in the control of papaya ripening and revealed that cell wall-related gene expression in papaya had similarities to the expression profiles seen in Arabidopsis during hypocotyl development. Conclusion The cross-species array experiment identified a ripening-related set of genes in papaya allowing the comparison of transcription control between papaya and other fruit bearing taxa during the ripening process.
Resumo:
Financial support. Brazilian Ministry of Science and Technology (CNPq Grant 577047-2008-6), FAP-DF NEXTREE Grant 193.000.570/2009 and EMBRAPA Macroprogram 2 project grant 02.07.01.004.
Resumo:
The main aim of this Ph.D. dissertation is the study of clustering dependent data by means of copula functions with particular emphasis on microarray data. Copula functions are a popular multivariate modeling tool in each field where the multivariate dependence is of great interest and their use in clustering has not been still investigated. The first part of this work contains the review of the literature of clustering methods, copula functions and microarray experiments. The attention focuses on the K–means (Hartigan, 1975; Hartigan and Wong, 1979), the hierarchical (Everitt, 1974) and the model–based (Fraley and Raftery, 1998, 1999, 2000, 2007) clustering techniques because their performance is compared. Then, the probabilistic interpretation of the Sklar’s theorem (Sklar’s, 1959), the estimation methods for copulas like the Inference for Margins (Joe and Xu, 1996) and the Archimedean and Elliptical copula families are presented. In the end, applications of clustering methods and copulas to the genetic and microarray experiments are highlighted. The second part contains the original contribution proposed. A simulation study is performed in order to evaluate the performance of the K–means and the hierarchical bottom–up clustering methods in identifying clusters according to the dependence structure of the data generating process. Different simulations are performed by varying different conditions (e.g., the kind of margins (distinct, overlapping and nested) and the value of the dependence parameter ) and the results are evaluated by means of different measures of performance. In light of the simulation results and of the limits of the two investigated clustering methods, a new clustering algorithm based on copula functions (‘CoClust’ in brief) is proposed. The basic idea, the iterative procedure of the CoClust and the description of the written R functions with their output are given. The CoClust algorithm is tested on simulated data (by varying the number of clusters, the copula models, the dependence parameter value and the degree of overlap of margins) and is compared with the performance of model–based clustering by using different measures of performance, like the percentage of well–identified number of clusters and the not rejection percentage of H0 on . It is shown that the CoClust algorithm allows to overcome all observed limits of the other investigated clustering techniques and is able to identify clusters according to the dependence structure of the data independently of the degree of overlap of margins and the strength of the dependence. The CoClust uses a criterion based on the maximized log–likelihood function of the copula and can virtually account for any possible dependence relationship between observations. Many peculiar characteristics are shown for the CoClust, e.g. its capability of identifying the true number of clusters and the fact that it does not require a starting classification. Finally, the CoClust algorithm is applied to the real microarray data of Hedenfalk et al. (2001) both to the gene expressions observed in three different cancer samples and to the columns (tumor samples) of the whole data matrix.
Resumo:
In the past decade, the advent of efficient genome sequencing tools and high-throughput experimental biotechnology has lead to enormous progress in the life science. Among the most important innovations is the microarray tecnology. It allows to quantify the expression for thousands of genes simultaneously by measurin the hybridization from a tissue of interest to probes on a small glass or plastic slide. The characteristics of these data include a fair amount of random noise, a predictor dimension in the thousand, and a sample noise in the dozens. One of the most exciting areas to which microarray technology has been applied is the challenge of deciphering complex disease such as cancer. In these studies, samples are taken from two or more groups of individuals with heterogeneous phenotypes, pathologies, or clinical outcomes. these samples are hybridized to microarrays in an effort to find a small number of genes which are strongly correlated with the group of individuals. Eventhough today methods to analyse the data are welle developed and close to reach a standard organization (through the effort of preposed International project like Microarray Gene Expression Data -MGED- Society [1]) it is not unfrequant to stumble in a clinician's question that do not have a compelling statistical method that could permit to answer it.The contribution of this dissertation in deciphering disease regards the development of new approaches aiming at handle open problems posed by clinicians in handle specific experimental designs. In Chapter 1 starting from a biological necessary introduction, we revise the microarray tecnologies and all the important steps that involve an experiment from the production of the array, to the quality controls ending with preprocessing steps that will be used into the data analysis in the rest of the dissertation. While in Chapter 2 a critical review of standard analysis methods are provided stressing most of problems that In Chapter 3 is introduced a method to adress the issue of unbalanced design of miacroarray experiments. In microarray experiments, experimental design is a crucial starting-point for obtaining reasonable results. In a two-class problem, an equal or similar number of samples it should be collected between the two classes. However in some cases, e.g. rare pathologies, the approach to be taken is less evident. We propose to address this issue by applying a modified version of SAM [2]. MultiSAM consists in a reiterated application of a SAM analysis, comparing the less populated class (LPC) with 1,000 random samplings of the same size from the more populated class (MPC) A list of the differentially expressed genes is generated for each SAM application. After 1,000 reiterations, each single probe given a "score" ranging from 0 to 1,000 based on its recurrence in the 1,000 lists as differentially expressed. The performance of MultiSAM was compared to the performance of SAM and LIMMA [3] over two simulated data sets via beta and exponential distribution. The results of all three algorithms over low- noise data sets seems acceptable However, on a real unbalanced two-channel data set reagardin Chronic Lymphocitic Leukemia, LIMMA finds no significant probe, SAM finds 23 significantly changed probes but cannot separate the two classes, while MultiSAM finds 122 probes with score >300 and separates the data into two clusters by hierarchical clustering. We also report extra-assay validation in terms of differentially expressed genes Although standard algorithms perform well over low-noise simulated data sets, multi-SAM seems to be the only one able to reveal subtle differences in gene expression profiles on real unbalanced data. In Chapter 4 a method to adress similarities evaluation in a three-class prblem by means of Relevance Vector Machine [4] is described. In fact, looking at microarray data in a prognostic and diagnostic clinical framework, not only differences could have a crucial role. In some cases similarities can give useful and, sometimes even more, important information. The goal, given three classes, could be to establish, with a certain level of confidence, if the third one is similar to the first or the second one. In this work we show that Relevance Vector Machine (RVM) [2] could be a possible solutions to the limitation of standard supervised classification. In fact, RVM offers many advantages compared, for example, with his well-known precursor (Support Vector Machine - SVM [3]). Among these advantages, the estimate of posterior probability of class membership represents a key feature to address the similarity issue. This is a highly important, but often overlooked, option of any practical pattern recognition system. We focused on Tumor-Grade-three-class problem, so we have 67 samples of grade I (G1), 54 samples of grade 3 (G3) and 100 samples of grade 2 (G2). The goal is to find a model able to separate G1 from G3, then evaluate the third class G2 as test-set to obtain the probability for samples of G2 to be member of class G1 or class G3. The analysis showed that breast cancer samples of grade II have a molecular profile more similar to breast cancer samples of grade I. Looking at the literature this result have been guessed, but no measure of significance was gived before.