969 resultados para SNP microarray
Resumo:
Use of microarray technology often leads to high-dimensional and low- sample size data settings. Over the past several years, a variety of novel approaches have been proposed for variable selection in this context. However, only a small number of these have been adapted for time-to-event data where censoring is present. Among standard variable selection methods shown both to have good predictive accuracy and to be computationally efficient is the elastic net penalization approach. In this paper, adaptation of the elastic net approach is presented for variable selection both under the Cox proportional hazards model and under an accelerated failure time (AFT) model. Assessment of the two methods is conducted through simulation studies and through analysis of microarray data obtained from a set of patients with diffuse large B-cell lymphoma where time to survival is of interest. The approaches are shown to match or exceed the predictive performance of a Cox-based and an AFT-based variable selection method. The methods are moreover shown to be much more computationally efficient than their respective Cox- and AFT- based counterparts.
Resumo:
Submicroscopic changes in chromosomal DNA copy number dosage are common and have been implicated in many heritable diseases and cancers. Recent high-throughput technologies have a resolution that permits the detection of segmental changes in DNA copy number that span thousands of basepairs across the genome. Genome-wide association studies (GWAS) may simultaneously screen for copy number-phenotype and SNP-phenotype associations as part of the analytic strategy. However, genome-wide array analyses are particularly susceptible to batch effects as the logistics of preparing DNA and processing thousands of arrays often involves multiple laboratories and technicians, or changes over calendar time to the reagents and laboratory equipment. Failure to adjust for batch effects can lead to incorrect inference and requires inefficient post-hoc quality control procedures that exclude regions that are associated with batch. Our work extends previous model-based approaches for copy number estimation by explicitly modeling batch effects and using shrinkage to improve locus-specific estimates of copy number uncertainty. Key features of this approach include the use of diallelic genotype calls from experimental data to estimate batch- and locus-specific parameters of background and signal without the requirement of training data. We illustrate these ideas using a study of bipolar disease and a study of chromosome 21 trisomy. The former has batch effects that dominate much of the observed variation in quantile-normalized intensities, while the latter illustrates the robustness of our approach to datasets where as many as 25% of the samples have altered copy number. Locus-specific estimates of copy number can be plotted on the copy-number scale to investigate mosaicism and guide the choice of appropriate downstream approaches for smoothing the copy number as a function of physical position. The software is open source and implemented in the R package CRLMM available at Bioconductor (http:www.bioconductor.org).
Resumo:
Simulation-based assessment is a popular and frequently necessary approach to evaluation of statistical procedures. Sometimes overlooked is the ability to take advantage of underlying mathematical relations and we focus on this aspect. We show how to take advantage of large-sample theory when conducting a simulation using the analysis of genomic data as a motivating example. The approach uses convergence results to provide an approximation to smaller-sample results, results that are available only by simulation. We consider evaluating and comparing a variety of ranking-based methods for identifying the most highly associated SNPs in a genome-wide association study, derive integral equation representations of the pre-posterior distribution of percentiles produced by three ranking methods, and provide examples comparing performance. These results are of interest in their own right and set the framework for a more extensive set of comparisons.
Resumo:
Amplifications and deletions of chromosomal DNA, as well as copy-neutral loss of heterozygosity have been associated with diseases processes. High-throughput single nucleotide polymorphism (SNP) arrays are useful for making genome-wide estimates of copy number and genotype calls. Because neighboring SNPs in high throughput SNP arrays are likely to have dependent copy number and genotype due to the underlying haplotype structure and linkage disequilibrium, hidden Markov models (HMM) may be useful for improving genotype calls and copy number estimates that do not incorporate information from nearby SNPs. We improve previous approaches that utilize a HMM framework for inference in high throughput SNP arrays by integrating copy number, genotype calls, and the corresponding confidence scores when available. Using simulated data, we demonstrate how confidence scores control smoothing in a probabilistic framework. Software for fitting HMMs to SNP array data is available in the R package ICE.
Resumo:
Coat color dilution in several breeds of dog is characterized by a specific pigmentation phenotype and sometimes accompanied by hair loss and recurrent skin inflammation, the so-called color dilution alopecia or black hair follicular dysplasia. Coat color dilution (d) is inherited as a Mendelian autosomal recessive trait. In a previous study, MLPH polymorphisms showed perfect cosegregation with the dilute phenotype within breeds. However, different dilute haplotypes were found in different breeds, and no single polymorphism was identified in the coding sequence that was likely to be causative for the dilute phenotype. We resequenced the 5'-region of the canine MLPH gene and identified a strong candidate single nucleotide polymorphism within the nontranslated exon 1, which showed perfect association to the dilute phenotype in 65 dilute dogs from 7 different breeds. The A/G polymorphism is located at the last nucleotide of exon 1 and the mutant A-allele is predicted to reduce splicing efficiency 8-fold. An MLPH mRNA expression study using quantitative reverse transcriptase-polymerase chain reaction confirmed that dd animals had only about approximately 25% of the MLPH transcript compared with DD animals. These results provide preliminary evidence that the reported regulatory MLPH mutation might represent a causal mutation for coat color dilution in dogs.
Resumo:
Classical antibody-based serotyping of Escherichia coli is an important method in diagnostic microbiology for epidemiological purposes, as well as for a rough virulence assessment. However, serotyping is so tedious that its use is restricted to a few reference laboratories. To improve this situation we developed and validated a genetic approach for serotyping based on the microarray technology. The genes encoding the O-antigen flippase (wzx) and the O-antigen polymerase (wzy) were selected as target sequences for the O antigen, whereas fliC and related genes, which code for the flagellar monomer, were chosen as representatives for the H phenotype. Starting with a detailed bioinformatic analysis and oligonucleotide design, an ArrayTube-based assay was established: a fast and robust DNA extraction method was coupled with a site-specific, linear multiplex labeling procedure and hybridization analysis of the biotinylated amplicons. The microarray contained oligonucleotide DNA probes, each in duplicate, representing 24 of the epidemiologically most relevant of the over 180 known O antigens (O antigens 4, 6 to 9, 15, 26, 52, 53, 55, 79, 86, 91, 101, 103, 104, 111, 113, 114, 121, 128, 145, 157, and 172) as well as 47 of the 53 different H antigens (H antigens 1 to 12, 14 to 16, 18 to 21, 23 to 34, 37 to 43, 45, 46, 48, 49, 51 to 54, and 56). Evaluation of the microarray with a set of defined strains representing all O and H serotypes covered revealed that it has a high sensitivity and a high specificity. All of the conventionally typed 24 O groups and all of the 47 H serotypes were correctly identified. Moreover, strains which were nonmotile or nontypeable by previous serotyping assays yielded unequivocal results with the novel ArrayTube assay, which proved to be a valuable alternative to classical serotyping, allowing processing of single colonies within a single working day.
Resumo:
Staphylococcus aureus is a common pathogen which can colonise and infect not only man, but also domestic animals. Especially, infection of cattle is of high economic relevance as S. aureus is an important causal agent of bovine mastitis. In the present contribution, a DNA microarray was applied for the study of 144 different gene targets, including resistance genes and genes encoding exotoxins, in S. aureus isolated from cows. One hundred and twenty-eight isolates from Germany and Switzerland were tested. These isolates were assigned to 20 different strains and nine clonal complexes. The majority of isolates belonged either to apparently closely related clonal complexes 8, 25, and 97 (together 34.4%) or were related to the sequenced bovine strain RF122 (48.4%). Notable characteristics of S. aureus of bovine origin are the carriage of intact haemolysin beta (in 82% of isolates tested), the absence of staphylokinase (in 89.1%), the presence of allelic variants of several exotoxins such as toxic shock syndrome toxin and enterotoxin N, and the occurrence of the leukocidin lukF-P83/lukM (in 53.1%). Two isolates were methicillin-resistant S. aureus (MRSA). One of them was a clonal complex 8 MRSA related to the epidemic MRSA strain Irish 01. The other one belonged to ST398/spa-type 34 resembling a newly emerging MRSA strain which has been described to occur in humans as well as in domestic animals. The presence of these two strains highlights the possibility of transfers of S. aureus strains between different host species.
Resumo:
A nonfluorescent low-cost, low-density oligonucleotide array was designed for detecting the whole coronavirus genus after reverse transcription (RT)-PCR. The limit of detection was 15.7 copies/reaction. The clinical detection limit in patients with severe acute respiratory syndrome was 100 copies/sample. In 39 children suffering from coronavirus 229E, NL63, OC43, or HKU1, the sensitivity was equal to that of individual real-time RT-PCRs.
Resumo:
Cholangiocarcinoma is the second most common malignant tumor of the liver. We analyzed, immunohistochemically, the significance of cell cycle- and apoptosis-related markers in 128 cholangiocarcinomas (42 intrahepatic, 70 extrahepatic, and 16 gallbladder carcinomas) combined in a tissue microarray. Follow-up was available for 57 patients (44.5%). In comparison with normal tissue (29 specimens), cholangiocarcinomas expressed significantly more frequently p53, bcl-2, bax, and COX-2 (P.05 <). Intrahepatic tumors were significantly more frequently bcl-2+ and p16+, whereas extrahepatic tumors were more often p53+ (P < .05). Loss of p16 expression was associated with reduced survival of patients. Our data show that p53, bcl-2, bax, and COX-2 have an important role in the pathogenesis of cholangiocarcinomas. The differential expression of p16, bcl-2, and p53 between intrahepatic and extrahepatic tumors demonstrates that there are location-related differences in the phenotype and the genetic profiles of these tumors. Moreover, p16 was identified as an important prognostic marker in cholangiocarcinomas.
Resumo:
Nitrogen and water are essential for plant growth and development. In this study, we designed experiments to produce gene expression data of poplar roots under nitrogen starvation and water deprivation conditions. We found low concentration of nitrogen led first to increased root elongation followed by lateral root proliferation and eventually increased root biomass. To identify genes regulating root growth and development under nitrogen starvation and water deprivation, we designed a series of data analysis procedures, through which, we have successfully identified biologically important genes. Differentially Expressed Genes (DEGs) analysis identified the genes that are differentially expressed under nitrogen starvation or drought. Protein domain enrichment analysis identified enriched themes (in same domains) that are highly interactive during the treatment. Gene Ontology (GO) enrichment analysis allowed us to identify biological process changed during nitrogen starvation. Based on the above analyses, we examined the local Gene Regulatory Network (GRN) and identified a number of transcription factors. After testing, one of them is a high hierarchically ranked transcription factor that affects root growth under nitrogen starvation. It is very tedious and time-consuming to analyze gene expression data. To avoid doing analysis manually, we attempt to automate a computational pipeline that now can be used for identification of DEGs and protein domain analysis in a single run. It is implemented in scripts of Perl and R.
Resumo:
Transcriptomics could contribute significantly to the early and specific diagnosis of rejection episodes by defining 'molecular Banff' signatures. Recently, the description of pathogenesis-based transcript sets offered a new opportunity for objective and quantitative diagnosis. Generating high-quality transcript panels is thus critical to define high-performance diagnostic classifier. In this study, a comparative analysis was performed across four different microarray datasets of heterogeneous sample collections from two published clinical datasets and two own datasets including biopsies for clinical indication, and samples from nonhuman primates. We characterized a common transcriptional profile of 70 genes, defined as acute rejection transcript set (ARTS). ARTS expression is significantly up-regulated in all AR samples as compared with stable allografts or healthy kidneys, and strongly correlates with the severity of Banff AR types. Similarly, ARTS were tested as a classifier in a large collection of 143 independent biopsies recently published by the University of Alberta. Results demonstrate that the 'in silico' approach applied in this study is able to identify a robust and reliable molecular signature for AR, supporting a specific and sensitive molecular diagnostic approach for renal transplant monitoring.
Resumo:
BACKGROUND: Gene expression analysis has emerged as a major biological research area, with real-time quantitative reverse transcription PCR (RT-QPCR) being one of the most accurate and widely used techniques for expression profiling of selected genes. In order to obtain results that are comparable across assays, a stable normalization strategy is required. In general, the normalization of PCR measurements between different samples uses one to several control genes (e.g. housekeeping genes), from which a baseline reference level is constructed. Thus, the choice of the control genes is of utmost importance, yet there is not a generally accepted standard technique for screening a large number of candidates and identifying the best ones. RESULTS: We propose a novel approach for scoring and ranking candidate genes for their suitability as control genes. Our approach relies on publicly available microarray data and allows the combination of multiple data sets originating from different platforms and/or representing different pathologies. The use of microarray data allows the screening of tens of thousands of genes, producing very comprehensive lists of candidates. We also provide two lists of candidate control genes: one which is breast cancer-specific and one with more general applicability. Two genes from the breast cancer list which had not been previously used as control genes are identified and validated by RT-QPCR. Open source R functions are available at http://www.isrec.isb-sib.ch/~vpopovic/research/ CONCLUSION: We proposed a new method for identifying candidate control genes for RT-QPCR which was able to rank thousands of genes according to some predefined suitability criteria and we applied it to the case of breast cancer. We also empirically showed that translating the results from microarray to PCR platform was achievable.
Resumo:
A disposable microarray was developed for detection of up to 90 antibiotic resistance genes in gram-positive bacteria by hybridization. Each antibiotic resistance gene is represented by two specific oligonucleotides chosen from consensus sequences of gene families, except for nine genes for which only one specific oligonucleotide could be developed. A total of 137 oligonucleotides (26 to 33 nucleotides in length with similar physicochemical parameters) were spotted onto the microarray. The microarrays (ArrayTubes) were hybridized with 36 strains carrying specific antibiotic resistance genes that allowed testing of the sensitivity and specificity of 125 oligonucleotides. Among these were well-characterized multidrug-resistant strains of Enterococcus faecalis, Enterococcus faecium, and Lactococcus lactis and an avirulent strain of Bacillus anthracis harboring the broad-host-range resistance plasmid pRE25. Analysis of two multidrug-resistant field strains allowed the detection of 12 different antibiotic resistance genes in a Staphylococcus haemolyticus strain isolated from mastitis milk and 6 resistance genes in a Clostridium perfringens strain isolated from a calf. In both cases, the microarray genotyping corresponded to the phenotype of the strains. The ArrayTube platform presents the advantage of rapidly screening bacteria for the presence of antibiotic resistance genes known in gram-positive bacteria. This technology has a large potential for applications in basic research, food safety, and surveillance programs for antimicrobial resistance.
Resumo:
Horses were domesticated from the Eurasian steppes 5,000-6,000 years ago. Since then, the use of horses for transportation, warfare, and agriculture, as well as selection for desired traits and fitness, has resulted in diverse populations distributed across the world, many of which have become or are in the process of becoming formally organized into closed, breeding populations (breeds). This report describes the use of a genome-wide set of autosomal SNPs and 814 horses from 36 breeds to provide the first detailed description of equine breed diversity. F(ST) calculations, parsimony, and distance analysis demonstrated relationships among the breeds that largely reflect geographic origins and known breed histories. Low levels of population divergence were observed between breeds that are relatively early on in the process of breed development, and between those with high levels of within-breed diversity, whether due to large population size, ongoing outcrossing, or large within-breed phenotypic diversity. Populations with low within-breed diversity included those which have experienced population bottlenecks, have been under intense selective pressure, or are closed populations with long breed histories. These results provide new insights into the relationships among and the diversity within breeds of horses. In addition these results will facilitate future genome-wide association studies and investigations into genomic targets of selection.
Resumo:
Background Tissue microarray (TMA) technology revolutionized the investigation of potential biomarkers from paraffin-embedded tissues. However, conventional TMA construction is laborious, time-consuming and imprecise. Next-generation tissue microarrays (ngTMA) combine histological expertise with digital pathology and automated tissue microarraying. The aim of this study was to test the feasibility of ngTMA for the investigation of biomarkers within the tumor microenvironment (tumor center and invasion front) of six tumor types, using CD3, CD8 and CD45RO as an example. Methods Ten cases each of malignant melanoma, lung, breast, gastric, prostate and colorectal cancers were reviewed. The most representative H&E slide was scanned and uploaded onto a digital slide management platform. Slides were viewed and seven TMA annotations of 1 mm in diameter were placed directly onto the digital slide. Different colors were used to identify the exact regions in normal tissue (n = 1), tumor center (n = 2), tumor front (n = 2), and tumor microenvironment at invasion front (n = 2) for subsequent punching. Donor blocks were loaded into an automated tissue microarrayer. Images of the donor block were superimposed with annotated digital slides. Exact annotated regions were punched out of each donor block and transferred into a TMA block. 420 tissue cores created two ngTMA blocks. H&E staining and immunohistochemistry for CD3, CD8 and CD45RO were performed. Results All 60 slides were scanned automatically (total time < 10 hours), uploaded and viewed. Annotation time was 1 hour. The 60 donor blocks were loaded into the tissue microarrayer, simultaneously. Alignment of donor block images and digital slides was possible in less than 2 minutes/case. Automated punching of tissue cores and transfer took 12 seconds/core. Total ngTMA construction time was 1.4 hours. Stains for H&E and CD3, CD8 and CD45RO highlighted the precision with which ngTMA could capture regions of tumor-stroma interaction of each cancer and the T-lymphocytic immune reaction within the tumor microenvironment. Conclusion Based on a manual selection criteria, ngTMA is able to precisely capture histological zones or cell types of interest in a precise and accurate way, aiding the pathological study of the tumor microenvironment. This approach would be advantageous for visualizing proteins, DNA, mRNA and microRNAs in specific cell types using in situ hybridization techniques.