972 resultados para Oligonucleotide Array Sequence Analysis
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Objective: The present study aimed at evaluating the PROP1 and HESX1 genes in a group of patients with septo-optic dysplasia (SOD) and pituitary hormone deficiency (combined – CPHD; isolated GH deficiency – GHD). Eleven patients with a clinical and biochemical presentation consistent with CPHD, GHD or SOD were evaluated. Subjects and methods: In all patients, the HESX1 gene was analyzed by direct sequence analysis and in cases of CPHD the PROP1 gene was also sequenced. Results: A polymorphism (1772 A > G; N125S) was identified in a patient with SOD. We found three patients carrying the allelic variants 27 T > C; A9A and 59 A > G; N20S in exon 1 of the PROP1 gene. Mutations in the PROP1 and HESX1 genes were not identified in these patients with sporadic GHD, CPHD and SOD. Conclusion: Genetic alterations in one or several other genes, or non-genetic mechanisms, must be implicated in the pathogenic process.
Resumo:
Intron splicing is one of the most important steps involved in the maturation process of a pre-mRNA. Although the sequence profiles around the splice sites have been studied extensively, the levels of sequence identity between the exonic sequences preceding the donor sites and the intronic sequences preceding the acceptor sites has not been examined as thoroughly. In this study we investigated identity patterns between the last 15 nucleotides of the exonic sequence preceding the 5' splice site and the intronic sequence preceding the 3' splice site in a set of human protein-coding genes that do not exhibit intron retention. We found that almost 60% of consecutive exons and introns in human protein-coding genes share at least two identical nucleotides at their 3' ends and, on average, the sequence identity length is 2.47 nucleotides. Based on our findings we conclude that the 3' ends of exons and introns tend to have longer identical sequences within a gene than when being taken from different genes. Our results hold even if the pairs are non-consecutive in the transcription order. (C) 2012 Elsevier Ltd. All rights reserved.
Resumo:
Surveys were conducted in Brazil, Benin and Tanzania to collect predatory mites as candidates for control of the coconut mite Aceria guerreronis Keifer, a serious pest of coconut fruits. At all locations surveyed, one of the most dominant predators on infested coconut fruits was identified as Neoseiulus baraki Athias-Henriot, based on morphological similarity with regard to taxonomically relevant characters. However, scrutiny of our own and published descriptions suggests that consistent morphological differences may exist between the Benin population and those from the other geographic origins. In this study, we combined three methods to assess whether these populations belong to one species or a few distinct, yet closely related species. First, multivariate analysis of 32 morphological characters showed that the Benin population differed from the other three populations. Second, DNA sequence analysis based on the mitochondrial cytochrome oxidase subunit I (COI) showed the same difference between these populations. Third, cross-breeding between populations was unsuccessful in all combinations. These data provide evidence for the existence of cryptic species. Subsequent morphological research showed that the Benin population can be distinguished from the others by a new character (not included in the multivariate analysis), viz. the number of teeth on the fixed digit of the female chelicera.
Resumo:
Peptides derived from cytosolic, mitochondrial, and nuclear proteins have been detected in extracts of animal tissues and cell lines. To test whether the proteasome is involved in their formation, HEK293T cells were treated with epoxomicin (0.2 or 2 mu M) for 1 h and quantitative peptidomics analysis was performed. Altogether, 147 unique peptides were identified by mass spectrometry sequence analysis. Epoxomicin treatment decreased the levels of the majority of intracellular peptides, consistent with inhibition of the proteasome beta-2 and beta-5 subunits. Treatment with the higher concentration of epoxomicin elevated the levels of some peptides. Most of the elevated peptides resulted from cleavages at acidic residues, suggesting that epoxomicin increased the processing of proteins through the beta-1 subunit. Interestingly, some of the peptides that were elevated by the epoxomicin treatment had hydrophobic residues in P1 cleavage sites. Taken together, these findings suggest that, while the proteasome is the major source of intracellular peptides, other peptide-generating mechanisms exist. Because intracellular peptides are likely to perform intracellular functions, studies using proteasome inhibitors need to be interpreted with caution, as it is possible that the effects of these inhibitors are due to a change in the peptide levels rather than inhibition of protein degradation.
Resumo:
Abstract Background A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative probability model. We can use as alternative model a null model. This is the scoring technique used by sequence analysis tools such as HMMER, SAM and INFERNAL. The most prevalent null models are position-independent residue distributions that include: the uniform distribution, genomic distribution, family-specific distribution and the target sequence distribution. This paper presents a study to evaluate the impact of the choice of a null model in the final result of classifications. In particular, we are interested in minimizing the number of false predictions in a classification. This is a crucial issue to reduce costs of biological validation. Results For all the tests, the target null model presented the lowest number of false positives, when using random sequences as a test. The study was performed in DNA sequences using GC content as the measure of content bias, but the results should be valid also for protein sequences. To broaden the application of the results, the study was performed using randomly generated sequences. Previous studies were performed on aminoacid sequences, using only one probabilistic model (HMM) and on a specific benchmark, and lack more general conclusions about the performance of null models. Finally, a benchmark test with P. falciparum confirmed these results. Conclusions Of the evaluated models the best suited for classification are the uniform model and the target model. However, the use of the uniform model presents a GC bias that can cause more false positives for candidate sequences with extreme compositional bias, a characteristic not described in previous studies. In these cases the target model is more dependable for biological validation due to its higher specificity.
Resumo:
Abstract Background Sugarcane is an increasingly economically and environmentally important C4 grass, used for the production of sugar and bioethanol, a low-carbon emission fuel. Sugarcane originated from crosses of Saccharum species and is noted for its unique capacity to accumulate high amounts of sucrose in its stems. Environmental stresses limit enormously sugarcane productivity worldwide. To investigate transcriptome changes in response to environmental inputs that alter yield we used cDNA microarrays to profile expression of 1,545 genes in plants submitted to drought, phosphate starvation, herbivory and N2-fixing endophytic bacteria. We also investigated the response to phytohormones (abscisic acid and methyl jasmonate). The arrayed elements correspond mostly to genes involved in signal transduction, hormone biosynthesis, transcription factors, novel genes and genes corresponding to unknown proteins. Results Adopting an outliers searching method 179 genes with strikingly different expression levels were identified as differentially expressed in at least one of the treatments analysed. Self Organizing Maps were used to cluster the expression profiles of 695 genes that showed a highly correlated expression pattern among replicates. The expression data for 22 genes was evaluated for 36 experimental data points by quantitative RT-PCR indicating a validation rate of 80.5% using three biological experimental replicates. The SUCAST Database was created that provides public access to the data described in this work, linked to tissue expression profiling and the SUCAST gene category and sequence analysis. The SUCAST database also includes a categorization of the sugarcane kinome based on a phylogenetic grouping that included 182 undefined kinases. Conclusion An extensive study on the sugarcane transcriptome was performed. Sugarcane genes responsive to phytohormones and to challenges sugarcane commonly deals with in the field were identified. Additionally, the protein kinases were annotated based on a phylogenetic approach. The experimental design and statistical analysis applied proved robust to unravel genes associated with a diverse array of conditions attributing novel functions to previously unknown or undefined genes. The data consolidated in the SUCAST database resource can guide further studies and be useful for the development of improved sugarcane varieties.
Resumo:
The comparative genomic sequence analysis of a region in human chromosome 11p15.3 and its homologous segment in mouse chromosome 7 between ST5 and LMO1 genes has been performed. 158,201 bases were sequenced in the mouse and compared with the syntenic region in human, partially available in the public databases. The analysed region exhibits the typical eukaryotic genomic structure and compared with the close neighbouring regions, strikingly reflexes the mosaic pattern distribution of (G+C) and repeats content despites its relative short size. Within this region the novel gene STK33 was discovered (Stk33 in the mouse), that codes for a serine/threonine kinase. The finding of this gene constitutes an excellent example of the strength of the comparative sequencing approach. Poor gene-predictions in the mouse genomic sequence were corrected and improved by the comparison with the unordered data from the human genomic sequence publicly available. Phylogenetical analysis suggests that STK33 belongs to the calcium/calmodulin-dependent protein kinases group and seems to be a novelty in the chordate lineage. The gene, as a whole, seems to evolve under purifying selection whereas some regions appear to be under strong positive selection. Both human and mouse versions of serine/threonine kinase 33, consists of seventeen exons highly conserved in the coding regions, particularly in those coding for the core protein kinase domain. Also the exon/intron structure in the coding regions of the gene is conserved between human and mouse. The existence and functionality of the gene is supported by the presence of entries in the EST databases and was in vivo fully confirmed by isolating specific transcripts from human uterus total RNA and from several mouse tissues. Strong evidence for alternative splicing was found, which may result in tissue-specific starting points of transcription and in some extent, different protein N-termini. RT-PCR and hybridisation experiments suggest that STK33/Stk33 is differentially expressed in a few tissues and in relative low levels. STK33 has been shown to be reproducibly down-regulated in tumor tissues, particularly in ovarian tumors. RNA in-situ hybridisation experiments using mouse Stk33-specific probes showed expression in dividing cells from lung and germinal epithelium and possibly also in macrophages from kidney and lungs. Preliminary experimentation with antibodies designed in this work, performed in parallel to the preparation of this manuscript, seems to confirm this expression pattern. The fact that the chromosomal region 11p15 in which STK33 is located may be associated with several human diseases including tumor development, suggest further investigation is necessary to establish the role of STK33 in human health.
Resumo:
Digital signal processing (DSP) techniques for biological sequence analysis continue to grow in popularity due to the inherent digital nature of these sequences. DSP methods have demonstrated early success for detection of coding regions in a gene. Recently, these methods are being used to establish DNA gene similarity. We present the inter-coefficient difference (ICD) transformation, a novel extension of the discrete Fourier transformation, which can be applied to any DNA sequence. The ICD method is a mathematical, alignment-free DNA comparison method that generates a genetic signature for any DNA sequence that is used to generate relative measures of similarity among DNA sequences. We demonstrate our method on a set of insulin genes obtained from an evolutionarily wide range of species, and on a set of avian influenza viral sequences, which represents a set of highly similar sequences. We compare phylogenetic trees generated using our technique against trees generated using traditional alignment techniques for similarity and demonstrate that the ICD method produces a highly accurate tree without requiring an alignment prior to establishing sequence similarity.
Resumo:
To date, investigations of genetic diversity and the origins of domestication in sheep have utilised autosomal microsatellites and variation in the mitochondrial genome. We present the first analysis of both domestic and wild sheep using genetic markers residing on the ovine Y chromosome. Analysis of a single nucleotide polymorphism (oY1) in the SRY promoter region revealed that allele A-oY1 was present in all wild bighorn sheep (Ovis canadensis), two subspecies of thinhorn sheep (Ovis dalli), European Mouflon (Ovis musimon) and the Barbary (Ammontragis lervia). A-oY1 also had the highest frequency (71.4%) within 458 domestic sheep drawn from 65 breeds sampled from Africa, Asia, Australia, the Caribbean, Europe, the Middle East and Central Asia. Sequence analysis of a second locus, microsatellite SRYM18, revealed a compound repeat array displaying fixed differences, which identified bighorn and thinhorn sheep as distinct from the European Mouflon and domestic animals. Combined genotypic data identified 11 male-specific haplotypes that represented at least two separate lineages. Investigation of the geographical distribution of each haplotype revealed that one (H6) was both very common and widespread in the global sample of domestic breeds. The remaining haplotypes each displayed more restricted and informative distributions. For example, H5 was likely founded following the domestication of European breeds and was used to trace the recent transportation of animals to both the Caribbean and Australia. A high rate of Y chromosomal dispersal appears to have taken place during the development of domestic sheep as only 12.9% of the total observed variation was partitioned between major geographical regions.
Resumo:
Genotyping platforms such as Affymetrix can be used to assess genotype-phenotype as well as copy number-phenotype associations at millions of markers. While genotyping algorithms are largely concordant when assessed on HapMap samples, tools to assess copy number changes are more variable and often discordant. One explanation for the discordance is that copy number estimates are susceptible to systematic differences between groups of samples that were processed at different times or by different labs. Analysis algorithms that do not adjust for batch effects are prone to spurious measures of association. The R package crlmm implements a multilevel model that adjusts for batch effects and provides allele-specific estimates of copy number. This paper illustrates a workflow for the estimation of allele-specific copy number, develops markerand study-level summaries of batch effects, and demonstrates how the marker-level estimates can be integrated with complimentary Bioconductor software for inferring regions of copy number gain or loss. All analyses are performed in the statistical environment R. A compendium for reproducing the analysis is available from the author’s website (http://www.biostat.jhsph.edu/~rscharpf/crlmmCompendium/index.html).
Resumo:
Echicetin, a heterodimeric protein from the venom of Echis carinatus, binds to platelet glycoprotein Ib (GPIb) and so inhibits platelet aggregation or agglutination induced by various platelet agonists acting via GPIb. The amino acid sequence of the beta subunit of echicetin has been reported and found to belong to the recently identified snake venom subclass of the C-type lectin protein family. Echicetin alpha and beta subunits were purified. N-terminal sequence analysis provided direct evidence that the protein purified was echicetin. The paper presents the complete amino acid sequence of the alpha subunit and computer models of the alpha and beta subunits. The sequence of alpha echicetin is highly similar to the alpha and beta chains of various heterodimeric and homodimeric C-type lectins. Neither of the fully reduced and alkylated alpha or beta subunits of echicetin inhibited the platelet agglutination induced by von Willebrand factor-ristocetin or alpha-thrombin. Earlier reports about the inhibitory activity of reduced and alkylated echicetin beta subunit might have been due to partial reduction of the protein.
Resumo:
Background Chronic obstructive pulmonary disease (COPD) is a respiratory inflammatory condition with autoimmune features including IgG autoantibodies. In this study we analyze the complexity of the autoantibody response and reveal the nature of the antigens that are recognized by autoantibodies in COPD patients. Methods An array of 1827 gridded immunogenic peptide clones was established and screened with 17 sera of COPD patients and 60 healthy controls. Protein arrays were evaluated both by visual inspection and a recently developed computer aided image analysis technique. By this computer aided image analysis technique we computed the intensity values for each peptide clone and each serum and calculated the area under the receiver operator characteristics curve (AUC) for each clone and the separation COPD sera versus control sera. Results By visual evaluation we detected 381 peptide clones that reacted with autoantibodies of COPD patients including 17 clones that reacted with more than 60% of the COPD sera and seven clones that reacted with more than 90% of the COPD sera. The comparison of COPD sera and controls by the automated image analysis system identified 212 peptide clones with informative AUC values. By in silico sequence analysis we found an enrichment of sequence motives previously associated with immunogenicity. Conclusion The identification of a rather complex humoral immune response in COPD patients supports the idea of COPD as a disease with strong autoimmune features. The identification of novel immunogenic antigens is a first step towards a better understanding of the autoimmune component of COPD.
Resumo:
Dichelobacter nodosus, the etiological agent of ovine footrot, exists both as virulent and as benign strains, which differ in virulence mainly due to subtle differences in the three subtilisin-like proteases AprV2, AprV5 and BprV found in virulent, and AprB2, AprB5 and BprB in benign strains of D. nodosus. Our objective was a molecular genetic epidemiological analysis of the genes of these proteases by direct sequence analysis from clinical material of sheep from herds with and without history of footrot from 4 different European countries. The data reveal the two proteases known as virulent AprV2 and benign AprB2 to correlate fully to the clinical status of the individuals or the footrot history of the herd. In samples taken from affected herds, the aprV2 gene was found as a single allele whereas in samples from unaffected herds several alleles with minor modifications of the aprB2 gene were detected. The different alleles of aprB2 were related to the herds. The aprV5 and aprB5 genes were found in the form of several alleles scattered without distinction between affected and non-affected herds. However, all different alleles of aprV5 and aprB5 encode the same amino acid sequences, indicating the existence of a single protease isoenzyme 5 in both benign and virulent strains. The genes of the basic proteases BprV and BprB also exist as various alleles. However, differences found in samples from affected versus non-affected herds do not reflect the currently known epitopes that are attributed to differences in biochemical activity. The data of the study confirm the prominent role of AprV2 in the virulence of D. nodosus and shed a new light on the presence of the other protease genes and their allelic variants in clinical samples.
Resumo:
Musculoskeletal infections are infections of the bone and surrounding tissues. They are currently diagnosed based on culture analysis, which is the gold standard for pathogen identification. However, these clinical laboratory methods are frequently inadequate for the identification of the causative agents, because a large percentage (25-50%) of confirmed musculoskeletal infections are false negatives in which no pathogen is identified in culture. My data supports these results. The goal of this project was to use PCR amplification of a portion of the 16S rRNA gene to test an alternative approach for the identification of these pathogens and to assess the diversity of the bacteria involved. The advantages of this alternative method are that it should increase sample sensitivity and the speed of detection. In addition, bacteria that are non-culturable or in low abundance can be detected using this molecular technique. However, a complication of this approach is that the majority of musculoskeletal infections are polymicrobial, which prohibits direct identification from the infected tissue by DNA sequencing of the initial 16S rDNA amplification products. One way to solve this problem is to use denaturing gradient gel electrophoresis (DGGE) to separate the PCR products before DNA sequencing. Denaturing gradient gel electrophoresis (DGGE) separates DNA molecules based on their melting point, which is determined by their DNA sequence. This analytical technique allows a mixture of PCR products of the same length that electrophoreses through agarose gels as one band, to be separated into different bands and then used for DNA sequence analysis. In this way, the DGGE allows for the identification of individual bacterial species in polymicrobial-infected tissue, which is critical for improving clinical outcomes. By combining the 16S rDNA amplification and the DGGE techniques together, an alternative approach for identification has been used. The 16S rRNA gene PCR-DGGE method includes several critical steps: DNA extraction from tissue biopsies, amplification of the bacterial DNA, PCR product separation by DGGE, amplification of the gel-extracted DNA, and DNA sequencing and analysis. Each step of the method was optimized to increase its sensitivity and for rapid detection of the bacteria present in human tissue samples. The limit of detection for the DNA extraction from tissue was at least 20 Staphylococcus aureus cells and the limit of detection for PCR was at least 0.05 pg of template DNA. The conditions for DGGE electrophoreses were optimized by using a double gradient of acrylamide (6 – 10%) and denaturant (30-70%), which increased the separation between distinct PCR products. The use of GelRed (Biotium) improved the DNA visualization in the DGGE gel. To recover the DNA from the DGGE gels the gel slices were excised, shredded in a bead beater, and the DNA was allowed to diffuse into sterile water overnight. The use of primers containing specific linkers allowed the entire amplified PCR product to be sequenced and then analyzed. The optimized 16S rRNA gene PCR-DGGE method was used to analyze 50 tissue biopsy samples chosen randomly from our collection. The results were compared to those of the Memorial Hermann Hospital Clinical Microbiology Laboratory for the same samples. The molecular method was congruent for 10 of the 17 (59%) culture negative tissue samples. In 7 of the 17 (41%) culture negative the molecular method identified a bacterium. The molecular method was congruent with the culture identification for 7 of the 33 (21%) positive cultured tissue samples. However, in 8 of the 33 (24%) the molecular method identified more organisms. In 13 of the 15 (87%) polymicrobial cultured tissue samples the molecular method identified at least one organism that was also identified by culture techniques. Overall, the DGGE analysis of 16S rDNA is an effective method to identify bacteria not identified by culture analysis.