888 resultados para Dataset
Resumo:
Congenital muscular dystrophy with laminin α2 chain deficiency (MDC1A) is one of the most severe forms of muscular disease and is characterized by severe muscle weakness and delayed motor milestones. The genetic basis of MDC1A is well known, yet the secondary mechanisms ultimately leading to muscle degeneration and subsequent connective tissue infiltration are not fully understood. In order to obtain new insights into the molecular mechanisms underlying MDC1A, we performed a comparative proteomic analysis of affected muscles (diaphragm and gastrocnemius) from laminin α2 chain-deficient dy(3K)/dy(3K) mice, using multidimensional protein identification technology combined with tandem mass tags. Out of the approximately 700 identified proteins, 113 and 101 proteins, respectively, were differentially expressed in the diseased gastrocnemius and diaphragm muscles compared with normal muscles. A large portion of these proteins are involved in different metabolic processes, bind calcium, or are expressed in the extracellular matrix. Our findings suggest that metabolic alterations and calcium dysregulation could be novel mechanisms that underlie MDC1A and might be targets that should be explored for therapy. Also, detailed knowledge of the composition of fibrotic tissue, rich in extracellular matrix proteins, in laminin α2 chain-deficient muscle might help in the design of future anti-fibrotic treatments. All MS data have been deposited in the ProteomeXchange with identifier PXD000978 (http://proteomecentral.proteomexchange.org/dataset/PXD000978).
Resumo:
Hevea brasiliensis (Willd. Ex Adr. Juss.) Muell.-Arg. is the primary source of natural rubber that is native to the Amazon rainforest. The singular properties of natural rubber make it superior to and competitive with synthetic rubber for use in several applications. Here, we performed RNA sequencing (RNA-seq) of H. brasiliensis bark on the Illumina GAIIx platform, which generated 179,326,804 raw reads on the Illumina GAIIx platform. A total of 50,384 contigs that were over 400 bp in size were obtained and subjected to further analyses. A similarity search against the non-redundant (nr) protein database returned 32,018 (63%) positive BLASTx hits. The transcriptome analysis was annotated using the clusters of orthologous groups (COG), gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Pfam databases. A search for putative molecular marker was performed to identify simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). In total, 17,927 SSRs and 404,114 SNPs were detected. Finally, we selected sequences that were identified as belonging to the mevalonate (MVA) and 2-C-methyl-D-erythritol 4-phosphate (MEP) pathways, which are involved in rubber biosynthesis, to validate the SNP markers. A total of 78 SNPs were validated in 36 genotypes of H. brasiliensis. This new dataset represents a powerful information source for rubber tree bark genes and will be an important tool for the development of microsatellites and SNP markers for use in future genetic analyses such as genetic linkage mapping, quantitative trait loci identification, investigations of linkage disequilibrium and marker-assisted selection.
Resumo:
Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.
Resumo:
With a huge amount of printed documents nowadays, identifying their source is useful for criminal investigations and also to authenticate digital copies of a document. In this paper, we propose novel techniques for laser printer attribution. Our solutions do not need very high resolution scanning of the investigated document and explore the multidirectional, multiscale and low-level gradient texture patterns yielded by printing devices. The main contributions of this work are: (1) the description of printed areas using multidirectional and multiscale co-occurring texture patterns; (2) description of texture on low-level gradient areas by a convolution texture gradient filter that emphasizes textures in specific transition areas and (3) the analysis of printer patterns in segments of interest, which we call frames, instead of whole documents or only printed letters. We show by experiments in a well documented dataset that the proposed methods outperform techniques described in the literature and present near-perfect classification accuracy being very promising for deployment in real-world forensic investigations.
Resumo:
A fosmid metagenomic library was constructed with total community DNA obtained from a municipal wastewater treatment plant (MWWTP), with the aim of identifying new FeFe-hydrogenase genes encoding the enzymes most important for hydrogen metabolism. The dataset generated by pyrosequencing of a fosmid library was mined to identify environmental gene tags (EGTs) assigned to FeFe-hydrogenase. The majority of EGTs representing FeFe-hydrogenase genes were affiliated with the class Clostridia, suggesting that this group is the main hydrogen producer in the MWWTP analyzed. Based on assembled sequences, three FeFe-hydrogenase genes were predicted based on detection of the L2 motif (MPCxxKxxE) in the encoded gene product, confirming true FeFe-hydrogenase sequences. These sequences were used to design specific primers to detect fosmids encoding FeFe-hydrogenase genes predicted from the dataset. Three identified fosmids were completely sequenced. The cloned genomic fragments within these fosmids are closely related to members of the Spirochaetaceae, Bacteroidales and Firmicutes, and their FeFe-hydrogenase sequences are characterized by the structure type M3, which is common to clostridial enzymes. FeFe-hydrogenase sequences found in this study represent hitherto undetected sequences, indicating the high genetic diversity regarding these enzymes in MWWTP. Results suggest that MWWTP have to be considered as reservoirs for new FeFe-hydrogenase genes.
Resumo:
OBJETIVOS: avaliar a expressão de erbB-2 e dos receptores hormonais para estrógeno e progesterona (RE/RP) nas regiões de transição entre as frações in situ e invasoras de neoplasias ductais da mama (CDIS e CDI, respectivamente). MÉTODOS: oitenta e cinco casos de neoplasias mamárias, contendo regiões contíguas de CDIS e CDI, foram selecionados. Espécimes histológicos das áreas de CDIS e de CDI foram obtidos através da técnica de tissue microarray (TMA). As expressões da erbB-2 e dos RE/RP foram avaliadas por meio de imunoistoquímica convencional. A comparação da expressão da erbB-2 e dos RE/RP nas frações in situ e invasoras da mama foi realizada com emprego do teste de McNemar. Os intervalos de confiança foram determinados em 5% (p=0,05). Foram calculados coeficientes de correlação intraclasse (ICC) para avaliar a concordância na tabulação cruzada da expressão de erbB-2 e RE/RP nas frações de CDIS e CDI. RESULTADOS: a expressão da erbB-2 não diferiu entre as áreas de CDIS e CDI (p=0,38). Comparando caso a caso suas áreas de CDIS e CDI, houve boa concordância na expressão da erbB-2 (coeficiente de correlação intraclasse, ICC=0,64), dos RP (ICC = 0,71) e dos RE (ICC = 0,64). Considerando apenas tumores cujo componente in situ apresentasse áreas de necrose (comedo), o ICC para erbB-2 foi de 0,4, comparado a 0,6 no conjunto completo de casos. Os ICC não diferiram substancialmente daqueles obtidos com o conjunto completo de espécimes em relação aos RE/RP: para RE, ICC=0,7 (versus 0,7 no conjunto completo), e para RP, ICC=0,7 (versus 0,6 no conjunto completo). CONCLUSÕES: nossos achados sugerem que as expressões de erbB-2 e RE/RP não diferem nos componentes contíguos in situ e invasivo em tumores ductais da mama.
Resumo:
Below cloud scavenging processes have been investigated considering a numerical simulation, local atmospheric conditions and particulate matter (PM) concentrations, at different sites in Germany. The below cloud scavenging model has been coupled with bulk particulate matter counter TSI (Trust Portacounter dataset, consisting of the variability prediction of the particulate air concentrations during chosen rain events. The TSI samples and meteorological parameters were obtained during three winter Campaigns: at Deuselbach, March 1994, consisting in three different events; Sylt, April 1994 and; Freiburg, March 1995. The results show a good agreement between modeled and observed air concentrations, emphasizing the quality of the conceptual model used in the below cloud scavenging numerical modeling. The results between modeled and observed data have also presented high square Pearson coefficient correlations over 0.7 and significant, except the Freiburg Campaign event. The differences between numerical simulations and observed dataset are explained by the wind direction changes and, perhaps, the absence of advection mass terms inside the modeling. These results validate previous works based on the same conceptual model.
Resumo:
Non-coding RNAs (ncRNAs) were recently given much higher attention due to technical advances in sequencing which expanded the characterization of transcriptomes in different organisms. ncRNAs have different lengths (22 nt to >1, 000 nt) and mechanisms of action that essentially comprise a sophisticated gene expression regulation network. Recent publication of schistosome genomes and transcriptomes has increased the description and characterization of a large number of parasite genes. Here we review the number of predicted genes and the coverage of genomic bases in face of the public ESTs dataset available, including a critical appraisal of the evidence and characterization of ncRNAs in schistosomes. We show expression data for ncRNAs in Schistosoma mansoni. We analyze three different microarray experiment datasets: (1) adult worms' large-scale expression measurements; (2) differentially expressed S. mansoni genes regulated by a human cytokine (TNF-α) in a parasite culture; and (3) a stage-specific expression of ncRNAs. All these data point to ncRNAs involved in different biological processes and physiological responses that suggest functionality of these new players in the parasite's biology. Exploring this world is a challenge for the scientists under a new molecular perspective of host-parasite interactions and parasite development.
Resumo:
OBJETIVO: Estimar a prevalência de defeitos congênitos (DC) em uma coorte de nascidos vivos (NV) vinculando-se os bancos de dados do Sistema de Informação de Mortalidade (SIM) e do Sistema de Informação sobre Nascidos Vivos (SINASC). MÉTODOS: Estudo descritivo para avaliar as declarações de nascido vivo como fonte de informação sobre DC. A população de estudo é uma coorte de NV hospitalares do 1º semestre de 2006 de mães residentes e ocorridos no Município de São Paulo no período de 01/01/2006 a 30/06/2006, obtida por meio da vinculação dos bancos de dados das declarações de nascido vivo e óbitos neonatais provenientes da coorte. RESULTADOS: Os DC mais prevalentes segundo o SINASC foram: malformações congênitas (MC) e deformidades do aparelho osteomuscular (44,7%), MC do sistema nervoso (10,0%) e anomalias cromossômicas (8,6%). Após a vinculação, houve uma recuperação de 80,0% de indivíduos portadores de DC do aparelho circulatório, 73,3% de DC do aparelho respiratório e 62,5% de DC do aparelho digestivo. O SINASC fez 55,2% das notificações de DC e o SIM notificou 44,8%, mostrando-se importante para a recuperação de informações de DC. Segundo o SINASC, a taxa de prevalência de DC na coorte foi de 75,4%00 NV; com os dados vinculados com o SIM, essa taxa passou para 86,2%00 NV. CONCLUSÕES: A complementação de dados obtida pela vinculação SIM/SINASC fornece um perfil mais real da prevalência de DC do que aquele registrado pelo SINASC, que identifica os DC mais visíveis, enquanto o SIM identifica os mais letais, mostrando a importância do uso conjunto das duas fontes de dados.
Resumo:
Background: Polypodium hydriforme is a parasite with an unusual life cycle and peculiar morphology, both of which have made its systematic position uncertain. Polypodium has traditionally been considered a cnidarian because it possesses nematocysts, the stinging structures characteristic of this phylum. However, recent molecular phylogenetic studies using 18S rDNA sequence data have challenged this interpretation, and have shown that Polypodium is a close relative to myxozoans and together they share a closer affinity to bilaterians than cnidarians. Due to the variable rates of 18S rDNA sequences, these results have been suggested to be an artifact of long-branch attraction ( LBA). A recent study, using multiple protein coding markers, shows that the myxozoan Buddenbrockia, is nested within cnidarians. Polypodium was not included in this study. To further investigate the phylogenetic placement of Polypodium, we have performed phylogenetic analyses of metazoans with 18S and partial 28S rDNA sequences in a large dataset that includes Polypodium and a comprehensive sampling of cnidarian taxa. Results: Analyses of a combined dataset of 18S and partial 28S sequences, and partial 28S alone, support the placement of Polypodium within Cnidaria. Removal of the long-branched myxozoans from the 18S dataset also results in Polypodium being nested within Cnidaria. These results suggest that previous reports showing that Polypodium and Myxozoa form a sister group to Bilateria were an artifact of long-branch attraction. Conclusion: By including 28S rDNA sequences and a comprehensive sampling of cnidarian taxa, we demonstrate that previously conflicting hypotheses concerning the phylogenetic placement of Polypodium can be reconciled. Specifically, the data presented provide evidence that Polypodium is indeed a cnidarian and is either the sister taxon to Hydrozoa, or part of the hydrozoan clade, Leptothecata. The former hypothesis is consistent with the traditional view that Polypodium should be placed in its own cnidarian class, Polypodiozoa.
Resumo:
Gene clustering is a useful exploratory technique to group together genes with similar expression levels under distinct cell cycle phases or distinct conditions. It helps the biologist to identify potentially meaningful relationships between genes. In this study, we propose a clustering method based on multivariate normal mixture models, where the number of clusters is predicted via sequential hypothesis tests: at each step, the method considers a mixture model of m components (m = 2 in the first step) and tests if in fact it should be m - 1. If the hypothesis is rejected, m is increased and a new test is carried out. The method continues (increasing m) until the hypothesis is accepted. The theoretical core of the method is the full Bayesian significance test, an intuitive Bayesian approach, which needs no model complexity penalization nor positive probabilities for sharp hypotheses. Numerical experiments were based on a cDNA microarray dataset consisting of expression levels of 205 genes belonging to four functional categories, for 10 distinct strains of Saccharomyces cerevisiae. To analyze the method's sensitivity to data dimension, we performed principal components analysis on the original dataset and predicted the number of classes using 2 to 10 principal components. Compared to Mclust (model-based clustering), our method shows more consistent results.
Resumo:
Macro- and microarrays are well-established technologies to determine gene functions through repeated measurements of transcript abundance. We constructed a chicken skeletal muscle-associated array based on a muscle-specific EST database, which was used to generate a tissue expression dataset of similar to 4500 chicken genes across 5 adult tissues (skeletal muscle, heart, liver, brain, and skin). Only a small number of ESTs were sufficiently well characterized by BLAST searches to determine their probable cellular functions. Evidence of a particular tissue-characteristic expression can be considered an indication that the transcript is likely to be functionally significant. The skeletal muscle macroarray platform was first used to search for evidence of tissue-specific expression, focusing on the biological function of genes/transcripts, since gene expression profiles generated across tissues were found to be reliable and consistent. Hierarchical clustering analysis revealed consistent clustering among genes assigned to 'developmental growth', such as the ontology genes and germ layers. Accuracy of the expression data was supported by comparing information from known transcripts and tissue from which the transcript was derived with macroarray data. Hybridization assays resulted in consistent tissue expression profile, which will be useful to dissect tissue-regulatory networks and to predict functions of novel genes identified after extensive sequencing of the genomes of model organisms. Screening our skeletal-muscle platform using 5 chicken adult tissues allowed us identifying 43 'tissue-specific' transcripts, and 112 co-expressed uncharacterized transcripts with 62 putative motifs. This platform also represents an important tool for functional investigation of novel genes; to determine expression pattern according to developmental stages; to evaluate differences in muscular growth potential between chicken lines, and to identify tissue-specific genes.
Resumo:
Background: Prostate cancer cells in primary tumors have been typed CD10(-)/CD13(-)/CD24(hi)/CD26(+)/CD38(lo)/CD44(-)/CD104(-). This CD phenotype suggests a lineage relationship between cancer cells and luminal cells. The Gleason grade of tumors is a descriptive of tumor glandular differentiation. Higher Gleason scores are associated with treatment failure. Methods: CD26(+) cancer cells were isolated from Gleason 3+3 (G3) and Gleason 4+4 (G4) tumors by cell sorting, and their gene expression or transcriptome was determined by Affymetrix DNA array analysis. Dataset analysis was used to determine gene expression similarities and differences between G3 and G4 as well as to prostate cancer cell lines and histologically normal prostate luminal cells. Results: The G3 and G4 transcriptomes were compared to those of prostatic cell types of non-cancer, which included luminal, basal, stromal fibromuscular, and endothelial. A principal components analysis of the various transcriptome datasets indicated a closer relationship between luminal and G3 than luminal and G4. Dataset comparison also showed that the cancer transcriptomes differed substantially from those of prostate cancer cell lines. Conclusions: Genes differentially expressed in cancer are potential biomarkers for cancer detection, and those differentially expressed between G3 and G4 are potential biomarkers for disease stratification given that G4 cancer is associated with poor outcomes. Differentially expressed genes likely contribute to the prostate cancer phenotype and constitute the signatures of these particular cancer cell types.
Resumo:
Background: The prostate stroma is a key mediator of epithelial differentiation and development, and potentially plays a role in the initiation and progression of prostate cancer. The tumor-associated stroma is marked by increased expression of CD90/THYI. Isolation and characterization of these stromal cells could provide valuable insight into the biology of the tumor microenvironment. Methods: Prostate CD90(+) stromal fibromuscular cells from tumor specimens were isolated by cell-sorting and analyzed by DNA microarray. Dataset analysis was used to compare gene expression between histologically normal and tumor-associated stromal cells. For comparison, stromal cells were also isolated and analyzed from the urinary bladder. Results: The tumor-associated stromal cells were found to have decreased expression of genes involved in smooth muscle differentiation, and those detected in prostate but not bladder. Other differential expression between the stromal cell types included that of the CXC-chemokine genes. Conclusion: CD90(+) prostate tumor-associated stromal cells differed from their normal counterpart in expression of multiple genes, some of which are potentially involved in organ development.
Resumo:
The dengue virus has a single-stranded positive-sense RNA genome of similar to 10.700 nucleotides with a single open reading frame that encodes three structural (C, prM, and E) and seven nonstructural (NS1, NS2A, NS2B, NS3, NS4A, NS4B, and NS5) proteins. It possesses four antigenically distinct serotypes (DENV 1-4). Many phylogenetic studies address particularities of the different serotypes using convenience samples that are not conducive to a spatio-temporal analysis in a single urban setting. We describe the pattern of spread of distinct lineages of DENV-3 circulating in Sao Jose do Rio Preto, Brazil, during 2006. Blood samples from patients presenting dengue-like symptoms were collected for DENV testing. We performed M-N-PCR using primers based on NS5 for virus detection and identification. The fragments were purified from PCR mixtures and sequenced. The positive dengue cases were geo-coded. To type the sequenced samples, 52 reference sequences were aligned. The dataset generated was used for iterative phylogenetic reconstruction with the maximum likelihood criterion. The best demographic model, the rate of growth, rate of evolutionary change, and Time to Most Recent Common Ancestor (TMRCA) were estimated. The basic reproductive rate during the epidemics was estimated. We obtained sequences from 82 patients among 174 blood samples. We were able to geo-code 46 sequences. The alignment generated a 399-nucleotide-long dataset with 134 taxa. The phylogenetic analysis indicated that all samples were of DENV-3 and related to strains circulating on the isle of Martinique in 2000-2001. Sixty DENV-3 from Sao Jose do Rio Preto formed a monophyletic group (lineage 1), closely related to the remaining 22 isolates (lineage 2). We assumed that these lineages appeared before 2006 in different occasions. By transforming the inferred exponential growth rates into the basic reproductive rate, we obtained values for lineage 1 of R(0) = 1.53 and values for lineage 2 of R(0) = 1.13. Under the exponential model, TMRCA of lineage 1 dated 1 year and lineage 2 dated 3.4 years before the last sampling. The possibility of inferring the spatio-temporal dynamics from genetic data has been generally little explored, and it may shed light on DENV circulation. The use of both geographic and temporally structured phylogenetic data provided a detailed view on the spread of at least two dengue viral strains in a populated urban area.