9 resultados para Text similarity analysis
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
Multidimensional Visualization techniques are invaluable tools for analysis of structured and unstructured data with variable dimensionality. This paper introduces PEx-Image-Projection Explorer for Images-a tool aimed at supporting analysis of image collections. The tool supports a methodology that employs interactive visualizations to aid user-driven feature detection and classification tasks, thus offering improved analysis and exploration capabilities. The visual mappings employ similarity-based multidimensional projections and point placement to layout the data on a plane for visual exploration. In addition to its application to image databases, we also illustrate how the proposed approach can be successfully employed in simultaneous analysis of different data types, such as text and images, offering a common visual representation for data expressed in different modalities.
Resumo:
Background: Microarray techniques have become an important tool to the investigation of genetic relationships and the assignment of different phenotypes. Since microarrays are still very expensive, most of the experiments are performed with small samples. This paper introduces a method to quantify dependency between data series composed of few sample points. The method is used to construct gene co-expression subnetworks of highly significant edges. Results: The results shown here are for an adapted subset of a Saccharomyces cerevisiae gene expression data set with low temporal resolution and poor statistics. The method reveals common transcription factors with a high confidence level and allows the construction of subnetworks with high biological relevance that reveals characteristic features of the processes driving the organism adaptations to specific environmental conditions. Conclusion: Our method allows a reliable and sophisticated analysis of microarray data even under severe constraints. The utilization of systems biology improves the biologists ability to elucidate the mechanisms underlying celular processes and to formulate new hypotheses.
Resumo:
In this study, 222 genome survey sequences were generated for Trypanosoma rangeli strain P07 isolated from an opossum (Didelphis albiventris) in Minas Gerais State, Brazil. T. rangeli sequences were compared by BLASTX (Basic Local Alignment Search Tool X) analysis with the assembled contigs of Leishmania braziliensis, Leishmania infantum, Leishmania major, Trypanosoma brucei, and Trypanosoma cruzi. Results revealed that 82% (182/222) of the sequences were associated with predicted proteins described, whereas 18% (40/222) of the sequences did not show significant identity with sequences deposited in databases, suggesting that they may represent T. rangeli-specific sequences. Among the 182 predicted sequences, 179 (80.6%) had the highest similarity with T. cruzi, 2 (0.9%) with T. brucei, and 1 (0.5%) with L. braziliensis. Computer analysis permitted the identification of members of various gene families described for trypanosomatids in the genome of T. rangeli, such as trans-sialidases, mucin-associated surface proteins, and major surface proteases (MSP or gp63). This is the first report identifying sequences of the MSP family in T. rangeli. Multiple sequence alignments showed that the predicted MSP of T. rangeli presented the typical characteristics of metalloproteases, such as the presence of the HEXXH motif, which corresponds to a region previously associated with the catalytic site of the enzyme, and various cysteine and proline residues, which are conserved among MSPs of different trypanosomatid species. Reverse transcriptase-polymerase chain reaction analysis revealed the presence of MSP transcripts in epimastigote forms of T. rangeli.
Resumo:
Phosphoric acid is generally obtained from an aqueous process starting with the reaction between phosphate rock and sulphuric acid. Due to their chemical similarity, uranium is usually associated with phosphate rock which during chemical processing is partitioned to phosphoric acid. Uranium determination in this matrix is a very important task because of its ingestion it could lead to radiological impact on the population. Therefore, a procedure was developed using an initial precipitation with calcium hydroxide and evaporation, followed by instrumental neutron activation analysis (INAA). The procedure was applied to analyse fourteen uranium enriched phosphoric acid samples.
Resumo:
Crotalus durissus rattlesnakes are responsible for the most lethal cases of snakebites in Brazil. Crotalus durissus collilineatus subspecies is related to a great number of accidents in Southeast and Central West regions, but few studies on its venom composition have been carried out to date. In an attempt to describe the transcriptional profile of the C. durissus collilineatus venom gland, we generated a cDNA library and the sequences obtained could be identified by similarity searches on existing databases. Out of 673 expressed sequence tags (ESTs) 489 produced readable sequences comprising 201 singletons and 47 clusters of two or more ESTs. One hundred and fifty reads (60.5%) produced significant hits to known sequences. The results showed a predominance of toxin-coding ESTs instead of transcripts coding for proteins involved in all cellular functions. The most frequent toxin was crotoxin, comprising 88% of toxin-coding sequences. Crotoxin B, a basic phospholipase A(2) (PLA(2)) subunit of crotoxin, was represented in more variable forms comparing to the non-enzymatic subunit (crotoxin A), and most sequences coding this molecule were identified as CB1 isoform from Crotalus durissus terrificus venom. Four percent of toxin-related sequences in this study were identified as growth factors, comprising five sequences for vascular endothelial growth factor (VEGF) and one for nerve growth factor (NGF) that showed 100% of identity with C. durissus terrificus NGF. We also identified two clusters for metalloprotease from PII class comprising 3% of the toxins, and two for serine proteases, including gyroxin (2.5%). The remaining 2.5% of toxin-coding ESTs represent singletons identified as homologue sequences to cardiotoxin, convulxin, angiotensin-converting enzyme inhibitor and C-type natriuretic peptide, Ohanin, crotamin and PLA(2) inhibitor. These results allowed the identification of the most common classes of toxins in C. durissus collilineatus snake venom, also showing some unknown classes for this subspecies and even for C. durissus species, such as cardiotoxins and VEGF. (C) 2009 Published by Elsevier Masson SAS.
Resumo:
Two members of the low density lipoprotein receptor (LDLR) family were identified as putative orthologs for a vitellogenin receptor (Amvgr) and a lipophorin receptor (Amlpr) in the Apis mellifera genome. Both receptor sequences have the structural motifs characteristic of LDLR family members and show a high degree of similarity with sequences of other insects. RT-PCR analysis of Amvgr and Amlpr expression detected the presence of both transcripts in different tissues of adult female (ovary, fat body, midgut, head and specifically hypopharyngeal gland), as well as in embryos. In the head RNA samples we found two variant forms of AmLpR: a full length one and a shorter one lacking 29 amino acids in the O-linked sugar domain. In ovaries the expression levels of the two honey bee LDLR members showed opposing trends: whereas Amvgr expression was upregulated as the ovaries became activated, Amlpr transcript levels gradually declined. In situ hybridization analysis performed on ovaries detected Amvgr mRNA exclusively in germ line cells and corroborated the qPCR results showing an increase in Amvgr gene expression concomitant with follicle growth. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
The current taxonomy of two poorly known hermit crab species Pagurus forceps H. Milne Edwards, 1836 and Pagurus comptus White, 1847 from temperate Pacific and Atlantic coastlines of South America is based only on adult morphology. Past studies have questioned the separation of these two very similar species, which occur sympatrically. We included specimens morphologically assignable to P. forceps and P. comptus in a phylogenetic analysis, along with other selected anomuran decapods, based on 16S ribosomal gene sequences. Differences between samples putatively assigned to either P. forceps and P. comptus were moderate, with sequence similarity ranging from 98.2 to 99.4% for the fragments analyzed. Our comparison of mitochondrial DNA sequences (16S rRNA) revealed diagnostic differences between the two putative species, suggesting that P. forceps and P. comptus are indeed phylogenetically close but different species, with no genetic justification to support their synonymization. The polyphyly of Pagurus is not corroborated here among the represented Atlantic species, despite obviously complex relationships among the members of the genus.
Resumo:
Presenilins (PS) are integral membrane proteins involved, among other functions, in regulated intramembrane proteolysis. In this study, we report the identification and characterization of a complementary DNA from Schistosoma mansoni exhibiting a significant homology to human and nonvertebrate presinilins. S. mansoni contained a 1,485 bp open reading frame encoding a predicted protein of 494 amino acids. Alignment of predicted amino acid sequence of S. mansoni with PS (SmPS) from other species revealed up to 40% similarity shared among the investigated organisms. In addition, phylogenetic analyses demonstrated SmPS being closely related to its orthologues found in Schistosoma japonicum and Caenorhabditis elegans. Expression analysis of SmPS using quantitative real-time PCR revealed that the transcript is up-regulated in the egg stage. We hypothesize that the high level of SmPS in the S. mansoni embryo correlates to an important role during cellular signaling associated to larval development. To our knowledge, this study represents the first attempt to investigate the existence and abundance of PS from a helminth parasite.
Resumo:
To identify novel genes involved in the molecular pathogenesis of chronic lymphocytic leukemia (CLL) we performed a serial analysis of gene expression (SAGE) in CLL cells, and compared this with healthy B cells (nCD19(+)). We found a high level of similarity among CLL subtypes, but a comparison of CLL versus nCD19(+) libraries revealed 55 genes that were over-represented and 49 genes that were down-regulated in CLL. A gene ontology analysis revealed that TOSO, which plays a functional role upstream of Fas extrinsic apoptosis pathway, was over-expressed in CLL cells. This finding was confirmed by real-time reverse transcription-polymerase chain reaction in 78 CLL and 12 nCD19(+) cases (P <.001). We validated expression using flow cytometry and tissue microarray and demonstrated a 5.6-fold increase of TOSO protein in circulating CLL cells (P =.013) and lymph nodes (P =.006). Our SAGE results have demonstrated that TOSO is a novel overexpressed antiapoptotic gene in CLL.