22 resultados para similarity search

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Cytochrome P450 (CYP450) is a class of enzymes where the substrate identification is particularly important to know. It would help medicinal chemists to design drugs with lower side effects due to drug-drug interactions and to extensive genetic polymorphism. Herein, we discuss the application of the 2D and 3D-similarity searches in identifying reference Structures with higher capacity to retrieve Substrates of three important CYP enzymes (CYP2C9, CYP2D6, and CYP3A4). On the basis of the complementarities of multiple reference structures selected by different similarity search methods, we proposed the fusion of their individual Tanimoto scores into a consensus Tanimoto score (T(consensus)). Using this new score, true positive rates of 63% (CYP2C9) and 81% (CYP2D6) were achieved with false positive rates of 4% for the CYP2C9-CYP2D6 data Set. Extended similarity searches were carried out oil a validation data set, and the results showed that by using the T(consensus) score, not only the area of a ROC graph increased, but also more substrates were recovered at the beginning of a ranked list.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Motivation: DNA assembly programs classically perform an all-against-all comparison of reads to identify overlaps, followed by a multiple sequence alignment and generation of a consensus sequence. If the aim is to assemble a particular segment, instead of a whole genome or transcriptome, a target-specific assembly is a more sensible approach. GenSeed is a Perl program that implements a seed-driven recursive assembly consisting of cycles comprising a similarity search, read selection and assembly. The iterative process results in a progressive extension of the original seed sequence. GenSeed was tested and validated on many applications, including the reconstruction of nuclear genes or segments, full-length transcripts, and extrachromosomal genomes. The robustness of the method was confirmed through the use of a variety of DNA and protein seeds, including short sequences derived from SAGE and proteome projects.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A myriad of methods are available for virtual screening of small organic compound databases. In this study we have successfully applied a quantitative model of consensus measurements, using a combination of 3D similarity searches (ROCS and EON), Hologram Quantitative Structure Activity Relationships (HQSAR) and docking (FRED, FlexX, Glide and AutoDock Vina), to retrieve cruzain inhibitors from collected databases. All methods were assessed individually and then combined in a Ligand-Based Virtual Screening (LBVS) and Target-Based Virtual Screening (TBVS) consensus scoring, using Receiving Operating Characteristic (ROC) curves to evaluate their performance. Three consensus strategies were used: scaled-rank-by-number, rank-by-rank and rank-by-vote, with the most thriving the scaled-rank-by-number strategy, considering that the stiff ROC curve appeared to be satisfactory in every way to indicate a higher enrichment power at early retrieval of active compounds from the database. The ligand-based method provided access to a robust and predictive HQSAR model that was developed to show superior discrimination between active and inactive compounds, which was also better than ROCS and EON procedures. Overall, the integration of fast computational techniques based on ligand and target structures resulted in a more efficient retrieval of cruzain inhibitors with desired pharmacological profiles that may be useful to advance the discovery of new trypanocidal agents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An important feature of a database management systems (DBMS) is its client/server architecture, where managing shared memory among the clients and the server is always an tough issue. However, similarity queries are specially sensitive to this kind of architecture, since the answer sizes vary widely. Usually, the answers of similarity query are fully processed to be sent in full to the user, who often is interested in just parts of the answer, e.g. just few elements closer or farther to the query reference. Compelling the DBMS to retrieve the full answer, further ignoring its majority is at least a waste of server processing power. Paging the answer is a technique that splits the answer onto several pages, following client requests. Despite the success of paging on traditional queries, little work has been done to support it in similarity queries. In this work, we present a technique that not only provides paging in similarity range or k-nearest neighbor queries, but also supports them in two variations: the forward similarity query and the backward similarity query. They return elements either increasingly farther of increasingly closer to the query reference. The reported experiments show that, depending on the proportion of the interesting part over the full answer, both techniques allow answering queries much faster than it is obtained in the non-paged way. (C) 2010 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Schistosomiasis affects more than 200 million people worldwide; another 600 million are at risk of infection. The schistosomulum stage is believed to be the target of protective immunity in the attenuated cercaria vaccine model. In an attempt to identify genes up-regulated in the schistosomulum stage in relation to cercaria, we explored the Schistosoma mansoni transcriptome by looking at the relative frequency of reads in EST libraries from both stages. The 400 genes potentially up-regulated in schistosomula were analyzed as to their Gene Ontology categorization, and we have focused on those encoding-predicted proteins with no similarity to proteins of other organisms, assuming they could be parasite-specific proteins important for survival in the host. Up-regulation in schistosomulum relative to cercaria was validated with real-time reverse transcription polymerase chain reaction (RT-PCR) for five out of nine selected genes (56%). We tested their protective potential in mice through immunization with DNA vaccines followed by a parasite challenge. Worm burden reductions of 16-17% were observed for one of them, indicating its protective potential. Our results demonstrate the value and caveats of using stage-associated frequency of ESTs as an indication of differential expression coupled to DNA vaccine screening in the identification of novel proteins to be further investigated as potential vaccine candidates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Primordial Quark Nuggets, remnants of the quark-hadron phase transition, may be hiding most of the baryon number in superdense chunks have been discussed for years always from the theoretical point of view. While they seemed originally fragile at intermediate cosmological temperatures, it became increasingly clear that they may survive due to a variety of effects affecting their evaporation (surface and volume) rates. A search of these objects have never been attempted to elucidate their existence. We discuss in this note how to search directly for cosmological fossil nuggets among the small asteroids approaching Earth. `Asteroids` with a high visible-to-infrared flux ratio, constant lightcurves and devoid of spectral features are signals of an actual possible nugget nature. A viable search of very definite primordial quark nugget features can be conducted as a spinoff of the ongoing/forthcoming NEAs observation programmes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Analysis of floristic similarity relationships between plant communities can detect patterns of species occurrence and also explain conditioning factors. Searching for such patterns, floristic similarity relationships among Atlantic Forest sites situated at Ibiuna Plateau, Sao Paulo state, Brazil, were analyzed by multivariate techniques. Twenty one forest fragments and six sites within a continuous Forest Reserve were included in the analyses. Floristic composition and structure of the tree community (minimum dbh 5 cm) were assessed using the point centered quarter method. Two methods were used for multivariate analysis: Detrended Correspondence Analysis (DCA) and Two-Way Indicator Species Analysis (TWINSPAN). Similarity relationships among the study areas were based on the successional stage of the community and also on spatial proximity. The more similar the successional stage of the communities, the higher the floristic similarity between them, especially if the communities are geographically close. A floristic gradient from north to south was observed, suggesting a transition between biomes, since northern indicator species are mostly heliophytes, occurring also in cerrado vegetation and seasonal semideciduous forest, while southern indicator species are mostly typical ombrophilous and climax species from typical dense evergreen Atlantic Forest.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background and aim: Knowledge about the genetic factors responsible for noise-induced hearing loss (NIHL) is still limited. This study investigated whether genetic factors are associated or not to susceptibility to NIHL. Subjects and methods: The family history and genotypes were studied for candidate genes in 107 individuals with NIHL, 44 with other causes of hearing impairment and 104 controls. Mutations frequently found among deaf individuals were investigated (35delG, 167delT in GJB2, Delta(GJB6- D13S1830), Delta(GJB6- D13S1854) in GJB6 and A1555G in MT-RNR1 genes); allelic and genotypic frequencies were also determined at the SNP rs877098 in DFNB1, of deletions of GSTM1 and GSTT1 and sequence variants in both MTRNR1 and MTTS1 genes, as well as mitochondrial haplogroups. Results: When those with NIHL were compared with the control group, a significant increase was detected in the number of relatives affected by hearing impairment, of the genotype corresponding to the presence of both GSTM1 and GSTT1 enzymes and of cases with mitochondrial haplogroup L1. Conclusion: The findings suggest effects of familial history of hearing loss, of GSTT1 and GSTM1 enzymes and of mitochondrial haplogroup L1 on the risk of NIHL. This study also described novel sequence variants of MTRNR1 and MTTS1 genes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We analyzed ostriches from an equipped farm located in the Brazilian southeast region for the presence of Salmonella spp. This bacterium was investigated in 80 samples of ostrich droppings, 90 eggs, 30 samples of feed and 30 samples of droppings from rodents. Additionally, at slaughter-house this bacterium was investigated in droppings, caecal content, spleen, liver and carcasses from 90 slaughtered ostriches from the studied farm. Also, blood serum of those animals were harvested and submitted to serum plate agglutination using commercial Salmonella Pullorum antigen. No Salmonella spp. was detected in any eggs, caecal content, liver, spleen, carcass and droppings from ostriches and rodents. However, Salmonella Javiana and Salmonella enterica subsp. enterica 4, 12: i:- were isolated from some samples of feed. The serologic test was negative for all samples. Good sanitary farming management and the application of HACCP principles and GMP during the slaughtering process could explain the absence of Salmonella spp. in the tested samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A novel technique for selecting the poles of orthonormal basis functions (OBF) in Volterra models of any order is presented. It is well-known that the usual large number of parameters required to describe the Volterra kernels can be significantly reduced by representing each kernel using an appropriate basis of orthonormal functions. Such a representation results in the so-called OBF Volterra model, which has a Wiener structure consisting of a linear dynamic generated by the orthonormal basis followed by a nonlinear static mapping given by the Volterra polynomial series. Aiming at optimizing the poles that fully parameterize the orthonormal bases, the exact gradients of the outputs of the orthonormal filters with respect to their poles are computed analytically by using a back-propagation-through-time technique. The expressions relative to the Kautz basis and to generalized orthonormal bases of functions (GOBF) are addressed; the ones related to the Laguerre basis follow straightforwardly as a particular case. The main innovation here is that the dynamic nature of the OBF filters is fully considered in the gradient computations. These gradients provide exact search directions for optimizing the poles of a given orthonormal basis. Such search directions can, in turn, be used as part of an optimization procedure to locate the minimum of a cost-function that takes into account the error of estimation of the system output. The Levenberg-Marquardt algorithm is adopted here as the optimization procedure. Unlike previous related work, the proposed approach relies solely on input-output data measured from the system to be modeled, i.e., no information about the Volterra kernels is required. Examples are presented to illustrate the application of this approach to the modeling of dynamic systems, including a real magnetic levitation system with nonlinear oscillatory behavior.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Modern database applications are increasingly employing database management systems (DBMS) to store multimedia and other complex data. To adequately support the queries required to retrieve these kinds of data, the DBMS need to answer similarity queries. However, the standard structured query language (SQL) does not provide effective support for such queries. This paper proposes an extension to SQL that seamlessly integrates syntactical constructions to express similarity predicates to the existing SQL syntax and describes the implementation of a similarity retrieval engine that allows posing similarity queries using the language extension in a relational DBM. The engine allows the evaluation of every aspect of the proposed extension, including the data definition language and data manipulation language statements, and employs metric access methods to accelerate the queries. Copyright (c) 2008 John Wiley & Sons, Ltd.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the course of our research program to discover novel antileishmanial agents, a biological screening of natural products against Leishmania major promastigotes allowed the identification of a furoquinoline alkaloid (1) and a furanocoumarin (2) as new hits. Subsequently, an integrated ligand-based virtual screening approach was employed to search for new antileishmanial compounds using these naturally occurring molecules as templates. Fourteen out of 40 compounds selected from a database of about 800,000 compounds (extracted from ZINC, a free database for virtual screening) were experimentally confirmed to possess significant in vitro antileishmanial properties. The application of ligand-based virtual screening as a complementary approach to experimental natural product screening was a useful strategy to facilitate the identification of new promising lead candidates.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present the results of searches for dipolar-type anisotropies in different energy ranges above 2.5 x 10(17) eV with the surface detector array of the Pierre Auger Observatory, reporting on both the phase and the amplitude measurements of the first harmonic modulation in the right-ascension distribution. Upper limits on the amplitudes are obtained, which provide the most stringent bounds at present, being below 2% at 99% C.L. for EeV energies. We also compare our results to those of previous experiments as well as with some theoretical expectations. (C) 2011 Elsevier B.V. All rights reserved.