10 resultados para Large database

em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Multi-relational data mining enables pattern mining from multiple tables. The existing multi-relational mining association rules algorithms are not able to process large volumes of data, because the amount of memory required exceeds the amount available. The proposed algorithm MRRadix presents a framework that promotes the optimization of memory usage. It also uses the concept of partitioning to handle large volumes of data. The original contribution of this proposal is enable a superior performance when compared to other related algorithms and moreover successfully concludes the task of mining association rules in large databases, bypass the problem of available memory. One of the tests showed that the MR-Radix presents fourteen times less memory usage than the GFP-growth. © 2011 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The isotypes of RAR and RXR are retinoic acid and retinoid X acid receptors, respectively, whose ligand-binding domain contains the ligand-dependent activation function, with distinct pharmacological targets for retinoids, involved in the treatment of various cancers and skin diseases. Due to the major challenge which cancer treatment and cure still imposes after many decades to the international scientific community, there is actually considerable interest in new ligands with increased bioactivity. We have focused on the retinoid acid receptor, which is considered an interesting target for drug design. In this work, we carried out density functional geometry optimizations, and different docking procedures. We performed screening in a large database (hundreds of thousands of molecules which we optimized at the AM1 level) yielding a set of potential bioactive ligands. A new ligand was selected and optimized at the B3LYP/6-31G* level. A flexible docking program was used to investigate the interactions between the receptor and the new ligand. The result of this work is compared with several crystallographic ligands of RAR. Our theoretically more bioactive new-ligand indicates stronger and more hydrogen bonds as well as hydrophobic interactions with the receptor. (c) 2005 Wiley Periodicals, Inc.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper we propose a novel method for shape analysis called HTS (Hough Transform Statistics), which uses statistics from Hough Transform space in order to characterize the shape of objects in digital images. Experimental results showed that the HTS descriptor is robust and presents better accuracy than some traditional shape description methods. Furthermore, HTS algorithm has linear complexity, which is an important requirement for content based image retrieval from large databases. © 2013 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Agronomia (Produção Vegetal) - FCAV

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Majority of biometric researchers focus on the accuracy of matching using biometrics databases, including iris databases, while the scalability and speed issues have been neglected. In the applications such as identification in airports and borders, it is critical for the identification system to have low-time response. In this paper, a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), is utilized as a classifier in a pre-developed iris recognition system. The aim of this paper is to verify the effectiveness of OPF in the field of iris recognition, and its performance for various scale iris databases. This paper investigates several classifiers, which are widely used in iris recognition papers, and the response time along with accuracy. The existing Gauss-Laguerre Wavelet based iris coding scheme, which shows perfect discrimination with rotary Hamming distance classifier, is used for iris coding. The performance of classifiers is compared using small, medium, and large scale databases. Such comparison shows that OPF has faster response for large scale database, thus performing better than more accurate but slower Bayesian classifier.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tick-borne zoonoses (TBZ) are emerging diseases worldwide. A large amount of information (e.g. case reports, results of epidemiological surveillance, etc.) is dispersed through various reference sources (ISI and non-ISI journals, conference proceedings, technical reports, etc.). An integrated database-derived from the ICTTD-3 project (http://www.icttd.nl)-was developed in order to gather TBZ records in the (sub-)tropics, collected both by the authors and collaborators worldwide. A dedicated website (http://www.tickbornezoonoses.org) was created to promote collaboration and circulate information. Data collected are made freely available to researchers for analysis by spatial methods, integrating mapped ecological factors for predicting TBZ risk. The authors present the assembly process of the TBZ database: the compilation of an updated list of TBZ relevant for (sub-)tropics, the database design and its structure, the method of bibliographic search, the assessment of spatial precision of geo-referenced records. At the time of writing, 725 records extracted from 337 publications related to 59 countries in the (sub-)tropics, have been entered in the database. TBZ distribution maps were also produced. Imported cases have been also accounted for. The most important datasets with geo-referenced records were those on Spotted Fever Group rickettsiosis in Latin-America and Crimean-Congo Haemorrhagic Fever in Africa. The authors stress the need for international collaboration in data collection to update and improve the database. Supervision of data entered remains always necessary. Means to foster collaboration are discussed. The paper is also intended to describe the challenges encountered to assemble spatial data from various sources and to help develop similar data collections.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The functional and structural characterisation of enzymes that belong to microbial metabolic pathways is very important for structure-based drug design. The main interest in studying shikimate pathway enzymes involves the fact that they are essential for bacteria but do not occur in humans, making them selective targets for design of drugs that do not directly impact humans.Description: The ShiKimate Pathway DataBase (SKPDB) is a relational database applied to the study of shikimate pathway enzymes in microorganisms and plants. The current database is updated regularly with the addition of new data; there are currently 8902 enzymes of the shikimate pathway from different sources. The database contains extensive information on each enzyme, including detailed descriptions about sequence, references, and structural and functional studies. All files (primary sequence, atomic coordinates and quality scores) are available for downloading. The modeled structures can be viewed using the Jmol program.Conclusions: The SKPDB provides a large number of structural models to be used in docking simulations, virtual screening initiatives and drug design. It is freely accessible at http://lsbzix.rc.unesp.br/skpdb/. © 2010 Arcuri et al; licensee BioMed Central Ltd.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A significant set of information stored in different databases around the world, can be shared through peer-topeer databases. With that, is obtained a large base of knowledge, without the need for large investments because they are used existing databases, as well as the infrastructure in place. However, the structural characteristics of peer-topeer, makes complex the process of finding such information. On the other side, these databases are often heterogeneous in their schemas, but semantically similar in their content. A good peer-to-peer databases systems should allow the user access information from databases scattered across the network and receive only the information really relate to your topic of interest. This paper proposes to use ontologies in peer-to-peer database queries to represent the semantics inherent to the data. The main contribution of this work is enable integration between heterogeneous databases, improve the performance of such queries and use the algorithm of optimization Ant Colony to solve the problem of locating information on peer-to-peer networks, which presents an improve of 18% in results. © 2011 IEEE.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Landscape fires show large variability in the amount of biomass or fuel consumed per unit area burned. Fuel consumption (FC) depends on the biomass available to burn and the fraction of the biomass that is actually combusted, and can be combined with estimates of area burned to assess emissions. While burned area can be detected from space and estimates are becoming more reliable due to improved algorithms and sensors, FC is usually modeled or taken selectively from the literature. We compiled the peerreviewed literature on FC for various biomes and fuel categories to understand FC and its variability better, and to provide a database that can be used to constrain biogeochemical models with fire modules. We compiled in total 77 studies covering 11 biomes including savanna (15 studies, average FC of 4.6 t DM (dry matter) ha 1 with a standard deviation of 2.2), tropical forest (n = 19, FC = 126 +/- 77), temperate forest (n = 12, FC = 58 +/- 72), boreal forest (n = 16, FC = 35 +/- 24), pasture (n = 4, FC = 28 +/- 9.3), shifting cultivation (n = 2, FC = 23, with a range of 4.0-43), crop residue (n = 4, FC = 6.5 +/- 9.0), chaparral (n = 3, FC = 27 +/- 19), tropical peatland (n = 4, FC = 314 +/- 196), boreal peatland (n = 2, FC = 42 [42-43]), and tundra (n = 1, FC = 40). Within biomes the regional variability in the number of measurements was sometimes large, with e. g. only three measurement locations in boreal Russia and 35 sites in North America. Substantial regional differences in FC were found within the defined biomes: for example, FC of temperate pine forests in the USA was 37% lower than Australian forests dominated by eucalypt trees. Besides showing the differences between biomes, FC estimates were also grouped into different fuel classes. Our results highlight the large variability in FC, not only between biomes but also within biomes and fuel classes. This implies that substantial uncertainties are associated with using biome-averaged values to represent FC for whole biomes. Comparing the compiled FC values with co-located Global Fire Emissions Database version 3 (GFED3) FC indicates that modeling studies that aim to represent variability in FC also within biomes, still require improvements as they have difficulty in representing the dynamics governing FC.