957 resultados para DATABASES


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Multi-relational data mining enables pattern mining from multiple tables. The existing multi-relational mining association rules algorithms are not able to process large volumes of data, because the amount of memory required exceeds the amount available. The proposed algorithm MRRadix presents a framework that promotes the optimization of memory usage. It also uses the concept of partitioning to handle large volumes of data. The original contribution of this proposal is enable a superior performance when compared to other related algorithms and moreover successfully concludes the task of mining association rules in large databases, bypass the problem of available memory. One of the tests showed that the MR-Radix presents fourteen times less memory usage than the GFP-growth. © 2011 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Includes bibliography

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The unavailability of data to inform policy planning and formulation has been repeatedly cited as the main challenge to economic and social progress in the Caribbean. Furthermore, even in instances when data is produced, broader gaps exist between its production and eventual use for evidence-based policy formulation. Owing to those challenges, this report explores the use of databases of social and gender statistics in the development of policies and programmes in the Caribbean subregion. The report offers a general appraisal of databases against two main considerations: (i) maximizing the use of existing databases in relevant policies and programmes; and (ii) bridging the gaps in data availability of relevant statistical databases and their analyses. The assessment entailed an inventory of social and gender databases maintained by data producers in the region and analysis of the extent to which the databases are used for policy formulation. To that end, a literature search as well as consultations with a number of knowledgeable persons active in the field of statistics and data provision was conducted. Based on the review, a set of recommendations were produced to improve current practices within the region with respect evidence based policy formulation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Given a large image set, in which very few images have labels, how to guess labels for the remaining majority? How to spot images that need brand new labels different from the predefined ones? How to summarize these data to route the user’s attention to what really matters? Here we answer all these questions. Specifically, we propose QuMinS, a fast, scalable solution to two problems: (i) Low-labor labeling (LLL) – given an image set, very few images have labels, find the most appropriate labels for the rest; and (ii) Mining and attention routing – in the same setting, find clusters, the top-'N IND.O' outlier images, and the 'N IND.R' images that best represent the data. Experiments on satellite images spanning up to 2.25 GB show that, contrasting to the state-of-the-art labeling techniques, QuMinS scales linearly on the data size, being up to 40 times faster than top competitors (GCap), still achieving better or equal accuracy, it spots images that potentially require unpredicted labels, and it works even with tiny initial label sets, i.e., nearly five examples. We also report a case study of our method’s practical usage to show that QuMinS is a viable tool for automatic coffee crop detection from remote sensing images.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

[EN]In this paper, we address the challenge of gender classi - cation using large databases of images with two goals. The rst objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classi er that provides the best classi cation rate for one database, improves the classi cation results for other databases, that is, the cross-database performance.