19 resultados para databases and data mining
Resumo:
Multi-relational data mining enables pattern mining from multiple tables. The existing multi-relational mining association rules algorithms are not able to process large volumes of data, because the amount of memory required exceeds the amount available. The proposed algorithm MRRadix presents a framework that promotes the optimization of memory usage. It also uses the concept of partitioning to handle large volumes of data. The original contribution of this proposal is enable a superior performance when compared to other related algorithms and moreover successfully concludes the task of mining association rules in large databases, bypass the problem of available memory. One of the tests showed that the MR-Radix presents fourteen times less memory usage than the GFP-growth. © 2011 IEEE.
Resumo:
The data mining of Eucalyptus ESTs genome finds four clusters (EGCEST2257E11.g, EGBGRT3213F11.g, and EGCCFB1223H11.g) from highly conservative 14-3-3 protein family which modulates a wide variety of cellular processes. Multiple alignments were built from twenty four sequences of 14-3-3 proteins searched into the GenBank databases and into the four pools of Eucalyptus genome programs. The alignment has shown two regions highly conservative on the sequences corresponding to the motifs of protein phosphorylation and nine highly conservative regions on the sequence corresponding to the linkage regions of alpha helices structure based on three dimensional of dimer functional structure. The differences of amino acid into the structural and functional domains of 14-3-3 plant protein were identified and can explain the functional diversity of different isoforms. The phylogenic protein trees were built by the maximum parsimony and neighborjoining procedures of Clustal X alignments and PAUP software for phylogenic analysis.
Resumo:
In a peer-to-peer network, the nodes interact with each other by sharing resources, services and information. Many applications have been developed using such networks, being a class of such applications are peer-to-peer databases. The peer-to-peer databases systems allow the sharing of unstructured data, being able to integrate data from several sources, without the need of large investments, because they are used existing repositories. However, the high flexibility and dynamicity of networks the network, as well as the absence of a centralized management of information, becomes complex the process of locating information among various participants in the network. In this context, this paper presents original contributions by a proposed architecture for a routing system that uses the Ant Colony algorithm to optimize the search for desired information supported by ontologies to add semantics to shared data, enabling integration among heterogeneous databases and the while seeking to reduce the message traffic on the network without causing losses in the amount of responses, confirmed by the improve of 22.5% in this amount. © 2011 IEEE.
Resumo:
The multi-relational Data Mining approach has emerged as alternative to the analysis of structured data, such as relational databases. Unlike traditional algorithms, the multi-relational proposals allow mining directly multiple tables, avoiding the costly join operations. In this paper, is presented a comparative study involving the traditional Patricia Mine algorithm and its corresponding multi-relational proposed, MR-Radix in order to evaluate the performance of two approaches for mining association rules are used for relational databases. This study presents two original contributions: the proposition of an algorithm multi-relational MR-Radix, which is efficient for use in relational databases, both in terms of execution time and in relation to memory usage and the presentation of the empirical approach multirelational advantage in performance over several tables, which avoids the costly join operations from multiple tables. © 2011 IEEE.