3 resultados para Reduct

em Brock University, Canada


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Rough Set Data Analysis (RSDA) is a non-invasive data analysis approach that solely relies on the data to find patterns and decision rules. Despite its noninvasive approach and ability to generate human readable rules, classical RSDA has not been successfully used in commercial data mining and rule generating engines. The reason is its scalability. Classical RSDA slows down a great deal with the larger data sets and takes much longer times to generate the rules. This research is aimed to address the issue of scalability in rough sets by improving the performance of the attribute reduction step of the classical RSDA - which is the root cause of its slow performance. We propose to move the entire attribute reduction process into the database. We defined a new schema to store the initial data set. We then defined SOL queries on this new schema to find the attribute reducts correctly and faster than the traditional RSDA approach. We tested our technique on two typical data sets and compared our results with the traditional RSDA approach for attribute reduction. In the end we also highlighted some of the issues with our proposed approach which could lead to future research.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This project aimed to determine the protein prof i les and concent rat ion in honeys, ef fect of storage condi t ions on the protein content and the interact ion between proteins and polyphenols. Thi r teen honeys f rom di f ferent botanical or igins were analyzed for thei r protein prof i les using SDS-PAGE, protein concent rat ion and phenol ic content , using the Pierce Protein Assay and Fol in-Ciocal teau methods, respectively. Protein-polyphenol interact ions were analyzed by a combinat ion of the ext ract ion of honeys wi th solvents of di f ferent polar i t ies fol lowed by LCjMS analysis of the obtained f ract ions. Results demonst rated a di f ferent protein content in the tested honeys, wi th buckwheat honey possessing the highest protein concent rat ion. We have shown that the reduct ion of proteins dur ing honey storage was caused, partially, by the protein complexat ion wi th phenolics. The LCjMS analysis of the peak elut ing at retent ion t ime of 10 to 14 min demonst rated that these phenolics included f lavonoids such as Pinobanksin, Pinobanksin acetate, Apigenin, Kaemferol and Myricetin and also cinnamic acid.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.