3 resultados para Data Mining, Rough Sets, Multi-Dimension, Association Rules, Constraint

em Brock University, Canada


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rough Set Data Analysis (RSDA) is a non-invasive data analysis approach that solely relies on the data to find patterns and decision rules. Despite its noninvasive approach and ability to generate human readable rules, classical RSDA has not been successfully used in commercial data mining and rule generating engines. The reason is its scalability. Classical RSDA slows down a great deal with the larger data sets and takes much longer times to generate the rules. This research is aimed to address the issue of scalability in rough sets by improving the performance of the attribute reduction step of the classical RSDA - which is the root cause of its slow performance. We propose to move the entire attribute reduction process into the database. We defined a new schema to store the initial data set. We then defined SOL queries on this new schema to find the attribute reducts correctly and faster than the traditional RSDA approach. We tested our technique on two typical data sets and compared our results with the traditional RSDA approach for attribute reduction. In the end we also highlighted some of the issues with our proposed approach which could lead to future research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Feature selection plays an important role in knowledge discovery and data mining nowadays. In traditional rough set theory, feature selection using reduct - the minimal discerning set of attributes - is an important area. Nevertheless, the original definition of a reduct is restrictive, so in one of the previous research it was proposed to take into account not only the horizontal reduction of information by feature selection, but also a vertical reduction considering suitable subsets of the original set of objects. Following the work mentioned above, a new approach to generate bireducts using a multi--objective genetic algorithm was proposed. Although the genetic algorithms were used to calculate reduct in some previous works, we did not find any work where genetic algorithms were adopted to calculate bireducts. Compared to the works done before in this area, the proposed method has less randomness in generating bireducts. The genetic algorithm system estimated a quality of each bireduct by values of two objective functions as evolution progresses, so consequently a set of bireducts with optimized values of these objectives was obtained. Different fitness evaluation methods and genetic operators, such as crossover and mutation, were applied and the prediction accuracies were compared. Five datasets were used to test the proposed method and two datasets were used to perform a comparison study. Statistical analysis using the one-way ANOVA test was performed to determine the significant difference between the results. The experiment showed that the proposed method was able to reduce the number of bireducts necessary in order to receive a good prediction accuracy. Also, the influence of different genetic operators and fitness evaluation strategies on the prediction accuracy was analyzed. It was shown that the prediction accuracies of the proposed method are comparable with the best results in machine learning literature, and some of them outperformed it.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study examines the efficiency of search engine advertising strategies employed by firms. The research setting is the online retailing industry, which is characterized by extensive use of Web technologies and high competition for market share and profitability. For Internet retailers, search engines are increasingly serving as an information gateway for many decision-making tasks. In particular, Search engine advertising (SEA) has opened a new marketing channel for retailers to attract new customers and improve their performance. In addition to natural (organic) search marketing strategies, search engine advertisers compete for top advertisement slots provided by search brokers such as Google and Yahoo! through keyword auctions. The rationale being that greater visibility on a search engine during a keyword search will capture customers' interest in a business and its product or service offerings. Search engines account for most online activities today. Compared with the slow growth of traditional marketing channels, online search volumes continue to grow at a steady rate. According to the Search Engine Marketing Professional Organization, spending on search engine marketing by North American firms in 2008 was estimated at $13.5 billion. Despite the significant role SEA plays in Web retailing, scholarly research on the topic is limited. Prior studies in SEA have focused on search engine auction mechanism design. In contrast, research on the business value of SEA has been limited by the lack of empirical data on search advertising practices. Recent advances in search and retail technologies have created datarich environments that enable new research opportunities at the interface of marketing and information technology. This research uses extensive data from Web retailing and Google-based search advertising and evaluates Web retailers' use of resources, search advertising techniques, and other relevant factors that contribute to business performance across different metrics. The methods used include Data Envelopment Analysis (DEA), data mining, and multivariate statistics. This research contributes to empirical research by analyzing several Web retail firms in different industry sectors and product categories. One of the key findings is that the dynamics of sponsored search advertising vary between multi-channel and Web-only retailers. While the key performance metrics for multi-channel retailers include measures such as online sales, conversion rate (CR), c1ick-through-rate (CTR), and impressions, the key performance metrics for Web-only retailers focus on organic and sponsored ad ranks. These results provide a useful contribution to our organizational level understanding of search engine advertising strategies, both for multi-channel and Web-only retailers. These results also contribute to current knowledge in technology-driven marketing strategies and provide managers with a better understanding of sponsored search advertising and its impact on various performance metrics in Web retailing.