949 resultados para rough sets analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a new technique in the investigation of limited-dependent variable models. This paper illustrates that variable precision rough set theory (VPRS), allied with the use of a modern method of classification, or discretisation of data, can out-perform the more standard approaches that are employed in economics, such as a probit model. These approaches and certain inductive decision tree methods are compared (through a Monte Carlo simulation approach) in the analysis of the decisions reached by the UK Monopolies and Mergers Committee. We show that, particularly in small samples, the VPRS model can improve on more traditional models, both in-sample, and particularly in out-of-sample prediction. A similar improvement in out-of-sample prediction over the decision tree methods is also shown.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern- based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experiments have been conducted to compare the proposed two-stage filtering (T-SM) model with other possible "term-based + pattern-based" or "term-based + term-based" IF models. The results based on the RCV1 corpus show that the T-SM model significantly outperforms other types of "two-stage" IF models.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Q. Shen and R. Jensen, 'Rough sets, their extensions and applications,' International Journal of Automation and Computing (IJAC), vol. 4, no. 3, pp. 217-218, 2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Rough Set Data Analysis (RSDA) is a non-invasive data analysis approach that solely relies on the data to find patterns and decision rules. Despite its noninvasive approach and ability to generate human readable rules, classical RSDA has not been successfully used in commercial data mining and rule generating engines. The reason is its scalability. Classical RSDA slows down a great deal with the larger data sets and takes much longer times to generate the rules. This research is aimed to address the issue of scalability in rough sets by improving the performance of the attribute reduction step of the classical RSDA - which is the root cause of its slow performance. We propose to move the entire attribute reduction process into the database. We defined a new schema to store the initial data set. We then defined SOL queries on this new schema to find the attribute reducts correctly and faster than the traditional RSDA approach. We tested our technique on two typical data sets and compared our results with the traditional RSDA approach for attribute reduction. In the end we also highlighted some of the issues with our proposed approach which could lead to future research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper highlights the prediction of learning disabilities (LD) in school-age children using rough set theory (RST) with an emphasis on application of data mining. In rough sets, data analysis start from a data table called an information system, which contains data about objects of interest, characterized in terms of attributes. These attributes consist of the properties of learning disabilities. By finding the relationship between these attributes, the redundant attributes can be eliminated and core attributes determined. Also, rule mining is performed in rough sets using the algorithm LEM1. The prediction of LD is accurately done by using Rosetta, the rough set tool kit for analysis of data. The result obtained from this study is compared with the output of a similar study conducted by us using Support Vector Machine (SVM) with Sequential Minimal Optimisation (SMO) algorithm. It is found that, using the concepts of reduct and global covering, we can easily predict the learning disabilities in children

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In data mining, an important goal is to generate an abstraction of the data. Such an abstraction helps in reducing the space and search time requirements of the overall decision making process. Further, it is important that the abstraction is generated from the data with a small number of disk scans. We propose a novel data structure, pattern count tree (PC-tree), that can be built by scanning the database only once. PC-tree is a minimal size complete representation of the data and it can be used to represent dynamic databases with the help of knowledge that is either static or changing. We show that further compactness can be achieved by constructing the PC-tree on segmented patterns. We exploit the flexibility offered by rough sets to realize a rough PC-tree and use it for efficient and effective rough classification. To be consistent with the sizes of the branches of the PC-tree, we use upper and lower approximations of feature sets in a manner different from the conventional rough set theory. We conducted experiments using the proposed classification scheme on a large-scale hand-written digit data set. We use the experimental results to establish the efficacy of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen and Q. Shen. Fuzzy-Rough Sets Assisted Attribute Selection. IEEE Transactions on Fuzzy Systems, vol. 15, no. 1, pp. 73-89, 2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

X. Wang, J. Yang, X. Teng, W. Xia, and R. Jensen. Feature Selection based on Rough Sets and Particle Swarm Optimization. Pattern Recognition Letters, vol. 28, no. 4, pp. 459-471, 2007.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Q. Shen and R. Jensen, 'Selecting Informative Features with Fuzzy-Rough Sets and its Application for Complex Systems Monitoring,' Pattern Recognition, vol. 37, no. 7, pp. 1351-1363, 2004.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

R. Jensen, Q. Shen, Data Reduction with Rough Sets, In: Encyclopedia of Data Warehousing and Mining - 2nd Edition, Vol. II, 2008.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes an optimal strategy for extracting probabilistic rules from databases. Two inductive learning-based statistic measures and their rough set-based definitions: accuracy and coverage are introduced. The simplicity of a rule emphasized in this paper has previously been ignored in the discovery of probabilistic rules. To avoid the high computational complexity of rough-set approach, some rough-set terminologies rather than the approach itself are applied to represent the probabilistic rules. The genetic algorithm is exploited to find the optimal probabilistic rules that have the highest accuracy and coverage, and shortest length. Some heuristic genetic operators are also utilized in order to make the global searching and evolution of rules more efficiently. Experimental results have revealed that it run more efficiently and generate probabilistic classification rules of the same integrity when compared with traditional classification methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Selecting a set of features which is optimal for a given task is the problem which plays an important role in a wide variety of contexts including pattern recognition, images understanding and machine learning. The concept of reduction of the decision table based on the rough set is very useful for feature selection. In this paper, a genetic algorithm based approach is presented to search the relative reduct decision table of the rough set. This approach has the ability to accommodate multiple criteria such as accuracy and cost of classification into the feature selection process and finds the effective feature subset for texture classification . On the basis of the effective feature subset selected, this paper presents a method to extract the objects which are higher than their surroundings, such as trees or forest, in the color aerial images. The experiments results show that the feature subset selected and the method of the object extraction presented in this paper are practical and effective.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rough set is a new mathematical approach to imprecision, vagueness and uncertainty. The concept of reduction of the decision table based on the rough sets is very useful for feature selection. The paper describes an application of rough sets method to feature selection and reduction in texture images recognition. The methods applied include continuous data discretization based on Fuzzy c-means and, and rough set method for feature selection and reduction. The trees extractions in the aerial images were applied. The experiments show that the methods presented in this paper are practical and effective.