2 resultados para Rough Set

em Aston University Research Archive


Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper introduces a new technique in the investigation of limited-dependent variable models. This paper illustrates that variable precision rough set theory (VPRS), allied with the use of a modern method of classification, or discretisation of data, can out-perform the more standard approaches that are employed in economics, such as a probit model. These approaches and certain inductive decision tree methods are compared (through a Monte Carlo simulation approach) in the analysis of the decisions reached by the UK Monopolies and Mergers Committee. We show that, particularly in small samples, the VPRS model can improve on more traditional models, both in-sample, and particularly in out-of-sample prediction. A similar improvement in out-of-sample prediction over the decision tree methods is also shown.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.