988 resultados para Document Representation
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.
Resumo:
Term dependence is a natural consequence of language use. Its successful representation has been a long standing goal for Information Retrieval research. We present a methodology for the construction of a concept hierarchy that takes into account the three basic dimensions of term dependence. We also introduce a document evaluation function that allows the use of the concept hierarchy as a user profile for Information Filtering. Initial experimental results indicate that this is a promising approach for incorporating term dependence in the way documents are filtered.
Resumo:
Giovanni Sartori famously wrote that political parties do not need to be mini-republics, yet today parties in many parliamentary democracies are moving in this direction by giving their members direct votes over important decisions, including selecting party leaders and settling policy issues. This paper explores some of the implications of these changes. It asks whether the addition of membership rights affects the types of members who are attracted: do we find a bigger gap between the preferences of party members and of party voters in parties that are more plebiscitary, as literature on members' motivations might lead us to expect? The paper examines this question both cross-sectionally and longitudinally using opinion data from the European Social Survey and newly-available party organizational data from the Political Party Database project.