62 resultados para K-anonymization
em Queensland University of Technology - ePrints Archive
Resumo:
The Raman spectra at 77 K of the hydroxyl stretching of kaolinite were obtained along the three axes perpendicular to the crystal faces. Raman bands were observed at 3616, 3658 and 3677 cm−1 together with a distinct band observed at 3691 cm−1 and a broad profile between 3695 and 3715 cm−1. The band at 3616 cm−1 is assigned to the inner hydroxyl. The bands at 3658 and 3677 cm−1 are attributed to the out-of-phase vibrations of the inner surface hydroxyls. The Raman spectra of the in-phase vibrations of the inner-surface hydroxyl-stretching region are described in terms of transverse and longitudinal optic splitting. The band at 3691 cm−1 is assigned to the transverse optic and the broad profile to the longitudinal optic mode. This splitting remained even at liquid nitrogen temperature. The transverse optic vibration may be curve resolved into two or three bands, which are attributed to different types of hydroxyl groups in the kaolinite.
Resumo:
We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.
Resumo:
The K-Adv has been developed around the concept that it comprises an ICT enabling infrastructure that encompasses ICT hardware and software infrastructure facilities together with an enabling ICT support system; a leadership infrastructure support system that provides the vision for its implementation and the realisation capacity for the vision to be realised; and the necessary people infrastructure that includes the people capabilities and capacities supported by organisational processes that facilitates this resource to be mobilised.
Resumo:
This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.
Resumo:
Although timber plantations and forests are classified as forms of agricultural production, the ownership of this land classification is not limited to rural producers. Timber plantations and forests are now regarded as a long-term investment with both institutional and absentee owners. While the NCREIF property indices have been the benchmarks for the measurement of the performance of the commercial property market in the UK, for many years the IPD timberland index has recently emerged as the U.K. forest and timberland performance indicator. The IPD Forest index incorporates 126 properties over five regions in the U.K. This paper will utilise the IPD Forestry Index to examine the performance of U.K. timber plantations and forests over the period 1981-2004. In particular, issues to be critically assessed include plantation and forest performance analysis, comparative investment analysis, and the role of plantations and forests in investment portfolios, the risk reduction and portfolio benefits of plantations and forests in mixed-asset portfolios and the strategic investment significance of U.K. timberlands.
Resumo:
Random Indexing K-tree is the combination of two algorithms suited for large scale document clustering.
Resumo:
This paper describes the approach taken to the clustering task at INEX 2009 by a group at the Queensland University of Technology. The Random Indexing (RI) K-tree has been used with a representation that is based on the semantic markup available in the INEX 2009 Wikipedia collection. The RI K-tree is a scalable approach to clustering large document collections. This approach has produced quality clustering when evaluated using two different methodologies.
Resumo:
Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.