984 resultados para Taxonomy information
Resumo:
Information overload has become a serious issue for web users. Personalisation can provide effective solutions to overcome this problem. Recommender systems are one popular personalisation tool to help users deal with this issue. As the base of personalisation, the accuracy and efficiency of web user profiling affects the performances of recommender systems and other personalisation systems greatly. In Web 2.0, the emerging user information provides new possible solutions to profile users. Folksonomy or tag information is a kind of typical Web 2.0 information. Folksonomy implies the users‘ topic interests and opinion information. It becomes another source of important user information to profile users and to make recommendations. However, since tags are arbitrary words given by users, folksonomy contains a lot of noise such as tag synonyms, semantic ambiguities and personal tags. Such noise makes it difficult to profile users accurately or to make quality recommendations. This thesis investigates the distinctive features and multiple relationships of folksonomy and explores novel approaches to solve the tag quality problem and profile users accurately. Harvesting the wisdom of crowds and experts, three new user profiling approaches are proposed: folksonomy based user profiling approach, taxonomy based user profiling approach, hybrid user profiling approach based on folksonomy and taxonomy. The proposed user profiling approaches are applied to recommender systems to improve their performances. Based on the generated user profiles, the user and item based collaborative filtering approaches, combined with the content filtering methods, are proposed to make recommendations. The proposed new user profiling and recommendation approaches have been evaluated through extensive experiments. The effectiveness evaluation experiments were conducted on two real world datasets collected from Amazon.com and CiteULike websites. The experimental results demonstrate that the proposed user profiling and recommendation approaches outperform those related state-of-the-art approaches. In addition, this thesis proposes a parallel, scalable user profiling implementation approach based on advanced cloud computing techniques such as Hadoop, MapReduce and Cascading. The scalability evaluation experiments were conducted on a large scaled dataset collected from Del.icio.us website. This thesis contributes to effectively use the wisdom of crowds and expert to help users solve information overload issues through providing more accurate, effective and efficient user profiling and recommendation approaches. It also contributes to better usages of taxonomy information given by experts and folksonomy information contributed by users in Web 2.0.
Resumo:
Item folksonomy or tag information is a kind of typical and prevalent web 2.0 information. Item folksonmy contains rich opinion information of users on item classifications and descriptions. It can be used as another important information source to conduct opinion mining. On the other hand, each item is associated with taxonomy information that reflects the viewpoints of experts. In this paper, we propose to mine for users’ opinions on items based on item taxonomy developed by experts and folksonomy contributed by users. In addition, we explore how to make personalized item recommendations based on users’ opinions. The experiments conducted on real word datasets collected from Amazon.com and CiteULike demonstrated the effectiveness of the proposed approaches.
Resumo:
Recommender systems assist users in finding what they want. The challenging issue is how to efficiently acquire user preferences or user information needs for building personalized recommender systems. This research explores the acquisition of user preferences using data taxonomy information to enhance personalized recommendations for alleviating cold-start problem. A concept hierarchy model is proposed, which provides a two-dimensional hierarchy for acquiring user preferences. The language model is also extended for the proposed hierarchy in order to generate an effective recommender algorithm. Both Amazon.com book and music datasets are used to evaluate the proposed approach, and the experimental results show that the proposed approach is promising.
Resumo:
Perfect information is seldom available to man or machines due to uncertainties inherent in real world problems. Uncertainties in geographic information systems (GIS) stem from either vague/ambiguous or imprecise/inaccurate/incomplete information and it is necessary for GIS to develop tools and techniques to manage these uncertainties. There is a widespread agreement in the GIS community that although GIS has the potential to support a wide range of spatial data analysis problems, this potential is often hindered by the lack of consistency and uniformity. Uncertainties come in many shapes and forms, and processing uncertain spatial data requires a practical taxonomy to aid decision makers in choosing the most suitable data modeling and analysis method. In this paper, we: (1) review important developments in handling uncertainties when working with spatial data and GIS applications; (2) propose a taxonomy of models for dealing with uncertainties in GIS; and (3) identify current challenges and future research directions in spatial data analysis and GIS for managing uncertainties.
Resumo:
Research suggests that those suspected of sexual offending might be more willing to reveal information about their crimes if interviewers display empathic behaviour. However, the literature concerning investigative empathy is in its infancy, and so as yet is not well understood. This study explores empathy in a sample of real-life interviews conducted by police officers in England with suspected sex offenders. Using qualitative methodology, the presence and type of empathic verbal behaviours displayed was examined. Resulting categories were quantitatively analysed to investigate their occurrence overall, and across interviewer gender. We identified four distinct types of empathy, some of which were used significantly more often than others. Female interviewers displayed more empathic behaviour per se by a considerable margin.
Resumo:
The explosive growth of the World-Wide-Web and the emergence of ecommerce are the major two factors that have led to the development of recommender systems (Resnick and Varian, 1997). The main task of recommender systems is to learn from users and recommend items (e.g. information, products or books) that match the users’ personal preferences. Recommender systems have been an active research area for more than a decade. Many different techniques and systems with distinct strengths have been developed to generate better quality recommendations. One of the main factors that affect recommenders’ recommendation quality is the amount of information resources that are available to the recommenders. The main feature of the recommender systems is their ability to make personalised recommendations for different individuals. However, for many ecommerce sites, it is difficult for them to obtain sufficient knowledge about their users. Hence, the recommendations they provided to their users are often poor and not personalised. This information insufficiency problem is commonly referred to as the cold-start problem. Most existing research on recommender systems focus on developing techniques to better utilise the available information resources to achieve better recommendation quality. However, while the amount of available data and information remains insufficient, these techniques can only provide limited improvements to the overall recommendation quality. In this thesis, a novel and intuitive approach towards improving recommendation quality and alleviating the cold-start problem is attempted. This approach is enriching the information resources. It can be easily observed that when there is sufficient information and knowledge base to support recommendation making, even the simplest recommender systems can outperform the sophisticated ones with limited information resources. Two possible strategies are suggested in this thesis to achieve the proposed information enrichment for recommenders: • The first strategy suggests that information resources can be enriched by considering other information or data facets. Specifically, a taxonomy-based recommender, Hybrid Taxonomy Recommender (HTR), is presented in this thesis. HTR exploits the relationship between users’ taxonomic preferences and item preferences from the combination of the widely available product taxonomic information and the existing user rating data, and it then utilises this taxonomic preference to item preference relation to generate high quality recommendations. • The second strategy suggests that information resources can be enriched simply by obtaining information resources from other parties. In this thesis, a distributed recommender framework, Ecommerce-oriented Distributed Recommender System (EDRS), is proposed. The proposed EDRS allows multiple recommenders from different parties (i.e. organisations or ecommerce sites) to share recommendations and information resources with each other in order to improve their recommendation quality. Based on the results obtained from the experiments conducted in this thesis, the proposed systems and techniques have achieved great improvement in both making quality recommendations and alleviating the cold-start problem.
Resumo:
An information filtering (IF) system monitors an incoming document stream to find the documents that match the information needs specified by the user profiles. To learn to use the user profiles effectively is one of the most challenging tasks when developing an IF system. With the document selection criteria better defined based on the users’ needs, filtering large streams of information can be more efficient and effective. To learn the user profiles, term-based approaches have been widely used in the IF community because of their simplicity and directness. Term-based approaches are relatively well established. However, these approaches have problems when dealing with polysemy and synonymy, which often lead to an information overload problem. Recently, pattern-based approaches (or Pattern Taxonomy Models (PTM) [160]) have been proposed for IF by the data mining community. These approaches are better at capturing sematic information and have shown encouraging results for improving the effectiveness of the IF system. On the other hand, pattern discovery from large data streams is not computationally efficient. Also, these approaches had to deal with low frequency pattern issues. The measures used by the data mining technique (for example, “support” and “confidences”) to learn the profile have turned out to be not suitable for filtering. They can lead to a mismatch problem. This thesis uses the rough set-based reasoning (term-based) and pattern mining approach as a unified framework for information filtering to overcome the aforementioned problems. This system consists of two stages - topic filtering and pattern mining stages. The topic filtering stage is intended to minimize information overloading by filtering out the most likely irrelevant information based on the user profiles. A novel user-profiles learning method and a theoretical model of the threshold setting have been developed by using rough set decision theory. The second stage (pattern mining) aims at solving the problem of the information mismatch. This stage is precision-oriented. A new document-ranking function has been derived by exploiting the patterns in the pattern taxonomy. The most likely relevant documents were assigned higher scores by the ranking function. Because there is a relatively small amount of documents left after the first stage, the computational cost is markedly reduced; at the same time, pattern discoveries yield more accurate results. The overall performance of the system was improved significantly. The new two-stage information filtering model has been evaluated by extensive experiments. Tests were based on the well-known IR bench-marking processes, using the latest version of the Reuters dataset, namely, the Reuters Corpus Volume 1 (RCV1). The performance of the new two-stage model was compared with both the term-based and data mining-based IF models. The results demonstrate that the proposed information filtering system outperforms significantly the other IF systems, such as the traditional Rocchio IF model, the state-of-the-art term-based models, including the BM25, Support Vector Machines (SVM), and Pattern Taxonomy Model (PTM).
Resumo:
The social tags in web 2.0 are becoming another important information source to profile users' interests and preferences to make personalized recommendations. To solve the problem of low information sharing caused by the free-style vocabulary of tags and the long tails of the distribution of tags and items, this paper proposes an approach to integrate the social tags given by users and the item taxonomy with standard vocabulary and hierarchical structure provided by experts to make personalized recommendations. The experimental results show that the proposed approach can effectively improve the information sharing and recommendation accuracy.
Resumo:
Opiine wasps (Hymenoptera: Braconidae: Opiinae) are parasitoids of dacine fruit flies (Diptera: Tephritidae: Dacinae), the primary horticultural pests of Australia and the South Pacific. Effective use of opiines for biological control of fruit flies is limited by poor taxonomy and identification difficulties. To overcome these problems, this thesis had two aims: (i) to carry out traditional taxonomic research on the fruit fly infesting opine braconids of Australia and the South Pacific; and (ii) to transfer the results of the taxonomic research into user friendly diagnostic tools. Curated wasp material was borrowed from all major Australian museum collections holding specimens. This was supplemented by a large body of material gathered as part of a major fruit fly project in Papua New Guinea: nearly 4000 specimens were examined and identified. Each wasp species was illustrated using traditional scientific drawings, full colour photomicroscopy and scanning electron microscopy. An electronic identification key was developed using Lucid software and diagnostic images were loaded on the web-based Pest and Diseases Image Library (PaDIL). A taxonomic synopsis and distribution and host records for each of the 15 species of dacine-parasitising opiine braconids found in the South Pacific is presented. Biosteres illusorius Fischer (1971) was formally transferred to the genus Fopius and a new species, Fopius ferrari Carmichael and Wharton (2005), was described. Other species dealt with were Diachasmimorpha hageni (Fullaway, 1952), D. kraussii (Fullaway, 1951), D. longicaudata (Ashmead, 1905), D. tryoni (Cameron, 1911), Fopius arisanus (Sonan, 1932), F. deeralensis (Fullaway, 1950), F. schlingeri Wharton (1999), Opius froggatti Fullaway (195), Psyttalia fijiensis (Fullaway, 1936), P. muesebecki (Fischer, 1963), P. novaguineensis (Szépliget, 1900i) and Utetes perkinsi (Fullaway, 1950). This taxonomic component of the thesis has been formally published in the scientific literature. An interactive diagnostics package (“OpiineID”) was developed, the centre of which is a Lucid based multi-access key. Because the diagnostics package is computer based, without the space limitations of the journal publication, there is no pictorial limit in OpiineID and so it is comprehensively illustrated with SEM photographs, full colour photographs, line drawings and fully rendered illustrations. The identification key is only one small component of OpiineID and the key is supported by fact sheets with morphological descriptions, host associations, geographical information and images. Each species contained within the OpiineID package has also been uploaded onto the PaDIL website (www.padil.gov.au). Because the identification of fruit fly parasitoids is largely of concern to fruit fly workers, rather than braconid specialists, this thesis deals directly with an area of growing importance to many areas of pure and applied biology; the nexus between taxonomy and diagnostics. The Discussion chapter focuses on this area, particularly the opportunities offered by new communication and information tools as new ways delivering the outputs of taxonomic science.
Resumo:
Many data mining techniques have been proposed for mining useful patterns in databases. However, how to effectively utilize discovered patterns is still an open research issue, especially in the domain of text mining. Most existing methods adopt term-based approaches. However, they all suffer from the problems of polysemy and synonymy. This paper presents an innovative technique, pattern taxonomy mining, to improve the effectiveness of using discovered patterns for finding useful information. Substantial experiments on RCV1 demonstrate that the proposed solution achieves encouraging performance.
Resumo:
Information Overload and Mismatch are two fundamental problems affecting the effectiveness of information filtering systems. Even though both term-based and patternbased approaches have been proposed to address the problems of overload and mismatch, neither of these approaches alone can provide a satisfactory solution to address these problems. This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed twostage filtering model significantly outperforms the both termbased and pattern-based information filtering models.
Resumo:
Delegation, from the technical point of view, is widely considered as a potential approach in addressing the problem of providing dynamic access control decisions in activities with a high level of collaboration, either within a single security domain or across multiple security domains. Although delegation continues to attract significant attention from the research community, presently, there is no published work that presents a taxonomy of delegation concepts and models. This paper intends to address this gap by presenting a set of taxonomic criteria relevant to the concept of delegation and applies the taxonomy to a selection of significant delegation models published in the literature.
Resumo:
Delegation, from a technical point of view, is widely considered as a potential approach in addressing the problem of providing dynamic access control decisions in activities with a high level of collaboration, either within a single security domain or across multiple security domains. Although delegation continues to attract significant attention from the research community, presently, there is no published work that presents a taxonomy of delegation concepts and models. This article intends to address this gap by presenting a set of taxonomic criteria relevant to the concept of delegation. This article also applies the taxonomy to a selection of significant delegation models published in the literature.