906 resultados para Knowledge retrieval, Ontology, User information needs, User profiles, Information retrieval


Relevância:

50.00% 50.00%

Publicador:

Resumo:

Product recommender systems are often deployed by e-commerce websites to improve user experience and increase sales. However, recommendation is limited by the product information hosted in those e-commerce sites and is only triggered when users are performing e-commerce activities. In this paper, we develop a novel product recommender system called METIS, a MErchanT Intelligence recommender System, which detects users' purchase intents from their microblogs in near real-time and makes product recommendation based on matching the users' demographic information extracted from their public profiles with product demographics learned from microblogs and online reviews. METIS distinguishes itself from traditional product recommender systems in the following aspects: 1) METIS was developed based on a microblogging service platform. As such, it is not limited by the information available in any specific e-commerce website. In addition, METIS is able to track users' purchase intents in near real-time and make recommendations accordingly. 2) In METIS, product recommendation is framed as a learning to rank problem. Users' characteristics extracted from their public profiles in microblogs and products' demographics learned from both online product reviews and microblogs are fed into learning to rank algorithms for product recommendation. We have evaluated our system in a large dataset crawled from Sina Weibo. The experimental results have verified the feasibility and effectiveness of our system. We have also made a demo version of our system publicly available and have implemented a live system which allows registered users to receive recommendations in real time. © 2014 ACM.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

With the advent of GPS enabled smartphones, an increasing number of users is actively sharing their location through a variety of applications and services. Along with the continuing growth of Location-Based Social Networks (LBSNs), security experts have increasingly warned the public of the dangers of exposing sensitive information such as personal location data. Most importantly, in addition to the geographical coordinates of the user’s location, LBSNs allow easy access to an additional set of characteristics of that location, such as the venue type or popularity. In this paper, we investigate the role of location semantics in the identification of LBSN users. We simulate a scenario in which the attacker’s goal is to reveal the identity of a set of LBSN users by observing their check-in activity. We then propose to answer the following question: what are the types of venues that a malicious user has to monitor to maximize the probability of success? Conversely, when should a user decide whether to make his/her check-in to a location public or not? We perform our study on more than 1 million check-ins distributed over 17 urban regions of the United States. Our analysis shows that different types of venues display different discriminative power in terms of user identity, with most of the venues in the “Residence” category providing the highest re-identification success across the urban regions. Interestingly, we also find that users with a high entropy of their check-ins distribution are not necessarily the hardest to identify, suggesting that it is the collective behaviour of the users’ population that determines the complexity of the identification task, rather than the individual behaviour.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

One of the greatest concerns related to the popularity of GPS-enabled devices and applications is the increasing availability of the personal location information generated by them and shared with application and service providers. Moreover, people tend to have regular routines and be characterized by a set of “significant places”, thus making it possible to identify a user from his/her mobility data. In this paper we present a series of techniques for identifying individuals from their GPS movements. More specifically, we study the uniqueness of GPS information for three popular datasets, and we provide a detailed analysis of the discriminatory power of speed, direction and distance of travel. Most importantly, we present a simple yet effective technique for the identification of users from location information that are not included in the original dataset used for training, thus raising important privacy concerns for the management of location datasets.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

People manage a spectrum of identities in cyber domains. Profiling individuals and assigning them to distinct groups or classes have potential applications in targeted services, online fraud detection, extensive social sorting, and cyber-security. This paper presents the Uncertainty of Identity Toolset, a framework for the identification and profiling of users from their social media accounts and e-mail addresses. More specifically, in this paper we discuss the design and implementation of two tools of the framework. The Twitter Geographic Profiler tool builds a map of the ethno-cultural communities of a person's friends on Twitter social media service. The E-mail Address Profiler tool identifies the probable identities of individuals from their e-mail addresses and maps their geographical distribution across the UK. To this end, this paper presents a framework for profiling the digital traces of individuals.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Monitoring is essential for conservation of sites, but capacity to undertake it in the field is often limited. Data collected by remote sensing has been identified as a partial solution to this problem, and is becoming a feasible option, since increasing quantities of satellite data in particular are becoming available to conservationists. When suitably classified, satellite imagery can be used to delineate land cover types such as forest, and to identify any changes over time. However, the conservation community lacks (a) a simple tool appropriate to the needs for monitoring change in all types of land cover (e.g. not just forest), and (b) an easily accessible information system which allows for simple land cover change analysis and data sharing to reduce duplication of effort. To meet these needs, we developed a web-based information system which allows users to assess land cover dynamics in and around protected areas (or other sites of conservation importance) from multi-temporal medium resolution satellite imagery. The system is based around an open access toolbox that pre-processes and classifies Landsat-type imagery, and then allows users to interactively verify the classification. These data are then open for others to utilize through the online information system. We first explain imagery processing and data accessibility features, and then demonstrate the toolbox and the value of user verification using a case study on Nakuru National Park, Kenya. Monitoring and detection of disturbances can support implementation of effective protection, assist the work of park managers and conservation scientists, and thus contribute to conservation planning, priority assessment and potentially to meeting monitoring needs for Aichi target 11.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In recent years, the boundaries between e-commerce and social networking have become increasingly blurred. Many e-commerce websites support the mechanism of social login where users can sign on the websites using their social network identities such as their Facebook or Twitter accounts. Users can also post their newly purchased products on microblogs with links to the e-commerce product web pages. In this paper, we propose a novel solution for cross-site cold-start product recommendation, which aims to recommend products from e-commerce websites to users at social networking sites in 'cold-start' situations, a problem which has rarely been explored before. A major challenge is how to leverage knowledge extracted from social networking sites for cross-site cold-start product recommendation. We propose to use the linked users across social networking sites and e-commerce websites (users who have social networking accounts and have made purchases on e-commerce websites) as a bridge to map users' social networking features to another feature representation for product recommendation. In specific, we propose learning both users' and products' feature representations (called user embeddings and product embeddings, respectively) from data collected from e-commerce websites using recurrent neural networks and then apply a modified gradient boosting trees method to transform users' social networking features into user embeddings. We then develop a feature-based matrix factorization approach which can leverage the learnt user embeddings for cold-start product recommendation. Experimental results on a large dataset constructed from the largest Chinese microblogging service Sina Weibo and the largest Chinese B2C e-commerce website JingDong have shown the effectiveness of our proposed framework.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

We present in this article an automated framework that extracts product adopter information from online reviews and incorporates the extracted information into feature-based matrix factorization formore effective product recommendation. In specific, we propose a bootstrapping approach for the extraction of product adopters from review text and categorize them into a number of different demographic categories. The aggregated demographic information of many product adopters can be used to characterize both products and users in the form of distributions over different demographic categories. We further propose a graphbased method to iteratively update user- and product-related distributions more reliably in a heterogeneous user-product graph and incorporate them as features into the matrix factorization approach for product recommendation. Our experimental results on a large dataset crawled from JINGDONG, the largest B2C e-commerce website in China, show that our proposed framework outperforms a number of competitive baselines for product recommendation.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

In product reviews, it is observed that the distribution of polarity ratings over reviews written by different users or evaluated based on different products are often skewed in the real world. As such, incorporating user and product information would be helpful for the task of sentiment classification of reviews. However, existing approaches ignored the temporal nature of reviews posted by the same user or evaluated on the same product. We argue that the temporal relations of reviews might be potentially useful for learning user and product embedding and thus propose employing a sequence model to embed these temporal relations into user and product representations so as to improve the performance of document-level sentiment analysis. Specifically, we first learn a distributed representation of each review by a one-dimensional convolutional neural network. Then, taking these representations as pretrained vectors, we use a recurrent neural network with gated recurrent units to learn distributed representations of users and products. Finally, we feed the user, product and review representations into a machine learning classifier for sentiment classification. Our approach has been evaluated on three large-scale review datasets from the IMDB and Yelp. Experimental results show that: (1) sequence modeling for the purposes of distributed user and product representation learning can improve the performance of document-level sentiment classification; (2) the proposed approach achieves state-of-The-Art results on these benchmark datasets.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This thesis addressed the problem of risk analysis in mental healthcare, with respect to the GRiST project at Aston University. That project provides a risk-screening tool based on the knowledge of 46 experts, captured as mind maps that describe relationships between risks and patterns of behavioural cues. Mind mapping, though, fails to impose control over content, and is not considered to formally represent knowledge. In contrast, this thesis treated GRiSTs mind maps as a rich knowledge base in need of refinement; that process drew on existing techniques for designing databases and knowledge bases. Identifying well-defined mind map concepts, though, was hindered by spelling mistakes, and by ambiguity and lack of coverage in the tools used for researching words. A novel use of the Edit Distance overcame those problems, by assessing similarities between mind map texts, and between spelling mistakes and suggested corrections. That algorithm further identified stems, the shortest text string found in related word-forms. As opposed to existing approaches’ reliance on built-in linguistic knowledge, this thesis devised a novel, more flexible text-based technique. An additional tool, Correspondence Analysis, found patterns in word usage that allowed machines to determine likely intended meanings for ambiguous words. Correspondence Analysis further produced clusters of related concepts, which in turn drove the automatic generation of novel mind maps. Such maps underpinned adjuncts to the mind mapping software used by GRiST; one such new facility generated novel mind maps, to reflect the collected expert knowledge on any specified concept. Mind maps from GRiST are stored as XML, which suggested storing them in an XML database. In fact, the entire approach here is ”XML-centric”, in that all stages rely on XML as far as possible. A XML-based query language allows user to retrieve information from the mind map knowledge base. The approach, it was concluded, will prove valuable to mind mapping in general, and to detecting patterns in any type of digital information.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Competition between Higher Education Institutions is increasing at an alarming rate, while changes of the surrounding environment and demands of labour market are frequent and substantial. Universities must meet the requirements of both the national and European legislation environment. The Bologna Declaration aims at providing guidelines and solutions for these problems and challenges of European Higher Education. One of its main goals is the introduction of a common framework of transparent and comparable degrees that ensures the recognition of knowledge and qualifications of citizens all across the European Union. This paper will discuss a knowledge management approach that highlights the importance of such knowledge representation tools as ontologies. The discussed ontology-based model supports the creation of transparent curricula content (Educational Ontology) and the promotion of reliable knowledge testing (Adaptive Knowledge Testing System).

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Starting from the Schumpeterian producer-driven understanding of innovation, followed by user-generated solutions and understanding of collaborative forms of co-creation, scholars investigated the drivers and the nature of interactions underpinning success in various ways. Innovation literature has gone a long way, where open innovation has attracted researchers to investigate problems like compatibilities of external resources, networks of innovation, or open source collaboration. Openness itself has gained various shades in the different strands of literature. In this paper the author provides with an overview and a draft evaluation of the different models of open innovation, illustrated with some empirical findings from various fields drawn from the literature. She points to the relevance of transaction costs affecting viable forms of (open) innovation strategies of firms, and the importance to define the locus of innovation for further analyses of different firm and interaction level formations.

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Methods for accessing data on the Web have been the focus of active research over the past few years. In this thesis we propose a method for representing Web sites as data sources. We designed a Data Extractor data retrieval solution that allows us to define queries to Web sites and process resulting data sets. Data Extractor is being integrated into the MSemODB heterogeneous database management system. With its help database queries can be distributed over both local and Web data sources within MSemODB framework. ^ Data Extractor treats Web sites as data sources, controlling query execution and data retrieval. It works as an intermediary between the applications and the sites. Data Extractor utilizes a twofold “custom wrapper” approach for information retrieval. Wrappers for the majority of sites are easily built using a powerful and expressive scripting language, while complex cases are processed using Java-based wrappers that utilize specially designed library of data retrieval, parsing and Web access routines. In addition to wrapper development we thoroughly investigate issues associated with Web site selection, analysis and processing. ^ Data Extractor is designed to act as a data retrieval server, as well as an embedded data retrieval solution. We also use it to create mobile agents that are shipped over the Internet to the client's computer to perform data retrieval on behalf of the user. This approach allows Data Extractor to distribute and scale well. ^ This study confirms feasibility of building custom wrappers for Web sites. This approach provides accuracy of data retrieval, and power and flexibility in handling of complex cases. ^

Relevância:

50.00% 50.00%

Publicador:

Resumo:

The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^

Relevância:

50.00% 50.00%

Publicador:

Resumo:

This research pursued the conceptualization and real-time verification of a system that allows a computer user to control the cursor of a computer interface without using his/her hands. The target user groups for this system are individuals who are unable to use their hands due to spinal dysfunction or other afflictions, and individuals who must use their hands for higher priority tasks while still requiring interaction with a computer. ^ The system receives two forms of input from the user: Electromyogram (EMG) signals from muscles in the face and point-of-gaze coordinates produced by an Eye Gaze Tracking (EGT) system. In order to produce reliable cursor control from the two forms of user input, the development of this EMG/EGT system addressed three key requirements: an algorithm was created to accurately translate EMG signals due to facial movements into cursor actions, a separate algorithm was created that recognized an eye gaze fixation and provided an estimate of the associated eye gaze position, and an information fusion protocol was devised to efficiently integrate the outputs of these algorithms. ^ Experiments were conducted to compare the performance of EMG/EGT cursor control to EGT-only control and mouse control. These experiments took the form of two different types of point-and-click trials. The data produced by these experiments were evaluated using statistical analysis, Fitts' Law analysis and target re-entry (TRE) analysis. ^ The experimental results revealed that though EMG/EGT control was slower than EGT-only and mouse control, it provided effective hands-free control of the cursor without a spatial accuracy limitation, and it also facilitated a reliable click operation. This combination of qualities is not possessed by either EGT-only or mouse control, making EMG/EGT cursor control a unique and practical alternative for a user's cursor control needs. ^

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity—users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top- k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.