Biblioteca Digital

216 resultados para Filtering

A combined method for mitigating sparsity problem in tag recommendation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tag recommendation is a specific recommendation task for recommending metadata (tag) for a web resource (item) during user annotation process. In this context, sparsity problem refers to situation where tags need to be produced for items with few annotations or for user who tags few items. Most of the state of the art approaches in tag recommendation are rarely evaluated or perform poorly under this situation. This paper presents a combined method for mitigating sparsity problem in tag recommendation by mainly expanding and ranking candidate tags based on similar items’ tags and existing tag ontology. We evaluated the approach on two public social bookmarking datasets. The experiment results show better accuracy for recommendation in sparsity situation over several state of the art methods.

Towards the retrieval of accurate OD matrices from Bluetooth data : lessons learned from 2 years of data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bluetooth technology is being increasingly used to track vehicles throughout their trips, within urban networks and across freeway stretches. One important opportunity offered by this type of data is the measurement of Origin-Destination patterns, emerging from the aggregation and clustering of individual trips. In order to obtain accurate estimations, however, a number of issues need to be addressed, through data filtering and correction techniques. These issues mainly stem from the use of the Bluetooth technology amongst drivers, and the physical properties of the Bluetooth sensors themselves. First, not all cars are equipped with discoverable Bluetooth devices and the Bluetooth-enabled vehicles may belong to some small socio-economic groups of users. Second, the Bluetooth datasets include data from various transport modes; such as pedestrian, bicycles, cars, taxi driver, buses and trains. Third, the Bluetooth sensors may fail to detect all of the nearby Bluetooth-enabled vehicles. As a consequence, the exact journey for some vehicles may become a latent pattern that will need to be extracted from the data. Finally, sensors that are in close proximity to each other may have overlapping detection areas, thus making the task of retrieving the correct travelled path even more challenging. The aim of this paper is twofold. We first give a comprehensive overview of the aforementioned issues. Further, we propose a methodology that can be followed, in order to cleanse, correct and aggregate Bluetooth data. We postulate that the methods introduced by this paper are the first crucial steps that need to be followed in order to compute accurate Origin-Destination matrices in urban road networks.

Vision-based estimation of airborne target pseudobearing rate using hidden Markov model filters

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.

Self excitation in equity indices

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A "self-exciting" market is one in which the probability of observing a crash increases in response to the occurrence of a crash. It essentially describes cases where the initial crash serves to weaken the system to some extent, making subsequent crashes more likely. This thesis investigates if equity markets possess this property. A self-exciting extension of the well-known jump-based Bates (1996) model is used as the workhorse model for this thesis, and a particle-filtering algorithm is used to facilitate estimation by means of maximum likelihood. The estimation method is developed so that option prices are easily included in the dataset, leading to higher quality estimates. Equilibrium arguments are used to price the risks associated with the time-varying crash probability, and in turn to motivate a risk-neutral system for use in option pricing. The option pricing function for the model is obtained via the application of widely-used Fourier techniques. An application to S&P500 index returns and a panel of S&P500 index option prices reveals evidence of self excitation.

Personalized ontology learning for enhancing text mining effectiveness

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.

A high-throughput panel for identifying clinically relevant mutation profiles in melanoma

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Success with molecular-based targeted drugs in the treatment of cancer has ignited extensive research efforts within the field of personalized therapeutics. However, successful application of such therapies is dependent on the presence or absence of mutations within the patient's tumor that can confer clinical efficacy or drug resistance. Building on these findings, we developed a high-throughput mutation panel for the identification of frequently occurring and clinically relevant mutations in melanoma. An extensive literature search and interrogation of the Catalogue of Somatic Mutations in Cancer database identified more than 1,000 melanoma mutations. Applying a filtering strategy to focus on mutations amenable to the development of targeted drugs, we initially screened 120 known mutations in 271 samples using the Sequenom MassARRAY system. A total of 252 mutations were detected in 17 genes, the highest frequency occurred in BRAF (n = 154, 57%), NRAS (n = 55, 20%), CDK4 (n = 8, 3%), PTK2B (n = 7, 2.5%), and ERBB4 (n = 5, 2%). Based on this initial discovery screen, a total of 46 assays interrogating 39 mutations in 20 genes were designed to develop a melanoma-specific panel. These assays were distributed in multiplexes over 8 wells using strict assay design parameters optimized for sensitive mutation detection. The final melanoma-specific mutation panel is a cost effective, sensitive, high-throughput approach for identifying mutations of clinical relevance to molecular-based therapeutics for the treatment of melanoma. When used in a clinical research setting, the panel may rapidly and accurately identify potentially effective treatment strategies using novel or existing molecularly targeted drugs

Filter and control performance bounds in the presence of model uncertainties with aerospace applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis establishes performance properties for approximate filters and controllers that are designed on the basis of approximate dynamic system representations. These performance properties provide a theoretical justification for the widespread application of approximate filters and controllers in the common situation where system models are not known with complete certainty. This research also provides useful tools for approximate filter designs, which are applied to hybrid filtering of uncertain nonlinear systems. As a contribution towards applications, this thesis also investigates air traffic separation control in the presence of measurement uncertainties.

A social matching system : using implicit and explicit information for personalized recommendation in online dating service

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online dating websites enable a specific form of social networking and their efficiency can be increased by supporting proactive recommendations based on participants' preferences with the use of data mining. This research develops two-way recommendation methods for people-to-people recommendation for large online social networks such as online dating networks. This research discovers the characteristics of the online dating networks and utilises these characteristics in developing efficient people-to-people recommendation methods. Methods developed support improved recommendation accuracy, can handle data sparsity that often comes with large data sets and are scalable for handling online networks with a large number of users.

Publicly verifiable ciphertexts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In many applications, where encrypted traffic flows from an open (public) domain to a protected (private) domain, there exists a gateway that bridges the two domains and faithfully forwards the incoming traffic to the receiver. We observe that indistinguishability against (adaptive) chosen-ciphertext attacks (IND-CCA), which is a mandatory goal in face of active attacks in a public domain, can be essentially relaxed to indistinguishability against chosen-plaintext attacks (IND-CPA) for ciphertexts once they pass the gateway that acts as an IND-CCA/CPA filter by first checking the validity of an incoming IND-CCA ciphertext, then transforming it (if valid) into an IND-CPA ciphertext, and forwarding the latter to the recipient in the private domain. “Non-trivial filtering'' can result in reduced decryption costs on the receivers' side. We identify a class of encryption schemes with publicly verifiable ciphertexts that admit generic constructions of (non-trivial) IND-CCA/CPA filters. These schemes are characterized by existence of public algorithms that can distinguish between valid and invalid ciphertexts. To this end, we formally define (non-trivial) public verifiability of ciphertexts for general encryption schemes, key encapsulation mechanisms, and hybrid encryption schemes, encompassing public-key, identity-based, and tag-based encryption flavours. We further analyze the security impact of public verifiability and discuss generic transformations and concrete constructions that enjoy this property.

Indexing and efficient instance-based retrieval of process models using untanglings

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Process-Aware Information Systems (PAISs) support executions of operational processes that involve people, resources, and software applications on the basis of process models. Process models describe vast, often infinite, amounts of process instances, i.e., workflows supported by the systems. With the increasing adoption of PAISs, large process model repositories emerged in companies and public organizations. These repositories constitute significant information resources. Accurate and efficient retrieval of process models and/or process instances from such repositories is interesting for multiple reasons, e.g., searching for similar models/instances, filtering, reuse, standardization, process compliance checking, verification of formal properties, etc. This paper proposes a technique for indexing process models that relies on their alternative representations, called untanglings. We show the use of untanglings for retrieval of process models based on process instances that they specify via a solution to the total executability problem. Experiments with industrial process models testify that the proposed retrieval approach is up to three orders of magnitude faster than the state of the art.

A reciprocal collaborative method using relevance feedback and feature importance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a people-to-people matching systems, filtering is widely applied to find the most suitable matches. The results returned are either too many or only a few when the search is generic or specific respectively. The use of a sophisticated recommendation approach becomes necessary. Traditionally, the object of recommendation is the item which is inanimate. In online dating systems, reciprocal recommendation is required to suggest a partner only when the user and the recommended candidate both are satisfied. In this paper, an innovative reciprocal collaborative method is developed based on the idea of similarity and common neighbors, utilizing the information of relevance feedback and feature importance. Extensive experiments are carried out using data gathered from a real online dating service. Compared to benchmarking methods, our results show the proposed method can achieve noticeable better performance.

Testing second order cyclostationarity in the squared envelope spectrum of non-white vibration signals

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cyclostationary models for the diagnostic signals measured on faulty rotating machineries have proved to be successful in many laboratory tests and industrial applications. The squared envelope spectrum has been pointed out as the most efficient indicator for the assessment of second order cyclostationary symptoms of damages, which are typical, for instance, of rolling element bearing faults. In an attempt to foster the spread of rotating machinery diagnostics, the current trend in the field is to reach higher levels of automation of the condition monitoring systems. For this purpose, statistical tests for the presence of cyclostationarity have been proposed during the last years. The statistical thresholds proposed in the past for the identification of cyclostationary components have been obtained under the hypothesis of having a white noise signal when the component is healthy. This need, coupled with the non-white nature of the real signals implies the necessity of pre-whitening or filtering the signal in optimal narrow-bands, increasing the complexity of the algorithm and the risk of losing diagnostic information or introducing biases on the result. In this paper, the authors introduce an original analytical derivation of the statistical tests for cyclostationarity in the squared envelope spectrum, dropping the hypothesis of white noise from the beginning. The effect of first order and second order cyclostationary components on the distribution of the squared envelope spectrum will be quantified and the effectiveness of the newly proposed threshold verified, providing a sound theoretical basis and a practical starting point for efficient automated diagnostics of machine components such as rolling element bearings. The analytical results will be verified by means of numerical simulations and by using experimental vibration data of rolling element bearings.

Retrieving dynamic origin-destination matrices from Bluetooth data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bluetooth technology is being increasingly used, among the Automated Vehicle Identification Systems, to retrieve important information about urban networks. Because the movement of Bluetooth-equipped vehicles can be monitored, throughout the network of Bluetooth sensors, this technology represents an effective means to acquire accurate time dependant Origin Destination information. In order to obtain reliable estimations, however, a number of issues need to be addressed, through data filtering and correction techniques. Some of the main challenges inherent to Bluetooth data are, first, that Bluetooth sensors may fail to detect all of the nearby Bluetooth-enabled vehicles. As a consequence, the exact journey for some vehicles may become a latent pattern that will need to be estimated. Second, sensors that are in close proximity to each other may have overlapping detection areas, thus making the task of retrieving the correct travelled path even more challenging. The aim of this paper is twofold: to give an overview of the issues inherent to the Bluetooth technology, through the analysis of the data available from the Bluetooth sensors in Brisbane; and to propose a method for retrieving the itineraries of the individual Bluetooth vehicles. We argue that estimating these latent itineraries, accurately, is a crucial step toward the retrieval of accurate dynamic Origin Destination Matrices.

Training-free probability models for whole-image based place recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Whole-image descriptors such as GIST have been used successfully for persistent place recognition when combined with temporal filtering or sequential filtering techniques. However, whole-image descriptor localization systems often apply a heuristic rather than a probabilistic approach to place recognition, requiring substantial environmental-specific tuning prior to deployment. In this paper we present a novel online solution that uses statistical approaches to calculate place recognition likelihoods for whole-image descriptors, without requiring either environmental tuning or pre-training. Using a real world benchmark dataset, we show that this method creates distributions appropriate to a specific environment in an online manner. Our method performs comparably to FAB-MAP in raw place recognition performance, and integrates into a state of the art probabilistic mapping system to provide superior performance to whole-image methods that are not based on true probability distributions. The method provides a principled means for combining the powerful change-invariant properties of whole-image descriptors with probabilistic back-end mapping systems without the need for prior training or system tuning.

Acquiring user information needs for recommender systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based filtering approach tries to recommend items that are similar to new users' profiles. The fundamental issues include how to profile new users, and how to deal with the over-specialization in content-based recommender systems. Indeed, the terms used to describe items can be formed as a concept hierarchy. Therefore, we aim to describe user profiles or information needs by using concepts vectors. This paper presents a new method to acquire user information needs, which allows new users to describe their preferences on a concept hierarchy rather than rating items. It also develops a new ranking function to recommend items to new users based on their information needs. The proposed approach is evaluated on Amazon book datasets. The experimental results demonstrate that the proposed approach can largely improve the effectiveness of recommender systems.

«
1
2
...
7
8
9
10
11
12
13
14
15
»