Biblioteca Digital

247 resultados para GRASP filtering

Vision-based estimation of airborne target pseudobearing rate using hidden Markov model filters

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of estimating pseudobearing rate information of an airborne target based on measurements from a vision sensor is considered. Novel image speed and heading angle estimators are presented that exploit image morphology, hidden Markov model (HMM) filtering, and relative entropy rate (RER) concepts to allow pseudobearing rate information to be determined before (or whilst) the target track is being estimated from vision information.

Self excitation in equity indices

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A "self-exciting" market is one in which the probability of observing a crash increases in response to the occurrence of a crash. It essentially describes cases where the initial crash serves to weaken the system to some extent, making subsequent crashes more likely. This thesis investigates if equity markets possess this property. A self-exciting extension of the well-known jump-based Bates (1996) model is used as the workhorse model for this thesis, and a particle-filtering algorithm is used to facilitate estimation by means of maximum likelihood. The estimation method is developed so that option prices are easily included in the dataset, leading to higher quality estimates. Equilibrium arguments are used to price the risks associated with the time-varying crash probability, and in turn to motivate a risk-neutral system for use in option pricing. The option pricing function for the model is obtained via the application of widely-used Fourier techniques. An application to S&P500 index returns and a panel of S&P500 index option prices reveals evidence of self excitation.

Personalized ontology learning for enhancing text mining effectiveness

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.

A high-throughput panel for identifying clinically relevant mutation profiles in melanoma

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Success with molecular-based targeted drugs in the treatment of cancer has ignited extensive research efforts within the field of personalized therapeutics. However, successful application of such therapies is dependent on the presence or absence of mutations within the patient's tumor that can confer clinical efficacy or drug resistance. Building on these findings, we developed a high-throughput mutation panel for the identification of frequently occurring and clinically relevant mutations in melanoma. An extensive literature search and interrogation of the Catalogue of Somatic Mutations in Cancer database identified more than 1,000 melanoma mutations. Applying a filtering strategy to focus on mutations amenable to the development of targeted drugs, we initially screened 120 known mutations in 271 samples using the Sequenom MassARRAY system. A total of 252 mutations were detected in 17 genes, the highest frequency occurred in BRAF (n = 154, 57%), NRAS (n = 55, 20%), CDK4 (n = 8, 3%), PTK2B (n = 7, 2.5%), and ERBB4 (n = 5, 2%). Based on this initial discovery screen, a total of 46 assays interrogating 39 mutations in 20 genes were designed to develop a melanoma-specific panel. These assays were distributed in multiplexes over 8 wells using strict assay design parameters optimized for sensitive mutation detection. The final melanoma-specific mutation panel is a cost effective, sensitive, high-throughput approach for identifying mutations of clinical relevance to molecular-based therapeutics for the treatment of melanoma. When used in a clinical research setting, the panel may rapidly and accurately identify potentially effective treatment strategies using novel or existing molecularly targeted drugs

Filter and control performance bounds in the presence of model uncertainties with aerospace applications

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis establishes performance properties for approximate filters and controllers that are designed on the basis of approximate dynamic system representations. These performance properties provide a theoretical justification for the widespread application of approximate filters and controllers in the common situation where system models are not known with complete certainty. This research also provides useful tools for approximate filter designs, which are applied to hybrid filtering of uncertain nonlinear systems. As a contribution towards applications, this thesis also investigates air traffic separation control in the presence of measurement uncertainties.

A social matching system : using implicit and explicit information for personalized recommendation in online dating service

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Online dating websites enable a specific form of social networking and their efficiency can be increased by supporting proactive recommendations based on participants' preferences with the use of data mining. This research develops two-way recommendation methods for people-to-people recommendation for large online social networks such as online dating networks. This research discovers the characteristics of the online dating networks and utilises these characteristics in developing efficient people-to-people recommendation methods. Methods developed support improved recommendation accuracy, can handle data sparsity that often comes with large data sets and are scalable for handling online networks with a large number of users.

Publicly verifiable ciphertexts

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In many applications, where encrypted traffic flows from an open (public) domain to a protected (private) domain, there exists a gateway that bridges the two domains and faithfully forwards the incoming traffic to the receiver. We observe that indistinguishability against (adaptive) chosen-ciphertext attacks (IND-CCA), which is a mandatory goal in face of active attacks in a public domain, can be essentially relaxed to indistinguishability against chosen-plaintext attacks (IND-CPA) for ciphertexts once they pass the gateway that acts as an IND-CCA/CPA filter by first checking the validity of an incoming IND-CCA ciphertext, then transforming it (if valid) into an IND-CPA ciphertext, and forwarding the latter to the recipient in the private domain. “Non-trivial filtering'' can result in reduced decryption costs on the receivers' side. We identify a class of encryption schemes with publicly verifiable ciphertexts that admit generic constructions of (non-trivial) IND-CCA/CPA filters. These schemes are characterized by existence of public algorithms that can distinguish between valid and invalid ciphertexts. To this end, we formally define (non-trivial) public verifiability of ciphertexts for general encryption schemes, key encapsulation mechanisms, and hybrid encryption schemes, encompassing public-key, identity-based, and tag-based encryption flavours. We further analyze the security impact of public verifiability and discuss generic transformations and concrete constructions that enjoy this property.

Indexing and efficient instance-based retrieval of process models using untanglings

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Process-Aware Information Systems (PAISs) support executions of operational processes that involve people, resources, and software applications on the basis of process models. Process models describe vast, often infinite, amounts of process instances, i.e., workflows supported by the systems. With the increasing adoption of PAISs, large process model repositories emerged in companies and public organizations. These repositories constitute significant information resources. Accurate and efficient retrieval of process models and/or process instances from such repositories is interesting for multiple reasons, e.g., searching for similar models/instances, filtering, reuse, standardization, process compliance checking, verification of formal properties, etc. This paper proposes a technique for indexing process models that relies on their alternative representations, called untanglings. We show the use of untanglings for retrieval of process models based on process instances that they specify via a solution to the total executability problem. Experiments with industrial process models testify that the proposed retrieval approach is up to three orders of magnitude faster than the state of the art.

A reciprocal collaborative method using relevance feedback and feature importance

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In a people-to-people matching systems, filtering is widely applied to find the most suitable matches. The results returned are either too many or only a few when the search is generic or specific respectively. The use of a sophisticated recommendation approach becomes necessary. Traditionally, the object of recommendation is the item which is inanimate. In online dating systems, reciprocal recommendation is required to suggest a partner only when the user and the recommended candidate both are satisfied. In this paper, an innovative reciprocal collaborative method is developed based on the idea of similarity and common neighbors, utilizing the information of relevance feedback and feature importance. Extensive experiments are carried out using data gathered from a real online dating service. Compared to benchmarking methods, our results show the proposed method can achieve noticeable better performance.

Testing second order cyclostationarity in the squared envelope spectrum of non-white vibration signals

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Cyclostationary models for the diagnostic signals measured on faulty rotating machineries have proved to be successful in many laboratory tests and industrial applications. The squared envelope spectrum has been pointed out as the most efficient indicator for the assessment of second order cyclostationary symptoms of damages, which are typical, for instance, of rolling element bearing faults. In an attempt to foster the spread of rotating machinery diagnostics, the current trend in the field is to reach higher levels of automation of the condition monitoring systems. For this purpose, statistical tests for the presence of cyclostationarity have been proposed during the last years. The statistical thresholds proposed in the past for the identification of cyclostationary components have been obtained under the hypothesis of having a white noise signal when the component is healthy. This need, coupled with the non-white nature of the real signals implies the necessity of pre-whitening or filtering the signal in optimal narrow-bands, increasing the complexity of the algorithm and the risk of losing diagnostic information or introducing biases on the result. In this paper, the authors introduce an original analytical derivation of the statistical tests for cyclostationarity in the squared envelope spectrum, dropping the hypothesis of white noise from the beginning. The effect of first order and second order cyclostationary components on the distribution of the squared envelope spectrum will be quantified and the effectiveness of the newly proposed threshold verified, providing a sound theoretical basis and a practical starting point for efficient automated diagnostics of machine components such as rolling element bearings. The analytical results will be verified by means of numerical simulations and by using experimental vibration data of rolling element bearings.

Retrieving dynamic origin-destination matrices from Bluetooth data

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The Bluetooth technology is being increasingly used, among the Automated Vehicle Identification Systems, to retrieve important information about urban networks. Because the movement of Bluetooth-equipped vehicles can be monitored, throughout the network of Bluetooth sensors, this technology represents an effective means to acquire accurate time dependant Origin Destination information. In order to obtain reliable estimations, however, a number of issues need to be addressed, through data filtering and correction techniques. Some of the main challenges inherent to Bluetooth data are, first, that Bluetooth sensors may fail to detect all of the nearby Bluetooth-enabled vehicles. As a consequence, the exact journey for some vehicles may become a latent pattern that will need to be estimated. Second, sensors that are in close proximity to each other may have overlapping detection areas, thus making the task of retrieving the correct travelled path even more challenging. The aim of this paper is twofold: to give an overview of the issues inherent to the Bluetooth technology, through the analysis of the data available from the Bluetooth sensors in Brisbane; and to propose a method for retrieving the itineraries of the individual Bluetooth vehicles. We argue that estimating these latent itineraries, accurately, is a crucial step toward the retrieval of accurate dynamic Origin Destination Matrices.

Training-free probability models for whole-image based place recognition

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Whole-image descriptors such as GIST have been used successfully for persistent place recognition when combined with temporal filtering or sequential filtering techniques. However, whole-image descriptor localization systems often apply a heuristic rather than a probabilistic approach to place recognition, requiring substantial environmental-specific tuning prior to deployment. In this paper we present a novel online solution that uses statistical approaches to calculate place recognition likelihoods for whole-image descriptors, without requiring either environmental tuning or pre-training. Using a real world benchmark dataset, we show that this method creates distributions appropriate to a specific environment in an online manner. Our method performs comparably to FAB-MAP in raw place recognition performance, and integrates into a state of the art probabilistic mapping system to provide superior performance to whole-image methods that are not based on true probability distributions. The method provides a principled means for combining the powerful change-invariant properties of whole-image descriptors with probabilistic back-end mapping systems without the need for prior training or system tuning.

Acquiring user information needs for recommender systems

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Most recommender systems attempt to use collaborative filtering, content-based filtering or hybrid approach to recommend items to new users. Collaborative filtering recommends items to new users based on their similar neighbours, and content-based filtering approach tries to recommend items that are similar to new users' profiles. The fundamental issues include how to profile new users, and how to deal with the over-specialization in content-based recommender systems. Indeed, the terms used to describe items can be formed as a concept hierarchy. Therefore, we aim to describe user profiles or information needs by using concepts vectors. This paper presents a new method to acquire user information needs, which allows new users to describe their preferences on a concept hierarchy rather than rating items. It also develops a new ranking function to recommend items to new users based on their information needs. The proposed approach is evaluated on Amazon book datasets. The experimental results demonstrate that the proposed approach can largely improve the effectiveness of recommender systems.

Slicing big data - Twitter, gambling and time sensitive information

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.

A topic based document relevance ranking model

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Topic modelling has been widely used in the fields of information retrieval, text mining, machine learning, etc. In this paper, we propose a novel model, Pattern Enhanced Topic Model (PETM), which makes improvements to topic modelling by semantically representing topics with discriminative patterns, and also makes innovative contributions to information filtering by utilising the proposed PETM to determine document relevance based on topics distribution and maximum matched patterns proposed in this paper. Extensive experiments are conducted to evaluate the effectiveness of PETM by using the TREC data collection Reuters Corpus Volume 1. The results show that the proposed model significantly outperforms both state-of-the-art term-based models and pattern-based models.

«
1
2
...
9
10
11
12
13
14
15
16
17
»