36 resultados para probabilistic ranking
Resumo:
The focus of this thesis is the extension of topographic visualisation mappings to allow for the incorporation of uncertainty. Few visualisation algorithms in the literature are capable of mapping uncertain data with fewer able to represent observation uncertainties in visualisations. As such, modifications are made to NeuroScale, Locally Linear Embedding, Isomap and Laplacian Eigenmaps to incorporate uncertainty in the observation and visualisation spaces. The proposed mappings are then called Normally-distributed NeuroScale (N-NS), T-distributed NeuroScale (T-NS), Probabilistic LLE (PLLE), Probabilistic Isomap (PIso) and Probabilistic Weighted Neighbourhood Mapping (PWNM). These algorithms generate a probabilistic visualisation space with each latent visualised point transformed to a multivariate Gaussian or T-distribution, using a feed-forward RBF network. Two types of uncertainty are then characterised dependent on the data and mapping procedure. Data dependent uncertainty is the inherent observation uncertainty. Whereas, mapping uncertainty is defined by the Fisher Information of a visualised distribution. This indicates how well the data has been interpolated, offering a level of ‘surprise’ for each observation. These new probabilistic mappings are tested on three datasets of vectorial observations and three datasets of real world time series observations for anomaly detection. In order to visualise the time series data, a method for analysing observed signals and noise distributions, Residual Modelling, is introduced. The performance of the new algorithms on the tested datasets is compared qualitatively with the latent space generated by the Gaussian Process Latent Variable Model (GPLVM). A quantitative comparison using existing evaluation measures from the literature allows performance of each mapping function to be compared. Finally, the mapping uncertainty measure is combined with NeuroScale to build a deep learning classifier, the Cascading RBF. This new structure is tested on the MNist dataset achieving world record performance whilst avoiding the flaws seen in other Deep Learning Machines.
Resumo:
Cloud computing is a new technological paradigm offering computing infrastructure, software and platforms as a pay-as-you-go, subscription-based service. Many potential customers of cloud services require essential cost assessments to be undertaken before transitioning to the cloud. Current assessment techniques are imprecise as they rely on simplified specifications of resource requirements that fail to account for probabilistic variations in usage. In this paper, we address these problems and propose a new probabilistic pattern modelling (PPM) approach to cloud costing and resource usage verification. Our approach is based on a concise expression of probabilistic resource usage patterns translated to Markov decision processes (MDPs). Key costing and usage queries are identified and expressed in a probabilistic variant of temporal logic and calculated to a high degree of precision using quantitative verification techniques. The PPM cost assessment approach has been implemented as a Java library and validated with a case study and scalability experiments. © 2012 Springer-Verlag Berlin Heidelberg.
Resumo:
The traditional use of global and centralised control methods, fails for large, complex, noisy and highly connected systems, which typify many real world industrial and commercial systems. This paper provides an efficient bottom up design of distributed control in which many simple components communicate and cooperate to achieve a joint system goal. Each component acts individually so as to maximise personal utility whilst obtaining probabilistic information on the global system merely through local message-passing. This leads to an implied scalable and collective control strategy for complex dynamical systems, without the problems of global centralised control. Robustness is addressed by employing a fully probabilistic design, which can cope with inherent uncertainties, can be implemented adaptively and opens a systematic rich way to information sharing. This paper opens the foreseen direction and inspects the proposed design on a linearised version of coupled map lattice with spatiotemporal chaos. A version close to linear quadratic design gives an initial insight into possible behaviours of such networks.
Resumo:
It is important to help researchers find valuable papers from a large literature collection. To this end, many graph-based ranking algorithms have been proposed. However, most of these algorithms suffer from the problem of ranking bias. Ranking bias hurts the usefulness of a ranking algorithm because it returns a ranking list with an undesirable time distribution. This paper is a focused study on how to alleviate ranking bias by leveraging the heterogeneous network structure of the literature collection. We propose a new graph-based ranking algorithm, MutualRank, that integrates mutual reinforcement relationships among networks of papers, researchers, and venues to achieve a more synthetic, accurate, and less-biased ranking than previous methods. MutualRank provides a unified model that involves both intra- and inter-network information for ranking papers, researchers, and venues simultaneously. We use the ACL Anthology Network as the benchmark data set and construct the gold standard from computer linguistics course websites of well-known universities and two well-known textbooks. The experimental results show that MutualRank greatly outperforms the state-of-the-art competitors, including PageRank, HITS, CoRank, Future Rank, and P-Rank, in ranking papers in both improving ranking effectiveness and alleviating ranking bias. Rankings of researchers and venues by MutualRank are also quite reasonable.
Resumo:
To compare the accuracy of different forecasting approaches an error measure is required. Many error measures have been proposed in the literature, however in practice there are some situations where different measures yield different decisions on forecasting approach selection and there is no agreement on which approach should be used. Generally forecasting measures represent ratios or percentages providing an overall image of how well fitted the forecasting technique is to the observations. This paper proposes a multiplicative Data Envelopment Analysis (DEA) model in order to rank several forecasting techniques. We demonstrate the proposed model by applying it to the set of yearly time series of the M3 competition. The usefulness of the proposed approach has been tested using the M3-competition where five error measures have been applied in and aggregated to a single DEA score.
Resumo:
Event extraction from texts aims to detect structured information such as what has happened, to whom, where and when. Event extraction and visualization are typically considered as two different tasks. In this paper, we propose a novel approach based on probabilistic modelling to jointly extract and visualize events from tweets where both tasks benefit from each other. We model each event as a joint distribution over named entities, a date, a location and event-related keywords. Moreover, both tweets and event instances are associated with coordinates in the visualization space. The manifold assumption that the intrinsic geometry of tweets is a low-rank, non-linear manifold within the high-dimensional space is incorporated into the learning framework using a regularization. Experimental results show that the proposed approach can effectively deal with both event extraction and visualization and performs remarkably better than both the state-of-the-art event extraction method and a pipeline approach for event extraction and visualization.