39 resultados para variable data printing

em Aston University Research Archive


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Visualization has proven to be a powerful and widely-applicable tool the analysis and interpretation of data. Most visualization algorithms aim to find a projection from the data space down to a two-dimensional visualization space. However, for complex data sets living in a high-dimensional space it is unlikely that a single two-dimensional projection can reveal all of the interesting structure. We therefore introduce a hierarchical visualization algorithm which allows the complete data set to be visualized at the top level, with clusters and sub-clusters of data points visualized at deeper levels. The algorithm is based on a hierarchical mixture of latent variable models, whose parameters are estimated using the expectation-maximization algorithm. We demonstrate the principle of the approach first on a toy data set, and then apply the algorithm to the visualization of a synthetic data set in 12 dimensions obtained from a simulation of multi-phase flows in oil pipelines and to data in 36 dimensions derived from satellite images.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There may be circumstances where it is necessary for microbiologists to compare variances rather than means, e,g., in analysing data from experiments to determine whether a particular treatment alters the degree of variability or testing the assumption of homogeneity of variance prior to other statistical tests. All of the tests described in this Statnote have their limitations. Bartlett’s test may be too sensitive but Levene’s and the Brown-Forsythe tests also have problems. We would recommend the use of the variance-ratio test to compare two variances and the careful application of Bartlett’s test if there are more than two groups. Considering that these tests are not particularly robust, it should be remembered that the homogeneity of variance assumption is usually the least important of those considered when carrying out an ANOVA. If there is concern about this assumption and especially if the other assumptions of the analysis are also not likely to be met, e.g., lack of normality or non additivity of treatment effects then it may be better either to transform the data or to carry out a non-parametric test on the data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In a Data Envelopment Analysis model, some of the weights used to compute the efficiency of a unit can have zero or negligible value despite of the importance of the corresponding input or output. This paper offers an approach to preventing inputs and outputs from being ignored in the DEA assessment under the multiple input and output VRS environment, building on an approach introduced in Allen and Thanassoulis (2004) for single input multiple output CRS cases. The proposed method is based on the idea of introducing unobserved DMUs created by adjusting input and output levels of certain observed relatively efficient DMUs, in a manner which reflects a combination of technical information and the decision maker's value judgements. In contrast to many alternative techniques used to constrain weights and/or improve envelopment in DEA, this approach allows one to impose local information on production trade-offs, which are in line with the general VRS technology. The suggested procedure is illustrated using real data. © 2011 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is currently considerable interest in developing general non-linear density models based on latent, or hidden, variables. Such models have the ability to discover the presence of a relatively small number of underlying `causes' which, acting in combination, give rise to the apparent complexity of the observed data set. Unfortunately, to train such models generally requires large computational effort. In this paper we introduce a novel latent variable algorithm which retains the general non-linear capabilities of previous models but which uses a training procedure based on the EM algorithm. We demonstrate the performance of the model on a toy problem and on data from flow diagnostics for a multi-phase oil pipeline.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Obtaining wind vectors over the ocean is important for weather forecasting and ocean modelling. Several satellite systems used operationally by meteorological agencies utilise scatterometers to infer wind vectors over the oceans. In this paper we present the results of using novel neural network based techniques to estimate wind vectors from such data. The problem is partitioned into estimating wind speed and wind direction. Wind speed is modelled using a multi-layer perceptron (MLP) and a sum of squares error function. Wind direction is a periodic variable and a multi-valued function for a given set of inputs; a conventional MLP fails at this task, and so we model the full periodic probability density of direction conditioned on the satellite derived inputs using a Mixture Density Network (MDN) with periodic kernel functions. A committee of the resulting MDNs is shown to improve the results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Purpose: To assess repeatability and reproducibility, to determine normative data, and to investigate the effect of age-related macular disease, compared with normals, on photostress recovery time measured using the Eger Macular Stressometer (EMS). Method: The study population comprised 49 healthy eyes of 49 participants. Four EMS measurements were taken in two sessions separated by 1 h by two practitioners, with reversal of order in the second session. EMS readings were also taken from 17 age-related maculopathy (ARM), and 12 age-related macular degeneration (AMD), affected eyes. Results: EMS readings are repeatable to within ± 7 s. There is a statistically significant difference between controls and ARM affected eyes (t = 2.169, p = 0.045), and AMD affected eyes (t = 2.817, p = 0.016). The EMS is highly specific, and demonstrates sensitivity of 29% for ARM, and 50% for AMD. Conclusions: The EMS may be a useful screening test for ARM, however, direct illumination of the macula of greater intensity and longer duration may yield less variable results. © 2004 The College of Optometrists.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper introduces a new technique in the investigation of limited-dependent variable models. This paper illustrates that variable precision rough set theory (VPRS), allied with the use of a modern method of classification, or discretisation of data, can out-perform the more standard approaches that are employed in economics, such as a probit model. These approaches and certain inductive decision tree methods are compared (through a Monte Carlo simulation approach) in the analysis of the decisions reached by the UK Monopolies and Mergers Committee. We show that, particularly in small samples, the VPRS model can improve on more traditional models, both in-sample, and particularly in out-of-sample prediction. A similar improvement in out-of-sample prediction over the decision tree methods is also shown.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Data envelopment analysis (DEA) is defined based on observed units and by finding the distance of each unit to the border of estimated production possibility set (PPS). The convexity is one of the underlying assumptions of the PPS. This paper shows some difficulties of using standard DEA models in the presence of input-ratios and/or output-ratios. The paper defines a new convexity assumption when data includes a ratio variable. Then it proposes a series of modified DEA models which are capable to rectify this problem.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The book aims to introduce the reader to DEA in the most accessible manner possible. It is specifically aimed at those who have had no prior exposure to DEA and wish to learn its essentials, how it works, its key uses, and the mechanics of using it. The latter will include using DEA software. Students on degree or training courses will find the book especially helpful. The same is true of practitioners engaging in comparative efficiency assessments and performance management within their organisation. Examples are used throughout the book to help the reader consolidate the concepts covered. Table of content: List of Tables. List of Figures. Preface. Abbreviations. 1. Introduction to Performance Measurement. 2. Definitions of Efficiency and Related Measures. 3. Data Envelopment Analysis Under Constant Returns to Scale: Basic Principles. 4. Data Envelopment Analysis under Constant Returns to Scale: General Models. 5. Using Data Envelopment Analysis in Practice. 6. Data Envelopment Analysis under Variable Returns to Scale. 7. Assessing Policy Effectiveness and Productivity Change Using DEA. 8. Incorporating Value Judgements in DEA Assessments. 9. Extensions to Basic DEA Models. 10. A Limited User Guide for Warwick DEA Software. Author Index. Topic Index. References.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ability to distinguish one visual stimulus from another slightly different one depends on the variability of their internal representations. In a recent paper on human visual-contrast discrimination, Kontsevich et al (2002 Vision Research 42 1771 - 1784) re-considered the long-standing question whether the internal noise that limits discrimination is fixed (contrast-invariant) or variable (contrast-dependent). They tested discrimination performance for 3 cycles deg-1 gratings over a wide range of incremental contrast levels at three masking contrasts, and showed that a simple model with an expansive response function and response-dependent noise could fit the data very well. Their conclusion - that noise in visual-discrimination tasks increases markedly with contrast - has profound implications for our understanding and modelling of vision. Here, however, we re-analyse their data, and report that a standard gain-control model with a compressive response function and fixed additive noise can also fit the data remarkably well. Thus these experimental data do not allow us to decide between the two models. The question remains open. [Supported by EPSRC grant GR/S74515/01]

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Different types of numerical data can be collected in a scientific investigation and the choice of statistical analysis will often depend on the distribution of the data. A basic distinction between variables is whether they are ‘parametric’ or ‘non-parametric’. When a variable is parametric, the data come from a symmetrically shaped distribution known as the ‘Gaussian’ or ‘normal distribution’ whereas non-parametric variables may have a distribution which deviates markedly in shape from normal. This article describes several aspects of the problem of non-normality including: (1) how to test for two common types of deviation from a normal distribution, viz., ‘skew’ and ‘kurtosis’, (2) how to fit the normal distribution to a sample of data, (3) the transformation of non-normally distributed data and scores, and (4) commonly used ‘non-parametric’ statistics which can be used in a variety of circumstances.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A local area network that can support both voice and data packets offers economic advantages due to the use of only a single network for both types of traffic, greater flexibility to changing user demands, and it also enables efficient use to be made of the transmission capacity. The latter aspect is very important in local broadcast networks where the capacity is a scarce resource, for example mobile radio. This research has examined two types of local broadcast network, these being the Ethernet-type bus local area network and a mobile radio network with a central base station. With such contention networks, medium access control (MAC) protocols are required to gain access to the channel. MAC protocols must provide efficient scheduling on the channel between the distributed population of stations who want to transmit. No access scheme can exceed the performance of a single server queue, due to the spatial distribution of the stations. Stations cannot in general form a queue without using part of the channel capacity to exchange protocol information. In this research, several medium access protocols have been examined and developed in order to increase the channel throughput compared to existing protocols. However, the established performance measures of average packet time delay and throughput cannot adequately characterise protocol performance for packet voice. Rather, the percentage of bits delivered within a given time bound becomes the relevant performance measure. Performance evaluation of the protocols has been examined using discrete event simulation and in some cases also by mathematical modelling. All the protocols use either implicit or explicit reservation schemes, with their efficiency dependent on the fact that many voice packets are generated periodically within a talkspurt. Two of the protocols are based on the existing 'Reservation Virtual Time CSMA/CD' protocol, which forms a distributed queue through implicit reservations. This protocol has been improved firstly by utilising two channels, a packet transmission channel and a packet contention channel. Packet contention is then performed in parallel with a packet transmission to increase throughput. The second protocol uses variable length packets to reduce the contention time between transmissions on a single channel. A third protocol developed, is based on contention for explicit reservations. Once a station has achieved a reservation, it maintains this effective queue position for the remainder of the talkspurt and transmits after it has sensed the transmission from the preceeding station within the queue. In the mobile radio environment, adaptions to the protocols were necessary in order that their operation was robust to signal fading. This was achieved through centralised control at a base station, unlike the local area network versions where the control was distributed at the stations. The results show an improvement in throughput compared to some previous protocols. Further work includes subjective testing to validate the protocols' effectiveness.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visualization of high-dimensional data has always been a challenging task. Here we discuss and propose variants of non-linear data projection methods (Generative Topographic Mapping (GTM) and GTM with simultaneous feature saliency (GTM-FS)) that are adapted to be effective on very high-dimensional data. The adaptations use log space values at certain steps of the Expectation Maximization (EM) algorithm and during the visualization process. We have tested the proposed algorithms by visualizing electrostatic potential data for Major Histocompatibility Complex (MHC) class-I proteins. The experiments show that the variation in the original version of GTM and GTM-FS worked successfully with data of more than 2000 dimensions and we compare the results with other linear/nonlinear projection methods: Principal Component Analysis (PCA), Neuroscale (NSC) and Gaussian Process Latent Variable Model (GPLVM).

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analyzing geographical patterns by collocating events, objects or their attributes has a long history in surveillance and monitoring, and is particularly applied in environmental contexts, such as ecology or epidemiology. The identification of patterns or structures at some scales can be addressed using spatial statistics, particularly marked point processes methodologies. Classification and regression trees are also related to this goal of finding "patterns" by deducing the hierarchy of influence of variables on a dependent outcome. Such variable selection methods have been applied to spatial data, but, often without explicitly acknowledging the spatial dependence. Many methods routinely used in exploratory point pattern analysis are2nd-order statistics, used in a univariate context, though there is also a wide literature on modelling methods for multivariate point pattern processes. This paper proposes an exploratory approach for multivariate spatial data using higher-order statistics built from co-occurrences of events or marks given by the point processes. A spatial entropy measure, derived from these multinomial distributions of co-occurrences at a given order, constitutes the basis of the proposed exploratory methods. © 2010 Elsevier Ltd.