337 resultados para outliers


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Collisions between pedestrians and vehicles continue to be a major problem throughout the world. Pedestrians trying to cross roads and railway tracks without any caution are often highly susceptible to collisions with vehicles and trains. Continuous financial, human and other losses have prompted transport related organizations to come up with various solutions addressing this issue. However, the quest for new and significant improvements in this area is still ongoing. This work addresses this issue by building a general framework using computer vision techniques to automatically monitor pedestrian movements in such high-risk areas to enable better analysis of activity, and the creation of future alerting strategies. As a result of rapid development in the electronics and semi-conductor industry there is extensive deployment of CCTV cameras in public places to capture video footage. This footage can then be used to analyse crowd activities in those particular places. This work seeks to identify the abnormal behaviour of individuals in video footage. In this work we propose using a Semi-2D Hidden Markov Model (HMM), Full-2D HMM and Spatial HMM to model the normal activities of people. The outliers of the model (i.e. those observations with insufficient likelihood) are identified as abnormal activities. Location features, flow features and optical flow textures are used as the features for the model. The proposed approaches are evaluated using the publicly available UCSD datasets, and we demonstrate improved performance using a Semi-2D Hidden Markov Model compared to other state of the art methods. Further we illustrate how our proposed methods can be applied to detect anomalous events at rail level crossings.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper addresses one of the foundational components of beginning infernce, namely variation, with 5 classes of Year 4 students undertaking a measurement activity using scaled instruments in two contexts: all students measuring one person's arm span and recording the values obtained, and each student having his/her own arm span measured and recorded. The results included documentation of students' explicit appreciation of the variety of ways in which varitation can occur, including outliers, and their ability to create and describe valid representations of their data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction This study investigated the sensitivity of calculated stereotactic radiotherapy and radiosurgery doses to the accuracy of the beam data used by the treatment planning system. Methods Two sets of field output factors were acquired using fields smaller than approximately 1 cm2, for inclusion in beam data used by the iPlan treatment planning system (Brainlab, Feldkirchen, Germany). One set of output factors were measured using an Exradin A16 ion chamber (Standard Imaging, Middleton, USA). Although this chamber has a relatively small collecting volume (0.007 cm3), measurements made in small fields using this chamber are subject to the effects of volume averaging, electronic disequilibrium and chamber perturbations. The second, more accurate, set of measurements were obtained by applying perturbation correction factors, calculated using Monte Carlo simulations according to a method recommended by Cranmer-Sargison et al. [1] to measurements made using a 60017 unshielded electron diode (PTW, Freiburg, Germany). A series of 12 sample patient treatments were used to investigate the effects of beam data accuracy on resulting planned dose. These treatments, which involved 135 fields, were planned for delivery via static conformal arcs and 3DCRT techniques, to targets ranging from prostates (up to 8 cm across) to meningiomas (usually more than 2 cm across) to arterioveinous malformations, acoustic neuromas and brain metastases (often less than 2 cm across). Isocentre doses were calculated for all of these fields using iPlan, and the results of using the two different sets of beam data were evaluated. Results While the isocentre doses for many fields are identical (difference = 0.0 %), there is a general trend for the doses calculated using the data obtained from corrected diode measurements to exceed the doses calculated using the less-accurate Exradin ion chamber measurements (difference\0.0 %). There are several alarming outliers (circled in the Fig. 1) where doses differ by more than 3 %, in beams from sample treatments planned for volumes up to 2 cm across. Discussion and conclusions These results demonstrate that treatment planning dose calculations for SRT/SRS treatments can be substantially affected when beam data for fields smaller than approximately 1 cm2 are measured inaccurately, even when treatment volumes are up to 2 cm across.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study presents a general approach to identify dominant oscillation modes in bulk power system by using wide-area measurement system. To automatically identify the dominant modes without artificial participation, spectral characteristic of power system oscillation mode is applied to distinguish electromechanical oscillation modes which are calculated by stochastic subspace method, and a proposed mode matching pursuit is adopted to discriminate the dominant modes from the trivial modes, then stepwise-refinement scheme is developed to remove outliers of the dominant modes and the highly accurate dominant modes of identification are obtained. The method is implemented on the dominant modes of China Southern Power Grid which is one of the largest AC/DC paralleling grids in the world. Simulation data and field-measurement data are used to demonstrate high accuracy and better robustness of the dominant modes identification approach.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This study aims to assess the accuracy of Digital Elevation Model (DEM) which is generated by using Toutin’s model. Thus, Toutin’s model was run by using OrthoEngineSE of PCI Geomatics 10.3.Thealong-track stereoimages of Advanced Spaceborne Thermal Emission and Reflection radiometer (ASTER) sensor with 15 m resolution were used to produce DEM on an area with low and near Mean Sea Level (MSL) elevation in Johor Malaysia. Despite the satisfactory pre-processing results the visual assessment of the DEM generated from Toutin’s model showed that the DEM contained many outliers and incorrect values. The failure of Toutin’s model may mostly be due to the inaccuracy and insufficiency of ASTER ephemeris data for low terrains as well as huge water body in the stereo images.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

High-Order Co-Clustering (HOCC) methods have attracted high attention in recent years because of their ability to cluster multiple types of objects simultaneously using all available information. During the clustering process, HOCC methods exploit object co-occurrence information, i.e., inter-type relationships amongst different types of objects as well as object affinity information, i.e., intra-type relationships amongst the same types of objects. However, it is difficult to learn accurate intra-type relationships in the presence of noise and outliers. Existing HOCC methods consider the p nearest neighbours based on Euclidean distance for the intra-type relationships, which leads to incomplete and inaccurate intra-type relationships. In this paper, we propose a novel HOCC method that incorporates multiple subspace learning with a heterogeneous manifold ensemble to learn complete and accurate intra-type relationships. Multiple subspace learning reconstructs the similarity between any pair of objects that belong to the same subspace. The heterogeneous manifold ensemble is created based on two-types of intra-type relationships learnt using p-nearest-neighbour graph and multiple subspaces learning. Moreover, in order to make sure the robustness of clustering process, we introduce a sparse error matrix into matrix decomposition and develop a novel iterative algorithm. Empirical experiments show that the proposed method achieves improved results over the state-of-art HOCC methods for FScore and NMI.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A long-held assumption in entrepreneurship research is that normal (i.e., Gaussian) distributions characterize variables of interest for both theory and practice. We challenge this assumption by examining more than 12,000 nascent, young, and hyper-growth firms. Results reveal that variables which play central roles in resource-, cognition-, action-, and environment-based entrepreneurship theories exhibit highly skewed power law distributions, where a few outliers account for a disproportionate amount of the distribution's total output. Our results call for the development of new theory to explain and predict the mechanisms that generate these distributions and the outliers therein. We offer a research agenda, including a description of non-traditional methodological approaches, to answer this call.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The purpose of this article is to assess the viability of blanket sustainability policies, such as Building Rating Systems in achieving energy efficiency in university campus buildings. We analyzed the energy consumption trends of 10 LEED-certified buildings and 14 non-LEED certified buildings at a major university in the US. Energy Use Intensity (EUI) of the LEED buildings was significantly higher (EUILEED= 331.20 kBtu/sf/yr) than non-LEED buildings (EUInon-LEED=222.70 kBtu/sf/yr); however, the median EUI values were comparable (EUILEED= 172.64 and EUInon-LEED= 178.16). Because the distributions of EUI values were non-symmetrical in this dataset, both measures can be used for energy comparisons—this was also evident when EUI computations exclude outliers, EUILEED=171.82 and EUInon-LEED=195.41. Additional analyses were conducted to further explore the impact of LEED certification on university campus buildings energy performance. No statistically significant differences were observed between certified and non-certified buildings through a range of robust comparison criteria. These findings were then leveraged to devise strategies to achieve sustainable energy policies for university campus buildings and to identify potential issues with portfolio level building energy performance comparisons.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Background There is a need for better understanding of the dispersion of classification-related variable to develop an evidence-based classification of athletes with a disability participating in stationary throwing events. Objectives The purposes of this study are (A) to describe tools designed to comprehend and represent the dispersion of the performance between successive classes, and (B) to present this dispersion for the elite male and female stationary shot-putters who participated in Beijing 2008 Paralympic Games. Study design Retrospective study Methods This study analysed a total of 479 attempts performed by 114 male and female stationary shot-putters in three F30s (F32-F34) and six F50s (F52-F58) classes during the course of eight events during Beijing 2008 Paralympic Games. Results The average differences of best performance were 1.46±0.46 m for males between F54 and F58 classes as well as 1.06±1.18 m for females between F55 and F58 classes. The results demonstrated a linear relationship between best performance and classification while revealing two male Gold Medallists in F33 and F52 classes were outliers. Conclusions This study confirms the benefits of the comparative matrices, performance continuum and dispersion plots to comprehend classification-related variables. The work presented here represents a stepping stone into biomechanical analyses of stationary throwers, particularly on the eve of the London 2012 Paralympic Games where new evidences could be gathered.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Some statistical procedures already available in literature are employed in developing the water quality index, WQI. The nature of complexity and interdependency that occur in physical and chemical processes of water could be easier explained if statistical approaches were applied to water quality indexing. The most popular statistical method used in developing WQI is the principal component analysis (PCA). In literature, the WQI development based on the classical PCA mostly used water quality data that have been transformed and normalized. Outliers may be considered in or eliminated from the analysis. However, the classical mean and sample covariance matrix used in classical PCA methodology is not reliable if the outliers exist in the data. Since the presence of outliers may affect the computation of the principal component, robust principal component analysis, RPCA should be used. Focusing in Langat River, the RPCA-WQI was introduced for the first time in this study to re-calculate the DOE-WQI. Results show that the RPCA-WQI is capable to capture similar distribution in the existing DOE-WQI.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Ordinal qualitative data are often collected for phenotypical measurements in plant pathology and other biological sciences. Statistical methods, such as t tests or analysis of variance, are usually used to analyze ordinal data when comparing two groups or multiple groups. However, the underlying assumptions such as normality and homogeneous variances are often violated for qualitative data. To this end, we investigated an alternative methodology, rank regression, for analyzing the ordinal data. The rank-based methods are essentially based on pairwise comparisons and, therefore, can deal with qualitative data naturally. They require neither normality assumption nor data transformation. Apart from robustness against outliers and high efficiency, the rank regression can also incorporate covariate effects in the same way as the ordinary regression. By reanalyzing a data set from a wheat Fusarium crown rot study, we illustrated the use of the rank regression methodology and demonstrated that the rank regression models appear to be more appropriate and sensible for analyzing nonnormal data and data with outliers.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Rank-based inference is widely used because of its robustness. This article provides optimal rank-based estimating functions in analysis of clustered data with random cluster effects. The extensive simulation studies carried out to evaluate the performance of the proposed method demonstrate that it is robust to outliers and is highly efficient given the existence of strong cluster correlations. The performance of the proposed method is satisfactory even when the correlation structure is misspecified, or when heteroscedasticity in variance is present. Finally, a real dataset is analyzed for illustration.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

With growing population and fast urbanization in Australia, it is a challenging task to maintain our water quality. It is essential to develop an appropriate statistical methodology in analyzing water quality data in order to draw valid conclusions and hence provide useful advices in water management. This paper is to develop robust rank-based procedures for analyzing nonnormally distributed data collected over time at different sites. To take account of temporal correlations of the observations within sites, we consider the optimally combined estimating functions proposed by Wang and Zhu (Biometrika, 93:459-464, 2006) which leads to more efficient parameter estimation. Furthermore, we apply the induced smoothing method to reduce the computational burden. Smoothing leads to easy calculation of the parameter estimates and their variance-covariance matrix. Analysis of water quality data from Total Iron and Total Cyanophytes shows the differences between the traditional generalized linear mixed models and rank regression models. Our analysis also demonstrates the advantages of the rank regression models for analyzing nonnormal data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Power calculation and sample size determination are critical in designing environmental monitoring programs. The traditional approach based on comparing the mean values may become statistically inappropriate and even invalid when substantial proportions of the response values are below the detection limits or censored because strong distributional assumptions have to be made on the censored observations when implementing the traditional procedures. In this paper, we propose a quantile methodology that is robust to outliers and can also handle data with a substantial proportion of below-detection-limit observations without the need of imputing the censored values. As a demonstration, we applied the methods to a nutrient monitoring project, which is a part of the Perth Long-Term Ocean Outlet Monitoring Program. In this example, the sample size required by our quantile methodology is, in fact, smaller than that by the traditional t-test, illustrating the merit of our method.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider ranked-based regression models for clustered data analysis. A weighted Wilcoxon rank method is proposed to take account of within-cluster correlations and varying cluster sizes. The asymptotic normality of the resulting estimators is established. A method to estimate covariance of the estimators is also given, which can bypass estimation of the density function. Simulation studies are carried out to compare different estimators for a number of scenarios on the correlation structure, presence/absence of outliers and different correlation values. The proposed methods appear to perform well, in particular, the one incorporating the correlation in the weighting achieves the highest efficiency and robustness against misspecification of correlation structure and outliers. A real example is provided for illustration.