966 resultados para Crowd density estimation
Resumo:
This paper describes the crowd image analysis challenge that forms part of the PETS 2009 workshop. The aim of this challenge is to use new or existing systems for i) crowd count and density estimation, ii) tracking of individual(s) within a crowd, and iii) detection of separate flows and specific crowd events, in a real-world environment. The dataset scenarios were filmed from multiple cameras and involve multiple actors.
Resumo:
This paper presents an efficient construction algorithm for obtaining sparse kernel density estimates based on a regression approach that directly optimizes model generalization capability. Computational efficiency of the density construction is ensured using an orthogonal forward regression, and the algorithm incrementally minimizes the leave-one-out test score. A local regularization method is incorporated naturally into the density construction process to further enforce sparsity. An additional advantage of the proposed algorithm is that it is fully automatic and the user is not required to specify any criterion to terminate the density construction procedure. This is in contrast to an existing state-of-art kernel density estimation method using the support vector machine (SVM), where the user is required to specify some critical algorithm parameter. Several examples are included to demonstrate the ability of the proposed algorithm to effectively construct a very sparse kernel density estimate with comparable accuracy to that of the full sample optimized Parzen window density estimate. Our experimental results also demonstrate that the proposed algorithm compares favorably with the SVM method, in terms of both test accuracy and sparsity, for constructing kernel density estimates.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Mixtures of polynomials (MoPs) are a non-parametric density estimation technique especially designed for hybrid Bayesian networks with continuous and discrete variables. Algorithms to learn one- and multi-dimensional (marginal) MoPs from data have recently been proposed. In this paper we introduce two methods for learning MoP approximations of conditional densities from data. Both approaches are based on learning MoP approximations of the joint density and the marginal density of the conditioning variables, but they differ as to how the MoP approximation of the quotient of the two densities is found. We illustrate and study the methods using data sampled from known parametric distributions, and we demonstrate their applicability by learning models based on real neuroscience data. Finally, we compare the performance of the proposed methods with an approach for learning mixtures of truncated basis functions (MoTBFs). The empirical results show that the proposed methods generally yield models that are comparable to or significantly better than those found using the MoTBF-based method.
Resumo:
The identification of disease clusters in space or space-time is of vital importance for public health policy and action. In the case of methicillin-resistant Staphylococcus aureus (MRSA), it is particularly important to distinguish between community and health care-associated infections, and to identify reservoirs of infection. 832 cases of MRSA in the West Midlands (UK) were tested for clustering and evidence of community transmission, after being geo-located to the centroids of UK unit postcodes (postal areas roughly equivalent to Zip+4 zip code areas). An age-stratified analysis was also carried out at the coarser spatial resolution of UK Census Output Areas. Stochastic simulation and kernel density estimation were combined to identify significant local clusters of MRSA (p<0.025), which were supported by SaTScan spatial and spatio-temporal scan. In order to investigate local sampling effort, a spatial 'random labelling' approach was used, with MRSA as cases and MSSA (methicillin-sensitive S. aureus) as controls. Heavy sampling in general was a response to MRSA outbreaks, which in turn appeared to be associated with medical care environments. The significance of clusters identified by kernel estimation was independently supported by information on the locations and client groups of nursing homes, and by preliminary molecular typing of isolates. In the absence of occupational/ lifestyle data on patients, the assumption was made that an individual's location and consequent risk is adequately represented by their residential postcode. The problems of this assumption are discussed, with recommendations for future data collection.
Resumo:
Quantile regression (QR) was first introduced by Roger Koenker and Gilbert Bassett in 1978. It is robust to outliers which affect least squares estimator on a large scale in linear regression. Instead of modeling mean of the response, QR provides an alternative way to model the relationship between quantiles of the response and covariates. Therefore, QR can be widely used to solve problems in econometrics, environmental sciences and health sciences. Sample size is an important factor in the planning stage of experimental design and observational studies. In ordinary linear regression, sample size may be determined based on either precision analysis or power analysis with closed form formulas. There are also methods that calculate sample size based on precision analysis for QR like C.Jennen-Steinmetz and S.Wellek (2005). A method to estimate sample size for QR based on power analysis was proposed by Shao and Wang (2009). In this paper, a new method is proposed to calculate sample size based on power analysis under hypothesis test of covariate effects. Even though error distribution assumption is not necessary for QR analysis itself, researchers have to make assumptions of error distribution and covariate structure in the planning stage of a study to obtain a reasonable estimate of sample size. In this project, both parametric and nonparametric methods are provided to estimate error distribution. Since the method proposed can be implemented in R, user is able to choose either parametric distribution or nonparametric kernel density estimation for error distribution. User also needs to specify the covariate structure and effect size to carry out sample size and power calculation. The performance of the method proposed is further evaluated using numerical simulation. The results suggest that the sample sizes obtained from our method provide empirical powers that are closed to the nominal power level, for example, 80%.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
This poster summarises the current findings from STRC’s Integrated Traveller Information research domain that aims for accurate and reliable travel time prediction, and optimisation of multimodal trips. Following are the three selected discussions: a) Fundamental understanding on the use of Bluetooth MAC Scanner (BMS) for travel time estimation b) Integration of multi-sources (Loops and Bluetooth) for travel time and density estimation c) Architecture for online and predictive multimodal trip planner
Resumo:
The Macroscopic Fundamental Diagram (MFD) relates space-mean density and flow, and the existence with dynamic features was confirmed in congested urban network with real data set from loop detectors and taxi probes. Since the MFD represents the area-wide network traffic performances, it gives foundations for perimeter control strategies and an area traffic state estimation enabling area-based network control. However, limited works have been reported on real world example from signalised arterial network. This paper fuses data from multiple sources (Bluetooth, Loops and Signals) and develops a framework for the development of the MFD for Brisbane. Existence of the MFD in Brisbane network is confirmed. Different MFDs (from whole network and several sub regions) are evaluated to discover the spatial partitioning in network performance representation.
Resumo:
The Macroscopic Fundamental Diagram (MFD) relates space-mean density and flow, and the existence with dynamic features was confirmed in congested urban network in downtown Yokohama with real data set. Since the MFD represents the area-wide network traffic performances, studies on perimeter control strategies and an area traffic state estimation utilizing the MFD concept has been reported. However, limited works have been reported on real world example from signalised arterial network. This paper fuses data from multiple sources (Bluetooth, Loops and Signals) and develops a framework for the development of the MFD for Brisbane, Australia. Existence of the MFD in Brisbane arterial network is confirmed. Different MFDs (from whole network and several sub regions) are evaluated to discover the spatial partitioning in network performance representation. The findings confirmed the usefulness of appropriate network partitioning for traffic monitoring and incident detections. The discussion addressed future research directions
Resumo:
An important aspect of decision support systems involves applying sophisticated and flexible statistical models to real datasets and communicating these results to decision makers in interpretable ways. An important class of problem is the modelling of incidence such as fire, disease etc. Models of incidence known as point processes or Cox processes are particularly challenging as they are ‘doubly stochastic’ i.e. obtaining the probability mass function of incidents requires two integrals to be evaluated. Existing approaches to the problem either use simple models that obtain predictions using plug-in point estimates and do not distinguish between Cox processes and density estimation but do use sophisticated 3D visualization for interpretation. Alternatively other work employs sophisticated non-parametric Bayesian Cox process models, but do not use visualization to render interpretable complex spatial temporal forecasts. The contribution here is to fill this gap by inferring predictive distributions of Gaussian-log Cox processes and rendering them using state of the art 3D visualization techniques. This requires performing inference on an approximation of the model on a discretized grid of large scale and adapting an existing spatial-diurnal kernel to the log Gaussian Cox process context.
Resumo:
The Macroscopic Fundamental Diagram (MFD) relates space-mean density and flow, and the existence with dynamic features was confirmed in congested urban network in downtown Yokohama with real data set. Since the MFD represents the area-wide network traffic performances, studies on perimeter control strategies and an area traffic state estimation utilizing the MFD concept has been reported. However, limited works have been reported on real world example from signalised arterial network. This paper fuses data from multiple sources (Bluetooth, Loops and Signals) and presents a framework for the development of the MFD for Brisbane, Australia. Existence of the MFD in Brisbane arterial network is confirmed. Different MFDs (from whole network and several sub regions) are evaluated to discover the spatial partitioning for network performance representation. The findings confirmed the usefulness of appropriate network partitioning for traffic monitoring and incident detections. The discussion addressed future research directions.
Resumo:
This thesis explored traffic characteristics at the aggregate level for area-wide traffic monitoring of large urban area. It focused on three aspects: understanding a macroscopic network performance under real-time traffic information provision, measuring traffic performance of a signalised arterial network using available data sets, and discussing network zoning for monitoring purposes in the case of Brisbane, Australia. This work presented the use of probe vehicle data for estimating traffic state variables, and illustrated dynamic features of regional traffic performance of Brisbane. The results confirmed the viability and effectiveness of area-wide traffic monitoring.