958 resultados para SIFT background model


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Several pixel-based people counting methods have been developed over the years. Among these the product of scale-weighted pixel sums and a linear correlation coefficient is a popular people counting approach. However most approaches have paid little attention to resolving the true background and instead take all foreground pixels into account. With large crowds moving at varying speeds and with the presence of other moving objects such as vehicles this approach is prone to problems. In this paper we present a method which concentrates on determining the true-foreground, i.e. human-image pixels only. To do this we have proposed, implemented and comparatively evaluated a human detection layer to make people counting more robust in the presence of noise and lack of empty background sequences. We show the effect of combining human detection with a pixel-map based algorithm to i) count only human-classified pixels and ii) prevent foreground pixels belonging to humans from being absorbed into the background model. We evaluate the performance of this approach on the PETS 2009 dataset using various configurations of the proposed methods. Our evaluation demonstrates that the basic benchmark method we implemented can achieve an accuracy of up to 87% on sequence ¿S1.L1 13-57 View 001¿ and our proposed approach can achieve up to 82% on sequence ¿S1.L3 14-33 View 001¿ where the crowd stops and the benchmark accuracy falls to 64%.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we present a novel person detection system for public transport buses tackling the problem of changing illumination conditions. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modeling mechanism with a human shape model into a weighted Bayesian framework to detect passengers on-board buses. SIFT background modeling extracts local stable features on the pre-annotated background seat areas and tracks these features over time to build a global statistical background model for each seat. Since SIFT features are partially invariant to lighting, this background model can be used robustly to detect the seat occupancy status even under severe lighting changes. The human shape model further confirms the existence of a passenger when a seat is occupied. This constructs a robust passenger monitoring system which is resilient to illumination changes. We evaluate the performance of our proposed system on a number of challenging video datasets obtained from bus cameras and the experimental results show that it is superior to state-of-art people detection systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the measurement of the Higgs Boson decaying into two photons the parametrization of an appropriate background model is essential for fitting the Higgs signal mass peak over a continuous background. This diphoton background modeling is crucial in the statistical process of calculating exclusion limits and the significance of observations in comparison to a background-only hypothesis. It is therefore ideal to obtain knowledge of the physical shape for the background mass distribution as the use of an improper function can lead to biases in the observed limits. Using an Information-Theoretic (I-T) approach for valid inference we apply Akaike Information Criterion (AIC) as a measure of the separation for a fitting model from the data. We then implement a multi-model inference ranking method to build a fit-model that closest represents the Standard Model background in 2013 diphoton data recorded by the Compact Muon Solenoid (CMS) experiment at the Large Hadron Collider (LHC). Potential applications and extensions of this model-selection technique are discussed with reference to CMS detector performance measurements as well as in potential physics analyses at future detectors.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper proposes an optimisation of the adaptive Gaussian mixture background model that allows the deployment of the method on processors with low memory capacity. The effect of the granularity of the Gaussian mean-value and variance in an integer-based implementation is investigated and novel updating rules of the mixture weights are described. Based on the proposed framework, an implementation for a very low power consumption micro-controller is presented. Results show that the proposed method operates in real time on the micro-controller and has similar performance to the original model. © 2012 Springer-Verlag.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We present a method for foreground/background separation of audio using a background modelling technique. The technique models the background in an online, unsupervised, and adaptive fashion, and is designed for application to long term surveillance and monitoring problems. The background is determined using a statistical method to model the states of the audio over time. In addition, three methods are used to increase the accuracy of background modelling in complex audio environments. Such environments can cause the failure of the statistical model to accurately capture the background states. An entropy-based approach is used to unify background representations fragmented over multiple states of the statistical model. The approach successfully unifies such background states, resulting in a more robust background model. We adaptively adjust the number of states considered background according to background complexity, resulting in the more accurate classification of background models. Finally, we use an auxiliary model cache to retain potential background states in the system. This prevents the deletion of such states due to a rapid influx of observed states that can occur for highly dynamic sections of the audio signal. The separation algorithm was successfully applied to a number of audio environments representing monitoring applications.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper, we present a system for pedestrian detection involving scenes captured by mobile bus surveillance cameras in busy city streets. Our approach integrates scene localization, foreground and background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data. In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarities and second stage further clusters these aligned frames in terms of lighting. This produces clusters of images which are differential in viewpoint and lighting. A kernel density estimation (KDE) method for colour and gradient foreground-background separation are then used to construct background model for each image cluster which is subsequently used to detect all foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be identified. We have tested our system on a set of real bus video datasets and the experimental results verify that our system works well in practice.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we extend an existing audio background modelling technique, leading to a more robust application to complex audio environments. The determination of background audio is used as an initial stage in the analysis of audio for surveillance and monitoring applications. Knowledge of the background serves to highlight unusual or infrequent sounds. An existing modelling approach uses an online, adaptive Gaussian Mixture model technique that uses multiple distributions to model variations in the background. The method used to determine the background distributions of the GMM leads to a failure mode of the existing technique when applied to complex audio. We propose a method incorporating further information, the proximity of distributions determined using entropy, to determine a more complete background model. The method was successful in more robustly modelling the background for complex audio scenes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The quality of temperature and humidity retrievals from the infrared SEVIRI sensors on the geostationary Meteosat Second Generation (MSG) satellites is assessed by means of a one dimensional variational algorithm. The study is performed with the aim of improving the spatial and temporal resolution of available observations to feed analysis systems designed for high resolution regional scale numerical weather prediction (NWP) models. The non-hydrostatic forecast model COSMO (COnsortium for Small scale MOdelling) in the ARPA-SIM operational configuration is used to provide background fields. Only clear sky observations over sea are processed. An optimised 1D–VAR set-up comprising of the two water vapour and the three window channels is selected. It maximises the reduction of errors in the model backgrounds while ensuring ease of operational implementation through accurate bias correction procedures and correct radiative transfer simulations. The 1D–VAR retrieval quality is firstly quantified in relative terms employing statistics to estimate the reduction in the background model errors. Additionally the absolute retrieval accuracy is assessed comparing the analysis with independent radiosonde and satellite observations. The inclusion of satellite data brings a substantial reduction in the warm and dry biases present in the forecast model. Moreover it is shown that the retrieval profiles generated by the 1D–VAR are well correlated with the radiosonde measurements. Subsequently the 1D–VAR technique is applied to two three–dimensional case–studies: a false alarm case–study occurred in Friuli–Venezia–Giulia on the 8th of July 2004 and a heavy precipitation case occurred in Emilia–Romagna region between 9th and 12th of April 2005. The impact of satellite data for these two events is evaluated in terms of increments in the integrated water vapour and saturation water vapour over the column, in the 2 meters temperature and specific humidity and in the surface temperature. To improve the 1D–VAR technique a method to calculate flow–dependent model error covariance matrices is also assessed. The approach employs members from an ensemble forecast system generated by perturbing physical parameterisation schemes inside the model. The improved set–up applied to the case of 8th of July 2004 shows a substantial neutral impact.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Keyword Spotting is the task of detecting keywords of interest within continu- ous speech. The applications of this technology range from call centre dialogue systems to covert speech surveillance devices. Keyword spotting is particularly well suited to data mining tasks such as real-time keyword monitoring and unre- stricted vocabulary audio document indexing. However, to date, many keyword spotting approaches have su®ered from poor detection rates, high false alarm rates, or slow execution times, thus reducing their commercial viability. This work investigates the application of keyword spotting to data mining tasks. The thesis makes a number of major contributions to the ¯eld of keyword spotting. The ¯rst major contribution is the development of a novel keyword veri¯cation method named Cohort Word Veri¯cation. This method combines high level lin- guistic information with cohort-based veri¯cation techniques to obtain dramatic improvements in veri¯cation performance, in particular for the problematic short duration target word class. The second major contribution is the development of a novel audio document indexing technique named Dynamic Match Lattice Spotting. This technique aug- ments lattice-based audio indexing principles with dynamic sequence matching techniques to provide robustness to erroneous lattice realisations. The resulting algorithm obtains signi¯cant improvement in detection rate over lattice-based audio document indexing while still maintaining extremely fast search speeds. The third major contribution is the study of multiple veri¯er fusion for the task of keyword veri¯cation. The reported experiments demonstrate that substantial improvements in veri¯cation performance can be obtained through the fusion of multiple keyword veri¯ers. The research focuses on combinations of speech background model based veri¯ers and cohort word veri¯ers. The ¯nal major contribution is a comprehensive study of the e®ects of limited training data for keyword spotting. This study is performed with consideration as to how these e®ects impact the immediate development and deployment of speech technologies for non-English languages.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Automatic spoken Language Identi¯cation (LID) is the process of identifying the language spoken within an utterance. The challenge that this task presents is that no prior information is available indicating the content of the utterance or the identity of the speaker. The trend of globalization and the pervasive popularity of the Internet will amplify the need for the capabilities spoken language identi¯ca- tion systems provide. A prominent application arises in call centers dealing with speakers speaking di®erent languages. Another important application is to index or search huge speech data archives and corpora that contain multiple languages. The aim of this research is to develop techniques targeted at producing a fast and more accurate automatic spoken LID system compared to the previous National Institute of Standards and Technology (NIST) Language Recognition Evaluation. Acoustic and phonetic speech information are targeted as the most suitable fea- tures for representing the characteristics of a language. To model the acoustic speech features a Gaussian Mixture Model based approach is employed. Pho- netic speech information is extracted using existing speech recognition technol- ogy. Various techniques to improve LID accuracy are also studied. One approach examined is the employment of Vocal Tract Length Normalization to reduce the speech variation caused by di®erent speakers. A linear data fusion technique is adopted to combine the various aspects of information extracted from speech. As a result of this research, a LID system was implemented and presented for evaluation in the 2003 Language Recognition Evaluation conducted by the NIST.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents visual detection and classification of light vehicles and personnel on a mine site.We capitalise on the rapid advances of ConvNet based object recognition but highlight that a naive black box approach results in a significant number of false positives. In particular, the lack of domain specific training data and the unique landscape in a mine site causes a high rate of errors. We exploit the abundance of background-only images to train a k-means classifier to complement the ConvNet. Furthermore, localisation of objects of interest and a reduction in computation is enabled through region proposals. Our system is tested on over 10km of real mine site data and we were able to detect both light vehicles and personnel. We show that the introduction of our background model can reduce the false positive rate by an order of magnitude.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We present a Bayesian-odds-ratio-based algorithm for detecting stellar flares in light-curve data. We assume flares are described by a model in which there is a rapid rise with a half-Gaussian profile, followed by an exponential decay. Our signal model also contains a polynomial background model required to fit underlying light-curve variations in the data, which could otherwise partially mimic a flare. We characterize the false alarm probability and efficiency of this method under the assumption that any unmodelled noise in the data is Gaussian, and compare it with a simpler thresholding method based on that used in Walkowicz et al. We find our method has a significant increase in detection efficiency for low signal-to-noise ratio (S/N) flares. For a conservative false alarm probability our method can detect 95 per cent of flares with S/N less than 20, as compared to S/N of 25 for the simpler method. We also test how well the assumption of Gaussian noise holds by applying the method to a selection of 'quiet' Kepler stars. As an example we have applied our method to a selection of stars in Kepler Quarter 1 data. The method finds 687 flaring stars with a total of 1873 flares after vetos have been applied. For these flares we have made preliminary characterizations of their durations and and S/N.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigated using lip movements as a behavioural biometric for person authentication. The system was trained, evaluated and tested using the XM2VTS dataset, following the Lausanne Protocol configuration II. Features were selected from the DCT coefficients of the greyscale lip image. This paper investigated the number of DCT coefficients selected, the selection process, and static and dynamic feature combinations. Using a Gaussian Mixture Model - Universal Background Model framework an Equal Error Rate of 2.20% was achieved during evaluation and on an unseen test set a False Acceptance Rate of 1.7% and False Rejection Rate of 3.0% was achieved. This compares favourably with face authentication results on the same dataset whilst not being susceptible to spoofing attacks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Actualmente tem-se observado um aumento do volume de sinais de fala em diversas aplicações, que reforçam a necessidade de um processamento automático dos ficheiros. No campo do processamento automático destacam-se as aplicações de “diarização de orador”, que permitem catalogar os ficheiros de fala com a identidade de oradores e limites temporais de fala de cada um, através de um processo de segmentação e agrupamento. No contexto de agrupamento, este trabalho visa dar continuidade ao trabalho intitulado “Detecção do Orador”, com o desenvolvimento de um algoritmo de “agrupamento multi-orador” capaz de identificar e agrupar correctamente os oradores, sem conhecimento prévio do número ou da identidade dos oradores presentes no ficheiro de fala. O sistema utiliza os coeficientes “Mel Line Spectrum Frequencies” (MLSF) como característica acústica de fala, uma segmentação de fala baseada na energia e uma estrutura do tipo “Universal Background Model - Gaussian Mixture Model” (UBM-GMM) adaptado com o classificador “Support Vector Machine” (SVM). No trabalho foram analisadas três métricas de discriminação dos modelos SVM e a avaliação dos resultados foi feita através da taxa de erro “Speaker Error Rate” (SER), que quantifica percentualmente o número de segmentos “fala” mal classificados. O algoritmo implementado foi ajustado às características da língua portuguesa através de um corpus com 14 ficheiros de treino e 30 ficheiros de teste. Os ficheiros de treino dos modelos e classificação final, enquanto os ficheiros de foram utilizados para avaliar o desempenho do algoritmo. A interacção com o algoritmo foi dinamizada com a criação de uma interface gráfica que permite receber o ficheiro de teste, processá-lo, listar os resultados ou gerar um vídeo para o utilizador confrontar o sinal de fala com os resultados de classificação.