664 resultados para Information Security, Safe Behavior, Users’ behavior, Brazilian users, threats


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of visual features in the form of lip movements to improve the performance of acoustic speech recognition has been shown to work well, particularly in noisy acoustic conditions. However, whether this technique can outperform speech recognition incorporating well-known acoustic enhancement techniques, such as spectral subtraction, or multi-channel beamforming is not known. This is an important question to be answered especially in an automotive environment, for the design of an efficient human-vehicle computer interface. We perform a variety of speech recognition experiments on a challenging automotive speech dataset and results show that synchronous HMM-based audio-visual fusion can outperform traditional single as well as multi-channel acoustic speech enhancement techniques. We also show that further improvement in recognition performance can be obtained by fusing speech-enhanced audio with the visual modality, demonstrating the complementary nature of the two robust speech recognition approaches.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Distributed Denial-of-Service (DDoS) attacks continue to be one of the most pernicious threats to the delivery of services over the Internet. Not only are DDoS attacks present in many guises, they are also continuously evolving as new vulnerabilities are exploited. Hence accurate detection of these attacks still remains a challenging problem and a necessity for ensuring high-end network security. An intrinsic challenge in addressing this problem is to effectively distinguish these Denial-of-Service attacks from similar looking Flash Events (FEs) created by legitimate clients. A considerable overlap between the general characteristics of FEs and DDoS attacks makes it difficult to precisely separate these two classes of Internet activity. In this paper we propose parameters which can be used to explicitly distinguish FEs from DDoS attacks and analyse two real-world publicly available datasets to validate our proposal. Our analysis shows that even though FEs appear very similar to DDoS attacks, there are several subtle dissimilarities which can be exploited to separate these two classes of events.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The construction of timelines of computer activity is a part of many digital investigations. These timelines of events are composed of traces of historical activity drawn from system logs and potentially from evidence of events found in the computer file system. A potential problem with the use of such information is that some of it may be inconsistent and contradictory thus compromising its value. This work introduces a software tool (CAT Detect) for the detection of inconsistency within timelines of computer activity. We examine the impact of deliberate tampering through experiments conducted with our prototype software tool. Based on the results of these experiments, we discuss techniques which can be employed to deal with such temporal inconsistencies.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

CCTV and surveillance networks are increasingly being used for operational as well as security tasks. One emerging area of technology that lends itself to operational analytics is soft biometrics. Soft biometrics can be used to describe a person and detect them throughout a sparse multi-camera network. This enables them to be used to perform tasks such as determining the time taken to get from point to point, and the paths taken through an environment by detecting and matching people across disjoint views. However, in a busy environment where there are 100's if not 1000's of people such as an airport, attempting to monitor everyone is highly unrealistic. In this paper we propose an average soft biometric, that can be used to identity people who look distinct, and are thus suitable for monitoring through a large, sparse camera network. We demonstrate how an average soft biometric can be used to identify unique people to calculate operational measures such as the time taken to travel from point to point.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Unusual event detection in crowded scenes remains challenging because of the diversity of events and noise. In this paper, we present a novel approach for unusual event detection via sparse reconstruction of dynamic textures over an overcomplete basis set, with the dynamic texture described by local binary patterns from three orthogonal planes (LBPTOP). The overcomplete basis set is learnt from the training data where only the normal items observed. In the detection process, given a new observation, we compute the sparse coefficients using the Dantzig Selector algorithm which was proposed in the literature of compressed sensing. Then the reconstruction errors are computed, based on which we detect the abnormal items. Our application can be used to detect both local and global abnormal events. We evaluate our algorithm on UCSD Abnormality Datasets for local anomaly detection, which is shown to outperform current state-of-the-art approaches, and we also get promising results for rapid escape detection using the PETS2009 dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes a scene invariant crowd counting algorithm that uses local features to monitor crowd size. Unlike previous algorithms that require each camera to be trained separately, the proposed method uses camera calibration to scale between viewpoints, allowing a system to be trained and tested on different scenes. A pre-trained system could therefore be used as a turn-key solution for crowd counting across a wide range of environments. The use of local features allows the proposed algorithm to calculate local occupancy statistics, and Gaussian process regression is used to scale to conditions which are unseen in the training data, also providing confidence intervals for the crowd size estimate. A new crowd counting database is introduced to the computer vision community to enable a wider evaluation over multiple scenes, and the proposed algorithm is tested on seven datasets to demonstrate scene invariance and high accuracy. To the authors' knowledge this is the first system of its kind due to its ability to scale between different scenes and viewpoints.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In automatic facial expression recognition, an increasing number of techniques had been proposed for in the literature that exploits the temporal nature of facial expressions. As all facial expressions are known to evolve over time, it is crucially important for a classifier to be capable of modelling their dynamics. We establish that the method of sparse representation (SR) classifiers proves to be a suitable candidate for this purpose, and subsequently propose a framework for expression dynamics to be efficiently incorporated into its current formulation. We additionally show that for the SR method to be applied effectively, then a certain threshold on image dimensionality must be enforced (unlike in facial recognition problems). Thirdly, we determined that recognition rates may be significantly influenced by the size of the projection matrix \Phi. To demonstrate these, a battery of experiments had been conducted on the CK+ dataset for the recognition of the seven prototypic expressions - anger, contempt, disgust, fear, happiness, sadness and surprise - and comparisons have been made between the proposed temporal-SR against the static-SR framework and state-of-the-art support vector machine.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Probabilistic topic models have recently been used for activity analysis in video processing, due to their strong capacity to model both local activities and interactions in crowded scenes. In those applications, a video sequence is divided into a collection of uniform non-overlaping video clips, and the high dimensional continuous inputs are quantized into a bag of discrete visual words. The hard division of video clips, and hard assignment of visual words leads to problems when an activity is split over multiple clips, or the most appropriate visual word for quantization is unclear. In this paper, we propose a novel algorithm, which makes use of a soft histogram technique to compensate for the loss of information in the quantization process; and a soft cut technique in the temporal domain to overcome problems caused by separating an activity into two video clips. In the detection process, we also apply a soft decision strategy to detect unusual events.We show that the proposed soft decision approach outperforms its hard decision counterpart in both local and global activity modelling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Modelling events in densely crowded environments remains challenging, due to the diversity of events and the noise in the scene. We propose a novel approach for anomalous event detection in crowded scenes using dynamic textures described by the Local Binary Patterns from Three Orthogonal Planes (LBP-TOP) descriptor. The scene is divided into spatio-temporal patches where LBP-TOP based dynamic textures are extracted. We apply hierarchical Bayesian models to detect the patches containing unusual events. Our method is an unsupervised approach, and it does not rely on object tracking or background subtraction. We show that our approach outperforms existing state of the art algorithms for anomalous event detection in UCSD dataset.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Robust speaker verification on short utterances remains a key consideration when deploying automatic speaker recognition, as many real world applications often have access to only limited duration speech data. This paper explores how the recent technologies focused around total variability modeling behave when training and testing utterance lengths are reduced. Results are presented which provide a comparison of Joint Factor Analysis (JFA) and i-vector based systems including various compensation techniques; Within-Class Covariance Normalization (WCCN), LDA, Scatter Difference Nuisance Attribute Projection (SDNAP) and Gaussian Probabilistic Linear Discriminant Analysis (GPLDA). Speaker verification performance for utterances with as little as 2 sec of data taken from the NIST Speaker Recognition Evaluations are presented to provide a clearer picture of the current performance characteristics of these techniques in short utterance conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Compressive Sensing (CS) is a popular signal processing technique, that can exactly reconstruct a signal given a small number of random projections of the original signal, provided that the signal is sufficiently sparse. We demonstrate the applicability of CS in the field of gait recognition as a very effective dimensionality reduction technique, using the gait energy image (GEI) as the feature extraction process. We compare the CS based approach to the principal component analysis (PCA) and show that the proposed method outperforms this baseline, particularly under situations where there are appearance changes in the subject. Applying CS to the gait features also avoids the need to train the models, by using a generalised random projection.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Visual activity detection of lip movements can be used to overcome the poor performance of voice activity detection based solely in the audio domain, particularly in noisy acoustic conditions. However, most of the research conducted in visual voice activity detection (VVAD) has neglected addressing variabilities in the visual domain such as viewpoint variation. In this paper we investigate the effectiveness of the visual information from the speaker’s frontal and profile views (i.e left and right side views) for the task of VVAD. As far as we are aware, our work constitutes the first real attempt to study this problem. We describe our visual front end approach and the Gaussian mixture model (GMM) based VVAD framework, and report the experimental results using the freely available CUAVE database. The experimental results show that VVAD is indeed possible from profile views and we give a quantitative comparison of VVAD based on frontal and profile views The results presented are useful in the development of multi-modal Human Machine Interaction (HMI) using a single camera, where the speaker’s face may not always be frontal.