952 resultados para Robust methods
Resumo:
The paper presents a fast and robust stereo object recognition method. The method is currently unable to identify the rotation of objects. This makes it very good at locating spheres which are rotationally independent. Approximate methods for located non-spherical objects have been developed. Fundamental to the method is that the correspondence problem is solved using information about the dimensions of the object being located. This is in contrast to previous stereo object recognition systems where the scene is first reconstructed by point matching techniques. The method is suitable for real-time application on low-power devices.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.
Resumo:
Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but such approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks on the other hand, optimise the parameters of speech enhancement algorithms based on state sequences generated by a speech recogniser for utterances of known transcriptions. Previous applications of LIMA frameworks have generated a set of global enhancement parameters for all model states without taking in account the distribution of model occurrence, making optimisation susceptible to favouring frequently occurring models, in particular silence. In this paper, we demonstrate the existence of highly disproportionate phonetic distributions on two corpora with distinct speech tasks, and propose to normalise the influence of each phone based on a priori occurrence probabilities. Likelihood analysis and speech recognition experiments verify this approach for improving ASR performance in noisy environments.
Resumo:
This paper demonstrates the application of a robust form of pose estimation and scene reconstruction using data from camera images. We demonstrate results that suggest the ability of the algorithm to rival methods of RANSAC based pose estimation polished by bundle adjustment in terms of solution robustness, speed and accuracy, even when given poor initialisations. Our simulated results show the behaviour of the algorithm in a number of novel simulated scenarios reflective of real world cases that show the ability of the algorithm to handle large observation noise and difficult reconstruction scenes. These results have a number of implications for the vision and robotics community, and show that the application of visual motion estimation on robotic platforms in an online fashion is approaching real-world feasibility.
Resumo:
This paper presents a method of voice activity detection (VAD) suitable for high noise scenarios, based on the fusion of two complementary systems. The first system uses a proposed non-Gaussianity score (NGS) feature based on normal probability testing. The second system employs a histogram distance score (HDS) feature that detects changes in the signal through conducting a template-based similarity measure between adjacent frames. The decision outputs by the two systems are then merged using an open-by-reconstruction fusion stage. Accuracy of the proposed method was compared to several baseline VAD methods on a database created using real recordings of a variety of high-noise environments.
Resumo:
This paper presents a method of voice activity detection (VAD) for high noise scenarios, using a noise robust voiced speech detection feature. The developed method is based on the fusion of two systems. The first system utilises the maximum peak of the normalised time-domain autocorrelation function (MaxPeak). The second zone system uses a novel combination of cross-correlation and zero-crossing rate of the normalised autocorrelation to approximate a measure of signal pitch and periodicity (CrossCorr) that is hypothesised to be noise robust. The score outputs by the two systems are then merged using weighted sum fusion to create the proposed autocorrelation zero-crossing rate (AZR) VAD. Accuracy of AZR was compared to state of the art and standardised VAD methods and was shown to outperform the best performing system with an average relative improvement of 24.8% in half-total error rate (HTER) on the QUT-NOISE-TIMIT database created using real recordings from high-noise environments.
Resumo:
A number of game strategies have been developed in past decades and used in the fields of economics, engineering, computer science, and biology due to their efficiency in solving design optimization problems. In addition, research in multiobjective and multidisciplinary design optimization has focused on developing a robust and efficient optimization method so it can produce a set of high quality solutions with less computational time. In this paper, two optimization techniques are considered; the first optimization method uses multifidelity hierarchical Pareto-optimality. The second optimization method uses the combination of game strategies Nash-equilibrium and Pareto-optimality. This paper shows how game strategies can be coupled to multiobjective evolutionary algorithms and robust design techniques to produce a set of high quality solutions. Numerical results obtained from both optimization methods are compared in terms of computational expense and model quality. The benefits of using Hybrid and non-Hybrid-Game strategies are demonstrated.
Resumo:
The paper investigates a detailed Active Shock Control Bump Design Optimisation on a Natural Laminar Flow (NLF) aerofoil; RAE 5243 to reduce cruise drag at transonic flow conditions using Evolutionary Algorithms (EAs) coupled to a robust design approach. For the uncertainty design parameters, the positions of boundary layer transition (xtr) and the coefficient of lift (Cl) are considered (250 stochastic samples in total). In this paper, two robust design methods are considered; the first approach uses a standard robust design method, which evaluates one design model at 250 stochastic conditions for uncertainty. The second approach is the combination of a standard robust design method and the concept of hierarchical (multi-population) sampling (250, 50, 15) for uncertainty. Numerical results show that the evolutionary optimization method coupled to uncertainty design techniques produces useful and reliable Pareto optimal SCB shapes which have low sensitivity and high aerodynamic performance while having significant total drag reduction. In addition,it also shows the benefit of using hierarchical robust method for detailed uncertainty design optimization.
Resumo:
The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.
Resumo:
Most unsignalised intersection capacity calculation procedures are based on gap acceptance models. Accuracy of critical gap estimation affects accuracy of capacity and delay estimation. Several methods have been published to estimate drivers’ sample mean critical gap, the Maximum Likelihood Estimation (MLE) technique regarded as the most accurate. This study assesses three novel methods; Average Central Gap (ACG) method, Strength Weighted Central Gap method (SWCG), and Mode Central Gap method (MCG), against MLE for their fidelity in rendering true sample mean critical gaps. A Monte Carlo event based simulation model was used to draw the maximum rejected gap and accepted gap for each of a sample of 300 drivers across 32 simulation runs. Simulation mean critical gap is varied between 3s and 8s, while offered gap rate is varied between 0.05veh/s and 0.55veh/s. This study affirms that MLE provides a close to perfect fit to simulation mean critical gaps across a broad range of conditions. The MCG method also provides an almost perfect fit and has superior computational simplicity and efficiency to the MLE. The SWCG method performs robustly under high flows; however, poorly under low to moderate flows. Further research is recommended using field traffic data, under a variety of minor stream and major stream flow conditions for a variety of minor stream movement types, to compare critical gap estimates using MLE against MCG. Should the MCG method prove as robust as MLE, serious consideration should be given to its adoption to estimate critical gap parameters in guidelines.
Resumo:
Despite their ecological significance as decomposers and their evolutionary significance as the most speciose eusocial insect group outside the Hymenoptera, termite (Blattodea: Termitoidae or Isoptera) evolutionary relationships have yet to be well resolved. Previous morphological and molecular analyses strongly conflict at the family level and are marked by poor support for backbone nodes. A mitochondrial (mt) genome phylogeny of termites was produced to test relationships between the recognised termite families, improve nodal support and test the phylogenetic utility of rare genomic changes found in the termite mt genome. Complete mt genomes were sequenced for 7 of the 9 extant termite families with additional representatives of each of the two most speciose families Rhinotermitidae (3 of 7 subfamilies) and Termitidae (3 of 8 subfamilies). The mt genome of the well supported sister group of termites, the subsocial cockroach Cryptocercus, was also sequenced. A highly supported tree of termite relationships was produced by all analytical methods and data treatment approaches, however the relationship of the termites + Cryptocercus clade to other cockroach lineages was highly affected by the strong nucleotide compositional bias found in termites relative to other dictyopterans. The phylogeny supports previously proposed suprafamilial termite lineages, the Euisoptera and Neoisoptera, a later derived Kalotermitidae as sister group of the Neoisoptera and a monophyletic clade of dampwood (Stolotermitidae, Archotermopsidae) and harvester termites (Hodotermitidae). In contrast to previous termite phylogenetic studies, nodal supports were very high for family-level relationships within termites. Two rare genomic changes in the mt genome control region were found to be molecular synapomorphies for major clades. An elongated stem-loop structure defined the clade Polyphagidae + (Cryptocercus + termites), and a further series of compensatory base changes in this stem loop is synapomorphic for the Neoisoptera. The complicated repeat structures first identified in Reticulitermes, composed of short (A-type) and long (B-type repeats) defines the clade Heterotermitinae + Termitidae, while the secondary loss of A-type repeats is synapomorphic for the non-macrotermitine Termitidae.
Resumo:
Many methods exist at the moment for deformable face fitting. A drawback to nearly all these approaches is that they are (i) noisy in terms of landmark positions, and (ii) the noise is biased across frames (i.e. the misalignment is toward common directions across all frames). In this paper we propose a grouped $\mathcal{L}1$-norm anchored method for simultaneously aligning an ensemble of deformable face images stemming from the same subject, given noisy heterogeneous landmark estimates. Impressive alignment performance improvement and refinement is obtained using very weak initialization as "anchors".
Resumo:
In the field of face recognition, Sparse Representation (SR) has received considerable attention during the past few years. Most of the relevant literature focuses on holistic descriptors in closed-set identification applications. The underlying assumption in SR-based methods is that each class in the gallery has sufficient samples and the query lies on the subspace spanned by the gallery of the same class. Unfortunately, such assumption is easily violated in the more challenging face verification scenario, where an algorithm is required to determine if two faces (where one or both have not been seen before) belong to the same person. In this paper, we first discuss why previous attempts with SR might not be applicable to verification problems. We then propose an alternative approach to face verification via SR. Specifically, we propose to use explicit SR encoding on local image patches rather than the entire face. The obtained sparse signals are pooled via averaging to form multiple region descriptors, which are then concatenated to form an overall face descriptor. Due to the deliberate loss spatial relations within each region (caused by averaging), the resulting descriptor is robust to misalignment & various image deformations. Within the proposed framework, we evaluate several SR encoding techniques: l1-minimisation, Sparse Autoencoder Neural Network (SANN), and an implicit probabilistic technique based on Gaussian Mixture Models. Thorough experiments on AR, FERET, exYaleB, BANCA and ChokePoint datasets show that the proposed local SR approach obtains considerably better and more robust performance than several previous state-of-the-art holistic SR methods, in both verification and closed-set identification problems. The experiments also show that l1-minimisation based encoding has a considerably higher computational than the other techniques, but leads to higher recognition rates.
Resumo:
Iris based identity verification is highly reliable but it can also be subject to attacks. Pupil dilation or constriction stimulated by the application of drugs are examples of sample presentation security attacks which can lead to higher false rejection rates. Suspects on a watch list can potentially circumvent the iris based system using such methods. This paper investigates a new approach using multiple parts of the iris (instances) and multiple iris samples in a sequential decision fusion framework that can yield robust performance. Results are presented and compared with the standard full iris based approach for a number of iris degradations. An advantage of the proposed fusion scheme is that the trade-off between detection errors can be controlled by setting parameters such as the number of instances and the number of samples used in the system. The system can then be operated to match security threat levels. It is shown that for optimal values of these parameters, the fused system also has a lower total error rate.
Resumo:
Robust hashing is an emerging field that can be used to hash certain data types in applications unsuitable for traditional cryptographic hashing methods. Traditional hashing functions have been used extensively for data/message integrity, data/message authentication, efficient file identification and password verification. These applications are possible because the hashing process is compressive, allowing for efficient comparisons in the hash domain but non-invertible meaning hashes can be used without revealing the original data. These techniques were developed with deterministic (non-changing) inputs such as files and passwords. For such data types a 1-bit or one character change can be significant, as a result the hashing process is sensitive to any change in the input. Unfortunately, there are certain applications where input data are not perfectly deterministic and minor changes cannot be avoided. Digital images and biometric features are two types of data where such changes exist but do not alter the meaning or appearance of the input. For such data types cryptographic hash functions cannot be usefully applied. In light of this, robust hashing has been developed as an alternative to cryptographic hashing and is designed to be robust to minor changes in the input. Although similar in name, robust hashing is fundamentally different from cryptographic hashing. Current robust hashing techniques are not based on cryptographic methods, but instead on pattern recognition techniques. Modern robust hashing algorithms consist of feature extraction followed by a randomization stage that introduces non-invertibility and compression, followed by quantization and binary encoding to produce a binary hash output. In order to preserve robustness of the extracted features, most randomization methods are linear and this is detrimental to the security aspects required of hash functions. Furthermore, the quantization and encoding stages used to binarize real-valued features requires the learning of appropriate quantization thresholds. How these thresholds are learnt has an important effect on hashing accuracy and the mere presence of such thresholds are a source of information leakage that can reduce hashing security. This dissertation outlines a systematic investigation of the quantization and encoding stages of robust hash functions. While existing literature has focused on the importance of quantization scheme, this research is the first to emphasise the importance of the quantizer training on both hashing accuracy and hashing security. The quantizer training process is presented in a statistical framework which allows a theoretical analysis of the effects of quantizer training on hashing performance. This is experimentally verified using a number of baseline robust image hashing algorithms over a large database of real world images. This dissertation also proposes a new randomization method for robust image hashing based on Higher Order Spectra (HOS) and Radon projections. The method is non-linear and this is an essential requirement for non-invertibility. The method is also designed to produce features more suited for quantization and encoding. The system can operate without the need for quantizer training, is more easily encoded and displays improved hashing performance when compared to existing robust image hashing algorithms. The dissertation also shows how the HOS method can be adapted to work with biometric features obtained from 2D and 3D face images.