954 resultados para Digit speech recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method of improving the security of biometric templates which satisfies desirable properties such as (a) irreversibility of the template, (b) revocability and assignment of a new template to the same biometric input, (c) matching in the secure transformed domain is presented. It makes use of an iterative procedure based on the bispectrum that serves as an irreversible transformation for biometric features because signal phase is discarded each iteration. Unlike the usual hash function, this transformation preserves closeness in the transformed domain for similar biometric inputs. A number of such templates can be generated from the same input. These properties are illustrated using synthetic data and applied to images from the FRGC 3D database with Gabor features. Verification can be successfully performed using these secure templates with an EER of 5.85%

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Secondary tasks such as cell phone calls or interaction with automated speech dialog systems (SDSs) increase the driver’s cognitive load as well as the probability of driving errors. This study analyzes speech production variations due to cognitive load and emotional state of drivers in real driving conditions. Speech samples were acquired from 24 female and 17 male subjects (approximately 8.5 h of data) while talking to a co-driver and communicating with two automated call centers, with emotional states (neutral, negative) and the number of necessary SDS query repetitions also labeled. A consistent shift in a number of speech production parameters (pitch, first format center frequency, spectral center of gravity, spectral energy spread, and duration of voiced segments) was observed when comparing SDS interaction against co-driver interaction; further increases were observed when considering negative emotion segments and the number of requested SDS query repetitions. A mel frequency cepstral coefficient based Gaussian mixture classifier trained on 10 male and 10 female sessions provided 91% accuracy in the open test set task of distinguishing co-driver interactions from SDS interactions, suggesting—together with the acoustic analysis—that it is possible to monitor the level of driver distraction directly from their speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Identifying an individual from surveillance video is a difficult, time consuming and labour intensive process. The proposed system aims to streamline this process by filtering out unwanted scenes and enhancing an individual's face through super-resolution. An automatic face recognition system is then used to identify the subject or present the human operator with likely matches from a database. A person tracker is used to speed up the subject detection and super-resolution process by tracking moving subjects and cropping a region of interest around the subject's face to reduce the number and size of the image frames to be super-resolved respectively. In this paper, experiments have been conducted to demonstrate how the optical flow super-resolution method used improves surveillance imagery for visual inspection as well as automatic face recognition on an Eigenface and Elastic Bunch Graph Matching system. The optical flow based method has also been benchmarked against the ``hallucination'' algorithm, interpolation methods and the original low-resolution images. Results show that both super-resolution algorithms improved recognition rates significantly. Although the hallucination method resulted in slightly higher recognition rates, the optical flow method produced less artifacts and more visually correct images suitable for human consumption.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Principal Topic: Project structures are often created by entrepreneurs and large corporate organizations to develop new products. Since new product development projects (NPDP) are more often situated within a larger organization, intrapreneurship or corporate entrepreneurship plays an important role in bringing these projects to fruition. Since NPDP often involves the development of a new product using immature technology, we describe development of an immature technology. The Joint Strike Fighter (JSF) F-35 aircraft is being developed by the U.S. Department of Defense and eight allied nations. In 2001 Lockheed Martin won a $19 billion contract to develop an affordable, stealthy and supersonic all-weather strike fighter designed to replace a wide range of aging fighter aircraft. In this research we define a complex project as one that demonstrates a number of sources of uncertainty to a degree, or level of severity, that makes it extremely difficult to predict project outcomes, to control or manage project (Remington & Zolin, Forthcoming). Project complexity has been conceptualized by Remington and Pollock (2007) in terms of four major sources of complexity; temporal, directional, structural and technological complexity (See Figure 1). Temporal complexity exists when projects experience significant environmental change outside the direct influence or control of the project. The Global Economic Crisis of 2008 - 2009 is a good example of the type of environmental change that can make a project complex as, for example in the JSF project, where project managers attempt to respond to changes in interest rates, international currency exchange rates and commodity prices etc. Directional complexity exists in a project where stakeholders' goals are unclear or undefined, where progress is hindered by unknown political agendas, or where stakeholders disagree or misunderstand project goals. In the JSF project all the services and all non countries have to agree to the specifications of the three variants of the aircraft; Conventional Take Off and Landing (CTOL), Short Take Off/Vertical Landing (STOVL) and the Carrier Variant (CV). Because the Navy requires a plane that can take off and land on an aircraft carrier, that required a special variant of the aircraft design, adding complexity to the project. Technical complexity occurs in a project using technology that is immature or where design characteristics are unknown or untried. Developing a plane that can take off on a very short runway and land vertically created may highly interdependent technological challenges to correctly locate, direct and balance the lift fans, modulate the airflow and provide equivalent amount of thrust from the downward vectored rear exhaust to lift the aircraft and at the same time control engine temperatures. These technological challenges make costing and scheduling equally challenging. Structural complexity in a project comes from the sheer numbers of elements such as the number of people, teams or organizations involved, ambiguity regarding the elements, and the massive degree of interconnectedness between them. While Lockheed Martin is the prime contractor, they are assisted in major aspects of the JSF development by Northrop Grumman, BAE Systems, Pratt & Whitney and GE/Rolls-Royce Fighter Engineer Team and innumerable subcontractors. In addition to identifying opportunities to achieve project goals, complex projects also need to identify and exploit opportunities to increase agility in response to changing stakeholder demands or to reduce project risks. Complexity Leadership Theory contends that in complex environments adaptive and enabling leadership are needed (Uhl-Bien, Marion and McKelvey, 2007). Adaptive leadership facilitates creativity, learning and adaptability, while enabling leadership handles the conflicts that inevitably arise between adaptive leadership and traditional administrative leadership (Uhl-Bien and Marion, 2007). Hence, adaptive leadership involves the recognition and opportunities to adapt, while and enabling leadership involves the exploitation of these opportunities. Our research questions revolve around the type or source of complexity and its relationship to opportunity recognition and exploitation. For example, is it only external environmental complexity that creates the need for the entrepreneurial behaviours, such as opportunity recognition and opportunity exploitation? Do the internal dimensions of project complexity, such as technological and structural complexity, also create the need for opportunity recognition and opportunity exploitation? The Kropp, Zolin and Lindsay model (2009) describes a relationship between entrepreneurial orientation (EO), opportunity recognition (OR), and opportunity exploitation (OX) in complex projects, with environmental and organizational contextual variables as moderators. We extend their model by defining the affects of external complexity and internal complexity on OR and OX. ---------- Methodology/Key Propositions: When the environment complex EO is more likely to result in OR because project members will be actively looking for solutions to problems created by environmental change. But in projects that are technologically or structurally complex project leaders and members may try to make the minimum changes possible to reduce the risk of creating new problems due to delays or schedule changes. In projects with environmental or technological complexity project leaders who encourage the innovativeness dimension of EO will increase OR in complex projects. But projects with technical or structural complexity innovativeness will not necessarily result in the recognition and exploitation of opportunities due to the over-riding importance of maintaining stability in the highly intricate and interconnected project structure. We propose that in projects with environmental complexity creating the need for change and innovation project leaders, who are willing to accept and manage risk, are more likely to identify opportunities to increase project effectiveness and efficiency. In contrast in projects with internal complexity a much higher willingness to accept risk will be necessary to trigger opportunity recognition. In structurally complex projects we predict it will be less likely to find a relationship between risk taking and OP. When the environment is complex, and a project has autonomy, they will be motivated to execute opportunities to improve the project's performance. In contrast, when the project has high internal complexity, they will be more cautious in execution. When a project experiences high competitive aggressiveness and their environment is complex, project leaders will be motivated to execute opportunities to improve the project's performance. In contrast, when the project has high internal complexity, they will be more cautious in execution. This paper reports the first stage of a three year study into the behaviours of managers, leaders and team members of complex projects. We conduct a qualitative study involving a Group Discussion with experienced project leaders. The objective is to determine how leaders of large and potentially complex projects perceive that external and internal complexity will influence the affects of EO on OR. ---------- Results and Implications: These results will help identify and distinguish the impact of external and internal complexity on entrepreneurial behaviours in NPDP. Project managers will be better able to quickly decide how and when to respond to changes in the environment and internal project events.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information fusion in biometrics has received considerable attention. The architecture proposed here is based on the sequential integration of multi-instance and multi-sample fusion schemes. This method is analytically shown to improve the performance and allow a controlled trade-off between false alarms and false rejects when the classifier decisions are statistically independent. Equations developed for detection error rates are experimentally evaluated by considering the proposed architecture for text dependent speaker verification using HMM based digit dependent speaker models. The tuning of parameters, n classifiers and m attempts/samples, is investigated and the resultant detection error trade-off performance is evaluated on individual digits. Results show that performance improvement can be achieved even for weaker classifiers (FRR-19.6%, FAR-16.7%). The architectures investigated apply to speaker verification from spoken digit strings such as credit card numbers in telephone or VOIP or internet based applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recovering position from sensor information is an important problem in mobile robotics, known as localisation. Localisation requires a map or some other description of the environment to provide the robot with a context to interpret sensor data. The mobile robot system under discussion is using an artificial neural representation of position. Building a geometrical map of the environment with a single camera and artificial neural networks is difficult. Instead it would be simpler to learn position as a function of the visual input. Usually when learning images, an intermediate representation is employed. An appropriate starting point for biologically plausible image representation is the complex cells of the visual cortex, which have invariance properties that appear useful for localisation. The effectiveness for localisation of two different complex cell models are evaluated. Finally the ability of a simple neural network with single shot learning to recognise these representations and localise a robot is examined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents a fast and robust stereo object recognition method. The method is currently unable to identify the rotation of objects. This makes it very good at locating spheres which are rotationally independent. Approximate methods for located non-spherical objects have been developed. Fundamental to the method is that the correspondence problem is solved using information about the dimensions of the object being located. This is in contrast to previous stereo object recognition systems where the scene is first reconstructed by point matching techniques. The method is suitable for real-time application on low-power devices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several approaches have been proposed to recognize handwritten Bengali characters using different curve fitting algorithms and curvature analysis. In this paper, a new algorithm (Curve-fitting Algorithm) to identify various strokes of a handwritten character is developed. The curve-fitting algorithm helps recognizing various strokes of different patterns (line, quadratic curve) precisely. This reduces the error elimination burden heavily. Implementation of this Modified Syntactic Method demonstrates significant improvement in the recognition of Bengali handwritten characters.