38 resultados para LIP
em Queensland University of Technology - ePrints Archive
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.
Resumo:
The performance of automatic speech recognition systems deteriorates in the presence of noise. One known solution is to incorporate video information with an existing acoustic speech recognition system. We investigate the performance of the individual acoustic and visual sub-systems and then examine different ways in which the integration of the two systems may be performed. The system is to be implemented in real time on a Texas Instruments' TMS320C80 DSP.
An approach to statistical lip modelling for speaker identification via chromatic feature extraction
Resumo:
This paper presents a novel technique for the tracking of moving lips for the purpose of speaker identification. In our system, a model of the lip contour is formed directly from chromatic information in the lip region. Iterative refinement of contour point estimates is not required. Colour features are extracted from the lips via concatenated profiles taken around the lip contour. Reduction of order in lip features is obtained via principal component analysis (PCA) followed by linear discriminant analysis (LDA). Statistical speaker models are built from the lip features based on the Gaussian mixture model (GMM). Identification experiments performed on the M2VTS1 database, show encouraging results
Resumo:
This paper investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. It has been previously shown in our own work, and in the work of others, that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms the performance of either sub-system. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
A new technique is proposed for learning the dynamic characteristics of a deformable object, applied in particular to the problem of lip-tracking. Experimental results are given which demonstrate that the use of dynamic models allows the system to track more robustly under adverse conditions and to correct spurious, poorly tracked frames
Resumo:
Investigates the use of temporal lip information, in conjunction with speech information, for robust, text-dependent speaker identification. We propose that significant speaker-dependent information can be obtained from moving lips, enabling speaker recognition systems to be highly robust in the presence of noise. The fusion structure for the audio and visual information is based around the use of multi-stream hidden Markov models (MSHMM), with audio and visual features forming two independent data streams. Recent work with multi-modal MSHMMs has been performed successfully for the task of speech recognition. The use of temporal lip information for speaker identification has been performed previously (T.J. Wark et al., 1998), however this has been restricted to output fusion via single-stream HMMs. We present an extension to this previous work, and show that a MSHMM is a valid structure for multi-modal speaker identification
Resumo:
Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise
Resumo:
The Early–mid Cretaceous marks the confluence of three major continental-scale events in eastern Gondwana: (1) the emplacement of a Silicic Large Igneous Province (LIP) near the continental margin; (2) the volcaniclastic fill, transgression and regression of a major epicontinental seaway developed over at least a quarter of the Australian continent; and (3) epeirogenic uplift, exhumation and continental rupturing culminating in the opening of the Tasman Basin c. 84 Ma. The Whitsunday Silicic LIP event had widespread impact, producing both substantial extrusive volumes of dominantly silicic pyroclastic material and coeval first-cycle volcanogenic sediment that accumulated within many eastern Australian sedimentary basins, and principally in the Great Australian Basin system (>2 Mkm3 combined volume). The final pulse of volcanism and volcanogenic sedimentation at c. 105–95 Ma coincided with epicontinental seaway regression, which shows a lack of correspondence with the global sea-level curve, and alternatively records a wider, continental-scale effect of volcanism and rift tectonism. Widespread igneous underplating related to this LIP event is evident from high paleogeothermal gradients and regional hydrothermal fluid flow detectable in the shallow crust and over a broad region. Enhanced CO2 fluxing through sedimentary basins also records indirectly, large-scale, LIP-related mafic underplating. A discrete episode of rapid crustal cooling and exhumation began c. 100–90 Ma along the length of the eastern Australian margin, related to an enhanced phase of continental rifting that was largely amagmatic, and probably a switch from wide–more narrow rift modes. Along-margin variations in detachment fault architecture produced narrow (SE Australia) and wide continental margins with marginal, submerged continental plateaux (NE Australia). Long-lived NE-trending cross-orogen lineaments controlled the switch from narrow to wide continental margin geometries.
Resumo:
LIP emplacement is linked to the timing and evolution of supercontinental break-up. LIP-related break-up produces volcanic rifted margins, new and large (up to 108 km2) ocean basins, and new, smaller continents that undergo dispersal and potentially reassembly (e.g., India). However, not all continental LIPs lead to continental rupture. We analysed the <330 Ma continental LIP record(following final assembly of Pangea) to find relationships between LIP event attributes (e.g., igneous volume, extent, distance from pre-existing continental margin) and ocean basin attributes (e.g., length of new ocean basin/rifted margin) and how these varied during the progressive break up of Pangea. No correlation exists between LIP magnitude and size of the subsequent ocean basin or rifted margin. Our review suggests a three-phased break-up history of Pangea: 1) “Preconditioning” phase (∼330–200 Ma): LIP events (n=7) occurred largely around the supercontinental margin clustering today in Asia, with a low (<20%) rifting success rate. The Panjal Traps at ∼280 Ma may represent the first continental rupturing event of Pangea, resulting in continental ribboning along the Tethyan margin; 2) “Main Break-up” phase (∼200–100 Ma): numerous large LIP events(n=10) in the supercontinent interior, resulting in highly successful fragmentation (90%) and large, new ocean basins(e.g., Central/South Atlantic, Indian, >3000 km long); 3) “Waning” phase (∼100–0 Ma): Declining LIP magnitudes (n=6), greater proximity to continental margins (e.g., Madagascar, North Atlantic, Afro-Arabia, Sierra Madre) producing smaller ocean basins (<2600 km long). How Pangea broke up may thus have implications for earlier supercontinent reconstructions and LIP record.
Resumo:
The epidemic of obesity is impacting an increasing proportion of children, adolescents and adults with a common feature being low levels of physical activity (PA). Despite having more knowledge than ever before about the benefits of PA for health and the growth and development of youngsters, we are only paying lip-service to the development of motor skills in children. Fun, enjoyment and basic skills are the essential underpinnings of meaningful participation in PA. A concurrent problem is the reported increase in sitting time with the most common sedentary behaviors being TV viewing and other screen-based games. Limitations of time have contributed to a displacement of active behaviors with inactive pursuits, which has contributed to reductions in activity energy expenditure. To redress the energy imbalance in overweight and obese children, we urgently need out-of-the-box multisectoral solutions. There is little to be gained from a shame and blame mentality where individuals, their parents, teachers and other groups are singled out as causes of the problem. Such an approach does little more than shift attention from the main game of prevention and management of the condition, which requires a concerted, whole-of-government approach (in each country). The failure to support and encourage all young people to participate in regular PA will increase the chance that our children will live shorter and less healthy lives than their parents. In short, we need novel environmental approaches to foster a systematic increase in PA. This paper provides examples of opportunities and challenges for PA strategies to prevent obesity with a particular emphasis on the school and home settings.
Resumo:
Acoustically, vehicles are extremely noisy environments and as a consequence audio-only in-car voice recognition systems perform very poorly. Seeing that the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem. However, implementing such an approach requires a system being able to accurately locate and track the driver’s face and facial features in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using this system, we present our results which show that using the Viola-Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose.
Resumo:
Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.
Resumo:
The study aimed to evaluate the suitability of Escherichia coli, enterococci and C. perfringens to assess the microbiological quality of roof harvested rainwater, and to assess whether the concentrations of these faecal indicators can be used to predict the presence or absence of specific zoonotic bacterial or protozoan pathogens. From a total of 100 samples tested, respectively 58%, 83% and 46% of samples were found to be positive for E. coli, enterococci and C. perfringens spores, as determined by traditional culture based methods. Additionally, in the samples tested, 7%, 19%, 1%, 8%, 17%, and 15% were PCR positive for A. hydrophila lip, C. coli ceuE, C. jejuni mapA, L. pneumophila mip, Salmonella invA, and G. lamblia β-giardin genes. However, none of the samples was positive for E. coli O157 LPS, VT1, VT2 and C. parvum COWP genes. The presence or absence of these potential pathogens did not correlate with any of the faecal indicator bacterial concentrations as determined by a binary logistic regression model. The roof-harvested rainwater samples tested in this study appear to be of poor microbiological quality and no significant correlation was found between the concentration of faecal indicators and pathogenic microorganisms. The use of faecal indicator bacteria raises questions regarding their reliability in assessing the microbiological quality of water and particularly their poor correlation with pathogenic microorganisms. The presence of one or more zoonotic pathogens suggests that the microbiological analysis of water should be performed, and appropriate treatment measures should be undertaken especially in tanks where the water is used for drinking.