999 resultados para Session Variation
Resumo:
Object classification is plagued by the issue of session variation. Session variation describes any variation that makes one instance of an object look different to another, for instance due to pose or illumination variation. Recent work in the challenging task of face verification has shown that session variability modelling provides a mechanism to overcome some of these limitations. However, for computer vision purposes, it has only been applied in the limited setting of face verification. In this paper we propose a local region based intersession variability (ISV) modelling approach, and apply it to challenging real-world data. We propose a region based session variability modelling approach so that local session variations can be modelled, termed Local ISV. We then demonstrate the efficacy of this technique on a challenging real-world fish image database which includes images taken underwater, providing significant real-world session variations. This Local ISV approach provides a relative performance improvement of, on average, 23% on the challenging MOBIO, Multi-PIE and SCface face databases. It also provides a relative performance improvement of 35% on our challenging fish image dataset.
Resumo:
Automatic recognition of people is an active field of research with important forensic and security applications. In these applications, it is not always possible for the subject to be in close proximity to the system. Voice represents a human behavioural trait which can be used to recognise people in such situations. Automatic Speaker Verification (ASV) is the process of verifying a persons identity through the analysis of their speech and enables recognition of a subject at a distance over a telephone channel { wired or wireless. A significant amount of research has focussed on the application of Gaussian mixture model (GMM) techniques to speaker verification systems providing state-of-the-art performance. GMM's are a type of generative classifier trained to model the probability distribution of the features used to represent a speaker. Recently introduced to the field of ASV research is the support vector machine (SVM). An SVM is a discriminative classifier requiring examples from both positive and negative classes to train a speaker model. The SVM is based on margin maximisation whereby a hyperplane attempts to separate classes in a high dimensional space. SVMs applied to the task of speaker verification have shown high potential, particularly when used to complement current GMM-based techniques in hybrid systems. This work aims to improve the performance of ASV systems using novel and innovative SVM-based techniques. Research was divided into three main themes: session variability compensation for SVMs; unsupervised model adaptation; and impostor dataset selection. The first theme investigated the differences between the GMM and SVM domains for the modelling of session variability | an aspect crucial for robust speaker verification. Techniques developed to improve the robustness of GMMbased classification were shown to bring about similar benefits to discriminative SVM classification through their integration in the hybrid GMM mean supervector SVM classifier. Further, the domains for the modelling of session variation were contrasted to find a number of common factors, however, the SVM-domain consistently provided marginally better session variation compensation. Minimal complementary information was found between the techniques due to the similarities in how they achieved their objectives. The second theme saw the proposal of a novel model for the purpose of session variation compensation in ASV systems. Continuous progressive model adaptation attempts to improve speaker models by retraining them after exploiting all encountered test utterances during normal use of the system. The introduction of the weight-based factor analysis model provided significant performance improvements of over 60% in an unsupervised scenario. SVM-based classification was then integrated into the progressive system providing further benefits in performance over the GMM counterpart. Analysis demonstrated that SVMs also hold several beneficial characteristics to the task of unsupervised model adaptation prompting further research in the area. In pursuing the final theme, an innovative background dataset selection technique was developed. This technique selects the most appropriate subset of examples from a large and diverse set of candidate impostor observations for use as the SVM background by exploiting the SVM training process. This selection was performed on a per-observation basis so as to overcome the shortcoming of the traditional heuristic-based approach to dataset selection. Results demonstrate the approach to provide performance improvements over both the use of the complete candidate dataset and the best heuristically-selected dataset whilst being only a fraction of the size. The refined dataset was also shown to generalise well to unseen corpora and be highly applicable to the selection of impostor cohorts required in alternate techniques for speaker verification.
Resumo:
A significant amount of speech is typically required for speaker verification system development and evaluation, especially in the presence of large intersession variability. This paper introduces a source and utterance duration normalized linear discriminant analysis (SUN-LDA) approaches to compensate session variability in short-utterance i-vector speaker verification systems. Two variations of SUN-LDA are proposed where normalization techniques are used to capture source variation from both short and full-length development i-vectors, one based upon pooling (SUN-LDA-pooled) and the other on concatenation (SUN-LDA-concat) across the duration and source-dependent session variation. Both the SUN-LDA-pooled and SUN-LDA-concat techniques are shown to provide improvement over traditional LDA on NIST 08 truncated 10sec-10sec evaluation conditions, with the highest improvement obtained with the SUN-LDA-concat technique achieving a relative improvement of 8% in EER for mis-matched conditions and over 3% for matched conditions over traditional LDA approaches.
Resumo:
This paper proposes techniques to improve the performance of i-vector based speaker verification systems when only short utterances are available. Short-length utterance i-vectors vary with speaker, session variations, and the phonetic content of the utterance. Well established methods such as linear discriminant analysis (LDA), source-normalized LDA (SN-LDA) and within-class covariance normalisation (WCCN) exist for compensating the session variation but we have identified the variability introduced by phonetic content due to utterance variation as an additional source of degradation when short-duration utterances are used. To compensate for utterance variations in short i-vector speaker verification systems using cosine similarity scoring (CSS), we have introduced a short utterance variance normalization (SUVN) technique and a short utterance variance (SUV) modelling approach at the i-vector feature level. A combination of SUVN with LDA and SN-LDA is proposed to compensate the session and utterance variations and is shown to provide improvement in performance over the traditional approach of using LDA and/or SN-LDA followed by WCCN. An alternative approach is also introduced using probabilistic linear discriminant analysis (PLDA) approach to directly model the SUV. The combination of SUVN, LDA and SN-LDA followed by SUV PLDA modelling provides an improvement over the baseline PLDA approach. We also show that for this combination of techniques, the utterance variation information needs to be artificially added to full-length i-vectors for PLDA modelling.
Resumo:
This paper proposes a combination of source-normalized weighted linear discriminant analysis (SN-WLDA) and short utterance variance (SUV) PLDA modelling to improve the short utterance PLDA speaker verification. As short-length utterance i-vectors vary with the speaker, session variations and phonetic content of the utterance (utterance variation), a combined approach of SN-WLDA projection and SUV PLDA modelling is used to compensate the session and utterance variations. Experimental studies have found that a combination of SN-WLDA and SUV PLDA modelling approach shows an improvement over baseline system (WCCN[LDA]-projected Gaussian PLDA (GPLDA)) as this approach effectively compensates the session and utterance variations.
Resumo:
This paper proposes the addition of a weighted median Fisher discriminator (WMFD) projection prior to length-normalised Gaussian probabilistic linear discriminant analysis (GPLDA) modelling in order to compensate the additional session variation. In limited microphone data conditions, a linear-weighted approach is introduced to increase the influence of microphone speech dataset. The linear-weighted WMFD-projected GPLDA system shows improvements in EER and DCF values over the pooled LDA- and WMFD-projected GPLDA systems in inter-view-interview condition as WMFD projection extracts more speaker discriminant information with limited number of sessions/ speaker data, and linear-weighted GPLDA approach estimates reliable model parameters with limited microphone data.
Resumo:
This thesis has investigated how to cluster a large number of faces within a multi-media corpus in the presence of large session variation. Quality metrics are used to select the best faces to represent a sequence of faces; and session variation modelling improves clustering performance in the presence of wide variations across videos. Findings from this thesis contribute to improving the performance of both face verification systems and the fully automated clustering of faces from a large video corpus.
Resumo:
Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.
Resumo:
Purpose: To investigate associations between the diurnal variation in a range of corneal parameters, including anterior and posterior corneal topography, and regional corneal thickness. ----- Methods: Fifteen subjects had their corneas measured using a rotating Scheimpflug camera (Pentacam) every 3-7 hours over a 24-hour period. Anterior and posterior corneal axial curvature, pachymetry and anterior chamber depth were analysed. The best fitting corneal sphero-cylinder from the axial curvature, and the average corneal thickness for a series of different corneal regions were calculated. Intraocular pressure and axial length were also measured at each measurement session. Repeated measures ANOVA were used to investigate diurnal change in these parameters. Analysis of covariance was used to examine associations between the measured ocular parameters. ----- Results: Significant diurnal variation was found to occur in both the anterior and posterior corneal curvature and in the regional corneal thickness. Flattening of the anterior corneal best sphere was observed at the early morning measurement (p < 0.0001). The posterior cornea also underwent a significant steepening (p < 0.0001) and change in astigmatism 90/180° at this time. A significant swelling of the cornea (p < 0.0001) was also found to occur immediately after waking. Highly significant associations were found between the diurnal variation in corneal thickness and the changes in corneal curvature. ----- Conclusions: Significant diurnal variation occurs in the regional thickness and the shape of the anterior and posterior cornea. The largest changes in the cornea were typically evident upon waking. The observed non-uniform regional corneal thickness changes resulted in a steepening of the posterior cornea, and a flattening of the anterior cornea to occur at this time.
Resumo:
Purpose. To investigate whether diurnal variation occurs in retinal thickness measures derived from spectral domain optical coherence tomography (SD-OCT). Methods. Twelve healthy adult subjects had retinal thickness measured with SD-OCT every 2 h over a 10 h period. At each measurement session, three average B-scan images were derived from a series of multiple B-scans (each from a 5 mm horizontal raster scan along the fovea, containing 1500 A-scans/B-scan) and analyzed to determine the thickness of the total retina, as well as the thickness of the outer retinal layers. Average thickness values were calculated at the foveal center, at the 0.5 mm diameter foveal region, and for the temporal parafovea (1.5 mm from foveal center) and nasal parafovea (1.5 mm from foveal center). Results. Total retinal thickness did not exhibit significant diurnal variation in any of the considered retinal regions (p > 0.05). Evidence of significant diurnal variation was found in the thickness of the outer retinal layers (p < 0.05), with the most prominent changes observed in the photoreceptor layers at the foveal center. The photoreceptor inner and outer segment layer thickness exhibited mean amplitude (peak to trough) of daily change of 7 ± 3 μm at the foveal center. The peak in thickness was typically observed at the third measurement session (mean measurement time, 13:06). Conclusions. The total retinal thickness measured with SD-OCT does not exhibit evidence of significant variation over the course of the day. However, small but significant diurnal variation occurs in the thickness of the foveal outer retinal layers.
Resumo:
Introduction: Evidence concerning the alteration of knee function during landing suffers from a lack of consensus. This uncertainty can be attributed to methodological flaws, particularly in relation to the statistical analysis of variable human movement data. Aim: The aim of this study was to compare single-subject and group analysis in quantifying alterations in the magnitude and within-participant variability of knee mechanics during a step landing task. Methods: A group of healthy men (N = 12) stepped-down from a knee-high platform for 60 consecutive trials, each trial separated by a 1-minute rest. The magnitude and within-participant variability of sagittal knee stiffness and coordination of the landing leg during the immediate postimpact period were evaluated. Coordination of the knee was quantified in the sagittal plane by calculating the mean absolute relative phase of sagittal shank and thigh motion (MARP1) and between knee rotation and knee flexion (MARP2). Changes across trials were compared between both group and single-subject statistical analyses. Results: The group analysis detected significant reductions in MARP1 magnitude. However, the single-subject analyses detected changes in all dependent variables, which included increases in variability with task repetition. Between-individual variation was also present in the timing, size and direction of alterations to task repetition. Conclusion: The results have important implications for the interpretation of existing information regarding the adaptation of knee mechanics to interventions such as fatigue, footwear or landing height. It is proposed that a familiarisation session be incorporated in future experiments on a single-subject basis prior to an intervention.
Resumo:
Purpose To examine whether anterior scleral and conjunctival thickness undergoes significant diurnal variation over a 24-hour period. Methods Nineteen healthy young adults (mean age 22 ± 2 years) with minimal refractive error (mean spherical equivalent refraction -0.08 ± 0.39 D), had measures of anterior scleral and conjunctival thickness collected using anterior segment optical coherence tomography (AS-OCT) at seven measurement sessions over a 24-hour period. The thickness of the temporal anterior sclera and conjunctiva were determined at 6 locations (each separated by 0.5 mm) at varying distances from the scleral spur for each subject at each measurement session. Results Both the anterior sclera and conjunctiva were found to undergo significant diurnal variations in thickness over a 24-hour period (both p <0.01). The sclera and conjunctiva exhibited a similar pattern of diurnal change, with a small magnitude thinning observed close to midday, and a larger magnitude thickening observed in the early morning immediately after waking. The amplitude of diurnal thickness change was larger in the conjunctiva (mean amplitude 69 ± 29 μm) compared to the sclera (21 ± 8 μm). The conjunctiva exhibited its smallest magnitude of change at the scleral spur location (mean amplitude 56 ± 17 μm) whereas the sclera exhibited its largest magnitude of change at this location (52 ± 21 μm). Conclusions This study provides the first evidence of diurnal variations occurring in the thickness of the anterior sclera and conjunctiva. Studies requiring precise measures of these anatomical layers should therefore take time of day into consideration. The majority of the observed changes occurred in the early morning immediately after waking and were of larger magnitude in the conjunctiva compared to the sclera. Thickness changes at other times of the day were of smaller magnitude and generally not statistically significant.
Modeling pronunciation variation using context-dependent weighting and B/S refined acoustic modeling
Resumo:
The purpose of this study was to investigate differences between abstracts of posters presented at the 79th (2002) and 80th (2003) Annual Session & Exhibition of the American Dental Education Association (ADEA) and the published full-length articles resulting from the same studies. The abstracts for poster presentation sessions were downloaded, and basic characteristics of the abstracts and their authors were determined. A PubMed search was then performed to identify the publication of full-length articles based on those abstracts in a peer-reviewed journal. The differences between the abstract and the article were examined and categorized as major and minor differences. Differences identified included authorship, title, materials and methods, results, conclusions, and funding. Data were analyzed with both descriptive and analytic statistics. Overall, 89 percent of the abstracts had at least one variation from its corresponding article, and 65 percent and 76 percent of the abstracts had at least one major and minor variation, respectively, from its corresponding article. The most prevalent major variation was in study results, and the most prevalent minor variation was change in the number of authors. The discussion speculates on some possible reasons for these differences.