374 resultados para Optical character recognition
Resumo:
Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.
Resumo:
This study was part of an integrated project developed in response to concerns regarding current and future land practices affecting water quality within coastal catchments and adjacent marine environments. Two forested coastal catchments on the Fraser Coast, Australia, were chosen as examples of low-modification areas with similar geomorphological and land-use characteristics to many other coastal zones in southeast Queensland. For this component of the overall project, organic , physico-chemical (Eh, pH and DO), ionic (Fe2+, Fe3+), and isotopic (ä13CDIC, ä15NDIN ä34SSO4) data were used to characterise waters and identify sources and processes contributing to concentrations and form of dissolved Fe, C, N and S within the ground and surface waters of these coastal catchments. Three sites with elevated Fe concentrations are discussed in detail. These included a shallow pool with intermittent interaction with the surface water drainage system, a monitoring well within a semi-confined alluvial aquifer, and a monitoring well within the fresh/saline water mixing zone adjacent to an estuary. Conceptual models of processes occurring in these environments are presented. The primary factors influencing Fe transport were; microbial reduction of Fe3+ oxyhydroxides in groundwaters and in the hyporheic zone of surface drainage systems, organic input available for microbial reduction and Fe3+ complexation, bacterial activity for reduction and oxidation, iron curtain effects where saline/fresh water mixing occurs, and variation in redox conditions with depth in ground and surface water columns. Data indicated that groundwater seepage appears a more likely source of Fe to coastal waters (during periods of low rainfall) via tidal flux. The drainage system is ephemeral and contributes little discharge to marine waters. However, data collected during a high rainfall event indicated considerable Fe loads can be transported to the estuary mouth from the catchment.
Resumo:
Purpose: To investigate the correlations of the global flash multifocal electroretinogram (MOFO mfERG) with common clinical visual assessments – Humphrey perimetry and Stratus circumpapillary retinal nerve fiber layer (RNFL) thickness measurement in type II diabetic patients. Methods: Forty-two diabetic patients participated in the study: ten were free from diabetic retinopathy (DR) while the remainder suffered from mild to moderate non-proliferative diabetic retinopathy (NPDR). Fourteen age-matched controls were recruited for comparison. MOFO mfERG measurements were made under high and low contrast conditions. Humphrey central 30-2 perimetry and Stratus OCT circumpapillary RNFL thickness measurements were also performed. Correlations between local values of implicit time and amplitude of the mfERG components (direct component (DC) and induced component (IC)), and perimetric sensitivity and RNFL thickness were evaluated by mapping the localized responses for the three subject groups. Results: MOFO mfERG was superior to perimetry and RNFL assessments in showing differences between the diabetic groups (with and without DR) and the controls. All the MOFO mfERG amplitudes (except IC amplitude at high contrast) correlated better with perimetry findings (Pearson’s r ranged from 0.23 to 0.36, p<0.01) than did the mfERG implicit time at both high and low contrasts across all subject groups. No consistent correlation was found between the mfERG and RNFL assessments for any group or contrast conditions. The responses of the local MOFO mfERG correlated with local perimetric sensitivity but not with RNFL thickness. Conclusion: Early functional changes in the diabetic retina seem to occur before morphological changes in the RNFL.
Resumo:
Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.
Resumo:
How do you identify "good" teaching practice in the complexity of a real classroom? How do you know that beginning teachers can recognise effective digital pedagogy when they see it? How can teacher educators see through their students’ eyes? The study in this paper has arisen from our interest in what pre-service teachers “see” when observing effective classroom practice and how this might reveal their own technological, pedagogical and content knowledge. We asked 104 pre-service teachers from Early Years, Primary and Secondary cohorts to watch and comment upon selected exemplary videos of teachers using ICT (information and communication technologies) in Science. The pre-service teachers recorded their observations using a simple PMI (plus, minus, interesting) matrix which were then coded using the SOLO Taxonomy to look for evidence of their familiarity with and judgements of digital pedagogies. From this, we determined that the majority of preservice teachers we surveyed were using a descriptive rather than a reflective strategy, that is, not extending beyond what was demonstrated in the teaching exemplar or differentiating between action and purpose. We also determined that this method warrants wider trialling as a means of evaluating students’ understandings of the complexity of the digital classroom.
Resumo:
Purpose: To determine whether there is a difference in neuroretinal function and in macular pigment optical density between persons with high- and low-risk gene variants for age-related macular degeneration (AMD) and no ophthalmoscopic signs of AMD, and to compare the results on neuroretinal function to patients with manifest early AMD. Methods and Participants: Neuroretinal function was assessed with the multifocal electroretinogram (mfERG) for 32 participants (22 healthy persons with no AMD and 10 early AMD patients). The 22 healthy participants with no AMD had high- or low-risk genotypes for either CFH (rs380390) and/or ARMS2 (rs10490924). Trough-to-peak response densities and peak-implicit times were analyzed in 5 concentric rings. Macular pigment optical densitometry was assessed by customized heterochromatic flicker photometry. Results: Trough-to-peak response densities for concentric rings 1 to 3 were, on average, significantly greater in participants with high-risk genotypes than in participants with low-risk genotypes and in persons with early AMD after correction for age and smoking (p<0.05). The group peak- implicit times for ring 1 were, on average, delayed in the patients with early AMD compared with the participants with high- or low-risk genotypes, although these differences were not significant. There was no significant correlation between genotypes and macular pigment optical density. Conclusion: Increased neuroretinal activity in persons who carry high-risk AMD genotypes may be due to genetically determined subclinical inflammatory and/or histological changes in the retina. Neuroretinal function in healthy persons genetically susceptible to AMD may be a useful additional early biomarker (in combination with genetics) before there is clinical manifestation.
Resumo:
Purpose. To evaluate the use of optical coherence tomography (OCT) to assess the effect of different soft contact lenses on corneoscleral morphology. Methods. Ten subjects had anterior segment OCT B-scans taken in the morning and again after six hours of soft contact lens wear. For each subject, three different contact lenses were used in the right eye on non-consecutive days, including a hydrogel sphere, a silicone hydrogel sphere and a silicone hydrogel toric. After image registration and layer segmentation, analyses were performed of the first hyper-reflective layer (HRL), the epithelial basement membrane (EBL) and the epithelial thickness (HRL to EBL). A root mean square difference (RMSD) of the layer profiles and the thickness change between the morning and afternoon measurements, was used to assess the effect of the contact lens on the corneoscleral morphology. Results. The soft contact lenses had a statistically significant effect on the morphology of the anterior segment layers (p <0.001). The average amounts of change for the three lenses (average RMSD values) for the corneal region were lower (3.93±1.95 µm for the HRL and 4.02±2.14 µm for the EBL) than those measured in the limbal/scleral region (11.24±6.21 µm for the HRL and 12.61±6.42 µm for the EBL). Similarly, averaged across the three lenses, the RMSD in epithelial thickness was lower in the cornea (2.84±0.84 µm) than the limbal/scleral (5.47±1.71 µm) region. Post-hoc analysis showed that ocular surface changes were significantly smaller with the silicone hydrogel sphere lens than both the silicone hydrogel toric (p<0.005) and hydrogel sphere (p<0.02) for the combined HRL and EBL data. Conclusions. In this preliminary study, we have shown that soft contact lenses can produce small but significant changes in the morphology of the limbal/scleral region and that OCT technology is useful in assessing these changes. The clinical significance of these changes is yet to be determined.
Resumo:
Introduction: Delirium is a serious issue associated with high morbidity and mortality in older hospitalised people. Early recognition enables diagnosis and treatment of underlying cause/s, which can lead to improved patient outcomes. However, research shows knowledge and accurate nurse recognition of delirium and is poor and lack of education appears to be a key issue related to this problem. Thus, the purpose of this randomised controlled trial (RCT) was to evaluate, in a sample of registered nurses, the usability and effectiveness of a web-based learning site, designed using constructivist learning principles, to improve acute care nurse knowledge and recognition of delirium. Prior to undertaking the RCT preliminary phases involving; validation of vignettes, video-taping five of the validated vignettes, website development and pilot testing were completed. Methods: The cluster RCT involved consenting registered nurse participants (N = 175) from twelve clinical areas within three acute health care facilities in Queensland, Australia. Data were collected through a variety of measures and instruments. Primary outcomes were improved ability of nurses to recognise delirium using written validated vignettes and improved knowledge of delirium using a delirium knowledge questionnaire. The secondary outcomes were aimed at determining nurse satisfaction and usability of the website. Primary outcome measures were taken at baseline (T1), directly after the intervention (T2) and two months later (T3). The secondary outcomes were measured at T2 by participants in the intervention group. Following baseline data collection remaining participants were assigned to either the intervention (n=75) or control (n=72) group. Participants in the intervention group were given access to the learning intervention while the control group continued to work in their clinical area and at that time, did not receive access to the learning intervention. Data from the primary outcome measures were examined in mixed model analyses. Results: Overall, the effect of the online learning intervention over time comparing the intervention group and the control group were positive. The intervention groups‘ scores were higher and the change over time results were statistically significant [T3 and T1 (t=3.78 p=<0.001) and T2 and T1 baseline (t=5.83 p=<0.001)]. Statistically significant improvements were also seen for delirium recognition when comparing T2 and T1 results (t=2.58 p=0.012) between the control and intervention group but not for changes in delirium recognition scores between the two groups from T3 and T1 (t=1.80 p=0.074). The majority of the participants rated the website highly on the visual, functional and content elements. Additionally, nearly 80% of the participants liked the overall website features and there were self-reported improvements in delirium knowledge and recognition by the registered nurses in the intervention group. Discussion: Findings from this study support the concept that online learning is an effective and satisfying method of information delivery. Embedded within a constructivist learning environment the site produced a high level of satisfaction and usability for the registered nurse end-users. Additionally, the results showed that the website significantly improved delirium knowledge & recognition scores and the improvement in delirium knowledge was retained at a two month follow-up. Given the strong effect of the intervention the online delirium intervention should be utilised as a way of providing information to registered nurses. It is envisaged that this knowledge would lead to improved recognition of delirium as well as improvement in patient outcomes however; translation of this knowledge attainment into clinical practice was outside the scope of this study. A critical next step is demonstrating the effect of the intervention in changing clinical behaviour, and improving patient health outcomes.
Resumo:
Emergence has the potential to effect complex, creative or open-ended interactions and novel game-play. We report on research into an emergent interactive system. This investigates emergent user behaviors and experience through the creation and evaluation of an interactive system. The system is +-NOW, an augmented reality, tangible, interactive art system. The paper briefly describes the qualities of emergence and +-NOW before focusing on its evaluation. This was a qualitative study with 30 participants conducted in context. Data analysis followed Grounded Theory Methods. Coding schemes, induced from data and external literature are presented. Findings show that emergence occurred in over half of the participants. The nature of these emergent behaviors is discussed along with examples from the data. Other findings indicate that participants found interaction with the work satisfactory. Design strategies for facilitating satisfactory experience despite the often unpredictable character of emergence, are briefly reviewed and potential application areas for emergence are discussed.
Resumo:
This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.
Resumo:
Automatic Call Recognition is vital for environmental monitoring. Patten recognition has been applied in automatic species recognition for years. However, few studies have applied formal syntactic methods to species call structure analysis. This paper introduces a novel method to adopt timed and probabilistic automata in automatic species recognition based upon acoustic components as the primitives. We demonstrate this through one kind of birds in Australia: Eastern Yellow Robin.
The backfilled GEI : a cross-capture modality gait feature for frontal and side-view gait recognition
Resumo:
In this paper, we propose a novel direction for gait recognition research by proposing a new capture-modality independent, appearance-based feature which we call the Back-filled Gait Energy Image (BGEI). It can can be constructed from both frontal depth images, as well as the more commonly used side-view silhouettes, allowing the feature to be applied across these two differing capturing systems using the same enrolled database. To evaluate this new feature, a frontally captured depth-based gait dataset was created containing 37 unique subjects, a subset of which also contained sequences captured from the side. The results demonstrate that the BGEI can effectively be used to identify subjects through their gait across these two differing input devices, achieving rank-1 match rate of 100%, in our experiments. We also compare the BGEI against the GEI and GEV in their respective domains, using the CASIA dataset and our depth dataset, showing that it compares favourably against them. The experiments conducted were performed using a sparse representation based classifier with a locally discriminating input feature space, which show significant improvement in performance over other classifiers used in gait recognition literature, achieving state of the art results with the GEI on the CASIA dataset.
Resumo:
Spatio-Temporal interest points are the most popular feature representation in the field of action recognition. A variety of methods have been proposed to detect and describe local patches in video with several techniques reporting state of the art performance for action recognition. However, the reported results are obtained under different experimental settings with different datasets, making it difficult to compare the various approaches. As a result of this, we seek to comprehensively evaluate state of the art spatio- temporal features under a common evaluation framework with popular benchmark datasets (KTH, Weizmann) and more challenging datasets such as Hollywood2. The purpose of this work is to provide guidance for researchers, when selecting features for different applications with different environmental conditions. In this work we evaluate four popular descriptors (HOG, HOF, HOG/HOF, HOG3D) using a popular bag of visual features representation, and Support Vector Machines (SVM)for classification. Moreover, we provide an in-depth analysis of local feature descriptors and optimize the codebook sizes for different datasets with different descriptors. In this paper, we demonstrate that motion based features offer better performance than those that rely solely on spatial information, while features that combine both types of data are more consistent across a variety of conditions, but typically require a larger codebook for optimal performance.
Resumo:
Our contemporary public sphere has seen the 'emergence of new political rituals, which are concerned with the stains of the past, with self disclosure, and with ways of remembering once taboo and traumatic events' (Misztal, 2005). A recent case of this phenomenon occurred in Australia in 2009 with the apology to the 'Forgotten Australians': a group who suffered abuse and neglect after being removed from their parents – either in Australia or in the UK - and placed in Church and State run institutions in Australia between 1930 and 1970. This campaign for recognition by a profoundly marginalized group coincides with the decade in which the opportunities of Web 2.0 were seen to be diffusing throughout different social groups, and were considered a tool for social inclusion. This paper examines the case of the Forgotten Australians as an opportunity to investigate the role of the internet in cultural trauma and public apology. As such, it adds to recent scholarship on the role of digital web based technologies in commemoration and memorials (Arthur, 2009; Haskins, 2007; Cohen and Willis, 2004), and on digital storytelling in the context of trauma (Klaebe, 2011) by locating their role in a broader and emerging domain of social responsibility and political action (Alexander, 2004).
Resumo:
Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.