336 resultados para RECOGNITION MEMORY
Resumo:
The chief challenge facing persistent robotic navigation using vision sensors is the recognition of previously visited locations under different lighting and illumination conditions. The majority of successful approaches to outdoor robot navigation use active sensors such as LIDAR, but the associated weight and power draw of these systems makes them unsuitable for widespread deployment on mobile robots. In this paper we investigate methods to combine representations for visible and long-wave infrared (LWIR) thermal images with time information to combat the time-of-day-based limitations of each sensing modality. We calculate appearance-based match likelihoods using the state-of-the-art FAB-MAP [1] algorithm to analyse loop closure detection reliability across different times of day. We present preliminary results on a dataset of 10 successive traverses of a combined urban-parkland environment, recorded in 2-hour intervals from before dawn to after dusk. Improved location recognition throughout an entire day is demonstrated using the combined system compared with methods which use visible or thermal sensing alone.
Resumo:
Studies of orthographic skills transfer between languages focus mostly on working memory (WM) ability in alphabetic first language (L1) speakers when learning another, often alphabetically congruent, language. We report two studies that, instead, explored the transferability of L1 orthographic processing skills in WM in logographic-L1 and alphabetic-L1 speakers. English-French bilingual and English monolingual (alphabetic-L1) speakers, and Chinese-English (logographic-L1) speakers, learned a set of artificial logographs and associated meanings (Study 1). The logographs were used in WM tasks with and without concurrent articulatory or visuo-spatial suppression. The logographic-L1 bilinguals were markedly less affected by articulatory suppression than alphabetic-L1 monolinguals (who did not differ from their bilingual peers). Bilinguals overall were less affected by spatial interference, reflecting superior phonological processing skills or, conceivably, greater executive control. A comparison of span sizes for meaningful and meaningless logographs (Study 2) replicated these findings. However, the logographic-L1 bilinguals’ spans in L1 were measurably greater than those of their alphabetic-L1 (bilingual and monolingual) peers; a finding unaccounted for by faster articulation rates or differences in general intelligence. The overall pattern of results suggests an advantage (possibly perceptual) for logographic-L1 speakers, over and above the bilingual advantage also seen elsewhere in third language (L3) acquisition.
Resumo:
Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.
Resumo:
Different archives of television material construct different versions of Australian national identity. There exists a Pro-Am archive of Australian television history materials consisting of many individual collections. This archive is not centrally located nor clearly bounded. The collections are not all linked to each other, nor are they aware of each other, and they do not claim to have a single common project. Pro-Am collections tend not to address Australian television as a whole, rather addressing particular genres, programs or production companies. Their vision of Australia is 'ordinary' and everyday. The boundaries of 'Australia' in the Pro-Am archive are porous, allowing non-Australians to contribute material, and also including non-Australian material and this causes little sense of anxiety.
Resumo:
Facial expression is an important channel of human social communication. Facial expression recognition (FER) aims to perceive and understand emotional states of humans based on information in the face. Building robust and high performance FER systems that can work in real-world video is still a challenging task, due to the various unpredictable facial variations and complicated exterior environmental conditions, as well as the difficulty of choosing a suitable type of feature descriptor for extracting discriminative facial information. Facial variations caused by factors such as pose, age, gender, race and occlusion, can exert profound influence on the robustness, while a suitable feature descriptor largely determines the performance. Most present attention on FER has been paid to addressing variations in pose and illumination. No approach has been reported on handling face localization errors and relatively few on overcoming facial occlusions, although the significant impact of these two variations on the performance has been proved and highlighted in many previous studies. Many texture and geometric features have been previously proposed for FER. However, few comparison studies have been conducted to explore the performance differences between different features and examine the performance improvement arisen from fusion of texture and geometry, especially on data with spontaneous emotions. The majority of existing approaches are evaluated on databases with posed or induced facial expressions collected in laboratory environments, whereas little attention has been paid on recognizing naturalistic facial expressions on real-world data. This thesis investigates techniques for building robust and high performance FER systems based on a number of established feature sets. It comprises of contributions towards three main objectives: (1) Robustness to face localization errors and facial occlusions. An approach is proposed to handle face localization errors and facial occlusions using Gabor based templates. Template extraction algorithms are designed to collect a pool of local template features and template matching is then performed to covert these templates into distances, which are robust to localization errors and occlusions. (2) Improvement of performance through feature comparison, selection and fusion. A comparative framework is presented to compare the performance between different features and different feature selection algorithms, and examine the performance improvement arising from fusion of texture and geometry. The framework is evaluated for both discrete and dimensional expression recognition on spontaneous data. (3) Evaluation of performance in the context of real-world applications. A system is selected and applied into discriminating posed versus spontaneous expressions and recognizing naturalistic facial expressions. A database is collected from real-world recordings and is used to explore feature differences between standard database images and real-world images, as well as between real-world images and real-world video frames. The performance evaluations are based on the JAFFE, CK, Feedtum, NVIE, Semaine and self-collected QUT databases. The results demonstrate high robustness of the proposed approach to the simulated localization errors and occlusions. Texture and geometry have different contributions to the performance of discrete and dimensional expression recognition, as well as posed versus spontaneous emotion discrimination. These investigations provide useful insights into enhancing robustness and achieving high performance of FER systems, and putting them into real-world applications.
Resumo:
Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.
Resumo:
How do you identify "good" teaching practice in the complexity of a real classroom? How do you know that beginning teachers can recognise effective digital pedagogy when they see it? How can teacher educators see through their students’ eyes? The study in this paper has arisen from our interest in what pre-service teachers “see” when observing effective classroom practice and how this might reveal their own technological, pedagogical and content knowledge. We asked 104 pre-service teachers from Early Years, Primary and Secondary cohorts to watch and comment upon selected exemplary videos of teachers using ICT (information and communication technologies) in Science. The pre-service teachers recorded their observations using a simple PMI (plus, minus, interesting) matrix which were then coded using the SOLO Taxonomy to look for evidence of their familiarity with and judgements of digital pedagogies. From this, we determined that the majority of preservice teachers we surveyed were using a descriptive rather than a reflective strategy, that is, not extending beyond what was demonstrated in the teaching exemplar or differentiating between action and purpose. We also determined that this method warrants wider trialling as a means of evaluating students’ understandings of the complexity of the digital classroom.
Resumo:
Introduction: Delirium is a serious issue associated with high morbidity and mortality in older hospitalised people. Early recognition enables diagnosis and treatment of underlying cause/s, which can lead to improved patient outcomes. However, research shows knowledge and accurate nurse recognition of delirium and is poor and lack of education appears to be a key issue related to this problem. Thus, the purpose of this randomised controlled trial (RCT) was to evaluate, in a sample of registered nurses, the usability and effectiveness of a web-based learning site, designed using constructivist learning principles, to improve acute care nurse knowledge and recognition of delirium. Prior to undertaking the RCT preliminary phases involving; validation of vignettes, video-taping five of the validated vignettes, website development and pilot testing were completed. Methods: The cluster RCT involved consenting registered nurse participants (N = 175) from twelve clinical areas within three acute health care facilities in Queensland, Australia. Data were collected through a variety of measures and instruments. Primary outcomes were improved ability of nurses to recognise delirium using written validated vignettes and improved knowledge of delirium using a delirium knowledge questionnaire. The secondary outcomes were aimed at determining nurse satisfaction and usability of the website. Primary outcome measures were taken at baseline (T1), directly after the intervention (T2) and two months later (T3). The secondary outcomes were measured at T2 by participants in the intervention group. Following baseline data collection remaining participants were assigned to either the intervention (n=75) or control (n=72) group. Participants in the intervention group were given access to the learning intervention while the control group continued to work in their clinical area and at that time, did not receive access to the learning intervention. Data from the primary outcome measures were examined in mixed model analyses. Results: Overall, the effect of the online learning intervention over time comparing the intervention group and the control group were positive. The intervention groups‘ scores were higher and the change over time results were statistically significant [T3 and T1 (t=3.78 p=<0.001) and T2 and T1 baseline (t=5.83 p=<0.001)]. Statistically significant improvements were also seen for delirium recognition when comparing T2 and T1 results (t=2.58 p=0.012) between the control and intervention group but not for changes in delirium recognition scores between the two groups from T3 and T1 (t=1.80 p=0.074). The majority of the participants rated the website highly on the visual, functional and content elements. Additionally, nearly 80% of the participants liked the overall website features and there were self-reported improvements in delirium knowledge and recognition by the registered nurses in the intervention group. Discussion: Findings from this study support the concept that online learning is an effective and satisfying method of information delivery. Embedded within a constructivist learning environment the site produced a high level of satisfaction and usability for the registered nurse end-users. Additionally, the results showed that the website significantly improved delirium knowledge & recognition scores and the improvement in delirium knowledge was retained at a two month follow-up. Given the strong effect of the intervention the online delirium intervention should be utilised as a way of providing information to registered nurses. It is envisaged that this knowledge would lead to improved recognition of delirium as well as improvement in patient outcomes however; translation of this knowledge attainment into clinical practice was outside the scope of this study. A critical next step is demonstrating the effect of the intervention in changing clinical behaviour, and improving patient health outcomes.
Resumo:
This paper investigates the use of mel-frequency deltaphase (MFDP) features in comparison to, and in fusion with, traditional mel-frequency cepstral coefficient (MFCC) features within joint factor analysis (JFA) speaker verification. MFCC features, commonly used in speaker recognition systems, are derived purely from the magnitude spectrum, with the phase spectrum completely discarded. In this paper, we investigate if features derived from the phase spectrum can provide additional speaker discriminant information to the traditional MFCC approach in a JFA based speaker verification system. Results are presented which provide a comparison of MFCC-only, MFDPonly and score fusion of the two approaches within a JFA speaker verification approach. Based upon the results presented using the NIST 2008 Speaker Recognition Evaluation (SRE) dataset, we believe that, while MFDP features alone cannot compete with MFCC features, MFDP can provide complementary information that result in improved speaker verification performance when both approaches are combined in score fusion, particularly in the case of shorter utterances.
Resumo:
The symbolic and improvisational nature of Livecoding requires a shared networking framework to be flexible and extensible, while at the same time providing support for synchronisation, persistence and redundancy. Above all the framework should be robust and available across a range of platforms. This paper proposes tuple space as a suitable framework for network communication in ensemble livecoding contexts. The role of tuple space as a concurrency framework and the associated timing aspects of the tuple space model are explored through Spaces, an implementation of tuple space for the Impromptu environment.
Resumo:
Automatic Call Recognition is vital for environmental monitoring. Patten recognition has been applied in automatic species recognition for years. However, few studies have applied formal syntactic methods to species call structure analysis. This paper introduces a novel method to adopt timed and probabilistic automata in automatic species recognition based upon acoustic components as the primitives. We demonstrate this through one kind of birds in Australia: Eastern Yellow Robin.
The backfilled GEI : a cross-capture modality gait feature for frontal and side-view gait recognition
Resumo:
In this paper, we propose a novel direction for gait recognition research by proposing a new capture-modality independent, appearance-based feature which we call the Back-filled Gait Energy Image (BGEI). It can can be constructed from both frontal depth images, as well as the more commonly used side-view silhouettes, allowing the feature to be applied across these two differing capturing systems using the same enrolled database. To evaluate this new feature, a frontally captured depth-based gait dataset was created containing 37 unique subjects, a subset of which also contained sequences captured from the side. The results demonstrate that the BGEI can effectively be used to identify subjects through their gait across these two differing input devices, achieving rank-1 match rate of 100%, in our experiments. We also compare the BGEI against the GEI and GEV in their respective domains, using the CASIA dataset and our depth dataset, showing that it compares favourably against them. The experiments conducted were performed using a sparse representation based classifier with a locally discriminating input feature space, which show significant improvement in performance over other classifiers used in gait recognition literature, achieving state of the art results with the GEI on the CASIA dataset.
Resumo:
A recent Australian literature digitisation project uncovered some surprising discoveries in the children’s books that it digitised. The Children’s Literature Digital Resources (CLDR) Project digitised children’s books that were first published between 1851 to 1945 and made them available online through AustLit: The Australian Literature Resource. The digitisation process also preserved, within the pages of those books, a range of bookplates, book labels, inscriptions, and loose ephemera. This material allows us to trace the provenance of some of the digitised works, some of which came from the personal libraries of now-famous authors, and others from less celebrated sources. These extra-textual traces can contribute to cultural memory of the past by providing evidence of how books were collected and exchanged, and what kinds of books were presented as prizes in schools and Sunday schools. They also provide insight into Australian literary and artistic networks, particularly of the first few decades of the 20th century. This article describes the kinds of material uncovered in the digitisation process and suggests that the material provides insights into literary and cultural histories that might otherwise be forgotten. It also argues that the indexing of this material is vital if it is not to be lost to future researchers.