374 resultados para Optical character recognition
Resumo:
Probabilistic robotics, most often applied to the problem of simultaneous localisation and mapping (SLAM), requires measures of uncertainly to accompany observations of the environment. This paper describes how uncertainly can be characterised for a vision system that locates coloured landmark in a typical laboratory environment. The paper describes a model of the uncertainly in segmentation, the internal camera model and the mounting of the camera on the robot. It =plains the implementation of the system on a laboratory robot, and provides experimental results that show the coherence of the uncertainly model,
Resumo:
In this paper we propose a new method for utilising phase information by complementing it with traditional magnitude-only spectral subtraction speech enhancement through Complex Spectrum Subtraction (CSS). The proposed approach has the following advantages over traditional magnitude-only spectral subtraction: (a) it introduces complementary information to the enhancement algorithm; (b) it reduces the total number of algorithmic parameters, and; (c) is designed for improving clean speech magnitude spectra and is therefore suitable for both automatic speech recognition (ASR) and speech perception applications. Oracle-based ASR experiments verify this approach, showing an average of 20% relative word accuracy improvements when accurate estimates of the phase spectrum are available. Based on sinusoidal analysis and assuming stationarity between observations (which is shown to be better approximated as the frame rate is increased), this paper also proposes a novel method for acquiring the phase information called Phase Estimation via Delay Projection (PEDEP). Further oracle ASR experiments validate the potential for the proposed PEDEP technique in ideal conditions. Realistic implementation of CSS with PEDEP shows performance comparable to state of the art spectral subtraction techniques in a range of 15-20 dB signal-to-noise ratio environments. These results clearly demonstrate the potential for using phase spectra in spectral subtractive enhancement applications, and at the same time highlight the need for deriving more accurate phase estimates in a wider range of noise conditions.
Resumo:
Uncooperative iris identification systems at a distance and on the move often suffer from poor resolution and poor focus of the captured iris images. The lack of pixel resolution and well-focused images significantly degrades the iris recognition performance. This paper proposes a new approach to incorporate the focus score into a reconstruction-based super-resolution process to generate a high resolution iris image from a low resolution and focus inconsistent video sequence of an eye. A reconstruction-based technique, which can incorporate middle and high frequency components from multiple low resolution frames into one desired super-resolved frame without introducing false high frequency components, is used. A new focus assessment approach is proposed for uncooperative iris at a distance and on the move to improve performance for variations in lighting, size and occlusion. A novel fusion scheme is then proposed to incorporate the proposed focus score into the super-resolution process. The experiments conducted on the The Multiple Biometric Grand Challenge portal database shows that our proposed approach achieves an EER of 2.1%, outperforming the existing state-of-the-art averaging signal-level fusion approach by 19.2% and the robust mean super-resolution approach by 8.7%.
Resumo:
Purpose: To investigate the short term influence of imposed monocular defocus upon human optical axial length (the distance from anterior cornea to retinal pigment epithelium) and ocular biometrics. Methods: Twenty-eight young adult subjects (14 myopes and 14 emmetropes) had eye biometrics measured before and then 30 and 60 minutes after exposure to monocular (right eye) defocus. Four different monocular defocus conditions were tested, each on a separate day: control (no defocus), myopic (+3 D defocus), hyperopic (-3 D defocus) and diffuse (0.2 density Bangerter filter) defocus. The fellow eye was optimally corrected (no defocus). Results: Imposed defocus caused small but significant changes in optical axial length (p<0.0001). A significant increase in optical axial length (mean change +8 ± 14 μm, p=0.03) occurred following hyperopic defocus, and a significant reduction in optical axial length (mean change -13 ± 14 μm, p=0.0001) was found following myopic defocus. A small increase in optical axial length was observed following diffuse defocus (mean change +6 ± 13 μm, p=0.053). Choroidal thickness also exhibited some significant changes with certain defocus conditions. No significant difference was found between myopes and emmetropes in the changes in optical axial length or choroidal thickness with defocus. Conclusions: Significant changes in optical axial length occur in human subjects following 60 minutes of monocular defocus. The bi-directional optical axial length changes observed in response to defocus implies the human visual system is capable of detecting the presence and sign of defocus and altering optical axial length to move the retina towards the image plane.
Resumo:
This chapter reports on research work that aims to overcome some limitations of conventional community engagement for urban planning. Adaptive and human-centred design approaches that are well established in human-computer interaction (such as personas and design scenarios) as well as creative writing and dramatic character development methods (such as the Stanislavsky System and the Meisner Technique) are yet largely unexplored in the rather conservative and long-term design context of urban planning. Based on these approaches, we have been trialling a set of performance based workshop activities to gain insights into participants’ desires and requirements that may inform the future design of apartments and apartment buildings in inner city Brisbane. The focus of these workshops is to analyse the behaviour and lifestyle of apartment dwellers and generate residential personas that become boundary objects in the cross-disciplinary discussions of urban design and planning teams. Dramatisation and embodied interaction of use cases form part of the strategies we employed to engage participants and elicit community feedback.
Resumo:
Voice recognition is one of the key enablers to reduce driver distraction as in-vehicle systems become more and more complex. With the integration of voice recognition in vehicles, safety and usability are improved as the driver’s eyes and hands are not required to operate system controls. Whilst speaker independent voice recognition is well developed, performance in high noise environments (e.g. vehicles) is still limited. La Trobe University and Queensland University of Technology have developed a low-cost hardware-based speech enhancement system for automotive environments based on spectral subtraction and delay–sum beamforming techniques. The enhancement algorithms have been optimised using authentic Australian English collected under typical driving conditions. Performance tests conducted using speech data collected under variety of vehicle noise conditions demonstrate a word recognition rate improvement in the order of 10% or more under the noisiest conditions. Currently developed to a proof of concept stage there is potential for even greater performance improvement.
Resumo:
We investigated influences of optics and surround area on color appearance of defocused, small narrow band photopic lights (1’ arc diameter, λmax 510 - 628 nm) centered within a black annulus and surrounded by a white field. Participants included seven normal trichromats with L- or M-cone biased ratios. We controlled chromatic aberration with elements of a Powell achromatizing lens and corrected higher-order aberrations with an adaptive-optics system. Longitudinal chromatic aberrations, but not monochromatic aberrations, are involved in changing appearance of small lights with defocus. Surround field structure is important because color changes were not observed when lights were presented on a uniform white surround.
Resumo:
Within a surveillance video, occlusions are commonplace, and accurately resolving these occlusions is key when seeking to accurately track objects. The challenge of accurately segmenting objects is further complicated by the fact that within many real-world surveillance environments, the objects appear very similar. For example, footage of pedestrians in a city environment will consist of many people wearing dark suits. In this paper, we propose a novel technique to segment groups and resolve occlusions using optical flow discontinuities. We demonstrate that the ratio of continuous to discontinuous pixels within a region can be used to locate the overlapping edges, and incorporate this into an object tracking framework. Results on a portion of the ETISEO database show that the proposed algorithm results in improved tracking performance overall, and improved tracking within occlusions.
Resumo:
Many cities worldwide face the prospect of major transformation as the world moves towards a global information order. In this new era, urban economies are being radically altered by dynamic processes of economic and spatial restructuring. The result is the creation of ‘informational cities’ or its new and more popular name, ‘knowledge cities’. For the last two centuries, social production had been primarily understood and shaped by neo-classical economic thought that recognized only three factors of production: land, labor and capital. Knowledge, education, and intellectual capacity were secondary, if not incidental, factors. Human capital was assumed to be either embedded in labor or just one of numerous categories of capital. In the last decades, it has become apparent that knowledge is sufficiently important to deserve recognition as a fourth factor of production. Knowledge and information and the social and technological settings for their production and communication are now seen as keys to development and economic prosperity. The rise of knowledge-based opportunity has, in many cases, been accompanied by a concomitant decline in traditional industrial activity. The replacement of physical commodity production by more abstract forms of production (e.g. information, ideas, and knowledge) has, however paradoxically, reinforced the importance of central places and led to the formation of knowledge cities. Knowledge is produced, marketed and exchanged mainly in cities. Therefore, knowledge cities aim to assist decision-makers in making their cities compatible with the knowledge economy and thus able to compete with other cities. Knowledge cities enable their citizens to foster knowledge creation, knowledge exchange and innovation. They also encourage the continuous creation, sharing, evaluation, renewal and update of knowledge. To compete nationally and internationally, cities need knowledge infrastructures (e.g. universities, research and development institutes); a concentration of well-educated people; technological, mainly electronic, infrastructure; and connections to the global economy (e.g. international companies and finance institutions for trade and investment). Moreover, they must possess the people and things necessary for the production of knowledge and, as importantly, function as breeding grounds for talent and innovation. The economy of a knowledge city creates high value-added products using research, technology, and brainpower. Private and the public sectors value knowledge, spend money on its discovery and dissemination and, ultimately, harness it to create goods and services. Although many cities call themselves knowledge cities, currently, only a few cities around the world (e.g., Barcelona, Delft, Dublin, Montreal, Munich, and Stockholm) have earned that label. Many other cities aspire to the status of knowledge city through urban development programs that target knowledge-based urban development. Examples include Copenhagen, Dubai, Manchester, Melbourne, Monterrey, Singapore, and Shanghai. Knowledge-Based Urban Development To date, the development of most knowledge cities has proceeded organically as a dependent and derivative effect of global market forces. Urban and regional planning has responded slowly, and sometimes not at all, to the challenges and the opportunities of the knowledge city. That is changing, however. Knowledge-based urban development potentially brings both economic prosperity and a sustainable socio-spatial order. Its goal is to produce and circulate abstract work. The globalization of the world in the last decades of the twentieth century was a dialectical process. On one hand, as the tyranny of distance was eroded, economic networks of production and consumption were constituted at a global scale. At the same time, spatial proximity remained as important as ever, if not more so, for knowledge-based urban development. Mediated by information and communication technology, personal contact, and the medium of tacit knowledge, organizational and institutional interactions are still closely associated with spatial proximity. The clustering of knowledge production is essential for fostering innovation and wealth creation. The social benefits of knowledge-based urban development extend beyond aggregate economic growth. On the one hand is the possibility of a particularly resilient form of urban development secured in a network of connections anchored at local, national, and global coordinates. On the other hand, quality of place and life, defined by the level of public service (e.g. health and education) and by the conservation and development of the cultural, aesthetic and ecological values give cities their character and attract or repel the creative class of knowledge workers, is a prerequisite for successful knowledge-based urban development. The goal is a secure economy in a human setting: in short, smart growth or sustainable urban development.
Resumo:
In this paper, a method has been developed for estimating pitch angle, roll angle and aircraft body rates based on horizon detection and temporal tracking using a forward-looking camera, without assistance from other sensors. Using an image processing front-end, we select several lines in an image that may or may not correspond to the true horizon. The optical flow at each candidate line is calculated, which may be used to measure the body rates of the aircraft. Using an Extended Kalman Filter (EKF), the aircraft state is propagated using a motion model and a candidate horizon line is associated using a statistical test based on the optical flow measurements and the location of the horizon. Once associated, the selected horizon line, along with the associated optical flow, is used as a measurement to the EKF. To test the accuracy of the algorithm, two flights were conducted, one using a highly dynamic Uninhabited Airborne Vehicle (UAV) in clear flight conditions and the other in a human-piloted Cessna 172 in conditions where the horizon was partially obscured by terrain, haze and smoke. The UAV flight resulted in pitch and roll error standard deviations of 0.42◦ and 0.71◦ respectively when compared with a truth attitude source. The Cessna flight resulted in pitch and roll error standard deviations of 1.79◦ and 1.75◦ respectively. The benefits of selecting and tracking the horizon using a motion model and optical flow rather than naively relying on the image processing front-end is also demonstrated.
Resumo:
For several reasons, the Fourier phase domain is less favored than the magnitude domain in signal processing and modeling of speech. To correctly analyze the phase, several factors must be considered and compensated, including the effect of the step size, windowing function and other processing parameters. Building on a review of these factors, this paper investigates a spectral representation based on the Instantaneous Frequency Deviation, but in which the step size between processing frames is used in calculating phase changes, rather than the traditional single sample interval. Reflecting these longer intervals, the term delta-phase spectrum is used to distinguish this from instantaneous derivatives. Experiments show that mel-frequency cepstral coefficients features derived from the delta-phase spectrum (termed Mel-Frequency delta-phase features) can produce broadly similar performance to equivalent magnitude domain features for both voice activity detection and speaker recognition tasks. Further, it is shown that the fusion of the magnitude and phase representations yields performance benefits over either in isolation.
Resumo:
This paper presents a robust place recognition algorithm for mobile robots. The framework proposed combines nonlinear dimensionality reduction, nonlinear regression under noise, and variational Bayesian learning to create consistent probabilistic representations of places from images. These generative models are learnt from a few images and used for multi-class place recognition where classification is computed from a set of feature-vectors. Recognition can be performed in near real-time and accounts for complexity such as changes in illumination, occlusions and blurring. The algorithm was tested with a mobile robot in indoor and outdoor environments with sequences of 1579 and 3820 images respectively. This framework has several potential applications such as map building, autonomous navigation, search-rescue tasks and context recognition.
Resumo:
Two archaeal Holliday junction resolving enzymes, Holliday junction cleavage (Hjc) and Holliday junction endonuclease (Hje), have been characterized. Both are members of a nuclease superfamily that includes the type II restriction enzymes, although their DNA cleaving activity is highly specific for four-way junction structure and not nucleic acid sequence. Despite 28% sequence identity, Hje and Hjc cleave junctions with distinct cutting patterns—they cut different strands of a four-way junction, at different distances from the junction centre. We report the high-resolution crystal structure of Hje from Sulfolobus solfataricus. The structure provides a basis to explain the differences in substrate specificity of Hje and Hjc, which result from changes in dimer organization, and suggests a viral origin for the Hje gene. Structural and biochemical data support the modelling of an Hje:DNA junction complex, highlighting a flexible loop that interacts intimately with the junction centre. A highly conserved serine residue on this loop is shown to be essential for the enzyme's activity, suggesting a novel variation of the nuclease active site. The loop may act as a conformational switch, ensuring that the active site is completed only on binding a four-way junction, thus explaining the exquisite specificity of these enzymes.
Resumo:
To date, the majority of films that utilise or feature hip hop music and culture, have either been in the realms of documentary, or in ‘show musicals’ (where the film musical’s device of characters’ bursting into song, is justified by the narrative of a pursuit of a career in the entertainment industry). Thus, most films that feature hip hop expression have in some way been tied to the subject of hip hop. A research interest and enthusiasm was developed for utilising hip hop expression in film in a new way, which would extend the narrative possibilities of hip hop film to wider topics and themes. The creation of the thesis film Out of My Cloud, and the writing of this accompanying exegesis, investigates a research concern of the potential for the use of hip hop expression in an ‘integrated musical’ film (where characters’ break into song without conceit or explanation). Context and rationale for Out of My Cloud (an Australian hip hop ‘integrated musical’ film) is provided in this writing. It is argued that hip hop is particularly suitable for use in a modern narrative film, and particularly in an ‘integrated musical’ film, due to its: current vibrancy and popularity, rap (vocal element of hip hop) music’s focus on lyrical message and meaning, and rap’s use as an everyday, non-performative method of communication. It is also argued that Australian hip hop deserves greater representation in film and literature due to: its current popularity, and its nature as a unique and distinct form of hip hop. To date, representation of Australian hip hop in film and television has almost solely been restricted to the documentary form. Out of My Cloud borrows from elements of social realist cinema such as: contrasts with mainstream cinema, an exploration/recognition of the relationship between environment and development of character, use of non-actors, location-shooting, a political intent of the filmmaker, displaying sympathy for an underclass, representation of underrepresented character types and topics, and a loose narrative structure that does not offer solid resolution. A case is made that it may be appropriate to marry elements of social realist film with hip hop expression due to common characteristics, such as: representation of marginalised or underrepresented groups and issues in society, political objectives of the artist/s, and sympathy for an underclass. In developing and producing Out of My Cloud, a specific method of working with, and filming actor improvisation was developed. This method was informed by improvisation and associated camera techniques of filmmakers such as Charlie Chaplin, Mike Leigh, Khoa Do, Dogme 95 filmmakers, and Lars von Trier (post-Dogme 95). A review of techniques used by these filmmakers is provided in this writing, as well as the impact it has made on my approach. The method utilised in Out of My Cloud was most influenced by Khoa Do’s technique of guiding actors to improvise fairly loosely, but with a predetermined endpoint in mind. A variation of this technique was developed for use in Out of My Cloud, which involved filming with two cameras to allow edits from multiple angles. Specific processes for creating Out of My Cloud are described and explained in this writing. Particular attention is given to the approaches regarding the story elements and the music elements. Various significant aspects of the process are referred to including the filming and recording of live musical performances, the recording of ‘freestyle’ performances (lyrics composed and performed spontaneously) and the creation of a scored musical scene involving a vocal performance without regular timing or rhythm. The documentation of processes in this writing serve to make the successful elements of this film transferable and replicable to other practitioners in the field, whilst flagging missteps to allow fellow practitioners to avoid similar missteps in future projects. While Out of My Cloud is not without its shortcomings as a short film work (for example in the areas of story and camerawork) it provides a significant contribution to the field as a working example of how hip hop may be utilised in an ‘integrated musical’ film, as well as being a rare example of a narrative film that features Australian hip hop. This film and the accompanying exegesis provide insights that contribute to an understanding of techniques, theories and knowledge in the field of filmmaking practice.
Resumo:
Occlusion is a big challenge for facial expression recognition (FER) in real-world situations. Previous FER efforts to address occlusion suffer from loss of appearance features and are largely limited to a few occlusion types and single testing strategy. This paper presents a robust approach for FER in occluded images and addresses these issues. A set of Gabor based templates is extracted from images in the gallery using a Monte Carlo algorithm. These templates are converted into distance features using template matching. The resulting feature vectors are robust to occlusion. Occluded eyes and mouth regions and randomly places occlusion patches are used for testing. Two testing strategies analyze the effects of these occlusions on the overall recognition performance as well as each facial expression. Experimental results on the Cohn-Kanade database confirm the high robustness of our approach and provide useful insights about the effects of occlusion on FER. Performance is also compared with previous approaches.