983 resultados para Online handwriting recognition


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This article examines the growing phenomenon of online dating and intimacy in the 21st century. The exponential rise of communications technologies, which is both reflective and constitutive of an increasingly networked and globalized society, has the potential to significantly influence the nature of intimacy in everyday life. Yet, to date, there has been a minimal response by sociologists to seek, describe and understand this influence. In this article, we present some of the key findings of our research on online dating in Australia, in order to foster a debate about the sociological impacts on intimacy in the postmodern world. Based on a web audit of more than 60 online dating sites and in-depth interviews with 23 users of online dating services, we argue that recent global trends are influencing the uptake of online technologies for the purposes of forming intimate relations. Further, some of the mediating effects of these technologies – in particular, the hypercommunication – may have specific implications for the nature of intimacy in the global era.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Knowledge intensive services are the fastest growing segment of the international economy and the digital creative industries are a key segment therein. Australia is well positioned to exploit this opportunity but has a skills shortage in the digital content industries in terms of commercial ready graduates. We report on a solution to this problem, in the form of an online creative community of practice – www.60Sox.org - where new graduates are mentored by Australian industry leaders - the 2bobmob. We describe this community of practice as a virtual creative ecology and discuss networks, peer feedback and mentoring as key elements of post-tertiary learning, in the context of portfolio career progression.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recovering position from sensor information is an important problem in mobile robotics, known as localisation. Localisation requires a map or some other description of the environment to provide the robot with a context to interpret sensor data. The mobile robot system under discussion is using an artificial neural representation of position. Building a geometrical map of the environment with a single camera and artificial neural networks is difficult. Instead it would be simpler to learn position as a function of the visual input. Usually when learning images, an intermediate representation is employed. An appropriate starting point for biologically plausible image representation is the complex cells of the visual cortex, which have invariance properties that appear useful for localisation. The effectiveness for localisation of two different complex cell models are evaluated. Finally the ability of a simple neural network with single shot learning to recognise these representations and localise a robot is examined.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper presents a fast and robust stereo object recognition method. The method is currently unable to identify the rotation of objects. This makes it very good at locating spheres which are rotationally independent. Approximate methods for located non-spherical objects have been developed. Fundamental to the method is that the correspondence problem is solved using information about the dimensions of the object being located. This is in contrast to previous stereo object recognition systems where the scene is first reconstructed by point matching techniques. The method is suitable for real-time application on low-power devices.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence audio-only, in-car voice recognition systems perform poorly. As the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem by using audio visual automatic speech recognition (AVASR). However, implementing AVASR requires a system being able to accurately locate and track the drivers face and lip area in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using the AVICAR [1] in-car database, we show that the Viola- Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose for audio-visual speech recognition system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Queensland University of Technology (QUT) is a multidisciplinary university in Brisbane, Queensland, Australia, and has 40,000 students and 1,700 researchers. Notable eResearch infrastructure includes the QUT ePrints repository, Microsoft QUT Research Centre, the OAK (Open Access to Knowledge) Law Project, Cambia and leading research institutes. ---------- The Australian Government, via the Australian National Data Service (ANDS), is funding institutions to identify and describe their research datasets, to develop and populate data repositories and collaborative infrastructure, and to seed the Australian Research Data Commons. QUT is currently broadening its range of research support services, including those to support the management of research data, in recognition of the value of these datasets as products of the research process, and in order to maximize the potential for reuse. QUT is integrating Library and High Performance Computing (HPC) services to achieve its research support goals. ---------- The Library and HPC released an online survey using Key Survey to 1,700 researchers in September 2009. A comprehensive range of eResearch practices and skills was presented for response, and grouped into areas of scholarly communication and open access publishing, using collaborative technologies, data management, data collection and management, computation and visualization tools. Researchers were asked to rate their skill level on each practice. 254 responses were received over two weeks. Eight focus groups were also held with 35 higher degree research (HDR) students and staff to provide additional qualitative feedback. A similar survey was released to 100 support staff and 73 responses were received.---------- Preliminary results from the researcher survey and focus groups indicate a gap between current eResearch practices, and the potential for researchers to engage in eResearch practices. Researchers are more likely to seek advice from their peers, than from support staff. HDR students are more positive about eResearch practices and are more willing to learn new ways of conducting research. An account of the survey methodology, the results obtained, and proposed strategies to embed eResearch practices and skills across and within the research disciplines will be provided.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Several approaches have been proposed to recognize handwritten Bengali characters using different curve fitting algorithms and curvature analysis. In this paper, a new algorithm (Curve-fitting Algorithm) to identify various strokes of a handwritten character is developed. The curve-fitting algorithm helps recognizing various strokes of different patterns (line, quadratic curve) precisely. This reduces the error elimination burden heavily. Implementation of this Modified Syntactic Method demonstrates significant improvement in the recognition of Bengali handwritten characters.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acoustically, car cabins are extremely noisy and as a consequence, existing audio-only speech recognition systems, for voice-based control of vehicle functions such as the GPS based navigator, perform poorly. Audio-only speech recognition systems fail to make use of the visual modality of speech (eg: lip movements). As the visual modality is immune to acoustic noise, utilising this visual information in conjunction with an audio only speech recognition system has the potential to improve the accuracy of the system. The field of recognising speech using both auditory and visual inputs is known as Audio Visual Speech Recognition (AVSR). Continuous research in AVASR field has been ongoing for the past twenty-five years with notable progress being made. However, the practical deployment of AVASR systems for use in a variety of real-world applications has not yet emerged. The main reason is due to most research to date neglecting to address variabilities in the visual domain such as illumination and viewpoint in the design of the visual front-end of the AVSR system. In this paper we present an AVASR system in a real-world car environment using the AVICAR database [1], which is publicly available in-car database and we show that the use of visual speech conjunction with the audio modality is a better approach to improve the robustness and effectiveness of voice-only recognition systems in car cabin environments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

When classifying a signal, ideally we want our classifier to trigger a large response when it encounters a positive example and have little to no response for all other examples. Unfortunately in practice this does not occur with responses fluctuating, often causing false alarms. There exists a myriad of reasons why this is the case, most notably not incorporating the dynamics of the signal into the classification. In facial expression recognition, this has been highlighted as one major research question. In this paper we present a novel technique which incorporates the dynamics of the signal which can produce a strong response when the peak expression is found and essentially suppresses all other responses as much as possible. We conducted preliminary experiments on the extended Cohn-Kanade (CK+) database which shows its benefits. The ability to automatically and accurately recognize facial expressions of drivers is highly relevant to the automobile. For example, the early recognition of “surprise” could indicate that an accident is about to occur; and various safeguards could immediately be deployed to avoid or minimize injury and damage. In this paper, we conducted initial experiments on the extended Cohn-Kanade (CK+) database which shows its benefits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The professional doctorate is a degree that is specifically designed for professionals investigating real-world problems and relevant issues for a profession, industry and/or the community. The exploratory study on which this paper is based sought to track the scholarly skill development of a cohort of professional doctoral students who commenced their course in January 2008 at an Australian university. Via an initial survey and two focus groups held six months apart, the study aimed to determine if there had been any qualitative shifts in students’ understandings, expectations and perceptions regarding their developing knowledge and skills. Three key findings that emerged from this study were: (i) the appropriateness of using a blended learning approach in this professional doctoral program; (ii) the challenges of using wikis as an online technology for creating communities of practice; and (iii) the transition from professional to scholar is a process that requires the guided support inherent in the design of this particular doctorate of education program.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

While close talking microphones give the best signal quality and produce the highest accuracy from current Automatic Speech Recognition (ASR) systems, the speech signal enhanced by microphone array has been shown to be an effective alternative in a noisy environment. The use of microphone arrays in contrast to close talking microphones alleviates the feeling of discomfort and distraction to the user. For this reason, microphone arrays are popular and have been used in a wide range of applications such as teleconferencing, hearing aids, speaker tracking, and as the front-end to speech recognition systems. With advances in sensor and sensor network technology, there is considerable potential for applications that employ ad-hoc networks of microphone-equipped devices collaboratively as a virtual microphone array. By allowing such devices to be distributed throughout the users’ environment, the microphone positions are no longer constrained to traditional fixed geometrical arrangements. This flexibility in the means of data acquisition allows different audio scenes to be captured to give a complete picture of the working environment. In such ad-hoc deployment of microphone sensors, however, the lack of information about the location of devices and active speakers poses technical challenges for array signal processing algorithms which must be addressed to allow deployment in real-world applications. While not an ad-hoc sensor network, conditions approaching this have in effect been imposed in recent National Institute of Standards and Technology (NIST) ASR evaluations on distant microphone recordings of meetings. The NIST evaluation data comes from multiple sites, each with different and often loosely specified distant microphone configurations. This research investigates how microphone array methods can be applied for ad-hoc microphone arrays. A particular focus is on devising methods that are robust to unknown microphone placements in order to improve the overall speech quality and recognition performance provided by the beamforming algorithms. In ad-hoc situations, microphone positions and likely source locations are not known and beamforming must be achieved blindly. There are two general approaches that can be employed to blindly estimate the steering vector for beamforming. The first is direct estimation without regard to the microphone and source locations. An alternative approach is instead to first determine the unknown microphone positions through array calibration methods and then to use the traditional geometrical formulation for the steering vector. Following these two major approaches investigated in this thesis, a novel clustered approach which includes clustering the microphones and selecting the clusters based on their proximity to the speaker is proposed. Novel experiments are conducted to demonstrate that the proposed method to automatically select clusters of microphones (ie, a subarray), closely located both to each other and to the desired speech source, may in fact provide a more robust speech enhancement and recognition than the full array could.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional speech enhancement methods optimise signal-level criteria such as signal-to-noise ratio, but these approaches are sub-optimal for noise-robust speech recognition. Likelihood-maximising (LIMA) frameworks are an alternative that optimise parameters of enhancement algorithms based on state sequences generated for utterances with known transcriptions. Previous reports of LIMA frameworks have shown significant promise for improving speech recognition accuracies under additive background noise for a range of speech enhancement techniques. In this paper we discuss the drawbacks of the LIMA approach when multiple layers of acoustic mismatch are present – namely background noise and speaker accent. Experimentation using LIMA-based Mel-filterbank noise subtraction on American and Australian English in-car speech databases supports this discussion, demonstrating that inferior speech recognition performance occurs when a second layer of mismatch is seen during evaluation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Product placement is a fast growing multi-billion dollar industry yet measures of its effectiveness, which influence the critical area of pricing, have been problematic. Past attempts to measure the effect of a placement, and therefore provide a basis for pricing of placements, have been confounded by the effect on consumers of multiple prior exposures of a brand name in all marketing communications. Virtual product placement offers certain advantages: as a tool to measure the effectiveness of product placements; assistance with the problem of lack of audience selectivity in traditional product placement; testing different audiences for brands and addressing a gap in the existing academic literature by focusing on the impact of product placement on recall and recognition of new brands.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Presentation describling a project in data intensive research in the humanities. Measuring activity of publically available data in social networks such as Blogosphere, Twitter, Flickr, YouTube

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In recent times, the improved levels of accuracy obtained by Automatic Speech Recognition (ASR) technology has made it viable for use in a number of commercial products. Unfortunately, these types of applications are limited to only a few of the world’s languages, primarily because ASR development is reliant on the availability of large amounts of language specific resources. This motivates the need for techniques which reduce this language-specific, resource dependency. Ideally, these approaches should generalise across languages, thereby providing scope for rapid creation of ASR capabilities for resource poor languages. Cross Lingual ASR emerges as a means for addressing this need. Underpinning this approach is the observation that sound production is largely influenced by the physiological construction of the vocal tract, and accordingly, is human, and not language specific. As a result, a common inventory of sounds exists across languages; a property which is exploitable, as sounds from a resource poor, target language can be recognised using models trained on resource rich, source languages. One of the initial impediments to the commercial uptake of ASR technology was its fragility in more challenging environments, such as conversational telephone speech. Subsequent improvements in these environments has gained consumer confidence. Pragmatically, if cross lingual techniques are to considered a viable alternative when resources are limited, they need to perform under the same types of conditions. Accordingly, this thesis evaluates cross lingual techniques using two speech environments; clean read speech and conversational telephone speech. Languages used in evaluations are German, Mandarin, Japanese and Spanish. Results highlight that previously proposed approaches provide respectable results for simpler environments such as read speech, but degrade significantly when in the more taxing conversational environment. Two separate approaches for addressing this degradation are proposed. The first is based on deriving better target language lexical representation, in terms of the source language model set. The second, and ultimately more successful approach, focuses on improving the classification accuracy of context-dependent (CD) models, by catering for the adverse influence of languages specific phonotactic properties. Whilst the primary research goal in this thesis is directed towards improving cross lingual techniques, the catalyst for investigating its use was based on expressed interest from several organisations for an Indonesian ASR capability. In Indonesia alone, there are over 200 million speakers of some Malay variant, provides further impetus and commercial justification for speech related research on this language. Unfortunately, at the beginning of the candidature, limited research had been conducted on the Indonesian language in the field of speech science, and virtually no resources existed. This thesis details the investigative and development work dedicated towards obtaining an ASR system with a 10000 word recognition vocabulary for the Indonesian language.