956 resultados para Multimodal Man-Machine Interface
Resumo:
Cinema, with its passive cinematic apparatus and linear narrative is often characterised as a contrast to new media narrative strategies, yet from Vertov’s Man with a Movie Camera to Mike Figgis’ TimeCode and Wong Kar Wei’s 2046 cinema provides narrative strategies and spatial conceptualisations which prefigure or are contiguous with new media environments. Both our perception of what cyberspace constitutes and the technology that actualises those perceptions arise out of and are driven by fantasy and desire. This paper will explore the metaphors used to represent and understand new media aesthetics through cinematic representations of new media environments. Two key themes relevant to new media aesthetics emerge. Irigaray, Haraway, and Grosz are used to explore the de-essentialising haptic and penetrative potential of new technologies and their ability to collapse the boundary between the body and the machine. The second fantasy, of new media as a liminal space that expresses the memorialising function of technology and its relation to mourning, is analysed using Benjamin, Burgin and Rutsky. These altered spaces and perceptions of the body and memory of the post-cinematic subject are illustrated through an analysis of Gondry’s Eternal Sunshine of the Spotless Mind and Jonze’s Being John Malkovich. [From the Author]
Resumo:
In this paper, a novel video-based multimodal biometric verification scheme using the subspace-based low-level feature fusion of face and speech is developed for specific speaker recognition for perceptual human--computer interaction (HCI). In the proposed scheme, human face is tracked and face pose is estimated to weight the detected facelike regions in successive frames, where ill-posed faces and false-positive detections are assigned with lower credit to enhance the accuracy. In the audio modality, mel-frequency cepstral coefficients are extracted for voice-based biometric verification. In the fusion step, features from both modalities are projected into nonlinear Laplacian Eigenmap subspace for multimodal speaker recognition and combined at low level. The proposed approach is tested on the video database of ten human subjects, and the results show that the proposed scheme can attain better accuracy in comparison with the conventional multimodal fusion using latent semantic analysis as well as the single-modality verifications. The experiment on MATLAB shows the potential of the proposed scheme to attain the real-time performance for perceptual HCI applications.
Resumo:
SEMAINE has created a large audiovisual database as a part of an iterative approach to building Sensitive Artificial Listener (SAL) agents that can engage a person in a sustained, emotionally colored conversation. Data used to build the agents came from interactions between users and an operator simulating a SAL agent, in different configurations: Solid SAL (designed so that operators displayed an appropriate nonverbal behavior) and Semi-automatic SAL (designed so that users' experience approximated interacting with a machine). We then recorded user interactions with the developed system, Automatic SAL, comparing the most communicatively competent version to versions with reduced nonverbal skills. High quality recording was provided by five high-resolution, high-framerate cameras, and four microphones, recorded synchronously. Recordings total 150 participants, for a total of 959 conversations with individual SAL characters, lasting approximately 5 minutes each. Solid SAL recordings are transcribed and extensively annotated: 6-8 raters per clip traced five affective dimensions and 27 associated categories. Other scenarios are labeled on the same pattern, but less fully. Additional information includes FACS annotation on selected extracts, identification of laughs, nods, and shakes, and measures of user engagement with the automatic system. The material is available through a web-accessible database. © 2010-2012 IEEE.
Resumo:
This paper presents a novel method of audio-visual feature-level fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there are limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new multimodal feature representation and a modified cosine similarity are introduced to combine and compare bimodal features with limited training data, as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal dataset created from the SPIDRE speaker recognition database and AR face recognition database with variable noise corruption of speech and occlusion in the face images. The system's speaker identification performance on the SPIDRE database, and facial identification performance on the AR database, is comparable with the literature. Combining both modalities using the new method of multimodal fusion leads to significantly improved accuracy over the unimodal systems, even when both modalities have been corrupted. The new method also shows improved identification accuracy compared with the bimodal systems based on multicondition model training or missing-feature decoding alone.
Resumo:
Taking in recent advances in neuroscience and digital technology, Gander and Garland assess the state of the inter-arts in America and the Western world, exploring and questioning the primacy of affect in an increasingly hypertextual everyday environment. In this analysis they signal a move beyond W. J. T. Mitchell’s coinage of the ‘imagetext’ to an approach that centres the reader-viewer in a recognition, after John Dewey, of ‘art as experience’. New thinking in cognitive and computer sciences about the relationship between the body and the mind challenges any established definitions of ‘embodiment’, ‘materiality’, ‘virtuality’ and even ‘intelligence, they argue, whilst ‘Extended Mind Theory’, they note, marries our cognitive processes with the material forms with which we engage, confirming and complicating Marshall McLuhan’s insight, decades ago, that ‘all media are “extensions of man”’. In this chapter, Gander and Garland open paths and suggest directions into understandings and critical interpretations of new and emerging imagetext worlds and experiences.
Resumo:
L'interface cerveau-ordinateur (ICO) décode les signaux électriques du cerveau requise par l’électroencéphalographie et transforme ces signaux en commande pour contrôler un appareil ou un logiciel. Un nombre limité de tâches mentales ont été détectés et classifier par différents groupes de recherche. D’autres types de contrôle, par exemple l’exécution d'un mouvement du pied, réel ou imaginaire, peut modifier les ondes cérébrales du cortex moteur. Nous avons utilisé un ICO pour déterminer si nous pouvions faire une classification entre la navigation de type marche avant et arrière, en temps réel et en temps différé, en utilisant différentes méthodes. Dix personnes en bonne santé ont participé à l’expérience sur les ICO dans un tunnel virtuel. L’expérience fut a était divisé en deux séances (48 min chaque). Chaque séance comprenait 320 essais. On a demandé au sujets d’imaginer un déplacement avant ou arrière dans le tunnel virtuel de façon aléatoire d’après une commande écrite sur l'écran. Les essais ont été menés avec feedback. Trois électrodes ont été montées sur le scalp, vis-à-vis du cortex moteur. Durant la 1re séance, la classification des deux taches (navigation avant et arrière) a été réalisée par les méthodes de puissance de bande, de représentation temporel-fréquence, des modèles autorégressifs et des rapports d’asymétrie du rythme β avec classificateurs d’analyse discriminante linéaire et SVM. Les seuils ont été calculés en temps différé pour former des signaux de contrôle qui ont été utilisés en temps réel durant la 2e séance afin d’initier, par les ondes cérébrales de l'utilisateur, le déplacement du tunnel virtuel dans le sens demandé. Après 96 min d'entrainement, la méthode « online biofeedback » de la puissance de bande a atteint une précision de classification moyenne de 76 %, et la classification en temps différé avec les rapports d’asymétrie et puissance de bande, a atteint une précision de classification d’environ 80 %.
Resumo:
Learning Disability (LD) is a general term that describes specific kinds of learning problems. It is a neurological condition that affects a child's brain and impairs his ability to carry out one or many specific tasks. The learning disabled children are neither slow nor mentally retarded. This disorder can make it problematic for a child to learn as quickly or in the same way as some child who isn't affected by a learning disability. An affected child can have normal or above average intelligence. They may have difficulty paying attention, with reading or letter recognition, or with mathematics. It does not mean that children who have learning disabilities are less intelligent. In fact, many children who have learning disabilities are more intelligent than an average child. Learning disabilities vary from child to child. One child with LD may not have the same kind of learning problems as another child with LD. There is no cure for learning disabilities and they are life-long. However, children with LD can be high achievers and can be taught ways to get around the learning disability. In this research work, data mining using machine learning techniques are used to analyze the symptoms of LD, establish interrelationships between them and evaluate the relative importance of these symptoms. To increase the diagnostic accuracy of learning disability prediction, a knowledge based tool based on statistical machine learning or data mining techniques, with high accuracy,according to the knowledge obtained from the clinical information, is proposed. The basic idea of the developed knowledge based tool is to increase the accuracy of the learning disability assessment and reduce the time used for the same. Different statistical machine learning techniques in data mining are used in the study. Identifying the important parameters of LD prediction using the data mining techniques, identifying the hidden relationship between the symptoms of LD and estimating the relative significance of each symptoms of LD are also the parts of the objectives of this research work. The developed tool has many advantages compared to the traditional methods of using check lists in determination of learning disabilities. For improving the performance of various classifiers, we developed some preprocessing methods for the LD prediction system. A new system based on fuzzy and rough set models are also developed for LD prediction. Here also the importance of pre-processing is studied. A Graphical User Interface (GUI) is designed for developing an integrated knowledge based tool for prediction of LD as well as its degree. The designed tool stores the details of the children in the student database and retrieves their LD report as and when required. The present study undoubtedly proves the effectiveness of the tool developed based on various machine learning techniques. It also identifies the important parameters of LD and accurately predicts the learning disability in school age children. This thesis makes several major contributions in technical, general and social areas. The results are found very beneficial to the parents, teachers and the institutions. They are able to diagnose the child’s problem at an early stage and can go for the proper treatments/counseling at the correct time so as to avoid the academic and social losses.
Resumo:
Friction welding is a solid state joining process that produces coalescence in materials, using the heat developed between surfaces through a combination of mechanical induced rubbing motion and applied load. In rotary friction welding technique heat is generated by the conversion of mechanical energy into thermal energy at the interface of the work pieces during rotation under pressure. Traditionally friction welding is carried out on a dedicated machine because of its adaptability to mass production. In the present work, steps were made to modify a conventional lathe to rotary friction welding set up to obtain friction welding with different interface surface geometries at two different speeds and to carry out tensile characteristic studies. The surface geometries welded include flat-flat, flat-tapered, tapered-tapered, concave-convex and convex-convex. A comparison of maximum load, breaking load and percentage elongation of different welded geometries has been realized through this project. The maximum load and breaking load were found to be highest for weld formed between rotating flat and stationary tapered at 500RPM and the values were 19.219kN and 14.28 kN respectively. The percentage elongation was found to be highest for weld formed between rotating flat and stationary flat at 500RPM and the value was 21.4%. Hence from the studies it is cleared that process parameter like “interfacing surface geometries” of weld specimens have strong influence on tensile characteristics of friction welded joints
Resumo:
Presentation at the 1997 Dagstuhl Seminar "Evaluation of Multimedia Information Retrieval", Norbert Fuhr, Keith van Rijsbergen, Alan F. Smeaton (eds.), Dagstuhl Seminar Report 175, 14.04. - 18.04.97 (9716). - Abstract: This presentation will introduce ESCHER, a database editor which supports visualization in non-standard applications in engineering, science, tourism and the entertainment industry. It was originally based on the extended nested relational data model and is currently extended to include object-relational properties like inheritance, object types, integrity constraints and methods. It serves as a research platform into areas such as multimedia and visual information systems, QBE-like queries, computer-supported concurrent work (CSCW) and novel storage techniques. In its role as a Visual Information System, a database editor must support browsing and navigation. ESCHER provides this access to data by means of so called fingers. They generalize the cursor paradigm in graphical and text editors. On the graphical display, a finger is reflected by a colored area which corresponds to the object a finger is currently pointing at. In a table more than one finger may point to objects, one of which is the active finger and is used for navigating through the table. The talk will mostly concentrate on giving examples for this type of navigation and will discuss some of the architectural needs for fast object traversal and display. ESCHER is available as public domain software from our ftp site in Kassel. The portable C source can be easily compiled for any machine running UNIX and OSF/Motif, in particular our working environments IBM RS/6000 and Intel-based LINUX systems. A porting to Tcl/Tk is under way.
Resumo:
This thesis defines Pi, a parallel architecture interface that separates model and machine issues, allowing them to be addressed independently. This provides greater flexibility for both the model and machine builder. Pi addresses a set of common parallel model requirements including low latency communication, fast task switching, low cost synchronization, efficient storage management, the ability to exploit locality, and efficient support for sequential code. Since Pi provides generic parallel operations, it can efficiently support many parallel programming models including hybrids of existing models. Pi also forms a basis of comparison for architectural components.
Resumo:
In this thesis, I designed and implemented a virtual machine (VM) for a monomorphic variant of Athena, a type-omega denotational proof language (DPL). This machine attempts to maintain the minimum state required to evaluate Athena phrases. This thesis also includes the design and implementation of a compiler for monomorphic Athena that compiles to the VM. Finally, it includes details on my implementation of a read-eval-print loop that glues together the VM core and the compiler to provide a full, user-accessible interface to monomorphic Athena. The Athena VM provides the same basis for DPLs that the SECD machine does for pure, functional programming and the Warren Abstract Machine does for Prolog.
Resumo:
In this paper a look is taken at how the use of implant technology can be used to either increase the range of the abilities of a human and/or diminish the effects of a neural illness, such as Parkinson's Disease. The key element is the need for a clear interface linking the human brain directly with a computer. The area of interest here is the use of implant technology, particularly where a connection is made between technology and the human brain and/or nervous system. Pilot tests and experimentation are invariably carried out apriori to investigate the eventual possibilities before human subjects are themselves involved. Some of the more pertinent animal studies are discussed here. The paper goes on to describe human experimentation, in particular that carried out by the author himself, which led to him receiving a neural implant which linked his nervous system bi-directionally with the internet. With this in place neural signals were transmitted to various technological devices to directly control them. In particular, feedback to the brain was obtained from the fingertips of a robot hand and ultrasonic (extra) sensory input. A view is taken as to the prospects for the future, both in the near term as a therapeutic device and in the long term as a form of enhancement.
Resumo:
The interface between humans and technology is a rapidly changing field. In particular as technological methods have improved dramatically so interaction has become possible that could only be speculated about even a decade earlier. This interaction can though take on a wide range of forms. Indeed standard buttons and dials with televisual feedback are perhaps a common example. But now virtual reality systems, wearable computers and most of all, implant technology are throwing up a completely new concept, namely a symbiosis of human and machine. No longer is it sensible simply to consider how a human interacts with a machine, but rather how the human-machine symbiotic combination interacts with the outside world. In this paper we take a look at some of the recent approaches, putting implant technology in context. We also consider some specific practical examples which may well alter the way we look at this symbiosis in the future. The main area of interest as far as symbiotic studies are concerned is clearly the use of implant technology, particularly where a connection is made between technology and the human brain and/or nervous system. Often pilot tests and experimentation has been carried out apriori to investigate the eventual possibilities before human subjects are themselves involved. Some of the more pertinent animal studies are discussed briefly here. The paper however concentrates on human experimentation, in particular that carried out by the authors themselves, firstly to indicate what possibilities exist as of now with available technology, but perhaps more importantly to also show what might be possible with such technology in the future and how this may well have extensive social effects. The driving force behind the integration of technology with humans on a neural level has historically been to restore lost functionality in individuals who have suffered neurological trauma such as spinal cord damage, or who suffer from a debilitating disease such as lateral amyotrophic sclerosis. Very few would argue against the development of implants to enable such people to control their environment, or some aspect of their own body functions. Indeed this technology in the short term has applications for amelioration of symptoms for the physically impaired, such as alternative senses being bestowed on a blind or deaf individual. However the issue becomes distinctly more complex when it is proposed that such technology be used on those with no medical need, but instead who wish to enhance and augment their own bodies, particularly in terms of their mental attributes. These issues are discussed here in the light of practical experimental test results and their ethical consequences.
Resumo:
We introduce transreal analysis as a generalisation of real analysis. We find that the generalisation of the real exponential and logarithmic functions is well defined for all transreal numbers. Hence, we derive well defined values of all transreal powers of all non-negative transreal numbers. In particular, we find a well defined value for zero to the power of zero. We also note that the computation of products via the transreal logarithm is identical to the transreal product, as expected. We then generalise all of the common, real, trigonometric functions to transreal functions and show that transreal (sin x)/x is well defined everywhere. This raises the possibility that transreal analysis is total, in other words, that every function and every limit is everywhere well defined. If so, transreal analysis should be an adequate mathematical basis for analysing the perspex machine - a theoretical, super-Turing machine that operates on a total geometry. We go on to dispel all of the standard counter "proofs" that purport to show that division by zero is impossible. This is done simply by carrying the proof through in transreal arithmetic or transreal analysis. We find that either the supposed counter proof has no content or else that it supports the contention that division by zero is possible. The supposed counter proofs rely on extending the standard systems in arbitrary and inconsistent ways and then showing, tautologously, that the chosen extensions are not consistent. This shows only that the chosen extensions are inconsistent and does not bear on the question of whether division by zero is logically possible. By contrast, transreal arithmetic is total and consistent so it defeats any possible "straw man" argument. Finally, we show how to arrange that a function has finite or else unmeasurable (nullity) values, but no infinite values. This arithmetical arrangement might prove useful in mathematical physics because it outlaws naked singularities in all equations.