916 resultados para Gesture based audio user interface


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The convergence of data, audio and video on IP networks is changing the way individuals, groups and organizations communicate. This diversity of communication media presents opportunities for creating synergistic collaborative communications. This form of collaborative communication is however not without its challenges. The increasing number of communication service providers coupled with a combinatorial mix of offered services, varying Quality-of-Service and oscillating pricing of services increases the complexity for the user to manage and maintain ‘always best’ priced or performance services. Consumers have to manually manage and adapt their communication in line with differences in services across devices, networks and media while ensuring that the usage remain consistent with their intended goals. This dissertation proposes a novel user-centric approach to address this problem. The proposed approach aims to reduce the aforementioned complexity to the user by (1) providing high-level abstractions and a policy based methodology for automated selection of the communication services guided by high-level user policies and (2) providing services through the seamless integration of multiple communication service providers and providing an extensible framework to support the integration of multiple communication service providers. The approach was implemented in the Communication Virtual Machine (CVM), a model-driven technology for realizing communication applications. The CVM includes the Network Communication Broker, the layer responsible for providing a network-independent API to the upper layers of CVM. The initial prototype for the NCB supported only a single communication framework which limited the number, quality and types of services available. Experimental evaluation of the approach show the additional overhead of the approach is minimal compared to the individual communication services frameworks. Additionally the automated approach proposed out performed the individual communication services frameworks for cross framework switching.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The move from Standard Definition (SD) to High Definition (HD) represents a six times increases in data, which needs to be processed. With expanding resolutions and evolving compression, there is a need for high performance with flexible architectures to allow for quick upgrade ability. The technology advances in image display resolutions, advanced compression techniques, and video intelligence. Software implementation of these systems can attain accuracy with tradeoffs among processing performance (to achieve specified frame rates, working on large image data sets), power and cost constraints. There is a need for new architectures to be in pace with the fast innovations in video and imaging. It contains dedicated hardware implementation of the pixel and frame rate processes on Field Programmable Gate Array (FPGA) to achieve the real-time performance. ^ The following outlines the contributions of the dissertation. (1) We develop a target detection system by applying a novel running average mean threshold (RAMT) approach to globalize the threshold required for background subtraction. This approach adapts the threshold automatically to different environments (indoor and outdoor) and different targets (humans and vehicles). For low power consumption and better performance, we design the complete system on FPGA. (2) We introduce a safe distance factor and develop an algorithm for occlusion occurrence detection during target tracking. A novel mean-threshold is calculated by motion-position analysis. (3) A new strategy for gesture recognition is developed using Combinational Neural Networks (CNN) based on a tree structure. Analysis of the method is done on American Sign Language (ASL) gestures. We introduce novel point of interests approach to reduce the feature vector size and gradient threshold approach for accurate classification. (4) We design a gesture recognition system using a hardware/ software co-simulation neural network for high speed and low memory storage requirements provided by the FPGA. We develop an innovative maximum distant algorithm which uses only 0.39% of the image as the feature vector to train and test the system design. Database set gestures involved in different applications may vary. Therefore, it is highly essential to keep the feature vector as low as possible while maintaining the same accuracy and performance^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This dissertation introduces the design of a multimodal, adaptive real-time assistive system as an alternate human computer interface that can be used by individuals with severe motor disabilities. The proposed design is based on the integration of a remote eye-gaze tracking system, voice recognition software, and a virtual keyboard. The methodology relies on a user profile that customizes eye gaze tracking using neural networks. The user profiling feature facilitates the notion of universal access to computing resources for a wide range of applications such as web browsing, email, word processing and editing. ^ The study is significant in terms of the integration of key algorithms to yield an adaptable and multimodal interface. The contributions of this dissertation stem from the following accomplishments: (a) establishment of the data transport mechanism between the eye-gaze system and the host computer yielding to a significantly low failure rate of 0.9%; (b) accurate translation of eye data into cursor movement through congregate steps which conclude with calibrated cursor coordinates using an improved conversion function; resulting in an average reduction of 70% of the disparity between the point of gaze and the actual position of the mouse cursor, compared with initial findings; (c) use of both a moving average and a trained neural network in order to minimize the jitter of the mouse cursor, which yield an average jittering reduction of 35%; (d) introduction of a new mathematical methodology to measure the degree of jittering of the mouse trajectory; (e) embedding an onscreen keyboard to facilitate text entry, and a graphical interface that is used to generate user profiles for system adaptability. ^ The adaptability nature of the interface is achieved through the establishment of user profiles, which may contain the jittering and voice characteristics of a particular user as well as a customized list of the most commonly used words ordered according to the user's preferences: in alphabetical or statistical order. This allows the system to successfully provide the capability of interacting with a computer. Every time any of the sub-system is retrained, the accuracy of the interface response improves even more. ^

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Hardware/software (HW/SW) cosimulation integrates software simulation and hardware simulation simultaneously. Usually, HW/SW co-simulation platform is used to ease debugging and verification for very large-scale integration (VLSI) design. To accelerate the computation of the gesture recognition technique, an HW/SW implementation using field programmable gate array (FPGA) technology is presented in this paper. The major contributions of this work are: (1) a novel design of memory controller in the Verilog Hardware Description Language (Verilog HDL) to reduce memory consumption and load on the processor. (2) The testing part of the neural network algorithm is being hardwired to improve the speed and performance. The American Sign Language gesture recognition is chosen to verify the performance of the approach. Several experiments were carried out on four databases of the gestures (alphabet signs A to Z). (3) The major benefit of this design is that it takes only few milliseconds to recognize the hand gesture which makes it computationally more efficient.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The effective control of production activities in dynamic job shop with predetermined resource allocation for all the jobs entering the system is a unique manufacturing environment, which exists in the manufacturing industry. In this thesis a framework for an Internet based real time shop floor control system for such a dynamic job shop environment is introduced. The system aims to maintain the schedule feasibility of all the jobs entering the manufacturing system under any circumstance. The system is capable of deciding how often the manufacturing activities should be monitored to check for control decisions that need to be taken on the shop floor. The system will provide the decision maker real time notification to enable him to generate feasible alternate solutions in case a disturbance occurs on the shop floor. The control system is also capable of providing the customer with real time access to the status of the jobs on the shop floor. The communication between the controller, the user and the customer is through web based user friendly GUI. The proposed control system architecture and the interface for the communication system have been designed, developed and implemented.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This research aims to systematize a proposal of developing a mobile tablet application in order to help implementing the Semantic Differential technique – SD, under the approach of Participatory Design. In 1975, Osgood et al. created the Semantic Differential technique. Since then, many experiments use it to measure the affective perception of individuals concerning objects and concepts by means of compounded scales of bipolar adjectives, based on the theoretical models that support the technique: the conductible, spatial and metric models. During the application of the technique with potential users, the researcher must simultaneously manage several contexts, that is, audio recorder, when authorized, and observe and record spontaneous reports of the respondent. It is noticeable that often occurs a cognitive overload during this event. Thus, the use of a single application whose interface is assigned to its users and respondents could assist researchers in applying the SD technique. This research aimed to understand the processes inherent to the task of implementing the Semantic Differential technique and obeyed the following steps: a) training of users, b) background questionnaire c) interview with Focus Group, and d) cooperative evaluation. Besides these procedures, one can also observe the degrees of facilitation or difficulty concerning the use of the conventional model, which is the development and application of scales with the aid of printed material, pencil, pens, clipboards, and recorder software for editing the document and data analysis. This paper comprises reactions and impressions from the experiences of users of SD technique. Considering the data recollected from the user’s observation, the hypothesis of the experiment proved to be right. It means that the development of the application for mobile tablet employing the technique of Semantic Differential is viable, since it assembles all the steps in one only tool, increases the accomplishment of the task between user/researcher and user/respondent resulting in their mutual satisfaction.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The main objective of this work was to enable the recognition of human gestures through the development of a computer program. The program created captures the gestures executed by the user through a camera attached to the computer and sends it to the robot command referring to the gesture. They were interpreted in total ve gestures made by human hand. The software (developed in C ++) widely used the computer vision concepts and open source library OpenCV that directly impact the overall e ciency of the control of mobile robots. The computer vision concepts take into account the use of lters to smooth/blur the image noise reduction, color space to better suit the developer's desktop as well as useful information for manipulating digital images. The OpenCV library was essential in creating the project because it was possible to use various functions/procedures for complete control lters, image borders, image area, the geometric center of borders, exchange of color spaces, convex hull and convexity defect, plus all the necessary means for the characterization of imaged features. During the development of the software was the appearance of several problems, as false positives (noise), underperforming the insertion of various lters with sizes oversized masks, as well as problems arising from the choice of color space for processing human skin tones. However, after the development of seven versions of the control software, it was possible to minimize the occurrence of false positives due to a better use of lters combined with a well-dimensioned mask size (tested at run time) all associated with a programming logic that has been perfected over the construction of the seven versions. After all the development is managed software that met the established requirements. After the completion of the control software, it was observed that the overall e ectiveness of the various programs, highlighting in particular the V programs: 84.75 %, with VI: 93.00 % and VII with: 94.67 % showed that the nal program performed well in interpreting gestures, proving that it was possible the mobile robot control through human gestures without the need for external accessories to give it a better mobility and cost savings for maintain such a system. The great merit of the program was to assist capacity in demystifying the man set/machine therefore uses an easy and intuitive interface for control of mobile robots. Another important feature observed is that to control the mobile robot is not necessary to be close to the same, as to control the equipment is necessary to receive only the address that the Robotino passes to the program via network or Wi-Fi.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Brain-computer interfaces (BCI) have the potential to restore communication or control abilities in individuals with severe neuromuscular limitations, such as those with amyotrophic lateral sclerosis (ALS). The role of a BCI is to extract and decode relevant information that conveys a user's intent directly from brain electro-physiological signals and translate this information into executable commands to control external devices. However, the BCI decision-making process is error-prone due to noisy electro-physiological data, representing the classic problem of efficiently transmitting and receiving information via a noisy communication channel.

This research focuses on P300-based BCIs which rely predominantly on event-related potentials (ERP) that are elicited as a function of a user's uncertainty regarding stimulus events, in either an acoustic or a visual oddball recognition task. The P300-based BCI system enables users to communicate messages from a set of choices by selecting a target character or icon that conveys a desired intent or action. P300-based BCIs have been widely researched as a communication alternative, especially in individuals with ALS who represent a target BCI user population. For the P300-based BCI, repeated data measurements are required to enhance the low signal-to-noise ratio of the elicited ERPs embedded in electroencephalography (EEG) data, in order to improve the accuracy of the target character estimation process. As a result, BCIs have relatively slower speeds when compared to other commercial assistive communication devices, and this limits BCI adoption by their target user population. The goal of this research is to develop algorithms that take into account the physical limitations of the target BCI population to improve the efficiency of ERP-based spellers for real-world communication.

In this work, it is hypothesised that building adaptive capabilities into the BCI framework can potentially give the BCI system the flexibility to improve performance by adjusting system parameters in response to changing user inputs. The research in this work addresses three potential areas for improvement within the P300 speller framework: information optimisation, target character estimation and error correction. The visual interface and its operation control the method by which the ERPs are elicited through the presentation of stimulus events. The parameters of the stimulus presentation paradigm can be modified to modulate and enhance the elicited ERPs. A new stimulus presentation paradigm is developed in order to maximise the information content that is presented to the user by tuning stimulus paradigm parameters to positively affect performance. Internally, the BCI system determines the amount of data to collect and the method by which these data are processed to estimate the user's target character. Algorithms that exploit language information are developed to enhance the target character estimation process and to correct erroneous BCI selections. In addition, a new model-based method to predict BCI performance is developed, an approach which is independent of stimulus presentation paradigm and accounts for dynamic data collection. The studies presented in this work provide evidence that the proposed methods for incorporating adaptive strategies in the three areas have the potential to significantly improve BCI communication rates, and the proposed method for predicting BCI performance provides a reliable means to pre-assess BCI performance without extensive online testing.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

PURPOSE: Radiation therapy is used to treat cancer using carefully designed plans that maximize the radiation dose delivered to the target and minimize damage to healthy tissue, with the dose administered over multiple occasions. Creating treatment plans is a laborious process and presents an obstacle to more frequent replanning, which remains an unsolved problem. However, in between new plans being created, the patient's anatomy can change due to multiple factors including reduction in tumor size and loss of weight, which results in poorer patient outcomes. Cloud computing is a newer technology that is slowly being used for medical applications with promising results. The objective of this work was to design and build a system that could analyze a database of previously created treatment plans, which are stored with their associated anatomical information in studies, to find the one with the most similar anatomy to a new patient. The analyses would be performed in parallel on the cloud to decrease the computation time of finding this plan. METHODS: The system used SlicerRT, a radiation therapy toolkit for the open-source platform 3D Slicer, for its tools to perform the similarity analysis algorithm. Amazon Web Services was used for the cloud instances on which the analyses were performed, as well as for storage of the radiation therapy studies and messaging between the instances and a master local computer. A module was built in SlicerRT to provide the user with an interface to direct the system on the cloud, as well as to perform other related tasks. RESULTS: The cloud-based system out-performed previous methods of conducting the similarity analyses in terms of time, as it analyzed 100 studies in approximately 13 minutes, and produced the same similarity values as those methods. It also scaled up to larger numbers of studies to analyze in the database with a small increase in computation time of just over 2 minutes. CONCLUSION: This system successfully analyzes a large database of radiation therapy studies and finds the one that is most similar to a new patient, which represents a potential step forward in achieving feasible adaptive radiation therapy replanning.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

[EN]Vision-based applications designed for humanmachine interaction require fast and accurate hand detection. However, previous works on this field assume different constraints, like a limitation in the number of detected gestures, because hands are highly complex objects to locate. This paper presents an approach which changes the detection target without limiting the number of detected gestures. Using a cascade classifier we detect hands based on their wrists. With this approach, we introduce two main contributions: (1) a reliable segmentation, independently of the gesture being made and (2) a training phase faster than previous cascade classifier based methods. The paper includes experimental evaluations with different video streams that illustrate the efficiency and suitability for perceptual interfaces.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-07

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Authentication plays an important role in how we interact with computers, mobile devices, the web, etc. The idea of authentication is to uniquely identify a user before granting access to system privileges. For example, in recent years more corporate information and applications have been accessible via the Internet and Intranet. Many employees are working from remote locations and need access to secure corporate files. During this time, it is possible for malicious or unauthorized users to gain access to the system. For this reason, it is logical to have some mechanism in place to detect whether the logged-in user is the same user in control of the user's session. Therefore, highly secure authentication methods must be used. We posit that each of us is unique in our use of computer systems. It is this uniqueness that is leveraged to "continuously authenticate users" while they use web software. To monitor user behavior, n-gram models are used to capture user interactions with web-based software. This statistical language model essentially captures sequences and sub-sequences of user actions, their orderings, and temporal relationships that make them unique by providing a model of how each user typically behaves. Users are then continuously monitored during software operations. Large deviations from "normal behavior" can possibly indicate malicious or unintended behavior. This approach is implemented in a system called Intruder Detector (ID) that models user actions as embodied in web logs generated in response to a user's actions. User identification through web logs is cost-effective and non-intrusive. We perform experiments on a large fielded system with web logs of approximately 4000 users. For these experiments, we use two classification techniques; binary and multi-class classification. We evaluate model-specific differences of user behavior based on coarse-grain (i.e., role) and fine-grain (i.e., individual) analysis. A specific set of metrics are used to provide valuable insight into how each model performs. Intruder Detector achieves accurate results when identifying legitimate users and user types. This tool is also able to detect outliers in role-based user behavior with optimal performance. In addition to web applications, this continuous monitoring technique can be used with other user-based systems such as mobile devices and the analysis of network traffic.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Users need to be able to address in-air gesture systems, which means finding where to perform gestures and how to direct them towards the intended system. This is necessary for input to be sensed correctly and without unintentionally affecting other systems. This thesis investigates novel interaction techniques which allow users to address gesture systems properly, helping them find where and how to gesture. It also investigates audio, tactile and interactive light displays for multimodal gesture feedback; these can be used by gesture systems with limited output capabilities (like mobile phones and small household controls), allowing the interaction techniques to be used by a variety of device types. It investigates tactile and interactive light displays in greater detail, as these are not as well understood as audio displays. Experiments 1 and 2 explored tactile feedback for gesture systems, comparing an ultrasound haptic display to wearable tactile displays at different body locations and investigating feedback designs. These experiments found that tactile feedback improves the user experience of gesturing by reassuring users that their movements are being sensed. Experiment 3 investigated interactive light displays for gesture systems, finding this novel display type effective for giving feedback and presenting information. It also found that interactive light feedback is enhanced by audio and tactile feedback. These feedback modalities were then used alongside audio feedback in two interaction techniques for addressing gesture systems: sensor strength feedback and rhythmic gestures. Sensor strength feedback is multimodal feedback that tells users how well they can be sensed, encouraging them to find where to gesture through active exploration. Experiment 4 found that they can do this with 51mm accuracy, with combinations of audio and interactive light feedback leading to the best performance. Rhythmic gestures are continuously repeated gesture movements which can be used to direct input. Experiment 5 investigated the usability of this technique, finding that users can match rhythmic gestures well and with ease. Finally, these interaction techniques were combined, resulting in a new single interaction for addressing gesture systems. Using this interaction, users could direct their input with rhythmic gestures while using the sensor strength feedback to find a good location for addressing the system. Experiment 6 studied the effectiveness and usability of this technique, as well as the design space for combining the two types of feedback. It found that this interaction was successful, with users matching 99.9% of rhythmic gestures, with 80mm accuracy from target points. The findings show that gesture systems could successfully use this interaction technique to allow users to address them. Novel design recommendations for using rhythmic gestures and sensor strength feedback were created, informed by the experiment findings.