975 resultados para Hand gesture recognition
Resumo:
With the release of the Nintendo Wii in 2006, the use of haptic force gestures has become a very popular form of input for interactive entertainment. However, current gesture recognition techniques utilised in Nintendo Wii games fall prey to a lack of control when it comes to recognising simple gestures. This paper presents a simple gesture recognition technique called Peak Testing which gives greater control over gesture interaction. This recognition technique locates force peaks in continuous force data (provided by a gesture device such as the Wiimote) and then cancels any peaks which are not meant for input. Peak Testing is therefore technically able to identify movements in any direction. This paper applies this recognition technique to control virtual instruments and investigates how users respond to this interaction. The technique is then explored as the basis for a robust way to navigate menus with a simple flick of the wrist. We propose that this flick-form of interaction could be a very intuitive way to navigate Nintendo Wii menus instead of the current pointer techniques implemented.
Resumo:
Local spatio-temporal features with a Bag-of-visual words model is a popular approach used in human action recognition. Bag-of-features methods suffer from several challenges such as extracting appropriate appearance and motion features from videos, converting extracted features appropriate for classification and designing a suitable classification framework. In this paper we address the problem of efficiently representing the extracted features for classification to improve the overall performance. We introduce two generative supervised topic models, maximum entropy discrimination LDA (MedLDA) and class- specific simplex LDA (css-LDA), to encode the raw features suitable for discriminative SVM based classification. Unsupervised LDA models disconnect topic discovery from the classification task, hence yield poor results compared to the baseline Bag-of-words framework. On the other hand supervised LDA techniques learn the topic structure by considering the class labels and improve the recognition accuracy significantly. MedLDA maximizes likelihood and within class margins using max-margin techniques and yields a sparse highly discriminative topic structure; while in css-LDA separate class specific topics are learned instead of common set of topics across the entire dataset. In our representation first topics are learned and then each video is represented as a topic proportion vector, i.e. it can be comparable to a histogram of topics. Finally SVM classification is done on the learned topic proportion vector. We demonstrate the efficiency of the above two representation techniques through the experiments carried out in two popular datasets. Experimental results demonstrate significantly improved performance compared to the baseline Bag-of-features framework which uses kmeans to construct histogram of words from the feature vectors.
Resumo:
Pen-based user interface has become a hot research field in recent years. Pen gesture plays an important role in Pen-based user interfaces. But it’s difficult for UI designers to design, and for users to learn and use. In this purpose, we performed a research on user-centered design and recognition pen gestures. We performed a survey of 100 pen gestures in twelve famous pen-bases systems to find problems of pen gestures currently used. And we conducted a questionnaire to evaluate the matching degree between commands and pen gestures to discover the characteristics that a good pen gestures should have. Then cognition theories were applied to analyze the advantages of those characteristics in helping improving the learnability of pen gestures. From these, we analyzed the pen gesture recognition effect and presented some improvements on features selection in recognition algorithm of pen gestures. Finally we used a couple of psychology experiments to evaluate twelve pen gestures designed based on the research. It shows those gestures is better for user to learn and use. Research results of this paper can be used for designer as a primary principle to design pen gestures in pen-based systems.
Resumo:
This paper introduces BoostMap, a method that can significantly reduce retrieval time in image and video database systems that employ computationally expensive distance measures, metric or non-metric. Database and query objects are embedded into a Euclidean space, in which similarities can be rapidly measured using a weighted Manhattan distance. Embedding construction is formulated as a machine learning task, where AdaBoost is used to combine many simple, 1D embeddings into a multidimensional embedding that preserves a significant amount of the proximity structure in the original space. Performance is evaluated in a hand pose estimation system, and a dynamic gesture recognition system, where the proposed method is used to retrieve approximate nearest neighbors under expensive image and video similarity measures. In both systems, BoostMap significantly increases efficiency, with minimal losses in accuracy. Moreover, the experiments indicate that BoostMap compares favorably with existing embedding methods that have been employed in computer vision and database applications, i.e., FastMap and Bourgain embeddings.
Resumo:
Many real world image analysis problems, such as face recognition and hand pose estimation, involve recognizing a large number of classes of objects or shapes. Large margin methods, such as AdaBoost and Support Vector Machines (SVMs), often provide competitive accuracy rates, but at the cost of evaluating a large number of binary classifiers, thus making it difficult to apply such methods when thousands or millions of classes need to be recognized. This thesis proposes a filter-and-refine framework, whereby, given a test pattern, a small number of candidate classes can be identified efficiently at the filter step, and computationally expensive large margin classifiers are used to evaluate these candidates at the refine step. Two different filtering methods are proposed, ClassMap and OVA-VS (One-vs.-All classification using Vector Search). ClassMap is an embedding-based method, works for both boosted classifiers and SVMs, and tends to map the patterns and their associated classes close to each other in a vector space. OVA-VS maps OVA classifiers and test patterns to vectors based on the weights and outputs of weak classifiers of the boosting scheme. At runtime, finding the strongest-responding OVA classifier becomes a classical vector search problem, where well-known methods can be used to gain efficiency. In our experiments, the proposed methods achieve significant speed-ups, in some cases up to two orders of magnitude, compared to exhaustive evaluation of all OVA classifiers. This was achieved in hand pose recognition and face recognition systems where the number of classes ranges from 535 to 48,600.
Resumo:
Nearest neighbor retrieval is the task of identifying, given a database of objects and a query object, the objects in the database that are the most similar to the query. Retrieving nearest neighbors is a necessary component of many practical applications, in fields as diverse as computer vision, pattern recognition, multimedia databases, bioinformatics, and computer networks. At the same time, finding nearest neighbors accurately and efficiently can be challenging, especially when the database contains a large number of objects, and when the underlying distance measure is computationally expensive. This thesis proposes new methods for improving the efficiency and accuracy of nearest neighbor retrieval and classification in spaces with computationally expensive distance measures. The proposed methods are domain-independent, and can be applied in arbitrary spaces, including non-Euclidean and non-metric spaces. In this thesis particular emphasis is given to computer vision applications related to object and shape recognition, where expensive non-Euclidean distance measures are often needed to achieve high accuracy. The first contribution of this thesis is the BoostMap algorithm for embedding arbitrary spaces into a vector space with a computationally efficient distance measure. Using this approach, an approximate set of nearest neighbors can be retrieved efficiently - often orders of magnitude faster than retrieval using the exact distance measure in the original space. The BoostMap algorithm has two key distinguishing features with respect to existing embedding methods. First, embedding construction explicitly maximizes the amount of nearest neighbor information preserved by the embedding. Second, embedding construction is treated as a machine learning problem, in contrast to existing methods that are based on geometric considerations. The second contribution is a method for constructing query-sensitive distance measures for the purposes of nearest neighbor retrieval and classification. In high-dimensional spaces, query-sensitive distance measures allow for automatic selection of the dimensions that are the most informative for each specific query object. It is shown theoretically and experimentally that query-sensitivity increases the modeling power of embeddings, allowing embeddings to capture a larger amount of the nearest neighbor structure of the original space. The third contribution is a method for speeding up nearest neighbor classification by combining multiple embedding-based nearest neighbor classifiers in a cascade. In a cascade, computationally efficient classifiers are used to quickly classify easy cases, and classifiers that are more computationally expensive and also more accurate are only applied to objects that are harder to classify. An interesting property of the proposed cascade method is that, under certain conditions, classification time actually decreases as the size of the database increases, a behavior that is in stark contrast to the behavior of typical nearest neighbor classification systems. The proposed methods are evaluated experimentally in several different applications: hand shape recognition, off-line character recognition, online character recognition, and efficient retrieval of time series. In all datasets, the proposed methods lead to significant improvements in accuracy and efficiency compared to existing state-of-the-art methods. In some datasets, the general-purpose methods introduced in this thesis even outperform domain-specific methods that have been custom-designed for such datasets.
Resumo:
Public Display Systems (PDS) increasingly have a greater presence in our cities. These systems provide information and advertising specifically tailored to audiences in spaces such as airports, train stations, and shopping centers. A large number of public displays are also being deployed for entertainment reasons. Sometimes designing and prototyping PDS come to be a laborious, complex and a costly task. This dissertation focuses on the design and evaluation of PDS at early development phases with the aim of facilitating low-effort, rapid design and the evaluation of interactive PDS. This study focuses on the IPED Toolkit. This tool proposes the design, prototype, and evaluation of public display systems, replicating real-world scenes in the lab. This research aims at identifying benefits and drawbacks on the use of different means to place overlays/virtual displays above a panoramic video footage, recorded at real-world locations. The means of interaction studied in this work are on the one hand the keyboard and mouse, and on the other hand the tablet with two different techniques of use. To carry out this study, an android application has been developed whose function is to allow users to interact with the IPED Toolkit using the tablet. Additionally, the toolkit has been modified and adapted to tablets by using different web technologies. Finally the users study makes a comparison about the different means of interaction.
Resumo:
Gesture-based applications have particularities, since users interact in a natural way, much as they interact in the non-digital world. Hence, new requirements are needed on the software design process. This paper shows a software development process model for these applications, including requirement specification, design, implementation, and testing procedures. The steps and activities of the proposed model were tested through a game case study, which is a puzzle game. The puzzle is completed when all pieces of a painting are correctly positioned by the drag and drop action of users hand gesture. It also shows the results obtained of applying a heuristic evaluation on this game. © 2012 IEEE.
Resumo:
[EN]In this paper a system for face recognition from a tabula rasa (i.e. blank slate) perspective is described. A priori, the system has the only ability to detect automatically faces and represent them in a space of reduced dimension. Later, the system is exposed to over 400 different identities, observing its recognition performance evolution. The preliminary results achieved indicate on the one side that the system is able to reject most of unknown individuals after an initialization stage.
Resumo:
Schizophrenia patients frequently present with subtle motor impairments, including higher order motor function such as hand gesture performance. Using cut off scores from a standardized gesture test, we previously reported gesture deficits in 40% of schizophrenia patients irrespective of the gesture content. However, these findings were based on normative data from an older control group. Hence, we now aimed at determining cut-off scores in an age and gender matched control group. Furthermore, we wanted to explore whether gesture categories are differentially affected in Schizophrenia. Gesture performance data of 30 schizophrenia patients and data from 30 matched controls were compared. Categories included meaningless, intransitive (communicative) and transitive (object related) hand gestures, which were either imitated or pantomimed, i.e. produced on verbal command. Cut-off scores of the age matched control group were higher than the previous cut-off scores in an older control group. An ANOVA tested effects of group, domain (imitation or pantomime), and semantic category (meaningless, transitive or intransitive), as well as their interaction. According to the new cut-off scores, 67% of the schizophrenia patients demonstrated gestural deficits. Patients performed worse in all gesture categories, however meaningless gestures on verbal command were particularly impaired (p = 0.008). This category correlated with poor frontal lobe function (p < 0.001). In conclusion, gestural deficits in schizophrenia are even more frequent than previously reported. Gesture categories that pose higher demands on planning and selection such as pantomime of meaningless gestures are predominantly affected and associated with the well-known frontal lobe dysfunction.
Resumo:
Schizophrenia patients are severely impaired in nonverbal communication, including social perception and gesture production. However, the impact of nonverbal social perception on gestural behavior remains unknown, as is the contribution of negative symptoms, working memory, and abnormal motor behavior. Thus, the study tested whether poor nonverbal social perception was related to impaired gesture performance, gestural knowledge, or motor abnormalities. Forty-six patients with schizophrenia (80%), schizophreniform (15%), or schizoaffective disorder (5%) and 44 healthy controls matched for age, gender, and education were included. Participants completed 4 tasks on nonverbal communication including nonverbal social perception, gesture performance, gesture recognition, and tool use. In addition, they underwent comprehensive clinical and motor assessments. Patients presented impaired nonverbal communication in all tasks compared with controls. Furthermore, in contrast to controls, performance in patients was highly correlated between tasks, not explained by supramodal cognitive deficits such as working memory. Schizophrenia patients with impaired gesture performance also demonstrated poor nonverbal social perception, gestural knowledge, and tool use. Importantly, motor/frontal abnormalities negatively mediated the strong association between nonverbal social perception and gesture performance. The factors negative symptoms and antipsychotic dosage were unrelated to the nonverbal tasks. The study confirmed a generalized nonverbal communication deficit in schizophrenia. Specifically, the findings suggested that nonverbal social perception in schizophrenia has a relevant impact on gestural impairment beyond the negative influence of motor/frontal abnormalities.
Resumo:
Applying biometrics to daily scenarios involves demanding requirements in terms of software and hardware. On the contrary, current biometric techniques are also being adapted to present-day devices, like mobile phones, laptops and the like, which are far from meeting the previous stated requirements. In fact, achieving a combination of both necessities is one of the most difficult problems at present in biometrics. Therefore, this paper presents a segmentation algorithm able to provide suitable solutions in terms of precision for hand biometric recognition, considering a wide range of backgrounds like carpets, glass, grass, mud, pavement, plastic, tiles or wood. Results highlight that segmentation accuracy is carried out with high rates of precision (F-measure 88%)), presenting competitive time results when compared to state-of-the-art segmentation algorithms time performance