918 resultados para Visual Word-recognition
Resumo:
We address the problem of face recognition on video by employing the recently proposed probabilistic linear discrimi-nant analysis (PLDA). The PLDA has been shown to be robust against pose and expression in image-based face recognition. In this research, the method is extended and applied to video where image set to image set matching is performed. We investigate two approaches of computing similarities between image sets using the PLDA: the closest pair approach and the holistic sets approach. To better model face appearances in video, we also propose the heteroscedastic version of the PLDA which learns the within-class covariance of each individual separately. Our experi-ments on the VidTIMIT and Honda datasets show that the combination of the heteroscedastic PLDA and the closest pair approach achieves the best performance.
Resumo:
Facial expression is one of the main issues of face recognition in uncontrolled environments. In this paper, we apply the probabilistic linear discriminant analysis (PLDA) method to recognize faces across expressions. Several PLDA approaches are tested and cross-evaluated on the Cohn-Kanade and JAFFE databases. With less samples per gallery subject, high recognition rates comparable to previous works have been achieved indicating the robustness of the approaches. Among the approaches, the mixture of PLDAs has demonstrated better performances. The experimental results also indicate that facial regions around the cheeks, eyes, and eyebrows are more discriminative than regions around the mouth, jaw, chin, and nose.
Resumo:
If Project Management (PM) is a well-accepted mode of managing organizations, more and more organizations are adopting PM in order to satisfy the diversified needs of application areas within a variety of industries and organizations. Concurrently, the number of PM practitioners and people involved at various level of qualification is vigorously rising. Thus the importance to characterize, define and understand this field and its underlying strength, basis and development is paramount. For this purpose we will referee to sociology of actor-networks and qualitative scientometrics leading to the choice of the co-word analysis method in enabling us to capture the project management field and its dynamics. Results of a study based on the analysis of EBSCO Business Source Premier Database will be presented and some future trends and scenarios proposed. The main following trends are confirmed, in alignment with previous studies: continuous interest for the “cost engineering” aspects, on going interest for Economic aspects and contracts, how to deal with various project types (categorizations), the integration with Supply Chain Management and Learning and Knowledge Management. Furthermore besides these continuous trends, we can note new areas of interest: the link between strategy and project, Governance, the importance of maturity (organizational performance and metrics, control) and Change Management. We see the actors (Professional Bodies, Governmental Bodies, Agencies, Universities, Industries, Researchers, and Practitioners) reinforcing their competing/cooperative strategies in the development of standards and certifications and moving to more “business oriented” relationships with their members and main stakeholders (Governments, Institutions like European Community, Industries, Agencies, NGOs…), at least at central level.
Resumo:
Large margin learning approaches, such as support vector machines (SVM), have been successfully applied to numerous classification tasks, especially for automatic facial expression recognition. The risk of such approaches however, is their sensitivity to large margin losses due to the influence from noisy training examples and outliers which is a common problem in the area of affective computing (i.e., manual coding at the frame level is tedious so coarse labels are normally assigned). In this paper, we leverage the relaxation of the parallel-hyperplanes constraint and propose the use of modified correlation filters (MCF). The MCF is similar in spirit to SVMs and correlation filters, but with the key difference of optimizing only a single hyperplane. We demonstrate the superiority of MCF over current techniques on a battery of experiments.
Resumo:
In an age of mobile phones, Facebook, Twitter and online dating, interactions in mediated environments often outnumber face to face encounters. Kiss is an interactive light artwork by artists Priscilla Bracks & Gavin Sade. Kiss reacts to people standing in front of the artwork looking at each other - the moment before kissing. Without interaction the work generates a seductive, ambient, red lighting display, that creates the restful sense of staring into a fire. A fleeting response of white light – like sparks flying in the air – occurs the moment before two faces touch. These sparks are visible in peripheral vision, but fade when the kissing couple turns to look at the work. This moment - as two people look at each other - is a primal moment when two people recognise each other. Face to face encounters with another person are a privileged phenomenon in which the other person's presence and proximity are strongly felt. Kiss does not respond to every instance of a kiss or a look. Its recognition algorithms are fussy, selecting some faces and not others. As in life it’s difficult to tell why sparks fly with some people but not with others. For some this will be felt as a glitch. “This machine should be part of my social life!” But it does promote trial and error, asking viewers to be intimate in public and look at each other for longer than otherwise socially normal. 10 minutes continuous eye contact is said in most cases to arouse sexual feelings in both parties. But even if we don’t look that long, a short time may be all that is needed to explore the face of the person we are looking at. We see that they are human like us. We experience beauty, difference, discomfort, perhaps even nervous laughing, before turning to a more intimate moment of recognition.
Resumo:
This article provides a tutorial introduction to visual servo control of robotic manipulators. Since the topic spans many disciplines our goal is limited to providing a basic conceptual framework. We begin by reviewing the prerequisite topics from robotics and computer vision, including a brief review of coordinate transformations, velocity representation, and a description of the geometric aspects of the image formation process. We then present a taxonomy of visual servo control systems. The two major classes of systems, position-based and image-based systems, are then discussed in detail. Since any visual servo system must be capable of tracking image features in a sequence of images, we also include an overview of feature-based and correlation-based methods for tracking. We conclude the tutorial with a number of observations on the current directions of the research field of visual servo control.
Resumo:
A female voice softly recites physical and psychological associations of aura colours. On screen, individual words fade in and out rhythmically amid a field of swirling and morphing colours. The animated words correlate with the words being spoken, but not every word is displayed, therefore enabling an alternative range of verbal associations to emerge. “Auric Variations” plays with the mix of affirmation and anxiety that can underscore contemporary subjective experiences and the new age techniques we sometimes used to understand them.
Resumo:
A whole tradition is said to be based on the hierarchical distinction between the perceptual and conceptual. In art, Niklas Luhmann argues, this schism is played out and repeated in conceptual art. This paper complicates this depiction by examining Ian Burn's last writings in which I argue the artist-writer reviews the challenge of minimal-conceptual art in terms of its perceptual pre-occupations. Burn revisits his own work and the legacy of minimal-conceptual by moving away from the kind of ideology critique he is best known for internationally in order to reassert the long overlooked visual-perceptual preoccupations of the conceptual in art.
Resumo:
“Spin” borrows idioms and metaphors from sports commentary and squeezes them into a single emotional rollercoaster. Accompanied by a driving soundtrack, text appears and disappears one word at a time. As the work progresses, multiple words fade in and out at the same time, filling the screen and testing our ability to read and assimilate these well-worn phrases. On the one hand, the work mimes some of what we enjoy about sport – its ability to take us to another place, to incite passion and emotion, and to enable us to share in common experiences, goals and desires. On the other hand, it plays up the hyperbolic language often associated with sports broadcasting. The very language that helps take us to another place, incite passion and make us feel part of something bigger than ourselves, is pushed to its extreme and starts to burst at the seams. This work was commissioned for “Kick Off: contemporary video art program” at Metricon Stadium, Gold Coast, and supported by Project Services, Department of Public Works, Queensland Government.
Resumo:
In this video, an abstract kaleidoscopic pattern slowly morphs and changes colour. It is accompanied by a male voice performing a word association or stream-of-consciousness activity. This work examines the nature of consciousness and identity in a contemporary context. It mixes the languages of meditation, new age philosophy and pop-psychology. Drawing on Zygmunt Bauman’s theoretical work on “liquid modernity”, this work questions how and where we find space for contemplation in a contemporary context increasingly defined by temporary social bonds, consumer choices and private anxieties.
Resumo:
Rapid prototyping environments can speed up the research of visual control algorithms. We have designed and implemented a software framework for fast prototyping of visual control algorithms for Micro Aerial Vehicles (MAV). We have applied a combination of a proxy-based network communication architecture and a custom Application Programming Interface. This allows multiple experimental configurations, like drone swarms or distributed processing of a drone's video stream. Currently, the framework supports a low-cost MAV: the Parrot AR.Drone. Real tests have been performed on this platform and the results show comparatively low figures of the extra communication delay introduced by the framework, while adding new functionalities and flexibility to the selected drone. This implementation is open-source and can be downloaded from www.vision4uav.com/?q=VC4MAV-FW
Resumo:
Process-aware information systems, ranging from generic workflow systems to dedicated enterprise information systems, use work-lists to offer so-called work items to users. In real scenarios, users can be confronted with a very large number of work items that stem from multiple cases of different processes. In this jungle of work items, users may find it hard to choose the right item to work on next. The system cannot autonomously decide which is the right work item, since the decision is also dependent on conditions that are somehow outside the system. For instance, what is “best” for an organisation should be mediated with what is “best” for its employees. Current work-list handlers show work items as a simple sorted list and therefore do not provide much decision support for choosing the right work item. Since the work-list handler is the dominant interface between the system and its users, it is worthwhile to provide an intuitive graphical interface that uses contextual information about work items and users to provide suggestions about prioritisation of work items. This paper uses the so-called map metaphor to visualise work items and resources (e.g., users) in a sophisticated manner. Moreover, based on distance notions, the work-list handler can suggest the next work item by considering different perspectives. For example, urgent work items of a type that suits the user may be highlighted. The underlying map and distance notions may be of a geographical nature (e.g., a map of a city or office building), but may also be based on process designs, organisational structures, social networks, due dates, calendars, etc. The framework proposed in this paper is generic and can be applied to any process-aware information system. Moreover, in order to show its practical feasibility, the paper discusses a full-fledged implementation developed in the context of the open-source workflow environment YAWL, together with two real examples stemming from two very different scenarios. The results of an initial usability evaluation of the implementation are also presented, which provide a first indication of the validity of the approach.
Resumo:
This paper presents an image-based visual servoing system that was used to track the atmospheric Earth re-entry of Hayabusa. The primary aim of this ground based tracking platform was to record the emission spectrum radiating from the superheated gas of the shock layer and the surface of the heat shield during re-entry. To the author's knowledge, this is the first time that a visual servoing system has successfully tracked a super-orbital re-entry of a spacecraft and recorded its pectral signature. Furthermore, we improved the system by including a simplified dynamic model for feed-forward control and demonstrate improved tracking performance on the International Space Station (ISS). We present comparisons between simulation and experimental results on different target trajectories including tracking results from Hayabusa and ISS. The required performance for tracking both spacecraft is demanding when combined with a narrow field of view (FOV). We also briefly discuss the preliminary results obtained from the spectroscopy of the Hayabusa's heat shield during re-entry.
Resumo:
This paper investigates the effects of limited speech data in the context of speaker verification using a probabilistic linear discriminant analysis (PLDA) approach. Being able to reduce the length of required speech data is important to the development of automatic speaker verification system in real world applications. When sufficient speech is available, previous research has shown that heavy-tailed PLDA (HTPLDA) modeling of speakers in the i-vector space provides state-of-the-art performance, however, the robustness of HTPLDA to the limited speech resources in development, enrolment and verification is an important issue that has not yet been investigated. In this paper, we analyze the speaker verification performance with regards to the duration of utterances used for both speaker evaluation (enrolment and verification) and score normalization and PLDA modeling during development. Two different approaches to total-variability representation are analyzed within the PLDA approach to show improved performance in short-utterance mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development. The results presented within this paper using the NIST 2008 Speaker Recognition Evaluation dataset suggest that the HTPLDA system can continue to achieve better performance than Gaussian PLDA (GPLDA) as evaluation utterance lengths are decreased. We also highlight the importance of matching durations for score normalization and PLDA modeling to the expected evaluation conditions. Finally, we found that a pooled total-variability approach to PLDA modeling can achieve better performance than the traditional concatenated total-variability approach for short utterances in mismatched evaluation conditions and conditions for which insufficient speech resources are available for adequate system development.