888 resultados para Computer vision system
Resumo:
[EN] In this work we propose a new variational model for the consistent estimation of motion fields. The aim of this work is to develop appropriate spatio-temporal coherence models. In this sense, we propose two main contributions: a nonlinear flow constancy assumption, similar in spirit to the nonlinear brightness constancy assumption, which conveniently relates flow fields at different time instants; and a nonlinear temporal regularization scheme, which complements the spatial regularization and can cope with piecewise continuous motion fields. These contributions pose a congruent variational model since all the energy terms, except the spatial regularization, are based on nonlinear warpings of the flow field. This model is more general than its spatial counterpart, provides more accurate solutions and preserves the continuity of optical flows in time. In the experimental results, we show that the method attains better results and, in particular, it considerably improves the accuracy in the presence of large displacements.
Resumo:
[EN] The aim of this work is to propose a new method for estimating the backward flow directly from the optical flow. We assume that the optical flow has already been computed and we need to estimate the inverse mapping. This mapping is not bijective due to the presence of occlusions and disocclusions, therefore it is not possible to estimate the inverse function in the whole domain. Values in these regions has to be guessed from the available information. We propose an accurate algorithm to calculate the backward flow uniquely from the optical flow, using a simple relation. Occlusions are filled by selecting the maximum motion and disocclusions are filled with two different strategies: a min-fill strategy, which fills each disoccluded region with the minimum value around the region; and a restricted min-fill approach that selects the minimum value in a close neighborhood. In the experimental results, we show the accuracy of the method and compare the results using these two strategies.
Resumo:
[EN] In this paper we study a variational problem derived from a computer vision application: video camera calibration with smoothing constraint. By video camera calibration we meanto estimate the location, orientation and lens zoom-setting of the camera for each video frame taking into account image visible features. To simplify the problem we assume that the camera is mounted on a tripod, in such case, for each frame captured at time t , the calibration is provided by 3 parameters : (1) P(t) (PAN) which represents the tripod vertical axis rotation, (2) T(t) (TILT) which represents the tripod horizontal axis rotation and (3) Z(t) (CAMERA ZOOM) the camera lens zoom setting. The calibration function t -> u(t) = (P(t),T(t),Z(t)) is obtained as the minima of an energy function I[u] . In thIs paper we study the existence of minima of such energy function as well as the solutions of the associated Euler-Lagrange equations.
Resumo:
[EN] In this report we study a number of fluid optic flow sequences in the context of the FLUID Specific Targeted Research Project - Contract No 513633 founded by the EEC. The main goal of this report is to analyse the behaviour of classical computer vision optic flow techniques when we deal with fluid sequences. We use the optic flow sequences provided by other partners of the FLUID project.
Resumo:
[EN] In the last years we have developed some methods for 3D reconstruction. First we began with the problem of reconstructing a 3D scene from a stereoscopic pair of images. We developed some methods based on energy functionals which produce dense disparity maps by preserving discontinuities from image boundaries. Then we passed to the problem of reconstructing a 3D scene from multiple views (more than 2). The method for multiple view reconstruction relies on the method for stereoscopic reconstruction. For every pair of consecutive images we estimate a disparity map and then we apply a robust method that searches for good correspondences through the sequence of images. Recently we have proposed several methods for 3D surface regularization. This is a postprocessing step necessary for smoothing the final surface, which could be afected by noise or mismatch correspondences. These regularization methods are interesting because they use the information from the reconstructing process and not only from the 3D surface. We have tackled all these problems from an energy minimization approach. We investigate the associated Euler-Lagrange equation of the energy functional, and we approach the solution of the underlying partial differential equation (PDE) using a gradient descent method.
Resumo:
Permitida la difusión del código bajo los términos de la licencia BSD de tres cláusulas.
Resumo:
[EN]In this paper, we address the challenge of gender classi - cation using large databases of images with two goals. The rst objective is to evaluate whether the error rate decreases compared to smaller databases. The second goal is to determine if the classi er that provides the best classi cation rate for one database, improves the classi cation results for other databases, that is, the cross-database performance.
Resumo:
[EN]In this paper, we experimentally study the combination of face and facial feature detectors to improve face detection performance. The face detection problem, as suggeted by recent face detection challenges, is still not solved. Face detectors traditionally fail in large-scale problems and/or when the face is occluded or di erent head rotations are present. The combination of face and facial feature detectors is evaluated with a public database. The obtained results evidence an improvement in the positive detection rate while reducing the false detection rate. Additionally, we prove that the integration of facial feature detectors provides useful information for pose estimation and face alignment.
Resumo:
[EN]In this paper, we focus on gender recognition in challenging large scale scenarios. Firstly, we review the literature results achieved for the problem in large datasets, and select the currently hardest dataset: The Images of Groups. Secondly, we study the extraction of features from the face and its local context to improve the recognition accuracy. Diff erent descriptors, resolutions and classfii ers are studied, overcoming previous literature results, reaching an accuracy of 89.8%.
Resumo:
[EN]Gender information may serve to automatically modulate interaction to the user needs, among other applications. Within the Computer Vision community, gender classification (GC) has mainly been accomplished with the facial pattern. Periocular biometrics has recently attracted researchers attention with successful results in the context of identity recognition. But, there is a lack of experimental evaluation of the periocular pattern for GC in the wild. The aim of this paper is to study the performance of this specific facial area in the currently most challenging large dataset for the problem.
Resumo:
[EN]In this work local binary patterns based focus measures are presented. Local binary patterns (LBP) have been introduced in computer vision tasks like texture classification or face recognition. In applications where recognition is based on LBP, a computational saving can be achieved with the use of LBP in the focus measures. The behavior of the proposed measures is studied to test if they fulfill the properties of the focus measures and then a comparison with some well know focus measures is carried out in different scenarios.
Resumo:
[EN]Perceptual User Interfaces (PUIs) aim at facilitating human-computer interaction with the aid of human-like capacities (computer vision, speech recognition, etc.). In PUIs, the human face is a central element, since it conveys not only identity but also other important information, particularly with respect to the user’s mood or emotional state. This paper describes both a face detector and a smile detector for PUIs. Both are suitable for real-time interaction.
Resumo:
[EN]This paper focuses on four different initialization methods for determining the initial shape for the AAM algorithm and their particular performance in two different classification tasks with respect to either the facial expression DaFEx database and to the real world data obtained from a robot’s point of view.
Resumo:
Visual correspondence is a key computer vision task that aims at identifying projections of the same 3D point into images taken either from different viewpoints or at different time instances. This task has been the subject of intense research activities in the last years in scenarios such as object recognition, motion detection, stereo vision, pattern matching, image registration. The approaches proposed in literature typically aim at improving the state of the art by increasing the reliability, the accuracy or the computational efficiency of visual correspondence algorithms. The research work carried out during the Ph.D. course and presented in this dissertation deals with three specific visual correspondence problems: fast pattern matching, stereo correspondence and robust image matching. The dissertation presents original contributions to the theory of visual correspondence, as well as applications dealing with 3D reconstruction and multi-view video surveillance.
Resumo:
Riconoscere un gesto, tracciarlo ed identificarlo è una operazione complessa ed articolata. Negli ultimi anni, con l’avvento massivo di interfacce interattive sempre più sofisticate, si sono ampliati gli approcci nell’interazione tra uomo e macchina. L’obiettivo comune, è quello di avere una comunicazione “trasparente” tra l’utente e il computer, il quale, deve interpretare gesti umani tramite algoritmi matematici. Il riconoscimento di gesti è un modo per iniziare a comprendere il linguaggio del corpo umano da parte della macchina. Questa disciplina, studia nuovi modi di interazione tra questi due elementi e si compone di due macro obiettivi : (a) tracciare i movimenti di un particolare arto; (b) riconoscere tale tracciato come un gesto identificativo. Ognuno di questi due punti, racchiude in sé moltissimi ambiti di ricerca perché moltissimi sono gli approcci proposti negli anni. Non si tratta di semplice cattura dell’immagine, è necessario creare un supporto, a volte molto articolato, nel quale i dati grezzi provenienti dalla fotocamera, necessitano di filtraggi avanzati e trattamenti algoritmici, in modo tale da trasformare informazioni grezze, in dati utilizzabili ed affidabili. La tecnologia riguardo la gesture recognition è rilevante come l’introduzione delle interfacce tattili sui telefoni intelligenti. L’industria oggi ha iniziato a produrre dispositivi in grado di offrire una nuova esperienza, la più naturale possibile, agli utenti. Dal videogioco, all’esperienza televisiva gestita con dei piccoli gesti, all’ambito biomedicale, si sta introducendo una nuova generazione di dispositivi i cui impieghi sono innumerevoli e, per ogni ambito applicativo, è necessario studiare al meglio le peculiarità, in modo tale da produrre un qualcosa di nuovo ed efficace. Questo lavoro di tesi ha l’obiettivo di apportare un contributo a questa disciplina. Ad oggi, moltissime applicazioni e dispositivi associati, si pongono l’obiettivo di catturare movimenti ampi: il gesto viene eseguito con la maggior parte del corpo e occupa una posizione spaziale rilevante. Questa tesi vuole proporre invece un approccio, nel quale i movimenti da seguire e riconoscere sono fatti “nel piccolo”. Si avrà a che fare con gesti classificati fini, dove i movimenti delle mani sono compiuti davanti al corpo, nella zona del torace, ad esempio. Gli ambiti applicativi sono molti, in questo lavoro si è scelto ed adottato l’ambito artigianale.