855 resultados para video object segmentation


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Imitation is an important form of social behavior, and research has aimed to discover and explain the neural and kinematic aspects of imitation. However, much of this research has featured single participants imitating in response to pre-recorded video stimuli. This is in spite of findings that show reduced neural activation to video vs. real life movement stimuli, particularly in the motor cortex. We investigated the degree to which video stimuli may affect the imitation process using a novel motion tracking paradigm with high spatial and temporal resolution. We recorded 14 positions on the hands, arms, and heads of two individuals in an imitation experiment. One individual freely moved within given parameters (moving balls across a series of pegs) and a second participant imitated. This task was performed with either simple (one ball) or complex (three balls) movement difficulty, and either face-to-face or via a live video projection. After an exploratory analysis, three dependent variables were chosen for examination: 3D grip position, joint angles in the arm, and grip aperture. A cross-correlation and multivariate analysis revealed that object-directed imitation task accuracy (as represented by grip position) was reduced in video compared to face-to-face feedback, and in complex compared to simple difficulty. This was most prevalent in the left-right and forward-back motions, relevant to the imitator sitting face-to-face with the actor or with a live projected video of the same actor. The results suggest that for tasks which require object-directed imitation, video stimuli may not be an ecologically valid way to present task materials. However, no similar effects were found in the joint angle and grip aperture variables, suggesting that there are limits to the influence of video stimuli on imitation. The implications of these results are discussed with regards to previous findings, and with suggestions for future experimentation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This contribution discusses the effects of camera aperture correction in broadcast video on colour-based keying. The aperture correction is used to ’sharpen’ an image and is one element that distinguishes the ’TV-look’ from ’film-look’. ’If a very high level of sharpening is applied, as is the case in many TV productions then this significantly shifts the colours around object boundaries with hight contrast. This paper discusses these effects and their impact on keying and describes a simple low-pass filter to compensate for them. Tests with colour-based segmentation algorithms show that the proposed compensation is an effective way of decreasing the keying artefacts on object boundaries.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We present a user supported tracking framework that combines automatic tracking with extended user input to create error free tracking results that are suitable for interactive video production. The goal of our approach is to keep the necessary user input as small as possible. In our framework, the user can select between different tracking algorithms - existing ones and new ones that are described in this paper. Furthermore, the user can automatically fuse the results of different tracking algorithms with our robust fusion approach. The tracked object can be marked in more than one frame, which can significantly improve the tracking result. After tracking, the user can validate the results in an easy way, thanks to the support of a powerful interpolation technique. The tracking results are iteratively improved until the complete track has been found. After the iterative editing process the tracking result of each object is stored in an interactive video file that can be loaded by our player for interactive videos.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Automatic visual object counting and video surveillance have important applications for home and business environments, such as security and management of access points. However, in order to obtain a satisfactory performance these technologies need professional and expensive hardware, complex installations and setups, and the supervision of qualified workers. In this paper, an efficient visual detection and tracking framework is proposed for the tasks of object counting and surveillance, which meets the requirements of the consumer electronics: off-the-shelf equipment, easy installation and configuration, and unsupervised working conditions. This is accomplished by a novel Bayesian tracking model that can manage multimodal distributions without explicitly computing the association between tracked objects and detections. In addition, it is robust to erroneous, distorted and missing detections. The proposed algorithm is compared with a recent work, also focused on consumer electronics, proving its superior performance.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

La segmentación de imágenes es un campo importante de la visión computacional y una de las áreas de investigación más activas, con aplicaciones en comprensión de imágenes, detección de objetos, reconocimiento facial, vigilancia de vídeo o procesamiento de imagen médica. La segmentación de imágenes es un problema difícil en general, pero especialmente en entornos científicos y biomédicos, donde las técnicas de adquisición imagen proporcionan imágenes ruidosas. Además, en muchos de estos casos se necesita una precisión casi perfecta. En esta tesis, revisamos y comparamos primero algunas de las técnicas ampliamente usadas para la segmentación de imágenes médicas. Estas técnicas usan clasificadores a nivel de pixel e introducen regularización sobre pares de píxeles que es normalmente insuficiente. Estudiamos las dificultades que presentan para capturar la información de alto nivel sobre los objetos a segmentar. Esta deficiencia da lugar a detecciones erróneas, bordes irregulares, configuraciones con topología errónea y formas inválidas. Para solucionar estos problemas, proponemos un nuevo método de regularización de alto nivel que aprende información topológica y de forma a partir de los datos de entrenamiento de una forma no paramétrica usando potenciales de orden superior. Los potenciales de orden superior se están popularizando en visión por computador, pero la representación exacta de un potencial de orden superior definido sobre muchas variables es computacionalmente inviable. Usamos una representación compacta de los potenciales basada en un conjunto finito de patrones aprendidos de los datos de entrenamiento que, a su vez, depende de las observaciones. Gracias a esta representación, los potenciales de orden superior pueden ser convertidos a potenciales de orden 2 con algunas variables auxiliares añadidas. Experimentos con imágenes reales y sintéticas confirman que nuestro modelo soluciona los errores de aproximaciones más débiles. Incluso con una regularización de alto nivel, una precisión exacta es inalcanzable, y se requeire de edición manual de los resultados de la segmentación automática. La edición manual es tediosa y pesada, y cualquier herramienta de ayuda es muy apreciada. Estas herramientas necesitan ser precisas, pero también lo suficientemente rápidas para ser usadas de forma interactiva. Los contornos activos son una buena solución: son buenos para detecciones precisas de fronteras y, en lugar de buscar una solución global, proporcionan un ajuste fino a resultados que ya existían previamente. Sin embargo, requieren una representación implícita que les permita trabajar con cambios topológicos del contorno, y esto da lugar a ecuaciones en derivadas parciales (EDP) que son costosas de resolver computacionalmente y pueden presentar problemas de estabilidad numérica. Presentamos una aproximación morfológica a la evolución de contornos basada en un nuevo operador morfológico de curvatura que es válido para superficies de cualquier dimensión. Aproximamos la solución numérica de la EDP de la evolución de contorno mediante la aplicación sucesiva de un conjunto de operadores morfológicos aplicados sobre una función de conjuntos de nivel. Estos operadores son muy rápidos, no sufren de problemas de estabilidad numérica y no degradan la función de los conjuntos de nivel, de modo que no hay necesidad de reinicializarlo. Además, su implementación es mucho más sencilla que la de las EDP, ya que no requieren usar sofisticados algoritmos numéricos. Desde un punto de vista teórico, profundizamos en las conexiones entre operadores morfológicos y diferenciales, e introducimos nuevos resultados en este área. Validamos nuestra aproximación proporcionando una implementación morfológica de los contornos geodésicos activos, los contornos activos sin bordes, y los turbopíxeles. En los experimentos realizados, las implementaciones morfológicas convergen a soluciones equivalentes a aquéllas logradas mediante soluciones numéricas tradicionales, pero con ganancias significativas en simplicidad, velocidad y estabilidad. ABSTRACT Image segmentation is an important field in computer vision and one of its most active research areas, with applications in image understanding, object detection, face recognition, video surveillance or medical image processing. Image segmentation is a challenging problem in general, but especially in the biological and medical image fields, where the imaging techniques usually produce cluttered and noisy images and near-perfect accuracy is required in many cases. In this thesis we first review and compare some standard techniques widely used for medical image segmentation. These techniques use pixel-wise classifiers and introduce weak pairwise regularization which is insufficient in many cases. We study their difficulties to capture high-level structural information about the objects to segment. This deficiency leads to many erroneous detections, ragged boundaries, incorrect topological configurations and wrong shapes. To deal with these problems, we propose a new regularization method that learns shape and topological information from training data in a nonparametric way using high-order potentials. High-order potentials are becoming increasingly popular in computer vision. However, the exact representation of a general higher order potential defined over many variables is computationally infeasible. We use a compact representation of the potentials based on a finite set of patterns learned fromtraining data that, in turn, depends on the observations. Thanks to this representation, high-order potentials can be converted into pairwise potentials with some added auxiliary variables and minimized with tree-reweighted message passing (TRW) and belief propagation (BP) techniques. Both synthetic and real experiments confirm that our model fixes the errors of weaker approaches. Even with high-level regularization, perfect accuracy is still unattainable, and human editing of the segmentation results is necessary. The manual edition is tedious and cumbersome, and tools that assist the user are greatly appreciated. These tools need to be precise, but also fast enough to be used in real-time. Active contours are a good solution: they are good for precise boundary detection and, instead of finding a global solution, they provide a fine tuning to previously existing results. However, they require an implicit representation to deal with topological changes of the contour, and this leads to PDEs that are computationally costly to solve and may present numerical stability issues. We present a morphological approach to contour evolution based on a new curvature morphological operator valid for surfaces of any dimension. We approximate the numerical solution of the contour evolution PDE by the successive application of a set of morphological operators defined on a binary level-set. These operators are very fast, do not suffer numerical stability issues, and do not degrade the level set function, so there is no need to reinitialize it. Moreover, their implementation is much easier than their PDE counterpart, since they do not require the use of sophisticated numerical algorithms. From a theoretical point of view, we delve into the connections between differential andmorphological operators, and introduce novel results in this area. We validate the approach providing amorphological implementation of the geodesic active contours, the active contours without borders, and turbopixels. In the experiments conducted, the morphological implementations converge to solutions equivalent to those achieved by traditional numerical solutions, but with significant gains in simplicity, speed, and stability.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The increasing use of video editing software has resulted in a necessity for faster and more efficient editing tools. Here, we propose a lightweight high-quality video indexing tool that is suitable for video editing software.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The increasing use of video editing software requires faster and more efficient editing tools. As a first step, these tools perform a temporal segmentation in shots that allows a later building of indexes describing the video content. Here, we propose a novel real-time high-quality shot detection strategy, suitable for the last generation of video editing software requiring both low computational cost and high quality results. While abrupt transitions are detected through a very fast pixel-based analysis, gradual transitions are obtained from an efficient edge-based analysis. Both analyses are reinforced with a motion analysis that helps to detect and discard false detections. This motion analysis is carried out exclusively over a reduced set of candidate transitions, thus maintaining the computational requirements demanded by new applications to fulfill user needs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Low-cost systems that can obtain a high-quality foreground segmentation almostindependently of the existing illumination conditions for indoor environments are verydesirable, especially for security and surveillance applications. In this paper, a novelforeground segmentation algorithm that uses only a Kinect depth sensor is proposedto satisfy the aforementioned system characteristics. This is achieved by combininga mixture of Gaussians-based background subtraction algorithm with a new Bayesiannetwork that robustly predicts the foreground/background regions between consecutivetime steps. The Bayesian network explicitly exploits the intrinsic characteristics ofthe depth data by means of two dynamic models that estimate the spatial and depthevolution of the foreground/background regions. The most remarkable contribution is thedepth-based dynamic model that predicts the changes in the foreground depth distributionbetween consecutive time steps. This is a key difference with regard to visible imagery,where the color/gray distribution of the foreground is typically assumed to be constant.Experiments carried out on two different depth-based databases demonstrate that theproposed combination of algorithms is able to obtain a more accurate segmentation of theforeground/background than other state-of-the art approaches.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The perception of an object as a single entity within a visual scene requires that its features are bound together and segregated from the background and/or other objects. Here, we used magnetoencephalography (MEG) to assess the hypothesis that coherent percepts may arise from the synchronized high frequency (gamma) activity between neurons that code features of the same object. We also assessed the role of low frequency (alpha, beta) activity in object processing. The target stimulus (i.e. object) was a small patch of a concentric grating of 3c/°, viewed eccentrically. The background stimulus was either a blank field or a concentric grating of 3c/° periodicity, viewed centrally. With patterned backgrounds, the target stimulus emerged--through rotation about its own centre--as a circular subsection of the background. Data were acquired using a 275-channel whole-head MEG system and analyzed using Synthetic Aperture Magnetometry (SAM), which allows one to generate images of task-related cortical oscillatory power changes within specific frequency bands. Significant oscillatory activity across a broad range of frequencies was evident at the V1/V2 border, and subsequent analyses were based on a virtual electrode at this location. When the target was presented in isolation, we observed that: (i) contralateral stimulation yielded a sustained power increase in gamma activity; and (ii) both contra- and ipsilateral stimulation yielded near identical transient power changes in alpha (and beta) activity. When the target was presented against a patterned background, we observed that: (i) contralateral stimulation yielded an increase in high-gamma (>55 Hz) power together with a decrease in low-gamma (40-55 Hz) power; and (ii) both contra- and ipsilateral stimulation yielded a transient decrease in alpha (and beta) activity, though the reduction tended to be greatest for contralateral stimulation. The opposing power changes across different regions of the gamma spectrum with 'figure/ground' stimulation suggest a possible dual role for gamma rhythms in visual object coding, and provide general support of the binding-by-synchronization hypothesis. As the power changes in alpha and beta activity were largely independent of the spatial location of the target, however, we conclude that their role in object processing may relate principally to changes in visual attention.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Automatic video segmentation plays a vital role in sports videos annotation. This paper presents a fully automatic and computationally efficient algorithm for analysis of sports videos. Various methods of automatic shot boundary detection have been proposed to perform automatic video segmentation. These investigations mainly concentrate on detecting fades and dissolves for fast processing of the entire video scene without providing any additional feedback on object relativity within the shots. The goal of the proposed method is to identify regions that perform certain activities in a scene. The model uses some low-level feature video processing algorithms to extract the shot boundaries from a video scene and to identify dominant colours within these boundaries. An object classification method is used for clustering the seed distributions of the dominant colours to homogeneous regions. Using a simple tracking method a classification of these regions to active or static is performed. The efficiency of the proposed framework is demonstrated over a standard video benchmark with numerous types of sport events and the experimental results show that our algorithm can be used with high accuracy for automatic annotation of active regions for sport videos.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Intravascular ultrasound (IVUS) image segmentation can provide more detailed vessel and plaque information, resulting in better diagnostics, evaluation and therapy planning. A novel automatic segmentation proposal is described herein; the method relies on a binary morphological object reconstruction to segment the coronary wall in IVUS images. First, a preprocessing followed by a feature extraction block are performed, allowing for the desired information to be extracted. Afterward, binary versions of the desired objects are reconstructed, and their contours are extracted to segment the image. The effectiveness is demonstrated by segmenting 1300 images, in which the outcomes had a strong correlation to their corresponding gold standard. Moreover, the results were also corroborated statistically by having as high as 92.72% and 91.9% of true positive area fraction for the lumen and media adventitia border, respectively. In addition, this approach can be adapted easily and applied to other related modalities, such as intravascular optical coherence tomography and intravascular magnetic resonance imaging. (E-mail: matheuscardosomg@hotmail.com) (C) 2011 World Federation for Ultrasound in Medicine & Biology.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we present a method for real-time detection and tracking of people in video captured by a depth camera. For each object to be assessed, an ordered sequence of values that represents the distances between its center of mass to the boundary points is calculated. The recognition is based on the analysis of the total distance value between the above sequence and some pre-defined human poses, after apply the Dynamic Time Warping. This similarity approach showed robust results in people detection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfilment of the requirements for the Degree of Master of Science in Geospatial Technologies

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Casa da Música Foundation, responsible for the management of Casa da Música do Porto building, has the need to obtain statistical data related to the number of building’s visitors. This information is a valuable tool for the elaboration of periodical reports concerning the success of this cultural institution. For this reason it was necessary to develop a system capable of returning the number of visitors for a requested period of time. This represents a complex task due to the building’s unique architectural design, characterized by very large doors and halls, and the sudden large number of people that pass through them in moments preceding and proceeding the different activities occurring in the building. To achieve the technical solution for this challenge, several image processing methods, for people detection with still cameras, were first studied. The next step was the development of a real time algorithm, using OpenCV libraries and computer vision concepts,to count individuals with the desired accuracy. This algorithm includes the scientific and technical knowledge acquired in the study of the previous methods. The themes developed in this thesis comprise the fields of background maintenance, shadow and highlight detection, and blob detection and tracking. A graphical interface was also built, to help on the development, test and tunning of the proposed system, as a complement to the work. Furthermore, tests to the system were also performed, to certify the proposed techniques against a set of limited circumstances. The results obtained revealed that the algorithm was successfully applied to count the number of people in complex environments with reliable accuracy.