6 resultados para video object segmentation
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.
Resumo:
A deep theoretical analysis of the graph cut image segmentation framework presented in this paper simultaneously translates into important contributions in several directions. The most important practical contribution of this work is a full theoretical description, and implementation, of a novel powerful segmentation algorithm, GC(max). The output of GC(max) coincides with a version of a segmentation algorithm known as Iterative Relative Fuzzy Connectedness, IRFC. However, GC(max) is considerably faster than the classic IRFC algorithm, which we prove theoretically and show experimentally. Specifically, we prove that, in the worst case scenario, the GC(max) algorithm runs in linear time with respect to the variable M=|C|+|Z|, where |C| is the image scene size and |Z| is the size of the allowable range, Z, of the associated weight/affinity function. For most implementations, Z is identical to the set of allowable image intensity values, and its size can be treated as small with respect to |C|, meaning that O(M)=O(|C|). In such a situation, GC(max) runs in linear time with respect to the image size |C|. We show that the output of GC(max) constitutes a solution of a graph cut energy minimization problem, in which the energy is defined as the a"" (a) norm ayenF (P) ayen(a) of the map F (P) that associates, with every element e from the boundary of an object P, its weight w(e). This formulation brings IRFC algorithms to the realm of the graph cut energy minimizers, with energy functions ayenF (P) ayen (q) for qa[1,a]. Of these, the best known minimization problem is for the energy ayenF (P) ayen(1), which is solved by the classic min-cut/max-flow algorithm, referred to often as the Graph Cut algorithm. We notice that a minimization problem for ayenF (P) ayen (q) , qa[1,a), is identical to that for ayenF (P) ayen(1), when the original weight function w is replaced by w (q) . Thus, any algorithm GC(sum) solving the ayenF (P) ayen(1) minimization problem, solves also one for ayenF (P) ayen (q) with qa[1,a), so just two algorithms, GC(sum) and GC(max), are enough to solve all ayenF (P) ayen (q) -minimization problems. We also show that, for any fixed weight assignment, the solutions of the ayenF (P) ayen (q) -minimization problems converge to a solution of the ayenF (P) ayen(a)-minimization problem (ayenF (P) ayen(a)=lim (q -> a)ayenF (P) ayen (q) is not enough to deduce that). An experimental comparison of the performance of GC(max) and GC(sum) algorithms is included. This concentrates on comparing the actual (as opposed to provable worst scenario) algorithms' running time, as well as the influence of the choice of the seeds on the output.
Resumo:
Primary voice production occurs in the larynx through vibrational movements carried out by vocal folds. However, many problems can affect this complex system resulting in voice disorders. In this context, time-frequency-shape analysis based on embedding phase space plots and nonlinear dynamics methods have been used to evaluate the vocal fold dynamics during phonation. For this purpose, the present work used high-speed video to record the vocal fold movements of three subjects and extract the glottal area time series using an image segmentation algorithm. This signal is used for an optimization method which combines genetic algorithms and a quasi-Newton method to optimize the parameters of a biomechanical model of vocal folds based on lumped elements (masses, springs and dampers). After optimization, this model is capable of simulating the dynamics of recorded vocal folds and their glottal pulse. Bifurcation diagrams and phase space analysis were used to evaluate the behavior of this deterministic system in different circumstances. The results showed that this methodology can be used to extract some physiological parameters of vocal folds and reproduce some complex behaviors of these structures contributing to the scientific and clinical evaluation of voice production. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Abstract Background Atherosclerosis causes millions of deaths, annually yielding billions in expenses round the world. Intravascular Optical Coherence Tomography (IVOCT) is a medical imaging modality, which displays high resolution images of coronary cross-section. Nonetheless, quantitative information can only be obtained with segmentation; consequently, more adequate diagnostics, therapies and interventions can be provided. Since it is a relatively new modality, many different segmentation methods, available in the literature for other modalities, could be successfully applied to IVOCT images, improving accuracies and uses. Method An automatic lumen segmentation approach, based on Wavelet Transform and Mathematical Morphology, is presented. The methodology is divided into three main parts. First, the preprocessing stage attenuates and enhances undesirable and important information, respectively. Second, in the feature extraction block, wavelet is associated with an adapted version of Otsu threshold; hence, tissue information is discriminated and binarized. Finally, binary morphological reconstruction improves the binary information and constructs the binary lumen object. Results The evaluation was carried out by segmenting 290 challenging images from human and pig coronaries, and rabbit iliac arteries; the outcomes were compared with the gold standards made by experts. The resultant accuracy was obtained: True Positive (%) = 99.29 ± 2.96, False Positive (%) = 3.69 ± 2.88, False Negative (%) = 0.71 ± 2.96, Max False Positive Distance (mm) = 0.1 ± 0.07, Max False Negative Distance (mm) = 0.06 ± 0.1. Conclusions In conclusion, by segmenting a number of IVOCT images with various features, the proposed technique showed to be robust and more accurate than published studies; in addition, the method is completely automatic, providing a new tool for IVOCT segmentation.
Resumo:
Recently there has been a considerable interest in dynamic textures due to the explosive growth of multimedia databases. In addition, dynamic texture appears in a wide range of videos, which makes it very important in applications concerning to model physical phenomena. Thus, dynamic textures have emerged as a new field of investigation that extends the static or spatial textures to the spatio-temporal domain. In this paper, we propose a novel approach for dynamic texture segmentation based on automata theory and k-means algorithm. In this approach, a feature vector is extracted for each pixel by applying deterministic partially self-avoiding walks on three orthogonal planes of the video. Then, these feature vectors are clustered by the well-known k-means algorithm. Although the k-means algorithm has shown interesting results, it only ensures its convergence to a local minimum, which affects the final result of segmentation. In order to overcome this drawback, we compare six methods of initialization of the k-means. The experimental results have demonstrated the effectiveness of our proposed approach compared to the state-of-the-art segmentation methods.
Resumo:
Dynamic texture is a recent field of investigation that has received growing attention from computer vision community in the last years. These patterns are moving texture in which the concept of selfsimilarity for static textures is extended to the spatiotemporal domain. In this paper, we propose a novel approach for dynamic texture representation, that can be used for both texture analysis and segmentation. In this method, deterministic partially self-avoiding walks are performed in three orthogonal planes of the video in order to combine appearance and motion features. We validate our method on three applications of dynamic texture that present interesting challenges: recognition, clustering and segmentation. Experimental results on these applications indicate that the proposed method improves the dynamic texture representation compared to the state of the art.