909 resultados para Scene segmentation
Resumo:
Chaotic synchronization has been discovered to be an important property of neural activities, which in turn has encouraged many researchers to develop chaotic neural networks for scene and data analysis. In this paper, we study the synchronization role of coupled chaotic oscillators in networks of general topology. Specifically, a rigorous proof is presented to show that a large number of oscillators with arbitrary geometrical connections can be synchronized by providing a sufficiently strong coupling strength. Moreover, the results presented in this paper not only are valid to a wide class of chaotic oscillators, but also cover the parameter mismatch case. Finally, we show how the obtained result can be applied to construct an oscillatory network for scene segmentation.
Resumo:
Synchronization and chaos play important roles in neural activities and have been applied in oscillatory correlation modeling for scene and data analysis. Although it is an extensively studied topic, there are still few results regarding synchrony in locally coupled systems. In this paper we give a rigorous proof to show that large numbers of coupled chaotic oscillators with parameter mismatch in a 2D lattice can be synchronized by providing a sufficiently large coupling strength. We demonstrate how the obtained result can be applied to construct an oscillatory network for scene segmentation. (C) 2007 Elsevier B.V. All rights reserved.
Resumo:
"COO-2118-0028."
Resumo:
We are concerned with the problem of image segmentation in which each pixel is assigned to one of a predefined finite number of classes. In Bayesian image analysis, this requires fusing together local predictions for the class labels with a prior model of segmentations. Markov Random Fields (MRFs) have been used to incorporate some of this prior knowledge, but this not entirely satisfactory as inference in MRFs is NP-hard. The multiscale quadtree model of Bouman and Shapiro (1994) is an attractive alternative, as this is a tree-structured belief network in which inference can be carried out in linear time (Pearl 1988). It is an hierarchical model where the bottom-level nodes are pixels, and higher levels correspond to downsampled versions of the image. The conditional-probability tables (CPTs) in the belief network encode the knowledge of how the levels interact. In this paper we discuss two methods of learning the CPTs given training data, using (a) maximum likelihood and the EM algorithm and (b) emphconditional maximum likelihood (CML). Segmentations obtained using networks trained by CML show a statistically-significant improvement in performance on synthetic images. We also demonstrate the methods on a real-world outdoor-scene segmentation task.
Resumo:
Object selection refers to the mechanism of extracting objects of interest while ignoring other objects and background in a given visual scene. It is a fundamental issue for many computer vision and image analysis techniques and it is still a challenging task to artificial Visual systems. Chaotic phase synchronization takes place in cases involving almost identical dynamical systems and it means that the phase difference between the systems is kept bounded over the time, while their amplitudes remain chaotic and may be uncorrelated. Instead of complete synchronization, phase synchronization is believed to be a mechanism for neural integration in brain. In this paper, an object selection model is proposed. Oscillators in the network representing the salient object in a given scene are phase synchronized, while no phase synchronization occurs for background objects. In this way, the salient object can be extracted. In this model, a shift mechanism is also introduced to change attention from one object to another. Computer simulations show that the model produces some results similar to those observed in natural vision systems.
Resumo:
The aging population has become a burning issue for all modern societies around the world recently. There are two important issues existing now to be solved. One is how to continuously monitor the movements of those people having suffered a stroke in natural living environment for providing more valuable feedback to guide clinical interventions. The other one is how to guide those old people effectively when they are at home or inside other buildings and to make their life easier and convenient. Therefore, human motion tracking and navigation have been active research fields with the increasing number of elderly people. However, motion capture has been extremely challenging to go beyond laboratory environments and obtain accurate measurements of human physical activity especially in free-living environments, and navigation in free-living environments also poses some problems such as the denied GPS signal and the moving objects commonly presented in free-living environments. This thesis seeks to develop new technologies to enable accurate motion tracking and positioning in free-living environments. This thesis comprises three specific goals using our developed IMU board and the camera from the imaging source company: (1) to develop a robust and real-time orientation algorithm using only the measurements from IMU; (2) to develop a robust distance estimation in static free-living environments to estimate people’s position and navigate people in static free-living environments and simultaneously the scale ambiguity problem, usually appearing in the monocular camera tracking, is solved by integrating the data from the visual and inertial sensors; (3) in case of moving objects viewed by the camera existing in free-living environments, to firstly design a robust scene segmentation algorithm and then respectively estimate the motion of the vIMU system and moving objects. To achieve real-time orientation tracking, an Adaptive-Gain Orientation Filter (AGOF) is proposed in this thesis based on the basic theory of deterministic approach and frequency-based approach using only measurements from the newly developed MARG (Magnet, Angular Rate, and Gravity) sensors. To further obtain robust positioning, an adaptive frame-rate vision-aided IMU system is proposed to develop and implement fast vIMU ego-motion estimation algorithms, where the orientation is estimated in real time from MARG sensors in the first step and then used to estimate the position based on the data from visual and inertial sensors. In case of the moving objects viewed by the camera existing in free-living environments, a robust scene segmentation algorithm is firstly proposed to obtain position estimation and simultaneously the 3D motion of moving objects. Finally, corresponding simulations and experiments have been carried out.
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
Resumo:
This paper proposes an automatic hand detection system that combines the Fourier-Mellin Transform along with other computer vision techniques to achieve hand detection in cluttered scene color images. The proposed system uses the Fourier-Mellin Transform as an invariant feature extractor to perform RST invariant hand detection. In a first stage of the system a simple non-adaptive skin color-based image segmentation and an interest point detector based on corners are used in order to identify regions of interest that contains possible matches. A sliding window algorithm is then used to scan the image at different scales performing the FMT calculations only in the previously detected regions of interest and comparing the extracted FM descriptor of the windows with a hand descriptors database obtained from a train image set. The results of the performed experiments suggest the use of Fourier-Mellin invariant features as a promising approach for automatic hand detection.
A new approach to segmentation based on fusing circumscribed contours, region growing and clustering
Resumo:
One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
Resumo:
We propose a probabilistic object classifier for outdoor scene analysis as a first step in solving the problem of scene context generation. The method begins with a top-down control, which uses the previously learned models (appearance and absolute location) to obtain an initial pixel-level classification. This information provides us the core of objects, which is used to acquire a more accurate object model. Therefore, their growing by specific active regions allows us to obtain an accurate recognition of known regions. Next, a stage of general segmentation provides the segmentation of unknown regions by a bottom-strategy. Finally, the last stage tries to perform a region fusion of known and unknown segmented objects. The result is both a segmentation of the image and a recognition of each segment as a given object class or as an unknown segmented object. Furthermore, experimental results are shown and evaluated to prove the validity of our proposal
A new approach to segmentation based on fusing circumscribed contours, region growing and clustering
Resumo:
One of the major problems in machine vision is the segmentation of images of natural scenes. This paper presents a new proposal for the image segmentation problem which has been based on the integration of edge and region information. The main contours of the scene are detected and used to guide the posterior region growing process. The algorithm places a number of seeds at both sides of a contour allowing stating a set of concurrent growing processes. A previous analysis of the seeds permits to adjust the homogeneity criterion to the regions's characteristics. A new homogeneity criterion based on clustering analysis and convex hull construction is proposed
Resumo:
A deep theoretical analysis of the graph cut image segmentation framework presented in this paper simultaneously translates into important contributions in several directions. The most important practical contribution of this work is a full theoretical description, and implementation, of a novel powerful segmentation algorithm, GC(max). The output of GC(max) coincides with a version of a segmentation algorithm known as Iterative Relative Fuzzy Connectedness, IRFC. However, GC(max) is considerably faster than the classic IRFC algorithm, which we prove theoretically and show experimentally. Specifically, we prove that, in the worst case scenario, the GC(max) algorithm runs in linear time with respect to the variable M=|C|+|Z|, where |C| is the image scene size and |Z| is the size of the allowable range, Z, of the associated weight/affinity function. For most implementations, Z is identical to the set of allowable image intensity values, and its size can be treated as small with respect to |C|, meaning that O(M)=O(|C|). In such a situation, GC(max) runs in linear time with respect to the image size |C|. We show that the output of GC(max) constitutes a solution of a graph cut energy minimization problem, in which the energy is defined as the a"" (a) norm ayenF (P) ayen(a) of the map F (P) that associates, with every element e from the boundary of an object P, its weight w(e). This formulation brings IRFC algorithms to the realm of the graph cut energy minimizers, with energy functions ayenF (P) ayen (q) for qa[1,a]. Of these, the best known minimization problem is for the energy ayenF (P) ayen(1), which is solved by the classic min-cut/max-flow algorithm, referred to often as the Graph Cut algorithm. We notice that a minimization problem for ayenF (P) ayen (q) , qa[1,a), is identical to that for ayenF (P) ayen(1), when the original weight function w is replaced by w (q) . Thus, any algorithm GC(sum) solving the ayenF (P) ayen(1) minimization problem, solves also one for ayenF (P) ayen (q) with qa[1,a), so just two algorithms, GC(sum) and GC(max), are enough to solve all ayenF (P) ayen (q) -minimization problems. We also show that, for any fixed weight assignment, the solutions of the ayenF (P) ayen (q) -minimization problems converge to a solution of the ayenF (P) ayen(a)-minimization problem (ayenF (P) ayen(a)=lim (q -> a)ayenF (P) ayen (q) is not enough to deduce that). An experimental comparison of the performance of GC(max) and GC(sum) algorithms is included. This concentrates on comparing the actual (as opposed to provable worst scenario) algorithms' running time, as well as the influence of the choice of the seeds on the output.
Resumo:
Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.