916 resultados para Visual search method


Relevância:

30.00% 30.00%

Publicador:

Resumo:

The report addresses the problem of visual recognition under two sources of variability: geometric and photometric. The geometric deals with the relation between 3D objects and their views under orthographic and perspective projection. The photometric deals with the relation between 3D matte objects and their images under changing illumination conditions. Taken together, an alignment-based method is presented for recognizing objects viewed from arbitrary viewing positions and illuminated by arbitrary settings of light sources.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A method is presented for the visual analysis of objects by computer. It is particularly well suited for opaque objects with smoothly curved surfaces. The method extracts information about the object's surface properties, including measures of its specularity, texture, and regularity. It also aids in determining the object's shape. The application of this method to a simple recognition task ??e recognition of fruit ?? discussed. The results on a more complex smoothly curved object, a human face, are also considered.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a statistical image-based shape + structure model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Integration of inputs by cortical neurons provides the basis for the complex information processing performed in the cerebral cortex. Here, we propose a new analytic framework for understanding integration within cortical neuronal receptive fields. Based on the synaptic organization of cortex, we argue that neuronal integration is a systems--level process better studied in terms of local cortical circuitry than at the level of single neurons, and we present a method for constructing self-contained modules which capture (nonlinear) local circuit interactions. In this framework, receptive field elements naturally have dual (rather than the traditional unitary influence since they drive both excitatory and inhibitory cortical neurons. This vector-based analysis, in contrast to scalarsapproaches, greatly simplifies integration by permitting linear summation of inputs from both "classical" and "extraclassical" receptive field regions. We illustrate this by explaining two complex visual cortical phenomena, which are incompatible with scalar notions of neuronal integration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we develop a novel index structure to support efficient approximate k-nearest neighbor (KNN) query in high-dimensional databases. In high-dimensional spaces, the computational cost of the distance (e.g., Euclidean distance) between two points contributes a dominant portion of the overall query response time for memory processing. To reduce the distance computation, we first propose a structure (BID) using BIt-Difference to answer approximate KNN query. The BID employs one bit to represent each feature vector of point and the number of bit-difference is used to prune the further points. To facilitate real dataset which is typically skewed, we enhance the BID mechanism with clustering, cluster adapted bitcoder and dimensional weight, named the BID⁺. Extensive experiments are conducted to show that our proposed method yields significant performance advantages over the existing index structures on both real life and synthetic high-dimensional datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A visual SLAM system has been implemented and optimised for real-time deployment on an AUV equipped with calibrated stereo cameras. The system incorporates a novel approach to landmark description in which landmarks are local sub maps that consist of a cloud of 3D points and their associated SIFT/SURF descriptors. Landmarks are also sparsely distributed which simplifies and accelerates data association and map updates. In addition to landmark-based localisation the system utilises visual odometry to estimate the pose of the vehicle in 6 degrees of freedom by identifying temporal matches between consecutive local sub maps and computing the motion. Both the extended Kalman filter and unscented Kalman filter have been considered for filtering the observations. The output of the filter is also smoothed using the Rauch-Tung-Striebel (RTS) method to obtain a better alignment of the sequence of local sub maps and to deliver a large-scale 3D acquisition of the surveyed area. Synthetic experiments have been performed using a simulation environment in which ray tracing is used to generate synthetic images for the stereo system

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this study was to compare the contrast visual processing of concentric sinusoidal gratings stimuli between adolescents and adults. The study included 20 volunteers divided into two groups: 10 adolescents aged 13-19 years (M=16.5, SD=1.65) and 10 adults aged 20-26 years (M=21.8, SD=2.04). In order to measure the contrast sensitivity at spatial frequencies of 0.6, 2.5, 5 and 20 degrees of visual angle (cpd), it was used the psychophysical method of two alternative forced choice (2AFC). A One Way ANOVA performance showed a significant difference in the comparison between groups: F [(4, 237)=3.74, p<.05]. The post-hoc Tukey HSD showed a significant difference between the frequencies of 0.6 (p <.05) and 20 cpd (p<.05). Thus, the results showed that the visual perception behaves differently with regard to the sensory mechanisms that render the contrast towards adolescents and adults. These results are useful to better characterize and comprehend human vision development.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The human visual ability to perceive depth looks like a puzzle. We perceive three-dimensional spatial information quickly and efficiently by using the binocular stereopsis of our eyes and, what is mote important the learning of the most common objects which we achieved through living. Nowadays, modelling the behaviour of our brain is a fiction, that is why the huge problem of 3D perception and further, interpretation is split into a sequence of easier problems. A lot of research is involved in robot vision in order to obtain 3D information of the surrounded scene. Most of this research is based on modelling the stereopsis of humans by using two cameras as if they were two eyes. This method is known as stereo vision and has been widely studied in the past and is being studied at present, and a lot of work will be surely done in the future. This fact allows us to affirm that this topic is one of the most interesting ones in computer vision. The stereo vision principle is based on obtaining the three dimensional position of an object point from the position of its projective points in both camera image planes. However, before inferring 3D information, the mathematical models of both cameras have to be known. This step is known as camera calibration and is broadly describes in the thesis. Perhaps the most important problem in stereo vision is the determination of the pair of homologue points in the two images, known as the correspondence problem, and it is also one of the most difficult problems to be solved which is currently investigated by a lot of researchers. The epipolar geometry allows us to reduce the correspondence problem. An approach to the epipolar geometry is describes in the thesis. Nevertheless, it does not solve it at all as a lot of considerations have to be taken into account. As an example we have to consider points without correspondence due to a surface occlusion or simply due to a projection out of the camera scope. The interest of the thesis is focused on structured light which has been considered as one of the most frequently used techniques in order to reduce the problems related lo stereo vision. Structured light is based on the relationship between a projected light pattern its projection and an image sensor. The deformations between the pattern projected into the scene and the one captured by the camera, permits to obtain three dimensional information of the illuminated scene. This technique has been widely used in such applications as: 3D object reconstruction, robot navigation, quality control, and so on. Although the projection of regular patterns solve the problem of points without match, it does not solve the problem of multiple matching, which leads us to use hard computing algorithms in order to search the correct matches. In recent years, another structured light technique has increased in importance. This technique is based on the codification of the light projected on the scene in order to be used as a tool to obtain an unique match. Each token of light is imaged by the camera, we have to read the label (decode the pattern) in order to solve the correspondence problem. The advantages and disadvantages of stereo vision against structured light and a survey on coded structured light are related and discussed. The work carried out in the frame of this thesis has permitted to present a new coded structured light pattern which solves the correspondence problem uniquely and robust. Unique, as each token of light is coded by a different word which removes the problem of multiple matching. Robust, since the pattern has been coded using the position of each token of light with respect to both co-ordinate axis. Algorithms and experimental results are included in the thesis. The reader can see examples 3D measurement of static objects, and the more complicated measurement of moving objects. The technique can be used in both cases as the pattern is coded by a single projection shot. Then it can be used in several applications of robot vision. Our interest is focused on the mathematical study of the camera and pattern projector models. We are also interested in how these models can be obtained by calibration, and how they can be used to obtained three dimensional information from two correspondence points. Furthermore, we have studied structured light and coded structured light, and we have presented a new coded structured light pattern. However, in this thesis we started from the assumption that the correspondence points could be well-segmented from the captured image. Computer vision constitutes a huge problem and a lot of work is being done at all levels of human vision modelling, starting from a)image acquisition; b) further image enhancement, filtering and processing, c) image segmentation which involves thresholding, thinning, contour detection, texture and colour analysis, and so on. The interest of this thesis starts in the next step, usually known as depth perception or 3D measurement.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The ISO norm line 9241 states some criteria for ergonomics of human system interaction. In markets with a huge variety of offers and little possibility of differentiation, providers can gain a decisive competitive advantage by user oriented interfaces. A precondition for this is that relevant information can be obtained for entrepreneurial decisions in this regard. To test how users of universal search result pages use those pages and pay attention to different elements, an eye tracking experiment with a mixed design has been developed. Twenty subjects were confronted with search engine result pages (SERPs) and were instructed to make a decision while conditions “national vs. international city” and “with vs. without miniaturized Google map” were used. Different parameters like fixation count, duration and time to first fixation were computed from the eye tracking raw data and supplemented by click rate data as well as data from questionnaires. Results of this pilot study revealed some remarkable facts like a vampire effect on miniaturized Google maps. Furthermore, Google maps did not shorten the process of decision making, Google ads were not fixated, visual attention on SERPs was influenced by position of the elements on the SERP and by the users’ familiarity with the search target. These results support the theory of Amount of Invested Mental Effort (AIME) and give providers empirical evidence to take users’ expectations into account. Furthermore, the results indicated that the task oriented goal mode of participants was a moderator for the attention spent on ads. Most important, SERPs with images attracted the viewers’ attention much longer than those without images. This unique selling proposition may lead to a distortion of competition on markets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we introduce a novel high-level visual content descriptor which is devised for performing semantic-based image classification and retrieval. The work can be treated as an attempt to bridge the so called “semantic gap”. The proposed image feature vector model is fundamentally underpinned by the image labelling framework, called Collaterally Confirmed Labelling (CCL), which incorporates the collateral knowledge extracted from the collateral texts of the images with the state-of-the-art low-level image processing and visual feature extraction techniques for automatically assigning linguistic keywords to image regions. Two different high-level image feature vector models are developed based on the CCL labelling of results for the purposes of image data clustering and retrieval respectively. A subset of the Corel image collection has been used for evaluating our proposed method. The experimental results to-date already indicates that our proposed semantic-based visual content descriptors outperform both traditional visual and textual image feature models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Facilitating the visual exploration of scientific data has received increasing attention in the past decade or so. Especially in life science related application areas the amount of available data has grown at a breath taking pace. In this paper we describe an approach that allows for visual inspection of large collections of molecular compounds. In contrast to classical visualizations of such spaces we incorporate a specific focus of analysis, for example the outcome of a biological experiment such as high throughout screening results. The presented method uses this experimental data to select molecular fragments of the underlying molecules that have interesting properties and uses the resulting space to generate a two dimensional map based on a singular value decomposition algorithm and a self organizing map. Experiments on real datasets show that the resulting visual landscape groups molecules of similar chemical properties in densely connected regions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual exploration of scientific data in life science area is a growing research field due to the large amount of available data. The Kohonen’s Self Organizing Map (SOM) is a widely used tool for visualization of multidimensional data. In this paper we present a fast learning algorithm for SOMs that uses a simulated annealing method to adapt the learning parameters. The algorithm has been adopted in a data analysis framework for the generation of similarity maps. Such maps provide an effective tool for the visual exploration of large and multi-dimensional input spaces. The approach has been applied to data generated during the High Throughput Screening of molecular compounds; the generated maps allow a visual exploration of molecules with similar topological properties. The experimental analysis on real world data from the National Cancer Institute shows the speed up of the proposed SOM training process in comparison to a traditional approach. The resulting visual landscape groups molecules with similar chemical properties in densely connected regions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, we present a distributed computing framework for problems characterized by a highly irregular search tree, whereby no reliable workload prediction is available. The framework is based on a peer-to-peer computing environment and dynamic load balancing. The system allows for dynamic resource aggregation, does not depend on any specific meta-computing middleware and is suitable for large-scale, multi-domain, heterogeneous environments, such as computational Grids. Dynamic load balancing policies based on global statistics are known to provide optimal load balancing performance, while randomized techniques provide high scalability. The proposed method combines both advantages and adopts distributed job-pools and a randomized polling technique. The framework has been successfully adopted in a parallel search algorithm for subgraph mining and evaluated on a molecular compounds dataset. The parallel application has shown good calability and close-to linear speedup in a distributed network of workstations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Empirical orthogonal functions (EOFs) are widely used in climate research to identify dominant patterns of variability and to reduce the dimensionality of climate data. EOFs, however, can be difficult to interpret. Rotated empirical orthogonal functions (REOFs) have been proposed as more physical entities with simpler patterns than EOFs. This study presents a new approach for finding climate patterns with simple structures that overcomes the problems encountered with rotation. The method achieves simplicity of the patterns by using the main properties of EOFs and REOFs simultaneously. Orthogonal patterns that maximise variance subject to a constraint that induces a form of simplicity are found. The simplified empirical orthogonal function (SEOF) patterns, being more 'local'. are constrained to have zero loadings outside the main centre of action. The method is applied to winter Northern Hemisphere (NH) monthly mean sea level pressure (SLP) reanalyses over the period 1948-2000. The 'simplified' leading patterns of variability are identified and compared to the leading patterns obtained from EOFs and REOFs. Copyright (C) 2005 Royal Meteorological Society.