813 resultados para scene representation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Holistic representations of natural scenes is an effective and powerful source of information for semantic classification and analysis of arbitrary images. Recently, the frequency domain has been successfully exploited to holistically encode the content of natural scenes in order to obtain a robust representation for scene classification. In this paper, we present a new approach to naturalness classification of scenes using frequency domain. The proposed method is based on the ordering of the Discrete Fourier Power Spectra. Features extracted from this ordering are shown sufficient to build a robust holistic representation for Natural vs. Artificial scene classification. Experiments show that the proposed frequency domain method matches the accuracy of other state-of-the-art solutions. © 2008 Springer Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Multiple sound sources often contain harmonics that overlap and may be degraded by environmental noise. The auditory system is capable of teasing apart these sources into distinct mental objects, or streams. Such an "auditory scene analysis" enables the brain to solve the cocktail party problem. A neural network model of auditory scene analysis, called the AIRSTREAM model, is presented to propose how the brain accomplishes this feat. The model clarifies how the frequency components that correspond to a give acoustic source may be coherently grouped together into distinct streams based on pitch and spatial cues. The model also clarifies how multiple streams may be distinguishes and seperated by the brain. Streams are formed as spectral-pitch resonances that emerge through feedback interactions between frequency-specific spectral representaion of a sound source and its pitch. First, the model transforms a sound into a spatial pattern of frequency-specific activation across a spectral stream layer. The sound has multiple parallel representations at this layer. A sound's spectral representation activates a bottom-up filter that is sensitive to harmonics of the sound's pitch. The filter activates a pitch category which, in turn, activate a top-down expectation that allows one voice or instrument to be tracked through a noisy multiple source environment. Spectral components are suppressed if they do not match harmonics of the top-down expectation that is read-out by the selected pitch, thereby allowing another stream to capture these components, as in the "old-plus-new-heuristic" of Bregman. Multiple simultaneously occuring spectral-pitch resonances can hereby emerge. These resonance and matching mechanisms are specialized versions of Adaptive Resonance Theory, or ART, which clarifies how pitch representations can self-organize durin learning of harmonic bottom-up filters and top-down expectations. The model also clarifies how spatial location cues can help to disambiguate two sources with similar spectral cures. Data are simulated from psychophysical grouping experiments, such as how a tone sweeping upwards in frequency creates a bounce percept by grouping with a downward sweeping tone due to proximity in frequency, even if noise replaces the tones at their interection point. Illusory auditory percepts are also simulated, such as the auditory continuity illusion of a tone continuing through a noise burst even if the tone is not present during the noise, and the scale illusion of Deutsch whereby downward and upward scales presented alternately to the two ears are regrouped based on frequency proximity, leading to a bounce percept. Since related sorts of resonances have been used to quantitatively simulate psychophysical data about speech perception, the model strengthens the hypothesis the ART-like mechanisms are used at multiple levels of the auditory system. Proposals for developing the model to explain more complex streaming data are also provided.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study develops a neuromorphic model of human lightness perception that is inspired by how the mammalian visual system is designed for this function. It is known that biological visual representations can adapt to a billion-fold change in luminance. How such a system determines absolute lightness under varying illumination conditions to generate a consistent interpretation of surface lightness remains an unsolved problem. Such a process, called "anchoring" of lightness, has properties including articulation, insulation, configuration, and area effects. The model quantitatively simulates such psychophysical lightness data, as well as other data such as discounting the illuminant, the double brilliant illusion, and lightness constancy and contrast effects. The model retina embodies gain control at retinal photoreceptors, and spatial contrast adaptation at the negative feedback circuit between mechanisms that model the inner segment of photoreceptors and interacting horizontal cells. The model can thereby adjust its sensitivity to input intensities ranging from dim moonlight to dazzling sunlight. A new anchoring mechanism, called the Blurred-Highest-Luminance-As-White (BHLAW) rule, helps simulate how surface lightness becomes sensitive to the spatial scale of objects in a scene. The model is also able to process natural color images under variable lighting conditions, and is compared with the popular RETINEX model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The article, which is part of a more detailed piece of work, aims to highlight the use of the portrait on the film posters of the first Spanish poster artists before the Star-System was introduced in Spain. For this it is posed the evolution that occurs in the representation of the characters in the film poster from the second decade to the beginning of the thirties in the twentieth century, a historical period of profound influences of the artistic and advertising vanguards in our poster artists´ work. However, in the late twenties moving from the simple inclusion of the scene based on the picture of a film, to the chromatic and realistic representation of the star´s face. These were the years when the influence of the major North American studios began to show in Spain. Nevertheless, it highlights their technical and compositional freedom and their influence on subsequent poster artists, as many of them will integrate the portraits and settings on their posters, following the guidelines of the major studios or the independent ones. But without forgetting their own personal way of painting the film stars’ faces on their posters.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a system for the computer understanding of English. The system answers questions, executes commands, and accepts information in normal English dialog. It uses semantic information and context to understand discourse and to disambiguate sentences. It combines a complete syntactic analysis of each sentence with a "heuristic understander" which uses different kinds of information about a sentence, other parts of the discourse, and general information about the world in deciding what the sentence means. It is based on the belief that a computer cannot deal reasonably with language unless it can "understand" the subject it is discussing. The program is given a detailed model of the knowledge needed by a simple robot having only a hand and an eye. We can give it instructions to manipulate toy objects, interrogate it about the scene, and give it information it will use in deduction. In addition to knowing the properties of toy objects, the program has a simple model of its own mentality. It can remember and discuss its plans and actions as well as carry them out. It enters into a dialog with a person, responding to English sentences with actions and English replies, and asking for clarification when its heuristic programs cannot understand a sentence through use of context and physical knowledge.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The s–x model of microwave emission from soil and vegetation layers is widely used to estimate soil moisture content from passive microwave observations. Its application to prospective satellite-based observations aggregating several thousand square kilometres requires understanding of the effects of scene heterogeneity. The effects of heterogeneity in soil surface roughness, soil moisture, water area and vegetation density on the retrieval of soil moisture from simulated single- and multi-angle observing systems were tested. Uncertainty in water area proved the most serious problem for both systems, causing errors of a few percent in soil moisture retrieval. Single-angle retrieval was largely unaffected by the other factors studied here. Multiple-angle retrievals errors around one percent arose from heterogeneity in either soil roughness or soil moisture. Errors of a few percent were caused by vegetation heterogeneity. A simple extension of the model vegetation representation was shown to reduce this error substantially for scenes containing a range of vegetation types.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

During the 1870s and 1880s, several British women writers traveled by transcontinental railroad across the American West via Salt Lake City, Utah, the capital of the Church of Jesus Christ of Latter-day Saints, or Mormons. These women subsequently wrote books about their travels for a home audience with a taste for adventures in the American West, and particularly for accounts of Mormon plural marriage, which was sanctioned by the Church before 1890. "The plight of the Mormon woman," a prominent social reform and literary theme of the period, situated Mormon women at the center of popular representations of Utah during the second half of the nineteenth century. "The Mormon question" thus lends itself to an analysis of how a stereotyped subaltern group was represented by elite British travelers. These residents of western American territories, however, differed in important respects from the typical subaltern subjects discussed by Victorian travelers. These white, upwardly mobile, and articulate Mormon plural wives attempted to influence observers' representations of them through a variety of narrative strategies. Both British women travel writers and Mormon women wrote from the margins of power and credibility, and as interpreters of the Mormon scene were concerned to established their representational authority.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Self-organising neural models have the ability to provide a good representation of the input space. In particular the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time-consuming, especially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This paper proposes a Graphics Processing Unit (GPU) parallel implementation of the GNG with Compute Unified Device Architecture (CUDA). In contrast to existing algorithms, the proposed GPU implementation allows the acceleration of the learning process keeping a good quality of representation. Comparative experiments using iterative, parallel and hybrid implementations are carried out to demonstrate the effectiveness of CUDA implementation. The results show that GNG learning with the proposed implementation achieves a speed-up of 6× compared with the single-threaded CPU implementation. GPU implementation has also been applied to a real application with time constraints: acceleration of 3D scene reconstruction for egomotion, in order to validate the proposal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This piece of art is a flipbook, analogous to the ones children play with as they make cartoon balls bounce with the quick flipping of pages between their thumb and index finger. However, instead of a playful scene, this flipbook is a commentary on Albanian Sworn Virgins. These are women from Northern Albania who, in their youth, swear to celibacy in order to gain the societal power that is exclusive to men in their culture. This flipbook demonstrates this cultural male-to-female shift and comments on its inability to ever be fully realized. This commentary is inspired by the words of Albanian Sworn Virgins in Elvira Dones’ documentary, Sworn Virgins, who feel betrayed by their biological need to menstruate and who view their reproductive system as a permanent obstacle in completing their societal shift. Just as a child’s flipbook tells a story, this flipbook illustrates the Albanian Sworn Virgins’ forever-unfinished transformation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many Object recognition techniques perform some flavour of point pattern matching between a model and a scene. Such points are usually selected through a feature detection algorithm that is robust to a class of image transformations and a suitable descriptor is computed over them in order to get a reliable matching. Moreover, some approaches take an additional step by casting the correspondence problem into a matching between graphs defined over feature points. The motivation is that the relational model would add more discriminative power, however the overall effectiveness strongly depends on the ability to build a graph that is stable with respect to both changes in the object appearance and spatial distribution of interest points. In fact, widely used graph-based representations, have shown to suffer some limitations, especially with respect to changes in the Euclidean organization of the feature points. In this paper we introduce a technique to build relational structures over corner points that does not depend on the spatial distribution of the features. © 2012 ICPR Org Committee.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.