788 resultados para RST-invariant object representation
Resumo:
The question of how shape is represented is of central interest to understanding visual processing in cortex. While tuning properties of the cells in early part of the ventral visual stream, thought to be responsible for object recognition in the primate, are comparatively well understood, several different theories have been proposed regarding tuning in higher visual areas, such as V4. We used the model of object recognition in cortex presented by Riesenhuber and Poggio (1999), where more complex shape tuning in higher layers is the result of combining afferent inputs tuned to simpler features, and compared the tuning properties of model units in intermediate layers to those of V4 neurons from the literature. In particular, we investigated the issue of shape representation in visual area V1 and V4 using oriented bars and various types of gratings (polar, hyperbolic, and Cartesian), as used in several physiology experiments. Our computational model was able to reproduce several physiological findings, such as the broadening distribution of the orientation bandwidths and the emergence of a bias toward non-Cartesian stimuli. Interestingly, the simulation results suggest that some V4 neurons receive input from afferents with spatially separated receptive fields, leading to experimentally testable predictions. However, the simulations also show that the stimulus set of Cartesian and non-Cartesian gratings is not sufficiently complex to probe shape tuning in higher areas, necessitating the use of more complex stimulus sets.
Resumo:
In this paper we present a component based person detection system that is capable of detecting frontal, rear and near side views of people, and partially occluded persons in cluttered scenes. The framework that is described here for people is easily applied to other objects as well. The motivation for developing a component based approach is two fold: first, to enhance the performance of person detection systems on frontal and rear views of people and second, to develop a framework that directly addresses the problem of detecting people who are partially occluded or whose body parts blend in with the background. The data classification is handled by several support vector machine classifiers arranged in two layers. This architecture is known as Adaptive Combination of Classifiers (ACC). The system performs very well and is capable of detecting people even when all components of a person are not found. The performance of the system is significantly better than a full body person detector designed along similar lines. This suggests that the improved performance is due to the components based approach and the ACC data classification structure.
Resumo:
We present a new method to perform reliable matching between different images. This method exploits a projective invariant property between concentric circles and the corresponding projected ellipses to find complete region correspondences centered on interest points. The method matches interest points allowing for a full perspective transformation and exploiting all the available luminance information in the regions. Experiments have been conducted on many different data sets to compare our approach to SIFT local descriptors. The results show the new method offers increased robustness to partial visibility, object rotation in depth, and viewpoint angle change.
Resumo:
L'increment de bases de dades que cada vegada contenen imatges més difícils i amb un nombre més elevat de categories, està forçant el desenvolupament de tècniques de representació d'imatges que siguin discriminatives quan es vol treballar amb múltiples classes i d'algorismes que siguin eficients en l'aprenentatge i classificació. Aquesta tesi explora el problema de classificar les imatges segons l'objecte que contenen quan es disposa d'un gran nombre de categories. Primerament s'investiga com un sistema híbrid format per un model generatiu i un model discriminatiu pot beneficiar la tasca de classificació d'imatges on el nivell d'anotació humà sigui mínim. Per aquesta tasca introduïm un nou vocabulari utilitzant una representació densa de descriptors color-SIFT, i desprès s'investiga com els diferents paràmetres afecten la classificació final. Tot seguit es proposa un mètode par tal d'incorporar informació espacial amb el sistema híbrid, mostrant que la informació de context es de gran ajuda per la classificació d'imatges. Desprès introduïm un nou descriptor de forma que representa la imatge segons la seva forma local i la seva forma espacial, tot junt amb un kernel que incorpora aquesta informació espacial en forma piramidal. La forma es representada per un vector compacte obtenint un descriptor molt adequat per ésser utilitzat amb algorismes d'aprenentatge amb kernels. Els experiments realitzats postren que aquesta informació de forma te uns resultats semblants (i a vegades millors) als descriptors basats en aparença. També s'investiga com diferents característiques es poden combinar per ésser utilitzades en la classificació d'imatges i es mostra com el descriptor de forma proposat juntament amb un descriptor d'aparença millora substancialment la classificació. Finalment es descriu un algoritme que detecta les regions d'interès automàticament durant l'entrenament i la classificació. Això proporciona un mètode per inhibir el fons de la imatge i afegeix invariança a la posició dels objectes dins les imatges. S'ensenya que la forma i l'aparença sobre aquesta regió d'interès i utilitzant els classificadors random forests millora la classificació i el temps computacional. Es comparen els postres resultats amb resultats de la literatura utilitzant les mateixes bases de dades que els autors Aixa com els mateixos protocols d'aprenentatge i classificació. Es veu com totes les innovacions introduïdes incrementen la classificació final de les imatges.
Resumo:
Eye tracking has become a preponderant technique in the evaluation of user interaction and behaviour with study objects in defined contexts. Common eye tracking related data representation techniques offer valuable input regarding user interaction and eye gaze behaviour, namely through fixations and saccades measurement. However, these and other techniques may be insufficient for the representation of acquired data in specific studies, namely because of the complexity of the study object being analysed. This paper intends to contribute with a summary of data representation and information visualization techniques used in data analysis within different contexts (advertising, websites, television news and video games). Additionally, several methodological approaches are presented in this paper, which resulted from several studies developed and under development at CETAC.MEDIA - Communication Sciences and Technologies Research Centre. In the studies described, traditional data representation techniques were insufficient. As a result, new approaches were necessary and therefore, new forms of representing data, based on common techniques were developed with the objective of improving communication and information strategies. In each of these studies, a brief summary of the contribution to their respective area will be presented, as well as the data representation techniques used and some of the acquired results.
Resumo:
Generalizing the notion of an eigenvector, invariant subspaces are frequently used in the context of linear eigenvalue problems, leading to conceptually elegant and numerically stable formulations in applications that require the computation of several eigenvalues and/or eigenvectors. Similar benefits can be expected for polynomial eigenvalue problems, for which the concept of an invariant subspace needs to be replaced by the concept of an invariant pair. Little has been known so far about numerical aspects of such invariant pairs. The aim of this paper is to fill this gap. The behavior of invariant pairs under perturbations of the matrix polynomial is studied and a first-order perturbation expansion is given. From a computational point of view, we investigate how to best extract invariant pairs from a linearization of the matrix polynomial. Moreover, we describe efficient refinement procedures directly based on the polynomial formulation. Numerical experiments with matrix polynomials from a number of applications demonstrate the effectiveness of our extraction and refinement procedures.
Resumo:
We studied how the integration of seen and felt tactile stimulation modulates somatosensory processing, and investigated whether visuotactile integration depends on temporal contiguity of stimulation, and its coherence with a pre-existing body representation. During training, participants viewed a rubber hand or a rubber object that was tapped either synchronously with stimulation of their own hand, or in an uncorrelated fashion. In a subsequent test phase, somatosensory event-related potentials (ERPs) were recorded to tactile stimulation of the left or right hand, to assess how tactile processing was affected by previous visuotactile experience during training. An enhanced somatosensory N140 component was elicited after synchronous, compared with uncorrelated, visuotactile training, irrespective of whether participants viewed a rubber hand or rubber object. This early effect of visuotactile integration on somatosensory processing is interpreted as a candidate electrophysiological correlate of the rubber hand illusion that is determined by temporal contiguity, but not by pre-existing body representations. ERPmodulations were observed beyond 200msec post-stimulus, suggesting an attentional bias induced by visuotactile training. These late modulations were absent when the stimulation of a rubber hand and the participant’s own hand was uncorrelated during training, suggesting that pre-existing body representations may affect later stages of tactile processing.
Resumo:
Perception is linked to action via two routes: a direct route based on affordance information in the environment and an indirect route based on semantic knowledge about objects. The present study explored the factors modulating the recruitment of the two routes, in particular which factors affecting the selection of paired objects. In Experiment 1, we presented real objects among semantically related or unrelated distracters. Participants had to select two objects that can interact. The presence of distracters affected selection times, but not the semantic relations of the objects with the distracters. Furthermore, participants first selected the active object (e.g. teaspoon) with their right hand, followed by the passive object (e.g. mug), often with their left hand. In Experiment 2, we presented pictures of the same objects with no hand grip, congruent or incongruent hand grip. Participants had to decide whether the two objects can interact. Action decisions were faster when the presentation of the active object preceded the presentation of the passive object, and when the grip was congruent. Interestingly, participants were slower when the objects were semantically but not functionally related; this effect increased with congruently gripped objects. Our data showed that action decisions in the presence of strong affordance cues (real objects, pictures of congruently gripped objects) relied on sensory-motor representation, supporting the direct route from perception-to-action that bypasses semantic knowledge. However, in the case of weak affordance cues (pictures), semantic information interfered with action decisions, indicating that semantic knowledge impacts action decisions. The data support the dual-route account from perception-to-action.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
We set up a new calculational framework for the Yang-Mills vacuum transition amplitude in the Schrodinger representation. After integrating out hard-mode contributions perturbatively and performing a gauge-invariant gradient expansion of the ensuing soft-mode action, a manageable saddle-point expansion for the vacuum overlap can be formulated. In combination with the squeezed approximation to the vacuum wave functional this allows for an essentially analytical treatment of physical amplitudes. Moreover, it leads to the identification of dominant and gauge-invariant classes of gauge field orbits which play the role of gluonic infrared (IR) degrees of freedom. The latter emerge as a diverse set of saddle-point solutions and are represented by unitary matrix fields. We discuss their scale stability, the associated virial theorem and other general properties including topological quantum numbers and action bounds. We then find important saddle-point solutions (most of them solitons) explicitly and examine their physical impact. While some are related to tunneling solutions of the classical Yang-Mills equation, i.e. to instantons and merons, others appear to play unprecedented roles. A remarkable new class of IR degrees of freedom consists of Faddeev-Niemi type link and knot solutions, potentially related to glueballs.
Resumo:
Characteristics of speech, especially figures of speech, are used by specific communities or domains, and, in this way, reflect their identities through their choice of vocabulary. This topic should be an object of study in the context of knowledge representation once it deals with different contexts of production of documents. This study aims to explore the dimensions of the concepts of euphemism, dysphemism, and orthophemism, focusing on the latter with the goal of extracting a concept which can be included in discussions about subject analysis and indexing. Euphemism is used as an alternative to a non-preferred expression or as an alternative to an offensive attribution-to avoid potential offense taken by the listener or by other persons, for instance, pass away. Dysphemism, on the other hand, is used by speakers to talk about people and things that frustrate and annoy them-their choice of language indicates disapproval and the topic is therefore denigrated, humiliated, or degraded, for instance, kick the bucket. While euphemism tries to make something sound better, dysphemism tries to make something sound worse. Orthophemism (Allan and Burridge 2006) is also used as an alternative to expressions, but it is a preferred, formal, and direct language of expression when representing an object or a situation, for instance, die. This paper suggests that the comprehension and use of such concepts could support the following issues: possible contributions from linguistics and terminology to subject analysis as demonstrated by Talamo et al. (1992); decrease of polysemy and ambiguity of terms used to represent certain topics of documents; and construction and evaluation of indexing languages. The concept of orthophemism can also serves to support associative relationships in the context of subject analysis, indexing, and even information retrieval related to more specific requests.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Generic programming is likely to become a new challenge for a critical mass of developers. Therefore, it is crucial to refine the support for generic programming in mainstream Object-Oriented languages — both at the design and at the implementation level — as well as to suggest novel ways to exploit the additional degree of expressiveness made available by genericity. This study is meant to provide a contribution towards bringing Java genericity to a more mature stage with respect to mainstream programming practice, by increasing the effectiveness of its implementation, and by revealing its full expressive power in real world scenario. With respect to the current research setting, the main contribution of the thesis is twofold. First, we propose a revised implementation for Java generics that greatly increases the expressiveness of the Java platform by adding reification support for generic types. Secondly, we show how Java genericity can be leveraged in a real world case-study in the context of the multi-paradigm language integration. Several approaches have been proposed in order to overcome the lack of reification of generic types in the Java programming language. Existing approaches tackle the problem of reification of generic types by defining new translation techniques which would allow for a runtime representation of generics and wildcards. Unfortunately most approaches suffer from several problems: heterogeneous translations are known to be problematic when considering reification of generic methods and wildcards. On the other hand, more sophisticated techniques requiring changes in the Java runtime, supports reified generics through a true language extension (where clauses) so that backward compatibility is compromised. In this thesis we develop a sophisticated type-passing technique for addressing the problem of reification of generic types in the Java programming language; this approach — first pioneered by the so called EGO translator — is here turned into a full-blown solution which reifies generic types inside the Java Virtual Machine (JVM) itself, thus overcoming both performance penalties and compatibility issues of the original EGO translator. Java-Prolog integration Integrating Object-Oriented and declarative programming has been the subject of several researches and corresponding technologies. Such proposals come in two flavours, either attempting at joining the two paradigms, or simply providing an interface library for accessing Prolog declarative features from a mainstream Object-Oriented languages such as Java. Both solutions have however drawbacks: in the case of hybrid languages featuring both Object-Oriented and logic traits, such resulting language is typically too complex, thus making mainstream application development an harder task; in the case of library-based integration approaches there is no true language integration, and some “boilerplate code” has to be implemented to fix the paradigm mismatch. In this thesis we develop a framework called PatJ which promotes seamless exploitation of Prolog programming in Java. A sophisticated usage of generics/wildcards allows to define a precise mapping between Object-Oriented and declarative features. PatJ defines a hierarchy of classes where the bidirectional semantics of Prolog terms is modelled directly at the level of the Java generic type-system.
Resumo:
Die chronisch obstruktive Lungenerkrankung (engl. chronic obstructive pulmonary disease, COPD) ist ein Überbegriff für Erkrankungen, die zu Husten, Auswurf und Dyspnoe (Atemnot) in Ruhe oder Belastung führen - zu diesen werden die chronische Bronchitis und das Lungenemphysem gezählt. Das Fortschreiten der COPD ist eng verknüpft mit der Zunahme des Volumens der Wände kleiner Luftwege (Bronchien). Die hochauflösende Computertomographie (CT) gilt bei der Untersuchung der Morphologie der Lunge als Goldstandard (beste und zuverlässigste Methode in der Diagnostik). Möchte man Bronchien, eine in Annäherung tubuläre Struktur, in CT-Bildern vermessen, so stellt die geringe Größe der Bronchien im Vergleich zum Auflösungsvermögen eines klinischen Computertomographen ein großes Problem dar. In dieser Arbeit wird gezeigt wie aus konventionellen Röntgenaufnahmen CT-Bilder berechnet werden, wo die mathematischen und physikalischen Fehlerquellen im Bildentstehungsprozess liegen und wie man ein CT-System mittels Interpretation als lineares verschiebungsinvariantes System (engl. linear shift invariant systems, LSI System) mathematisch greifbar macht. Basierend auf der linearen Systemtheorie werden Möglichkeiten zur Beschreibung des Auflösungsvermögens bildgebender Verfahren hergeleitet. Es wird gezeigt wie man den Tracheobronchialbaum aus einem CT-Datensatz stabil segmentiert und mittels eines topologieerhaltenden 3-dimensionalen Skelettierungsalgorithmus in eine Skelettdarstellung und anschließend in einen kreisfreien Graphen überführt. Basierend auf der linearen System Theorie wird eine neue, vielversprechende, integral-basierte Methodik (IBM) zum Vermessen kleiner Strukturen in CT-Bildern vorgestellt. Zum Validieren der IBM-Resultate wurden verschiedene Messungen an einem Phantom, bestehend aus 10 unterschiedlichen Silikon Schläuchen, durchgeführt. Mit Hilfe der Skelett- und Graphendarstellung ist ein Vermessen des kompletten segmentierten Tracheobronchialbaums im 3-dimensionalen Raum möglich. Für 8 zweifach gescannte Schweine konnte eine gute Reproduzierbarkeit der IBM-Resultate nachgewiesen werden. In einer weiteren, mit IBM durchgeführten Studie konnte gezeigt werden, dass die durchschnittliche prozentuale Bronchialwandstärke in CT-Datensätzen von 16 Rauchern signifikant höher ist, als in Datensätzen von 15 Nichtrauchern. IBM läßt sich möglicherweise auch für Wanddickenbestimmungen bei Problemstellungen aus anderen Arbeitsgebieten benutzen - kann zumindest als Ideengeber dienen. Ein Artikel mit der Beschreibung der entwickelten Methodik und der damit erzielten Studienergebnisse wurde zur Publikation im Journal IEEE Transactions on Medical Imaging angenommen.
Resumo:
In this thesis we consider systems of finitely many particles moving on paths given by a strong Markov process and undergoing branching and reproduction at random times. The branching rate of a particle, its number of offspring and their spatial distribution are allowed to depend on the particle's position and possibly on the configuration of coexisting particles. In addition there is immigration of new particles, with the rate of immigration and the distribution of immigrants possibly depending on the configuration of pre-existing particles as well. In the first two chapters of this work, we concentrate on the case that the joint motion of particles is governed by a diffusion with interacting components. The resulting process of particle configurations was studied by E. Löcherbach (2002, 2004) and is known as a branching diffusion with immigration (BDI). Chapter 1 contains a detailed introduction of the basic model assumptions, in particular an assumption of ergodicity which guarantees that the BDI process is positive Harris recurrent with finite invariant measure on the configuration space. This object and a closely related quantity, namely the invariant occupation measure on the single-particle space, are investigated in Chapter 2 where we study the problem of the existence of Lebesgue-densities with nice regularity properties. For example, it turns out that the existence of a continuous density for the invariant measure depends on the mechanism by which newborn particles are distributed in space, namely whether branching particles reproduce at their death position or their offspring are distributed according to an absolutely continuous transition kernel. In Chapter 3, we assume that the quantities defining the model depend only on the spatial position but not on the configuration of coexisting particles. In this framework (which was considered by Höpfner and Löcherbach (2005) in the special case that branching particles reproduce at their death position), the particle motions are independent, and we can allow for more general Markov processes instead of diffusions. The resulting configuration process is a branching Markov process in the sense introduced by Ikeda, Nagasawa and Watanabe (1968), complemented by an immigration mechanism. Generalizing results obtained by Höpfner and Löcherbach (2005), we give sufficient conditions for ergodicity in the sense of positive recurrence of the configuration process and finiteness of the invariant occupation measure in the case of general particle motions and offspring distributions.