920 resultados para Visual pattern recognition
Resumo:
Voluntary control of information processing is crucial to allocate resources and prioritize the processes that are most important under a given situation; the algorithms underlying such control, however, are often not clear. We investigated possible algorithms of control for the performance of the majority function, in which participants searched for and identified one of two alternative categories (left or right pointing arrows) as composing the majority in each stimulus set. We manipulated the amount (set size of 1, 3, and 5) and content (ratio of left and right pointing arrows within a set) of the inputs to test competing hypotheses regarding mental operations for information processing. Using a novel measure based on computational load, we found that reaction time was best predicted by a grouping search algorithm as compared to alternative algorithms (i.e., exhaustive or self-terminating search). The grouping search algorithm involves sampling and resampling of the inputs before a decision is reached. These findings highlight the importance of investigating the implications of voluntary control via algorithms of mental operations.
Resumo:
BACKGROUND: A key aspect of representations for object recognition and scene analysis in the ventral visual stream is the spatial frame of reference, be it a viewer-centered, object-centered, or scene-based coordinate system. Coordinate transforms from retinocentric space to other reference frames involve combining neural visual responses with extraretinal postural information. METHODOLOGY/PRINCIPAL FINDINGS: We examined whether such spatial information is available to anterior inferotemporal (AIT) neurons in the macaque monkey by measuring the effect of eye position on responses to a set of simple 2D shapes. We report, for the first time, a significant eye position effect in over 40% of recorded neurons with small gaze angle shifts from central fixation. Although eye position modulates responses, it does not change shape selectivity. CONCLUSIONS/SIGNIFICANCE: These data demonstrate that spatial information is available in AIT for the representation of objects and scenes within a non-retinocentric frame of reference. More generally, the availability of spatial information in AIT calls into questions the classic dichotomy in visual processing that associates object shape processing with ventral structures such as AIT but places spatial processing in a separate anatomical stream projecting to dorsal structures.
Resumo:
We seek to determine the relationship between threshold and suprathreshold perception for position offset and stereoscopic depth perception under conditions that elevate their respective thresholds. Two threshold-elevating conditions were used: (1) increasing the interline gap and (2) dioptric blur. Although increasing the interline gap increases position (Vernier) offset and stereoscopic disparity thresholds substantially, the perception of suprathreshold position offset and stereoscopic depth remains unchanged. Perception of suprathreshold position offset also remains unchanged when the Vernier threshold is elevated by dioptric blur. We show that such normalization of suprathreshold position offset can be attributed to the topographical-map-based encoding of position. On the other hand, dioptric blur increases the stereoscopic disparity thresholds and reduces the perceived suprathreshold stereoscopic depth, which can be accounted for by a disparity-computation model in which the activities of absolute disparity encoders are multiplied by a Gaussian weighting function that is centered on the horopter. Overall, the statement "equal suprathreshold perception occurs in threshold-elevated and unelevated conditions when the stimuli are equally above their corresponding thresholds" describes the results better than the statement "suprathreshold stimuli are perceived as equal when they are equal multiples of their respective threshold values."
Resumo:
Scientific background: Marine mammals use sound for communication, navigation and prey detection. Acoustic sensors therefore allow the detection of marine mammals, even during polar winter months, when restricted visibility prohibits visual sightings. The animals are surrounded by a permanent natural soundscape, which, in polar waters, is mainly dominated by the movement of ice. In addition to the detection of marine mammals, acoustic long-term recordings provide information on intensity and temporal variability of characteristic natural and anthropogenic background sounds, as well as their influence on the vocalization of marine mammals Scientific objectives: The PerenniAL Acoustic Observatory in the Antarctic Ocean (PALAOA, Hawaiian "whale") near Neumayer Station is intended to record the underwater soundscape in the vicinity of the shelf ice edge over the duration of several years. These long-term recordings will allow studying the acoustic repertoire of whales and seals continuously in an environment almost undisturbed by humans. The data will be analyzed to (1) register species specific vocalizations, (2) infer the approximate number of animals inside the measuring range, (3) calculate their movements relative to the observatory, and (4) examine possible effects of the sporadic shipping traffic on the acoustic and locomotive behaviour of marine mammals. The data, which are largely free of anthropogenic noise, provide also a base to set up passive acoustic mitigation systems used on research vessels. Noise-free bioacoustic data thereby represent the foundation for the development of automatic pattern recognition procedures in the presence of interfering sounds, e.g. propeller noise.
Resumo:
This paper presents a robust approach for recognition of thermal face images based on decision level fusion of 34 different region classifiers. The region classifiers concentrate on local variations. They use singular value decomposition (SVD) for feature extraction. Fusion of decisions of the region classifier is done by using majority voting technique. The algorithm is tolerant against false exclusion of thermal information produced by the presence of inconsistent distribution of temperature statistics which generally make the identification process difficult. The algorithm is extensively evaluated on UGC-JU thermal face database, and Terravic facial infrared database and the recognition performance are found to be 95.83% and 100%, respectively. A comparative study has also been made with the existing works in the literature.
Resumo:
Uno de los mayores retos para la comunidad científica es conseguir que las máquinas posean en un futuro la capacidad del sistema visual y cognitivo humanos, de forma que, por ejemplo, en entornos de video vigilancia, puedan llegar a proporcionar de manera automática una descripción fiable de lo que está ocurriendo en la escena. En la presente tesis, mediante la propuesta de un marco de trabajo de referencia, se discuten y plantean los pasos necesarios para el desarrollo de sistemas más inteligentes capaces de extraer y analizar, a diferentes niveles de abstracción y mediante distintos módulos de procesamiento independientes, la información necesaria para comprender qué está sucediendo en un conjunto amplio de escenarios de distinta naturaleza. Se parte de un análisis de requisitos y se identifican los retos para este tipo de sistemas en la actualidad, lo que constituye en sí mismo los objetivos de esta tesis, contribuyendo así a un modelo de datos basado en el conocimiento que permitirá analizar distintas situaciones en las que personas y vehículos son los actores principales, dejando no obstante la puerta abierta a la adaptación a otros dominios. Así mismo, se estudian los distintos procesos que se pueden lanzar a nivel interno así como la necesidad de integrar mecanismos de realimentación a distintos niveles que permitan al sistema adaptarse mejor a cambios en el entorno. Como resultado, se propone un marco de referencia jerárquico que integra las capacidades de percepción, interpretación y aprendizaje para superar los retos identificados en este ámbito; y así poder desarrollar sistemas de vigilancia más robustos, flexibles e inteligentes, capaces de operar en una variedad de entornos. Resultados experimentales ejecutados sobre distintas muestras de datos (secuencias de vídeo principalmente) demuestran la efectividad del marco de trabajo propuesto respecto a otros propuestos en el pasado. Un primer caso de estudio, permite demostrar la creación de un sistema de monitorización de entornos de parking en exteriores para la detección de vehículos y el análisis de plazas libres de aparcamiento. Un segundo caso de estudio, permite demostrar la flexibilidad del marco de referencia propuesto para adaptarse a los requisitos de un entorno de vigilancia completamente distinto, como es un hogar inteligente donde el análisis automático de actividades de la vida cotidiana centra la atención del estudio. ABSTRACT One of the most ambitious objectives for the Computer Vision and Pattern Recognition research community is that machines can achieve similar capacities to the human's visual and cognitive system, and thus provide a trustworthy description of what is happening in the scene under surveillance. Thus, a number of well-established scenario understanding architectural frameworks to develop applications working on a variety of environments can be found in the literature. In this Thesis, a highly descriptive methodology for the development of scene understanding applications is presented. It consists of a set of formal guidelines to let machines extract and analyse, at different levels of abstraction and by means of independent processing modules that interact with each other, the necessary information to understand a broad set of different real World surveillance scenarios. Taking into account the challenges that working at both low and high levels offer, we contribute with a highly descriptive knowledge-based data model for the analysis of different situations in which people and vehicles are the main actors, leaving the door open for the development of interesting applications in diverse smart domains. Recommendations to let systems achieve high-level behaviour understanding will be also provided. Furthermore, feedback mechanisms are proposed to be integrated in order to let any system to understand better the environment and the logical context around, reducing thus the uncertainty and noise, and increasing its robustness and precision in front of low-level or high-level errors. As a result, a hierarchical cognitive architecture of reference which integrates the necessary perception, interpretation, attention and learning capabilities to overcome main challenges identified in this area of research is proposed; thus allowing to develop more robust, flexible and smart surveillance systems to cope with the different requirements of a variety of environments. Once crucial issues that should be treated explicitly in the design of this kind of systems have been formulated and discussed, experimental results shows the effectiveness of the proposed framework compared with other proposed in the past. Two case studies were implemented to test the capabilities of the framework. The first case study presents how the proposed framework can be used to create intelligent parking monitoring systems. The second case study demonstrates the flexibility of the system to cope with the requirements of a completely different environment, a smart home where activities of daily living are performed. Finally, general conclusions and future work lines to further enhancing the capabilities of the proposed framework are presented.
Resumo:
The primate temporal cortex has been demonstrated to play an important role in visual memory and pattern recognition. It is of particular interest to investigate whether activity-dependent modification of synaptic efficacy, a presumptive mechanism for learning and memory, is present in this cortical region. Here we address this issue by examining the induction of synaptic plasticity in surgically resected human inferior and middle temporal cortex. The results show that synaptic strength in the human temporal cortex could undergo bidirectional modifications, depending on the pattern of conditioning stimulation. High frequency stimulation (100 or 40 Hz) in layer IV induced long-term potentiation (LTP) of both intracellular excitatory postsynaptic potentials and evoked field potentials in layers II/III. The LTP induced by 100 Hz tetanus was blocked by 50-100 microM DL-2-amino-5-phosphonovaleric acid, suggesting that N-methyl-D-aspartate receptors were responsible for its induction. Long-term depression (LTD) was elicited by prolonged low frequency stimulation (1 Hz, 15 min). It was reduced, but not completely blocked, by DL-2-amino-5-phosphonovaleric acid, implying that some other mechanisms in addition to N-methyl-DL-aspartate receptors were involved in LTD induction. LTD was input-specific, i.e., low frequency stimulation of one pathway produced LTD of synaptic transmission in that pathway only. Finally, the LTP and LTD could reverse each other, suggesting that they can act cooperatively to modify the functional state of cortical network. These results suggest that LTP and LTD are possible mechanisms for the visual memory and pattern recognition functions performed in the human temporal cortex.
Resumo:
In this paper, we propose a novel method for the unsupervised clustering of graphs in the context of the constellation approach to object recognition. Such method is an EM central clustering algorithm which builds prototypical graphs on the basis of fast matching with graph transformations. Our experiments, both with random graphs and in realistic situations (visual localization), show that our prototypes improve the set median graphs and also the prototypes derived from our previous incremental method. We also discuss how the method scales with a growing number of images.
Resumo:
We have studied the effect of inactivated microbial stimuli (Candida albicans, Candida glabrata, Saccharomyces boulardii, and Staphylococcus aureus) on the in vitro differentiation of lineage negative (Lin−) hematopoietic progenitor mouse cells. Purified Lin− progenitors were co-cultured for 7 days with the stimuli, and cell differentiation was determined by flow cytometry analysis. All the stimuli assayed caused differentiation toward the myeloid lineage. S. boulardii and particularly C. glabrata were the stimuli that induced in a minor extent differentiation of Lin− cells, as the major population of differentiated cells corresponded to monocytes, whereas C. albicans and S. aureus induced differentiation beyond monocytes: to monocyte-derived dendritic cells and macrophages, respectively. Interestingly, signaling through TLR2 by its pure ligand Pam3CSK4 directed differentiation of Lin− cells almost exclusively to macrophages. These data support the notion that hematopoiesis can be modulated in response to microbial stimuli in a pathogen-dependent manner, being determined by the pathogen-associated molecular patterns and the pattern-recognition receptors involved, in order to generate the populations of mature cells required to deal with the pathogen.
Resumo:
Human behaviour recognition has been, and still remains, a challenging problem that involves different areas of computational intelligence. The automated understanding of people activities from video sequences is an open research topic in which the computer vision and pattern recognition areas have made big efforts. In this paper, the problem is studied from a prediction point of view. We propose a novel method able to early detect behaviour using a small portion of the input, in addition to the capabilities of it to predict behaviour from new inputs. Specifically, we propose a predictive method based on a simple representation of trajectories of a person in the scene which allows a high level understanding of the global human behaviour. The representation of the trajectory is used as a descriptor of the activity of the individual. The descriptors are used as a cue of a classification stage for pattern recognition purposes. Classifiers are trained using the trajectory representation of the complete sequence. However, partial sequences are processed to evaluate the early prediction capabilities having a specific observation time of the scene. The experiments have been carried out using the three different dataset of the CAVIAR database taken into account the behaviour of an individual. Additionally, different classic classifiers have been used for experimentation in order to evaluate the robustness of the proposal. Results confirm the high accuracy of the proposal on the early recognition of people behaviours.
Resumo:
In this work, a modified version of the elastic bunch graph matching (EBGM) algorithm for face recognition is introduced. First, faces are detected by using a fuzzy skin detector based on the RGB color space. Then, the fiducial points for the facial graph are extracted automatically by adjusting a grid of points to the result of an edge detector. After that, the position of the nodes, their relation with their neighbors and their Gabor jets are calculated in order to obtain the feature vector defining each face. A self-organizing map (SOM) framework is shown afterwards. Thus, the calculation of the winning neuron and the recognition process are performed by using a similarity function that takes into account both the geometric and texture information of the facial graph. The set of experiments carried out for our SOM-EBGM method shows the accuracy of our proposal when compared with other state-of the-art methods.
Resumo:
"Supported in part by Contract AT(11-1) 1018 with the U.S. Atomic Energy Commission and the Advanced Research Projects Agency."
Resumo:
Supported by: Contract AT (11-1)-1018 with the U.S. Atomic Energy Commission and the Advanced Research Projects Agency.
Resumo:
This paper defines the 3D reconstruction problem as the process of reconstructing a 3D scene from numerous 2D visual images of that scene. It is well known that this problem is ill-posed, and numerous constraints and assumptions are used in 3D reconstruction algorithms in order to reduce the solution space. Unfortunately, most constraints only work in a certain range of situations and often constraints are built into the most fundamental methods (e.g. Area Based Matching assumes that all the pixels in the window belong to the same object). This paper presents a novel formulation of the 3D reconstruction problem, using a voxel framework and first order logic equations, which does not contain any additional constraints or assumptions. Solving this formulation for a set of input images gives all the possible solutions for that set, rather than picking a solution that is deemed most likely. Using this formulation, this paper studies the problem of uniqueness in 3D reconstruction and how the solution space changes for different configurations of input images. It is found that it is not possible to guarantee a unique solution, no matter how many images are taken of the scene, their orientation or even how much color variation is in the scene itself. Results of using the formulation to reconstruct a few small voxel spaces are also presented. They show that the number of solutions is extremely large for even very small voxel spaces (5 x 5 voxel space gives 10 to 10(7) solutions). This shows the need for constraints to reduce the solution space to a reasonable size. Finally, it is noted that because of the discrete nature of the formulation, the solution space size can be easily calculated, making the formulation a useful tool to numerically evaluate the usefulness of any constraints that are added.
Resumo:
One hundred and twelve university students completed 7 tests assessing word-reading accuracy, print exposure, phonological sensitivity, phonological coding and knowledge of English morphology as predictors of spelling accuracy. Together the tests accounted for 71% of the variance in spelling, with phonological skills and morphological knowledge emerging as strong predictors of spelling accuracy for words with both regular and irregular sound-spelling correspondences. The pattern of relationships was consistent with a model in which, as a function of the learning opportunities that are provided by reading experience, phonological skills promote the learning of individual word orthographies and structural relationships among words.