866 resultados para Computer Vision and Pattern Recognition
Resumo:
[ES]This paper describes an analysis performed for facial description in static images and video streams. The still image context is first analyzed in order to decide the optimal classifier configuration for each problem: gender recognition, race classification, and glasses and moustache presence. These results are later applied to significant samples which are automatically extracted in real-time from video streams achieving promising results in the facial description of 70 individuals by means of gender, race and the presence of glasses and moustache.
Resumo:
[EN]Facial image processing is becoming widespread in human-computer applications, despite its complexity. High-level processes such as face recognition or gender determination rely on low-level routines that must e ectively detect and normalize the faces that appear in the input image. In this paper, a face detection and normalization system is described. The approach taken is based on a cascade of fast, weak classi ers that together try to determine whether a frontal face is present in the image.
Resumo:
[EN]This paper describes an Active Vision System whose design assumes a distinction between fast or reactive and slow or background processes. Fast processes need to operate in cycles with critical timeouts that may affect system stability. While slow processes, though necessary, do not compromise system stability if its execution is delayed. Based on this simple taxonomy, a control architecture has been proposed and a prototype implemented that is able to track people in real-time with a robotic head while trying to identify the target. In this system, the tracking module is considered as the reactive part of the system while person identification is considered a background task.
Resumo:
In the last decade, research in Computer Vision has developed several algorithms to help botanists and non-experts to classify plants based on images of their leaves. LeafSnap is a mobile application that uses a multiscale curvature model of the leaf margin to classify leaf images into species. It has achieved high levels of accuracy on 184 tree species from Northeast US. We extend the research that led to the development of LeafSnap along two lines. First, LeafSnap’s underlying algorithms are applied to a set of 66 tree species from Costa Rica. Then, texture is used as an additional criterion to measure the level of improvement achieved in the automatic identification of Costa Rica tree species. A 25.6% improvement was achieved for a Costa Rican clean image dataset and 42.5% for a Costa Rican noisy image dataset. In both cases, our results show this increment as statistically significant. Further statistical analysis of visual noise impact, best algorithm combinations per species, and best value of , the minimal cardinality of the set of candidate species that the tested algorithms render as best matches is also presented in this research
Resumo:
Increasing the size of training data in many computer vision tasks has shown to be very effective. Using large scale image datasets (e.g. ImageNet) with simple learning techniques (e.g. linear classifiers) one can achieve state-of-the-art performance in object recognition compared to sophisticated learning techniques on smaller image sets. Semantic search on visual data has become very popular. There are billions of images on the internet and the number is increasing every day. Dealing with large scale image sets is intense per se. They take a significant amount of memory that makes it impossible to process the images with complex algorithms on single CPU machines. Finding an efficient image representation can be a key to attack this problem. A representation being efficient is not enough for image understanding. It should be comprehensive and rich in carrying semantic information. In this proposal we develop an approach to computing binary codes that provide a rich and efficient image representation. We demonstrate several tasks in which binary features can be very effective. We show how binary features can speed up large scale image classification. We present learning techniques to learn the binary features from supervised image set (With different types of semantic supervision; class labels, textual descriptions). We propose several problems that are very important in finding and using efficient image representation.
Resumo:
The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.
Resumo:
Visual inputs to artificial and biological visual systems are often quantized: cameras accumulate photons from the visual world, and the brain receives action potentials from visual sensory neurons. Collecting more information quanta leads to a longer acquisition time and better performance. In many visual tasks, collecting a small number of quanta is sufficient to solve the task well. The ability to determine the right number of quanta is pivotal in situations where visual information is costly to obtain, such as photon-starved or time-critical environments. In these situations, conventional vision systems that always collect a fixed and large amount of information are infeasible. I develop a framework that judiciously determines the number of information quanta to observe based on the cost of observation and the requirement for accuracy. The framework implements the optimal speed versus accuracy tradeoff when two assumptions are met, namely that the task is fully specified probabilistically and constant over time. I also extend the framework to address scenarios that violate the assumptions. I deploy the framework to three recognition tasks: visual search (where both assumptions are satisfied), scotopic visual recognition (where the model is not specified), and visual discrimination with unknown stimulus onset (where the model is dynamic over time). Scotopic classification experiments suggest that the framework leads to dramatic improvement in photon-efficiency compared to conventional computer vision algorithms. Human psychophysics experiments confirmed that the framework provides a parsimonious and versatile explanation for human behavior under time pressure in both static and dynamic environments.
Resumo:
In the study of complex networks, vertex centrality measures are used to identify the most important vertices within a graph. A related problem is that of measuring the centrality of an edge. In this paper, we propose a novel edge centrality index rooted in quantum information. More specifically, we measure the importance of an edge in terms of the contribution that it gives to the Von Neumann entropy of the graph. We show that this can be computed in terms of the Holevo quantity, a well known quantum information theoretical measure. While computing the Von Neumann entropy and hence the Holevo quantity requires computing the spectrum of the graph Laplacian, we show how to obtain a simplified measure through a quadratic approximation of the Shannon entropy. This in turns shows that the proposed centrality measure is strongly correlated with the negative degree centrality on the line graph. We evaluate our centrality measure through an extensive set of experiments on real-world as well as synthetic networks, and we compare it against commonly used alternative measures.
Resumo:
Laplacian-based descriptors, such as the Heat Kernel Signature and the Wave Kernel Signature, allow one to embed the vertices of a graph onto a vectorial space, and have been successfully used to find the optimal matching between a pair of input graphs. While the HKS uses a heat di↵usion process to probe the local structure of a graph, the WKS attempts to do the same through wave propagation. In this paper, we propose an alternative structural descriptor that is based on continuoustime quantum walks. More specifically, we characterise the structure of a graph using its average mixing matrix. The average mixing matrix is a doubly-stochastic matrix that encodes the time-averaged behaviour of a continuous-time quantum walk on the graph. We propose to use the rows of the average mixing matrix for increasing stopping times to develop a novel signature, the Average Mixing Matrix Signature (AMMS). We perform an extensive range of experiments and we show that the proposed signature is robust under structural perturbations of the original graphs and it outperforms both the HKS and WKS when used as a node descriptor in a graph matching task.
Resumo:
The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Resumo:
The first mechanical Automaton concept was found in a Chinese text written in the 3rd century BC, while Computer Vision was born in the late 1960s. Therefore, visual perception applied to machines (i.e. the Machine Vision) is a young and exciting alliance. When robots came in, the new field of Robotic Vision was born, and these terms began to be erroneously interchanged. In short, we can say that Machine Vision is an engineering domain, which concern the industrial use of Vision. The Robotic Vision, instead, is a research field that tries to incorporate robotics aspects in computer vision algorithms. Visual Servoing, for example, is one of the problems that cannot be solved by computer vision only. Accordingly, a large part of this work deals with boosting popular Computer Vision techniques by exploiting robotics: e.g. the use of kinematics to localize a vision sensor, mounted as the robot end-effector. The remainder of this work is dedicated to the counterparty, i.e. the use of computer vision to solve real robotic problems like grasping objects or navigate avoiding obstacles. Will be presented a brief survey about mapping data structures most widely used in robotics along with SkiMap, a novel sparse data structure created both for robotic mapping and as a general purpose 3D spatial index. Thus, several approaches to implement Object Detection and Manipulation, by exploiting the aforementioned mapping strategies, will be proposed, along with a completely new Machine Teaching facility in order to simply the training procedure of modern Deep Learning networks.
Resumo:
To evaluate the use of optical and nonoptical aids during reading and writing activities in individuals with acquired low vision. This study was performed using descriptive and cross-sectional surveys. The data collection instrument was created with structured questions that were developed from an exploratory study and a previous test based on interviews, and it evaluated the following variables: personal characteristics, use of optical and nonoptical aids, and activities that required the use of optical and nonoptical aids. The study population included 30 subjects with acquired low vision and visual acuities of 20/200-20/400. Most subjects reported the use of some optical aids (60.0%). Of these 60.0%, the majority (83.3%) cited spectacles as the most widely used optical aid. The majority (63.3%) of subjects also reported the use of nonoptical aids, the most frequent ones being letter magnification (68.4%), followed by bringing the objects closer to the eyes (57.8%). Subjects often used more than one nonoptical aid. The majority of participants reported the use of optical and nonoptical aids during reading activities, highlighting the use of spectacles, magnifying glasses, and letter magnification; however, even after the use of these aids, we found that the subjects often needed to read the text more than once to understand it. During writing activities, all subjects reported the use of optical aids, while most stated that they did not use nonoptical aids for such activities.
Resumo:
OBJETIVO: Desenvolver um método e um dispositivo para quantificar a visão em candela (cd). Os estudos de medida da visão são importantes para todas as ciências visuais. MÉTODOS: É um estudo teórico e experimental. Foram descritos os detalhes do método psicofísico e da calibração do dispositivo. Foram realizados testes preliminares em voluntários. RESULTADOS: É um teste psicofísico simples e com resultado expresso em unidades do sistema internacional de medidas. Com a descrição técnica será possível reproduzir o experimento em outros centros de pesquisa. CONCLUSÃO: Os resultados aferidos em intensidade luminosa (cd) são uma opção para estudo visual. Esses resultados possibilitarão extrapolar medidas para modelos matemáticos e para simular efeitos individuais com dados aberrométricos.
Resumo:
The peritoneal cavity (PerC) is a singular compartment where many cell populations reside and interact. Despite the widely adopted experimental approach of intraperitoneal (i.p.) inoculation, little is known about the behavior of the different cell populations within the PerC. To evaluate the dynamics of peritoneal macrophage (Mempty set) subsets, namely small peritoneal Mempty set (SPM) and large peritoneal Mempty set (LPM), in response to infectious stimuli, C57BL/6 mice were injected i.p. with zymosan or Trypanosoma cruzi. These conditions resulted in the marked modification of the PerC myelo-monocytic compartment characterized by the disappearance of LPM and the accumulation of SPM and monocytes. In parallel, adherent cells isolated from stimulated PerC displayed reduced staining for beta-galactosidase, a biomarker for senescence. Further, the adherent cells showed increased nitric oxide (NO) and higher frequency of IL-12-producing cells in response to subsequent LPS and IFN-gamma stimulation. Among myelo-monocytic cells, SPM rather than LPM or monocytes, appear to be the central effectors of the activated PerC; they display higher phagocytic activity and are the main source of IL-12. Thus, our data provide a first demonstration of the consequences of the dynamics between peritoneal Mempty set subpopulations by showing that substitution of LPM by a robust SPM and monocytes in response to infectious stimuli greatly improves PerC effector activity.
Resumo:
By means of continuous topology optimization, this paper discusses the influence of material gradation and layout in the overall stiffness behavior of functionally graded structures. The formulation is associated to symmetry and pattern repetition constraints, including material gradation effects at both global and local levels. For instance, constraints associated with pattern repetition are applied by considering material gradation either on the global structure or locally over the specific pattern. By means of pattern repetition, we recover previous results in the literature which were obtained using homogenization and optimization of cellular materials.