920 resultados para Visual pattern recognition
Resumo:
It is widely accepted that infants begin learning their native language not by learning words, but by discovering features of the speech signal: consonants, vowels, and combinations of these sounds. Learning to understand words, as opposed to just perceiving their sounds, is said to come later, between 9 and 15 mo of age, when infants develop a capacity for interpreting others' goals and intentions. Here, we demonstrate that this consensus about the developmental sequence of human language learning is flawed: in fact, infants already know the meanings of several common words from the age of 6 mo onward. We presented 6- to 9-mo-old infants with sets of pictures to view while their parent named a picture in each set. Over this entire age range, infants directed their gaze to the named pictures, indicating their understanding of spoken words. Because the words were not trained in the laboratory, the results show that even young infants learn ordinary words through daily experience with language. This surprising accomplishment indicates that, contrary to prevailing beliefs, either infants can already grasp the referential intentions of adults at 6 mo or infants can learn words before this ability emerges. The precocious discovery of word meanings suggests a perspective in which learning vocabulary and learning the sound structure of spoken language go hand in hand as language acquisition begins.
Resumo:
This paper presents the novel theory for performing multi-agent activity recognition without requiring large training corpora. The reduced need for data means that robust probabilistic recognition can be performed within domains where annotated datasets are traditionally unavailable. Complex human activities are composed from sequences of underlying primitive activities. We do not assume that the exact temporal ordering of primitives is necessary, so can represent complex activity using an unordered bag. Our three-tier architecture comprises low-level video tracking, event analysis and high-level inference. High-level inference is performed using a new, cascading extension of the Rao–Blackwellised Particle Filter. Simulated annealing is used to identify pairs of agents involved in multi-agent activity. We validate our framework using the benchmarked PETS 2006 video surveillance dataset and our own sequences, and achieve a mean recognition F-Score of 0.82. Our approach achieves a mean improvement of 17% over a Hidden Markov Model baseline.
Resumo:
[EN]Social robots are receiving much interest in the robotics community. The most important goal for such robots lies in their interaction capabilities. An attention system is crucial, both as a filter to center the robot’s perceptual resources and as a mean of letting the observer know that the robot has intentionality. In this paper a simple but flexible and functional attentional model is described. The model, which has been implemented in an interactive robot currently under development, fuses both visual and auditive information extracted from the robot’s environment, and can incorporate knowledge-based influences on attention.
Resumo:
With the world of professional sports shifting towards employing better sport analytics, the demand for vision-based performance analysis is growing increasingly in recent years. In addition, the nature of many sports does not allow the use of any kind of sensors or other wearable markers attached to players for monitoring their performances during competitions. This provides a potential application of systematic observations such as tracking information of the players to help coaches to develop their visual skills and perceptual awareness needed to make decisions about team strategy or training plans. My PhD project is part of a bigger ongoing project between sport scientists and computer scientists involving also industry partners and sports organisations. The overall idea is to investigate the contribution technology can make to the analysis of sports performance on the example of team sports such as rugby, football or hockey. A particular focus is on vision-based tracking, so that information about the location and dynamics of the players can be gained without any additional sensors on the players. To start with, prior approaches on visual tracking are extensively reviewed and analysed. In this thesis, methods to deal with the difficulties in visual tracking to handle the target appearance changes caused by intrinsic (e.g. pose variation) and extrinsic factors, such as occlusion, are proposed. This analysis highlights the importance of the proposed visual tracking algorithms, which reflect these challenges and suggest robust and accurate frameworks to estimate the target state in a complex tracking scenario such as a sports scene, thereby facilitating the tracking process. Next, a framework for continuously tracking multiple targets is proposed. Compared to single target tracking, multi-target tracking such as tracking the players on a sports field, poses additional difficulties, namely data association, which needs to be addressed. Here, the aim is to locate all targets of interest, inferring their trajectories and deciding which observation corresponds to which target trajectory is. In this thesis, an efficient framework is proposed to handle this particular problem, especially in sport scenes, where the players of the same team tend to look similar and exhibit complex interactions and unpredictable movements resulting in matching ambiguity between the players. The presented approach is also evaluated on different sports datasets and shows promising results. Finally, information from the proposed tracking system is utilised as the basic input for further higher level performance analysis such as tactics and team formations, which can help coaches to design a better training plan. Due to the continuous nature of many team sports (e.g. soccer, hockey), it is not straightforward to infer the high-level team behaviours, such as players’ interaction. The proposed framework relies on two distinct levels of performance analysis: low-level performance analysis, such as identifying players positions on the play field, as well as a high-level analysis, where the aim is to estimate the density of player locations or detecting their possible interaction group. The related experiments show the proposed approach can effectively explore this high-level information, which has many potential applications.
Resumo:
The research described in this thesis was motivated by the need of a robust model capable of representing 3D data obtained with 3D sensors, which are inherently noisy. In addition, time constraints have to be considered as these sensors are capable of providing a 3D data stream in real time. This thesis proposed the use of Self-Organizing Maps (SOMs) as a 3D representation model. In particular, we proposed the use of the Growing Neural Gas (GNG) network, which has been successfully used for clustering, pattern recognition and topology representation of multi-dimensional data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models, without considering time constraints. It is proposed a hardware implementation leveraging the computing power of modern GPUs, which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). The proposed methods were applied to different problem and applications in the area of computer vision such as the recognition and localization of objects, visual surveillance or 3D reconstruction.
Resumo:
Nowadays, new computers generation provides a high performance that enables to build computationally expensive computer vision applications applied to mobile robotics. Building a map of the environment is a common task of a robot and is an essential part to allow the robots to move through these environments. Traditionally, mobile robots used a combination of several sensors from different technologies. Lasers, sonars and contact sensors have been typically used in any mobile robotic architecture, however color cameras are an important sensor due to we want the robots to use the same information that humans to sense and move through the different environments. Color cameras are cheap and flexible but a lot of work need to be done to give robots enough visual understanding of the scenes. Computer vision algorithms are computational complex problems but nowadays robots have access to different and powerful architectures that can be used for mobile robotics purposes. The advent of low-cost RGB-D sensors like Microsoft Kinect which provide 3D colored point clouds at high frame rates made the computer vision even more relevant in the mobile robotics field. The combination of visual and 3D data allows the systems to use both computer vision and 3D processing and therefore to be aware of more details of the surrounding environment. The research described in this thesis was motivated by the need of scene mapping. Being aware of the surrounding environment is a key feature in many mobile robotics applications from simple robotic navigation to complex surveillance applications. In addition, the acquisition of a 3D model of the scenes is useful in many areas as video games scene modeling where well-known places are reconstructed and added to game systems or advertising where once you get the 3D model of one room the system can add furniture pieces using augmented reality techniques. In this thesis we perform an experimental study of the state-of-the-art registration methods to find which one fits better to our scene mapping purposes. Different methods are tested and analyzed on different scene distributions of visual and geometry appearance. In addition, this thesis proposes two methods for 3d data compression and representation of 3D maps. Our 3D representation proposal is based on the use of Growing Neural Gas (GNG) method. This Self-Organizing Maps (SOMs) has been successfully used for clustering, pattern recognition and topology representation of various kind of data. Until now, Self-Organizing Maps have been primarily computed offline and their application in 3D data has mainly focused on free noise models without considering time constraints. Self-organising neural models have the ability to provide a good representation of the input space. In particular, the Growing Neural Gas (GNG) is a suitable model because of its flexibility, rapid adaptation and excellent quality of representation. However, this type of learning is time consuming, specially for high-dimensional input data. Since real applications often work under time constraints, it is necessary to adapt the learning process in order to complete it in a predefined time. This thesis proposes a hardware implementation leveraging the computing power of modern GPUs which takes advantage of a new paradigm coined as General-Purpose Computing on Graphics Processing Units (GPGPU). Our proposed geometrical 3D compression method seeks to reduce the 3D information using plane detection as basic structure to compress the data. This is due to our target environments are man-made and therefore there are a lot of points that belong to a plane surface. Our proposed method is able to get good compression results in those man-made scenarios. The detected and compressed planes can be also used in other applications as surface reconstruction or plane-based registration algorithms. Finally, we have also demonstrated the goodness of the GPU technologies getting a high performance implementation of a CAD/CAM common technique called Virtual Digitizing.
Resumo:
The study of acoustic communication in animals often requires not only the recognition of species specific acoustic signals but also the identification of individual subjects, all in a complex acoustic background. Moreover, when very long recordings are to be analyzed, automatic recognition and identification processes are invaluable tools to extract the relevant biological information. A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech. This methodology was applied here for the first time to the detection and recognition of fish acoustic signals, specifically in a stream of round-the-clock recordings of Lusitanian toadfish (Halobatrachus didactylus) in their natural estuarine habitat. The results show that this methodology is able not only to detect the mating sounds (boatwhistles) but also to identify individual male toadfish, reaching an identification rate of ca. 95%. Moreover this method also proved to be a powerful tool to assess signal durations in large data sets. However, the system failed in recognizing other sound types.
Resumo:
Inductively Coupled Plasma Optical Emission Spectrometry was used to determine Ca, Mg, Mn, Fe, Zn and Cu in samples of processed and natural coconut water. The sample preparation consisted in a filtration step followed by a dilution. The analysis was made employing optimized instrumental parameters and the results were evaluated using methods of Pattern Recognition. The data showed common concentration values for the analytes present in processed and natural samples. Principal Component Analysis (PCA) and Hierarchical Cluster Analysis (HCA) indicated that the samples of different kinds were statistically different when the concentrations of all the analytes were considered simultaneously.
Resumo:
Chemometric activities in Brazil are described according to three phases: before the existence of microcomputers in the 1970s, through the initial stages of microcomputer use in the 1980s and during the years of extensive microcomputer applications of the ´90s and into this century. Pioneering activities in both the university and industry are emphasized. Active research areas in chemometrics are cited including experimental design, pattern recognition and classification, curve resolution for complex systems and multivariate calibration. New trends in chemometrics, especially higher order methods for treating data, are emphasized.
Resumo:
The peritoneal cavity (PerC) is a singular compartment where many cell populations reside and interact. Despite the widely adopted experimental approach of intraperitoneal (i.p.) inoculation, little is known about the behavior of the different cell populations within the PerC. To evaluate the dynamics of peritoneal macrophage (Mempty set) subsets, namely small peritoneal Mempty set (SPM) and large peritoneal Mempty set (LPM), in response to infectious stimuli, C57BL/6 mice were injected i.p. with zymosan or Trypanosoma cruzi. These conditions resulted in the marked modification of the PerC myelo-monocytic compartment characterized by the disappearance of LPM and the accumulation of SPM and monocytes. In parallel, adherent cells isolated from stimulated PerC displayed reduced staining for beta-galactosidase, a biomarker for senescence. Further, the adherent cells showed increased nitric oxide (NO) and higher frequency of IL-12-producing cells in response to subsequent LPS and IFN-gamma stimulation. Among myelo-monocytic cells, SPM rather than LPM or monocytes, appear to be the central effectors of the activated PerC; they display higher phagocytic activity and are the main source of IL-12. Thus, our data provide a first demonstration of the consequences of the dynamics between peritoneal Mempty set subpopulations by showing that substitution of LPM by a robust SPM and monocytes in response to infectious stimuli greatly improves PerC effector activity.
Resumo:
Activation of NF-kappa B and 5-lipoxygenase-mediated (5-LO-mediated) biosynthesis of the lipid mediator leukotriene B(4) (LTB(4)) are pivotal components of host defense and inflammatory responses. However, the role of LTB(4) in mediating innate immune responses elicited by specific TLR ligands and cytokines is unknown. Here we have shown that responses dependent on MyD88 (an adaptor protein that mediates signaling through all of the known TLRs, except TLR3, as well as IL-1 beta and IL-18) are reduced in mice lacking either 5-LO or the LTB(4) receptor BTL1, and that macrophages from these mice are impaired in MyD88-dependent activation of NF-kappa B. This macrophage defect was associated with lower basal and inducible expression of MyD88 and reflected impaired activation of STAT1 and overexpression of the STAT1 inhibitor SOCS1. Expression of MyD88 and responsiveness to the TLR4 ligand LPS were decreased by Stat1 siRNA silencing in WT macrophages and restored by Socs1 siRNA in 5-LO-deficient macrophages. These results uncover a pivotal role in macrophages for the GPCR BLT1 in regulating activation of NF-kappa B through Stat1-dependent expression of MyD88.
Resumo:
Background: Feature selection is a pattern recognition approach to choose important variables according to some criteria in order to distinguish or explain certain phenomena (i.e., for dimensionality reduction). There are many genomic and proteomic applications that rely on feature selection to answer questions such as selecting signature genes which are informative about some biological state, e. g., normal tissues and several types of cancer; or inferring a prediction network among elements such as genes, proteins and external stimuli. In these applications, a recurrent problem is the lack of samples to perform an adequate estimate of the joint probabilities between element states. A myriad of feature selection algorithms and criterion functions have been proposed, although it is difficult to point the best solution for each application. Results: The intent of this work is to provide an open-source multiplataform graphical environment for bioinformatics problems, which supports many feature selection algorithms, criterion functions and graphic visualization tools such as scatterplots, parallel coordinates and graphs. A feature selection approach for growing genetic networks from seed genes ( targets or predictors) is also implemented in the system. Conclusion: The proposed feature selection environment allows data analysis using several algorithms, criterion functions and graphic visualization tools. Our experiments have shown the software effectiveness in two distinct types of biological problems. Besides, the environment can be used in different pattern recognition applications, although the main concern regards bioinformatics tasks.
Resumo:
The aim of this Study was to compare the learning process of a highly complex ballet skill following demonstrations of point light and video models 16 participants divided into point light and video groups (ns = 8) performed 160 trials of a pirouette equally distributed in blocks of 20 trials alternating periods of demonstration and practice with a retention test a day later Measures of head and trunk oscillation coordination d1 parity from the model and movement time difference showed similarities between video and point light groups ballet experts evaluations indicated superiority of performance in the video over the point light group Results are discussed in terms of the task requirements of dissociation between head and trunk rotations focusing on the hypothesis of sufficiency and higher relevance of information contained in biological motion models applied to learning of complex motor skills
Resumo:
Today several different unsupervised classification algorithms are commonly used to cluster similar patterns in a data set based only on its statistical properties. Specially in image data applications, self-organizing methods for unsupervised classification have been successfully applied for clustering pixels or group of pixels in order to perform segmentation tasks. The first important contribution of this paper refers to the development of a self-organizing method for data classification, named Enhanced Independent Component Analysis Mixture Model (EICAMM), which was built by proposing some modifications in the Independent Component Analysis Mixture Model (ICAMM). Such improvements were proposed by considering some of the model limitations as well as by analyzing how it should be improved in order to become more efficient. Moreover, a pre-processing methodology was also proposed, which is based on combining the Sparse Code Shrinkage (SCS) for image denoising and the Sobel edge detector. In the experiments of this work, the EICAMM and other self-organizing models were applied for segmenting images in their original and pre-processed versions. A comparative analysis showed satisfactory and competitive image segmentation results obtained by the proposals presented herein. (C) 2008 Published by Elsevier B.V.
Resumo:
A rapid method for classification of mineral waters is proposed. The discrimination power was evaluated by a novel combination of chemometric data analysis and qualitative multi-elemental fingerprints of mineral water samples acquired from different regions of the Brazilian territory. The classification of mineral waters was assessed using only the wavelength emission intensities obtained by inductively coupled plasma optical emission spectrometry (ICP OES), monitoring different lines of Al, B, Ba, Ca, Cl, Cu, Co, Cr, Fe, K, Mg, Mn, Na, Ni, P, Pb, S, Sb, Si, Sr, Ti, V, and Zn, and Be, Dy, Gd, In, La, Sc and Y as internal standards. Data acquisition was done under robust (RC) and non-robust (NRC) conditions. Also, the combination of signal intensities of two or more emission lines for each element were evaluated instead of the individual lines. The performance of two classification-k-nearest neighbor (kNN) and soft independent modeling of class analogy (SIMCA)-and preprocessing algorithms, autoscaling and Pareto scaling, were evaluated for the ability to differentiate between the various samples in each approach tested (combination of robust or non-robust conditions with use of individual lines or sum of the intensities of emission lines). It was shown that qualitative ICP OES fingerprinting in combination with multivariate analysis is a promising analytical tool that has potential to become a recognized procedure for rapid authenticity and adulteration testing of mineral water samples or other material whose physicochemical properties (or origin) are directly related to mineral content.