888 resultados para Computer vision system
Resumo:
Pós-graduação em Engenharia Mecânica - FEG
Resumo:
Automatic video surveillance system has been a frequent topic of research due to the large number of promising applications. In this research, we developed a tracking and counting people system, as well as suspicious activities detector. The model tracks individual objects as they pass through the field of vision of the camera using vision algorithms to classify the activities of each person, and according to this features, detect dangerous situations. This dissertation includes a review of several techniques trying to develop a robust and low computacional costs system to be used in glass door barrier turnstiles avoiding fraud
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Computação - IBILCE
Resumo:
Pós-graduação em Televisão Digital: Informação e Conhecimento - FAAC
Resumo:
The aims of this study were to investigate work conditions, to estimate the prevalence and to describe risk factors associated with Computer Vision Syndrome among two call centers' operators in Sao Paulo (n = 476). The methods include a quantitative cross-sectional observational study and an ergonomic work analysis, using work observation, interviews and questionnaires. The case definition was the presence of one or more specific ocular symptoms answered as always, often or sometimes. The multiple logistic regression model, were created using the stepwise forward likelihood method and remained the variables with levels below 5% (p < 0.05). The operators were mainly female and young (from 15 to 24 years old). The call center was opened 24 hours and the operators weekly hours were 36 hours with break time from 21 to 35 minutes per day. The symptoms reported were eye fatigue (73.9%), "weight" in the eyes (68.2%), "burning" eyes (54.6%), tearing (43.9%) and weakening of vision (43.5%). The prevalence of Computer Vision Syndrome was 54.6%. Associations verified were: being female (OR 2.6, 95% CI 1.6 to 4.1), lack of recognition at work (OR 1.4, 95% CI 1.1 to 1.8), organization of work in call center (OR 1.4, 95% CI 1.1 to 1.7) and high demand at work (OR 1.1, 95% CI 1.0 to 1.3). The organization and psychosocial factors at work should be included in prevention programs of visual syndrome among call centers' operators.
Resumo:
Bilayer segmentation of live video in uncontrolled environments is an essential task for home applications in which the original background of the scene must be replaced, as in videochats or traditional videoconference. The main challenge in such conditions is overcome all difficulties in problem-situations (e. g., illumination change, distract events such as element moving in the background and camera shake) that may occur while the video is being captured. This paper presents a survey of segmentation methods for background substitution applications, describes the main concepts and identifies events that may cause errors. Our analysis shows that although robust methods rely on specific devices (multiple cameras or sensors to generate depth maps) which aid the process. In order to achieve the same results using conventional devices (monocular video cameras), most current research relies on energy minimization frameworks, in which temporal and spacial information are probabilistically combined with those of color and contrast.
Resumo:
[EN] In this paper we present a variational technique for the reconstruction of 3D cylindrical surfaces. Roughly speaking by a cylindrical surface we mean a surface that can be parameterized using the projection on a cylinder in terms of two coordinates, representing the displacement and angle in a cylindrical coordinate system respectively. The starting point for our method is a set of different views of a cylindrical surface, as well as a precomputed disparity map estimation between pair of images. The proposed variational technique is based on an energy minimization where we balance on the one hand the regularity of the cylindrical function given by the distance of the surface points to cylinder axis, and on the other hand, the distance between the projection of the surface points on the images and the expected location following the precomputed disparity map estimation between pair of images. One interesting advantage of this approach is that we regularize the 3D surface by means of a bi-dimensio al minimization problem. We show some experimental results for large stereo sequences.
Resumo:
[EN] In this paper we present some real problems which appear in computer vision which yields to nonlinear system of algebraic equations. We study the problem of camera calibration. Roughly speaking camera calibration consists in looking at the camera position in the 3- D world using as information the projection of a 3- D Scene in a 2-D plane (the photogram). The problem is quite different when we use a single view or several views (stereo vision) of the 3-D scene. We will show in this paper how these problems yields to nonlinear algebraic system of equations.
Resumo:
[EN]The widespread availability of portable computing power and inexpensive digital cameras is opening up new possibilities for retailers. One example is in optical shops, where a number of systems exist that facilitate eyeglasses selection. These systems are now more necessary as the market is saturated with an increasingly complex array of lenses, frames, coatings, tints, photochromic and polarizing treatments, etc. Research challenges encompass Computer Vision, Multimedia and Human-Computer Interaction. Cost factors are also of importance for widespread product acceptance. This paper describes a low-cost system that allows the user to visualize di erent spectacle models in live video. The user can also move the spectacles to adjust its position on the face. Experiments show the potential of the system.
Resumo:
The term Ambient Intelligence (AmI) refers to a vision on the future of the information society where smart, electronic environment are sensitive and responsive to the presence of people and their activities (Context awareness). In an ambient intelligence world, devices work in concert to support people in carrying out their everyday life activities, tasks and rituals in an easy, natural way using information and intelligence that is hidden in the network connecting these devices. This promotes the creation of pervasive environments improving the quality of life of the occupants and enhancing the human experience. AmI stems from the convergence of three key technologies: ubiquitous computing, ubiquitous communication and natural interfaces. Ambient intelligent systems are heterogeneous and require an excellent cooperation between several hardware/software technologies and disciplines, including signal processing, networking and protocols, embedded systems, information management, and distributed algorithms. Since a large amount of fixed and mobile sensors embedded is deployed into the environment, the Wireless Sensor Networks is one of the most relevant enabling technologies for AmI. WSN are complex systems made up of a number of sensor nodes which can be deployed in a target area to sense physical phenomena and communicate with other nodes and base stations. These simple devices typically embed a low power computational unit (microcontrollers, FPGAs etc.), a wireless communication unit, one or more sensors and a some form of energy supply (either batteries or energy scavenger modules). WNS promises of revolutionizing the interactions between the real physical worlds and human beings. Low-cost, low-computational power, low energy consumption and small size are characteristics that must be taken into consideration when designing and dealing with WSNs. To fully exploit the potential of distributed sensing approaches, a set of challengesmust be addressed. Sensor nodes are inherently resource-constrained systems with very low power consumption and small size requirements which enables than to reduce the interference on the physical phenomena sensed and to allow easy and low-cost deployment. They have limited processing speed,storage capacity and communication bandwidth that must be efficiently used to increase the degree of local ”understanding” of the observed phenomena. A particular case of sensor nodes are video sensors. This topic holds strong interest for a wide range of contexts such as military, security, robotics and most recently consumer applications. Vision sensors are extremely effective for medium to long-range sensing because vision provides rich information to human operators. However, image sensors generate a huge amount of data, whichmust be heavily processed before it is transmitted due to the scarce bandwidth capability of radio interfaces. In particular, in video-surveillance, it has been shown that source-side compression is mandatory due to limited bandwidth and delay constraints. Moreover, there is an ample opportunity for performing higher-level processing functions, such as object recognition that has the potential to drastically reduce the required bandwidth (e.g. by transmitting compressed images only when something ‘interesting‘ is detected). The energy cost of image processing must however be carefully minimized. Imaging could play and plays an important role in sensing devices for ambient intelligence. Computer vision can for instance be used for recognising persons and objects and recognising behaviour such as illness and rioting. Having a wireless camera as a camera mote opens the way for distributed scene analysis. More eyes see more than one and a camera system that can observe a scene from multiple directions would be able to overcome occlusion problems and could describe objects in their true 3D appearance. In real-time, these approaches are a recently opened field of research. In this thesis we pay attention to the realities of hardware/software technologies and the design needed to realize systems for distributed monitoring, attempting to propose solutions on open issues and filling the gap between AmI scenarios and hardware reality. The physical implementation of an individual wireless node is constrained by three important metrics which are outlined below. Despite that the design of the sensor network and its sensor nodes is strictly application dependent, a number of constraints should almost always be considered. Among them: • Small form factor to reduce nodes intrusiveness. • Low power consumption to reduce battery size and to extend nodes lifetime. • Low cost for a widespread diffusion. These limitations typically result in the adoption of low power, low cost devices such as low powermicrocontrollers with few kilobytes of RAMand tenth of kilobytes of program memory with whomonly simple data processing algorithms can be implemented. However the overall computational power of the WNS can be very large since the network presents a high degree of parallelism that can be exploited through the adoption of ad-hoc techniques. Furthermore through the fusion of information from the dense mesh of sensors even complex phenomena can be monitored. In this dissertation we present our results in building several AmI applications suitable for a WSN implementation. The work can be divided into two main areas:Low Power Video Sensor Node and Video Processing Alghoritm and Multimodal Surveillance . Low Power Video Sensor Nodes and Video Processing Alghoritms In comparison to scalar sensors, such as temperature, pressure, humidity, velocity, and acceleration sensors, vision sensors generate much higher bandwidth data due to the two-dimensional nature of their pixel array. We have tackled all the constraints listed above and have proposed solutions to overcome the current WSNlimits for Video sensor node. We have designed and developed wireless video sensor nodes focusing on the small size and the flexibility of reuse in different applications. The video nodes target a different design point: the portability (on-board power supply, wireless communication), a scanty power budget (500mW),while still providing a prominent level of intelligence, namely sophisticated classification algorithmand high level of reconfigurability. We developed two different video sensor node: The device architecture of the first one is based on a low-cost low-power FPGA+microcontroller system-on-chip. The second one is based on ARM9 processor. Both systems designed within the above mentioned power envelope could operate in a continuous fashion with Li-Polymer battery pack and solar panel. Novel low power low cost video sensor nodes which, in contrast to sensors that just watch the world, are capable of comprehending the perceived information in order to interpret it locally, are presented. Featuring such intelligence, these nodes would be able to cope with such tasks as recognition of unattended bags in airports, persons carrying potentially dangerous objects, etc.,which normally require a human operator. Vision algorithms for object detection, acquisition like human detection with Support Vector Machine (SVM) classification and abandoned/removed object detection are implemented, described and illustrated on real world data. Multimodal surveillance: In several setup the use of wired video cameras may not be possible. For this reason building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. Energy efficiency for wireless smart camera networks is one of the major efforts in distributed monitoring and surveillance community. For this reason, building an energy efficient wireless vision network for monitoring and surveillance is one of the major efforts in the sensor network community. The Pyroelectric Infra-Red (PIR) sensors have been used to extend the lifetime of a solar-powered video sensor node by providing an energy level dependent trigger to the video camera and the wireless module. Such approach has shown to be able to extend node lifetime and possibly result in continuous operation of the node.Being low-cost, passive (thus low-power) and presenting a limited form factor, PIR sensors are well suited for WSN applications. Moreover techniques to have aggressive power management policies are essential for achieving long-termoperating on standalone distributed cameras needed to improve the power consumption. We have used an adaptive controller like Model Predictive Control (MPC) to help the system to improve the performances outperforming naive power management policies.
Resumo:
L'analisi di un'immagine con strumenti automatici si è sviluppata in quella che oggi viene chiamata "computer vision", la materia di studio proveniente dal mondo informatico che si occupa, letteralmente, di "vedere oltre", di estrarre da una figura una serie di aspetti strutturali, sotto forma di dati numerici. Tra le tante aree di ricerca che ne derivano, una in particolare è dedicata alla comprensione di un dettaglio estremamente interessante, che si presta ad applicazioni di molteplici tipologie: la profondità. L'idea di poter recuperare ciò che, apparentemente, si era perso fermando una scena ed imprimendone l'istante in un piano a due dimensioni poteva sembrare, fino a non troppi anni fa, qualcosa di impossibile. Grazie alla cosiddetta "visione stereo", invece, oggi possiamo godere della "terza dimensione" in diversi ambiti, legati ad attività professionali piuttosto che di svago. Inoltre, si presta ad utilizzi ancora più interessanti quando gli strumenti possono vantare caratteristiche tecniche accessibili, come dimensioni ridotte e facilità d'uso. Proprio quest'ultimo aspetto ha catturato l'attenzione di un gruppo di lavoro, dal quale è nata l'idea di sviluppare una soluzione, chiamata "SuperStereo", capace di permettere la stereo vision usando uno strumento estremamente diffuso nel mercato tecnologico globale: uno smartphone e, più in generale, qualsiasi dispositivo mobile appartenente a questa categoria.
Resumo:
Generic object recognition is an important function of the human visual system and everybody finds it highly useful in their everyday life. For an artificial vision system it is a really hard, complex and challenging task because instances of the same object category can generate very different images, depending of different variables such as illumination conditions, the pose of an object, the viewpoint of the camera, partial occlusions, and unrelated background clutter. The purpose of this thesis is to develop a system that is able to classify objects in 2D images based on the context, and identify to which category the object belongs to. Given an image, the system can classify it and decide the correct categorie of the object. Furthermore the objective of this thesis is also to test the performance and the precision of different supervised Machine Learning algorithms in this specific task of object image categorization. Through different experiments the implemented application reveals good categorization performances despite the difficulty of the problem. However this project is open to future improvement; it is possible to implement new algorithms that has not been invented yet or using other techniques to extract features to make the system more reliable. This application can be installed inside an embedded system and after trained (performed outside the system), so it can become able to classify objects in a real-time. The information given from a 3D stereocamera, developed inside the department of Computer Engineering of the University of Bologna, can be used to improve the accuracy of the classification task. The idea is to segment a single object in a scene using the depth given from a stereocamera and in this way make the classification more accurate.
Resumo:
Questo studio si propone di realizzare un’applicazione per dispositivi Android che permetta, per mezzo di un gioco di ruolo strutturato come caccia al tesoro, di visitare in prima persona città d’arte e luoghi turistici. Gli utenti finali, grazie alle funzionalità dell’app stessa, potranno giocare, creare e condividere cacce al tesoro basate sulla ricerca di edifici, monumenti, luoghi di rilevanza artistico-storica o turistica; in particolare al fine di completare ciascuna tappa di una caccia al tesoro il giocatore dovrà scattare una fotografia al monumento o edificio descritto nell’obiettivo della caccia stessa. Il software grazie ai dati rilevati tramite GPS e giroscopio (qualora il dispositivo ne sia dotato) e per mezzo di un algoritmo di instance recognition sarà in grado di affermare se la foto scattata rappresenta la risposta corretta al quesito della tappa. L’applicazione GeoPhotoHunt rappresenta non solo uno strumento ludico per la visita di città turistiche o più in generale luoghi di interesse, lo studio propone, infatti come suo contributo originale, l’implementazione su piattaforma mobile di un Content Based Image Retrieval System (CBIR) del tutto indipendente da un supporto server. Nello specifico il server dell’applicazione non sarà altro che uno strumento di appoggio con il quale i membri della “community” di GeoPhotoHunt potranno pubblicare le cacce al tesoro da loro create e condividere i punteggi che hanno totalizzato partecipando a una caccia al tesoro. In questo modo quando un utente ha scaricato sul proprio smartphone i dati di una caccia al tesoro potrà iniziare l’avventura anche in assenza di una connessione internet. L’intero studio è stato suddiviso in più fasi, ognuna di queste corrisponde ad una specifica sezione dell’elaborato che segue. In primo luogo si sono effettuate delle ricerche, soprattutto nel web, con lo scopo di individuare altre applicazioni che implementano l’idea della caccia al tesoro su piattaforma mobile o applicazioni che implementassero algoritmi di instance recognition direttamente su smartphone. In secondo luogo si è ricercato in letteratura quali fossero gli algoritmi di riconoscimento di immagini più largamente diffusi e studiati in modo da avere una panoramica dei metodi da testare per poi fare la scelta dell’algoritmo più adatto al caso di studio. Quindi si è proceduto con lo sviluppo dell’applicazione GeoPhotoHunt stessa, sia per quanto riguarda l’app front-end per dispositivi Android sia la parte back-end server. Infine si è passati ad una fase di test di algoritmi di riconoscimento di immagini in modo di avere una sufficiente quantità di dati sperimentali da permettere di effettuare una scelta dell’algoritmo più adatto al caso di studio. Al termine della fase di testing si è deciso di implementare su Android un algoritmo basato sulla distanza tra istogrammi di colore costruiti sulla scala cromatica HSV, questo metodo pur non essendo robusto in presenza di variazioni di luminosità e contrasto, rappresenta un buon compromesso tra prestazioni, complessità computazionale in modo da rendere la user experience quanto più coinvolgente.