916 resultados para computer vision


Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’objectif de cette thèse par articles est de présenter modestement quelques étapes du parcours qui mènera (on espère) à une solution générale du problème de l’intelligence artificielle. Cette thèse contient quatre articles qui présentent chacun une différente nouvelle méthode d’inférence perceptive en utilisant l’apprentissage machine et, plus particulièrement, les réseaux neuronaux profonds. Chacun de ces documents met en évidence l’utilité de sa méthode proposée dans le cadre d’une tâche de vision par ordinateur. Ces méthodes sont applicables dans un contexte plus général, et dans certains cas elles on tété appliquées ailleurs, mais ceci ne sera pas abordé dans le contexte de cette de thèse. Dans le premier article, nous présentons deux nouveaux algorithmes d’inférence variationelle pour le modèle génératif d’images appelé codage parcimonieux “spike- and-slab” (CPSS). Ces méthodes d’inférence plus rapides nous permettent d’utiliser des modèles CPSS de tailles beaucoup plus grandes qu’auparavant. Nous démontrons qu’elles sont meilleures pour extraire des détecteur de caractéristiques quand très peu d’exemples étiquetés sont disponibles pour l’entraînement. Partant d’un modèle CPSS, nous construisons ensuite une architecture profonde, la machine de Boltzmann profonde partiellement dirigée (MBP-PD). Ce modèle a été conçu de manière à simplifier d’entraînement des machines de Boltzmann profondes qui nécessitent normalement une phase de pré-entraînement glouton pour chaque couche. Ce problème est réglé dans une certaine mesure, mais le coût d’inférence dans le nouveau modèle est relativement trop élevé pour permettre de l’utiliser de manière pratique. Dans le deuxième article, nous revenons au problème d’entraînement joint de machines de Boltzmann profondes. Cette fois, au lieu de changer de famille de modèles, nous introduisons un nouveau critère d’entraînement qui donne naissance aux machines de Boltzmann profondes à multiples prédictions (MBP-MP). Les MBP-MP sont entraînables en une seule étape et ont un meilleur taux de succès en classification que les MBP classiques. Elles s’entraînent aussi avec des méthodes variationelles standard au lieu de nécessiter un classificateur discriminant pour obtenir un bon taux de succès en classification. Par contre, un des inconvénients de tels modèles est leur incapacité de générer deséchantillons, mais ceci n’est pas trop grave puisque la performance de classification des machines de Boltzmann profondes n’est plus une priorité étant donné les dernières avancées en apprentissage supervisé. Malgré cela, les MBP-MP demeurent intéressantes parce qu’elles sont capable d’accomplir certaines tâches que des modèles purement supervisés ne peuvent pas faire, telles que celle de classifier des données incomplètes ou encore celle de combler intelligemment l’information manquante dans ces données incomplètes. Le travail présenté dans cette thèse s’est déroulé au milieu d’une période de transformations importantes du domaine de l’apprentissage à réseaux neuronaux profonds qui a été déclenchée par la découverte de l’algorithme de “dropout” par Geoffrey Hinton. Dropout rend possible un entraînement purement supervisé d’architectures de propagation unidirectionnel sans être exposé au danger de sur- entraînement. Le troisième article présenté dans cette thèse introduit une nouvelle fonction d’activation spécialement con ̧cue pour aller avec l’algorithme de Dropout. Cette fonction d’activation, appelée maxout, permet l’utilisation de aggrégation multi-canal dans un contexte d’apprentissage purement supervisé. Nous démontrons comment plusieurs tâches de reconnaissance d’objets sont mieux accomplies par l’utilisation de maxout. Pour terminer, sont présentons un vrai cas d’utilisation dans l’industrie pour la transcription d’adresses de maisons à plusieurs chiffres. En combinant maxout avec une nouvelle sorte de couche de sortie pour des réseaux neuronaux de convolution, nous démontrons qu’il est possible d’atteindre un taux de succès comparable à celui des humains sur un ensemble de données coriace constitué de photos prises par les voitures de Google. Ce système a été déployé avec succès chez Google pour lire environ cent million d’adresses de maisons.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The application of computer vision based quality control has been slowly but steadily gaining importance mainly due to its speed in achieving results and also greatly due to its non- destnictive nature of testing. Besides, in food applications it also does not contribute to contamination. However, computer vision applications in quality control needs the application of an appropriate software for image analysis. Eventhough computer vision based quality control has several advantages, its application has limitations as to the type of work to be done, particularly so in the food industries. Selective applications, however, can be highly advantageous and very accurate.Computer vision based image analysis could be used in morphometric measurements of fish with the same accuracy as the existing conventional method. The method is non-destructive and non-contaminating thus providing anadvantage in seafood processing.The images could be stored in archives and retrieved at anytime to carry out morphometric studies for biologists.Computer vision and subsequent image analysis could be used in measurements of various food products to assess uniformity of size. One product namely cutlet and product ingredients namely coating materials such as bread crumbs and rava were selected for the study. Computer vision based image analysis was used in the measurements of length, width and area of cutlets. Also the width of coating materials like bread crumbs was measured.Computer imaging and subsequent image analysis can be very effectively used in quality evaluations of product ingredients in food processing. Measurement of width of coating materials could establish uniformity of particles or the lack of it. The application of image analysis in bacteriological work was also done

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper discusses and compares the use of vision based and non-vision based technologies in developing intelligent environments. By reviewing the related projects that use vision based techniques in intelligent environment design, the achieved functions, technical issues and drawbacks of those projects are discussed and summarized, and the potential solutions for future improvement are proposed, which leads to the prospective direction of my PhD research.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A vision system for recognizing rigid and articulated three-dimensional objects in two-dimensional images is described. Geometrical models are extracted from a commercial computer aided design package. The models are then augmented with appearance and functional information which improves the system's hypothesis generation, hypothesis verification, and pose refinement. Significant advantages over existing CAD-based vision systems, which utilize only information available in the CAD system, are realized. Examples show the system recognizing, locating, and tracking a variety of objects in a robot work-cell and in natural scenes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Optical characteristics of stirred curd were simultaneously monitored during syneresis in a 10-L cheese vat using computer vision and colorimetric measurements. Curd syneresis kinetic conditions were varied using 2 levels of milk pH (6.0 and 6.5) and 2 agitation speeds (12.1 and 27.2 rpm). Measured optical parameters were compared with gravimetric measurements of syneresis, taken simultaneously. The results showed that computer vision and colorimeter measurements have potential for monitoring syneresis. The 2 different phases, curd and whey, were distinguished by means of color differences. As syneresis progressed, the backscattered light became increasingly yellow in hue for circa 20 min for the higher stirring speed and circa 30 min for the lower stirring speed. Syneresis-related gravimetric measurements of importance to cheese making (e.g., curd moisture content, total solids in whey, and yield of whey) correlated significantly with computer vision and colorimetric measurements..

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The meltabilities of 14 process cheese samples were determined at 2 and 4 weeks after manufacture using sensory analysis, a computer vision method, and the Olson and Price test. Sensory analysis meltability correlated with both computer vision meltability (R-2 = 0.71, P < 0.001) and Olson and Price meltability (R-2 = 0.69, P < 0.001). There was a marked lack of correlation between the computer vision method and the Olson and Price test. This study showed that the Olson and Price test gave greater repeatability than the computer vision method. Results showed process cheese meltability decreased with increasing inorganic salt content and with lower moisture/fat ratios. There was very little evidence in this study to show that process cheese meltability changed between 2 and 4 weeks after manufacture..

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The rapid development of data transfer through internet made it easier to send the data accurate and faster to the destination. There are many transmission media to transfer the data to destination like e-mails; at the same time it is may be easier to modify and misuse the valuable information through hacking. So, in order to transfer the data securely to the destination without any modifications, there are many approaches like cryptography and steganography. This paper deals with the image steganography as well as with the different security issues, general overview of cryptography, steganography and digital watermarking approaches.  The problem of copyright violation of multimedia data has increased due to the enormous growth of computer networks that provides fast and error free transmission of any unauthorized duplicate and possibly manipulated copy of multimedia information. In order to be effective for copyright protection, digital watermark must be robust which are difficult to remove from the object in which they are embedded despite a variety of possible attacks. The message to be send safe and secure, we use watermarking. We use invisible watermarking to embed the message using LSB (Least Significant Bit) steganographic technique. The standard LSB technique embed the message in every pixel, but my contribution for this proposed watermarking, works with the hint for embedding the message only on the image edges alone. If the hacker knows that the system uses LSB technique also, it cannot decrypt correct message. To make my system robust and secure, we added cryptography algorithm as Vigenere square. Whereas the message is transmitted in cipher text and its added advantage to the proposed system. The standard Vigenere square algorithm works with either lower case or upper case. The proposed cryptography algorithm is Vigenere square with extension of numbers also. We can keep the crypto key with combination of characters and numbers. So by using these modifications and updating in this existing algorithm and combination of cryptography and steganography method we develop a secure and strong watermarking method. Performance of this watermarking scheme has been analyzed by evaluating the robustness of the algorithm with PSNR (Peak Signal to Noise Ratio) and MSE (Mean Square Error) against the quality of the image for large amount of data. While coming to see results of the proposed encryption, higher value of 89dB of PSNR with small value of MSE is 0.0017. Then it seems the proposed watermarking system is secure and robust for hiding secure information in any digital system, because this system collect the properties of both steganography and cryptography sciences.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: To define and evaluate a Computer-Vision (CV) method for scoring Paced Finger-Tapping (PFT) in Parkinson's disease (PD) using quantitative motion analysis of index-fingers and to compare the obtained scores to the UPDRS (Unified Parkinson's Disease Rating Scale) finger-taps (FT). Background: The naked-eye evaluation of PFT in clinical practice results in coarse resolution to determine PD status. Besides, sensor mechanisms for PFT evaluation may cause patients discomfort. In order to avoid cost and effort of applying wearable sensors, a CV system for non-invasive PFT evaluation is introduced. Methods: A database of 221 PFT videos from 6 PD patients was processed. The subjects were instructed to position their hands above their shoulders besides the face and tap the index-finger against the thumb consistently with speed. They were facing towards a pivoted camera during recording. The videos were rated by two clinicians between symptom levels 0-to-3 using UPDRS-FT. The CV method incorporates a motion analyzer and a face detector. The method detects the face of testee in each video-frame. The frame is split into two images from face-rectangle center. Two regions of interest are located in each image to detect index-finger motion of left and right hands respectively. The tracking of opening and closing phases of dominant hand index-finger produces a tapping time-series. This time-series is normalized by the face height. The normalization calibrates the amplitude in tapping signal which is affected by the varying distance between camera and subject (farther the camera, lesser the amplitude). A total of 15 features were classified using K-nearest neighbor (KNN) classifier to characterize the symptoms levels in UPDRS-FT. The target ratings provided by the raters were averaged. Results: A 10-fold cross validation in KNN classified 221 videos between 3 symptom levels with 75% accuracy. An area under the receiver operating characteristic curves of 82.6% supports feasibility of the obtained features to replicate clinical assessments. Conclusions: The system is able to track index-finger motion to estimate tapping symptoms in PD. It has certain advantages compared to other technologies (e.g. magnetic sensors, accelerometers etc.) for PFT evaluation to improve and automate the ratings

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This project aims to apply image processing techniques in computer vision featuring an omnidirectional vision system to agricultural mobile robots (AMR) used for trajectory navigation problems, as well as localization matters. To carry through this task, computational methods based on the JSEG algorithm were used to provide the classification and the characterization of such problems, together with Artificial Neural Networks (ANN) for pattern recognition. Therefore, it was possible to run simulations and carry out analyses of the performance of JSEG image segmentation technique through Matlab/Octave platforms, along with the application of customized Back-propagation algorithm and statistical methods in a Simulink environment. Having the aforementioned procedures been done, it was practicable to classify and also characterize the HSV space color segments, not to mention allow the recognition of patterns in which reasonably accurate results were obtained.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main application area in this project, is to deploy image processing and segmentation techniques in computer vision through an omnidirectional vision system to agricultural mobile robots (AMR) used for trajectory navigation problems, as well as localization matters. Thereby, computational methods based on the JSEG algorithm were used to provide the classification and the characterization of such problems, together with Artificial Neural Networks (ANN) for image recognition. Hence, it was possible to run simulations and carry out analyses of the performance of JSEG image segmentation technique through Matlab/Octave computational platforms, along with the application of customized Back-propagation Multilayer Perceptron (MLP) algorithm and statistical methods as structured heuristics methods in a Simulink environment. Having the aforementioned procedures been done, it was practicable to classify and also characterize the HSV space color segments, not to mention allow the recognition of segmented images in which reasonably accurate results were obtained. © 2010 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Inferences about leaf anatomical characteristics had largely been made by manually measuring diverse leaf regions, such as cuticle, epidermis and parenchyma to evaluate differences caused by environmental variables. Here we tested an approach for data acquisition and analysis in ecological quantitative leaf anatomy studies based on computer vision and pattern recognition methods. A case study was conducted on Gochnatia polymorpha (Less.) Cabrera (Asteraceae), a Neotropical savanna tree species that has high phenotypic plasticity. We obtained digital images of cross-sections of its leaves developed under different light conditions (sun vs. shade), different seasons (dry vs. wet) and in different soil types (oxysoil vs. hydromorphic soil), and analyzed several visual attributes, such as color, texture and tissues thickness in a perpendicular plane from microscopic images. The experimental results demonstrated that computational analysis is capable of distinguishing anatomical alterations in microscope images obtained from individuals growing in different environmental conditions. The methods presented here offer an alternative way to determine leaf anatomical differences. © 2013 Elsevier B.V.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Tesis en inglés. Eliminadas las páginas en blanco del pdf

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]This paper describes a low-cost system that allows the user to visualize different glasses models in live video. The user can also move the glasses to adjust its position on the face. The system, which runs at 9.5 frames/s on general-purpose hardware, has a homeostatic module that keeps image parameters controlled. This is achieved by using a camera with motorized zoom, iris, white balance, etc. This feature can be specially useful in environments with changing illumination and shadows, like in an optical shop. The system also includes a face and eye detection module and a glasses management module.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

[ES]This paper describes some simple but useful computer vision techniques for human-robot interaction. First, an omnidirectional camera setting is described that can detect people in the surroundings of the robot, giving their angular positions and a rough estimate of the distance. The device can be easily built with inexpensive components. Second, we comment on a color-based face detection technique that can alleviate skin-color false positives. Third, a simple head nod and shake detector is described, suitable for detecting affirmative/negative, approval/dissaproval, understanding/disbelief head gestures.