978 resultados para Automatic Image Annotation


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This project is based on Artificial Intelligence (A.I) and Digital Image processing (I.P) for automatic condition monitoring of sleepers in the railway track. Rail inspection is a very important task in railway maintenance for traffic safety issues and in preventing dangerous situations. Monitoring railway track infrastructure is an important aspect in which the periodical inspection of rail rolling plane is required.Up to the present days the inspection of the railroad is operated manually by trained personnel. A human operator walks along the railway track searching for sleeper anomalies. This monitoring way is not more acceptable for its slowness and subjectivity. Hence, it is desired to automate such intuitive human skills for the development of more robust and reliable testing methods. Images of wooden sleepers have been used as data for my project. The aim of this project is to present a vision based technique for inspecting railway sleepers (wooden planks under the railway track) by automatic interpretation of Non Destructive Test (NDT) data using A.I. techniques in determining the results of inspection.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A camera based machine vision system for the automatic inspection of surface defects in aluminum die casting is presented. The system uses a hybrid image processing algorithm based on mathematic morphology to detect defects with different sizes and shapes. The defect inspection algorithm consists of two parts. One is a parameter learning algorithm, in which a genetic algorithm is used to extract optimal structuring element parameters, and segmentation and noise removal thresholds. The second part is a defect detection algorithm, in which the parameters obtained by a genetic algorithm are used for morphological operations. The machine vision system has been applied in an industrial setting to detect two types of casting defects: parts mix-up and any defects on the surface of castings. The system performs with a 99% or higher accuracy for both part mix-up and defect detection and is currently used in industry as part of normal production.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A machine vision system is presented for the automatic inspection of surface defects in aluminium die casting. The system uses a hybrid image processing algorithm based on mathematic morphology to detect defects with different sizes and shapes. The defect inspection algorithm consists of two parts. One is a parameter learning algorithm, in which a genetic algorithm is used to extract optimal structuring element parameters, and segmentation and noise removal thresholds. The second part is a defect detection algorithm, in which the parameters obtained by a genetic algorithm are used for morphological operations. The machine vision system has been applied in an industrial setting to detect two types of casting defects: parts mix-up and any defects on the surface of castings. The system performs with a 99% or higher accuracy for both part mix-up and defect detection and is currently used in industry as part of normal production.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Age Specific Human-Computer Interaction (ASHCI) has vast potential applications in daily life. However, automatic age estimation technique is still underdeveloped. One of the main reasons is that the aging effects on human faces present several unique characteristics which make age estimation a challenging task that requires non-standard classification approaches. According to the speciality of the facial aging effects, this paper proposes the AGES (AGing pattErn Sub-space) method for automatic age estimation. The basic idea is to model the aging pattern, which is defined as a sequence of personal aging face images, by learning a representative subspace. The proper aging pattern for an unseen face image is then determined by the projection in the subspace that can best reconstruct the face image, while the position of the face image in that aging pattern will indicate its age. The AGES method has shown encouraging performance in the comparative experiments either as an age estimator or as an age range estimator.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While recognition of most facial variations, such as identity, expression, and gender, has been extensively studied, automatic age estimation has rarely been explored. In contrast to other facial variations, aging variation presents several unique characteristics which make age estimation a challenging task. This paper proposes an automatic age estimation method named AGES (AGing pattErn Subspace). The basic idea is to model the aging pattern, which is defined as the sequence of a particular individual's face images sorted in time order, by constructing a representative subspace. The proper aging pattern for a previously unseen face image is determined by the projection in the subspace that can reconstruct the face image with minimum reconstruction error, while the position of the face image in that aging pattern will then indicate its age. In the experiments, AGES and its variants are compared with the limited existing age estimation methods (WAS and AAS) and some well-established classification methods (kNN, BP, C4.5, and SVM). Moreover, a comparison with human perception ability on age is conducted. It is interesting to note that the performance of AGES is not only significantly better than that of all the other algorithms, but also comparable to that of the human observers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis focuses on novel technologies for facial image analysis, which involves three topics: face recognition under uncontrolled conditions, automatic facial age estimation, and context-aware fusion of face and gait. They are either key issues bridging laboratorial research and real applications, or innovative problems that have barely been studied before.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tonewood for musical instruments is quarter-sawn and frequently quality-graded based on visual appearance, mechanical and acoustic properties. The assessment uses simple human (subjective) observation, and two ‘‘experts’’ can rate the same sample differently. This paper describes the application of integral transforms (Fourier and Radon) for automatic (objective) assessment of the visual appearance of 10 Sitka spruce (Picea sitchensis) sample images. This work considers surface classification on the basis of grain orientation, count, spacing, and evenness or uniformity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the challenge of bridging the semantic gap that exists between the simplicity of features that can be currently computed in automated content indexing systems and the richness of semantics in user queries posed for media search and retrieval. It proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high-level semantics of stories portrayed, thus enabling rich video annotation and interpretation. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step toward demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for a number of full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful high-level semantic construct in its own right and a promising component of others such as rhythm, tone or mood of a film. In addition to the development of this computable tempo measure, a study is conducted as to the usefulness of biasing it toward either of its constituents, namely, motion or shot length. Finally, a refinement is made to the shot length normalizing mechanism, driven by the peculiar characteristics of shot length distribution exhibited by movies. Results of these additional studies, and possible applications and limitations are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to enable high-level semantics-based video annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from film production to determine when a scene change occurs in film. We examine different rules and conventions followed as part of Film Grammar to guide and shape our algorithmic solution for determining a scene boundary. Two different techniques are proposed as new solutions in this paper. Our experimental results on 10 full-length movies show that our technique based on shot sequence coherence performs well and reasonably better than the color edges-based approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present results on an extension to our approach for automatic sports video annotation. Sports video is augmented with accelerometer data from wrist bands worn by umpires in the game. We solve the problem of automatic segmentation and robust gesture classification using a hierarchical hidden Markov model in conjunction with a filler model. The hierarchical model allows us to consider gestures at different levels of abstraction and the filler model allows us to handle extraneous umpire movements. Results are presented for labeling video for a game of Cricket.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes an application of camera motion estimation to index cricket games. The shots are labeled with the type of shot: glance left, glance right, left drive, right drive, left cut, right pull and straight drive. The method has the advantages that it is fast and avoids complex image segmentation. The classification of the cricket shots is done using an incremental learning algorithm. We tested the method on over 600 shots and the results show that the system has a classification accuracy of 74%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper details research that will explore the analysis of human behaviour via video surveillance. Digital computer images will be obtained from video footage of a real world scene, and positions of people in the scene will be identified and tracked through each frame in the sequence.

The noted positions will build into a pattern of motion that can be examined and classified. It is proposed that specific events, such as panic or fight situations, will have unique, and therefore identifying, characteristics that will enable automatic detection of such events.

It is envisaged that active cameras will be used when a situation of interest occurs, to enable more information to be extracted from the scene (e.g., panning to follow action, or zooming to enhance detail.)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The self-quotient image is a biologically inspired representation which has been proposed as an illumination invariant feature for automatic face recognition. Owing to the lack of strong domain specific assumptions underlying this representation, it can be readily extracted from raw images irrespective of the persons's pose, facial expression etc. What makes the self-quotient image additionally attractive is that it can be computed quickly and in a closed form using simple low-level image operations. However, it is generally accepted that the self-quotient is insufficiently robust to large illumination changes which is why it is mainly used in applications in which low precision is an acceptable compromise for high recall (e.g. retrieval systems). Yet, in this paper we demonstrate that the performance of this representation in challenging illuminations has been greatly underestimated. We show that its error rate can be reduced by over an order of magnitude, without any changes to the representation itself. Rather, we focus on the manner in which the dissimilarity between two self-quotient images is computed. By modelling the dominant sources of noise affecting the representation, we propose and evaluate a series of different dissimilarity measures, the best of which reduces the initial error rate of 63.0% down to only 5.7% on the notoriously challenging YaleB data set.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this chapter we described a novel framework for automatic face recognition in the presence of varying illumination, primarily applicable to matching face sets or sequences. The framework is based on simple image processing filters that compete with unprocessed greyscale input to yield a single matching score between individuals. By performing all numerically consuming computation offline, our method both (i) retains the matching efficiency of simple image filters, but (ii) with a greatly increased robustness, as all online processing is performed in closed-form. Evaluated on a large, real-world data corpus, the proposed framework was shown to be successful in video-based recognition across a wide range of illumination, pose and face motion pattern changes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this work is to recognize all the frontal faces of a character in the closed world of a movie or situation comedy, given a small number of query faces. This is challenging because faces in a feature-length film are relatively uncontrolled with a wide variability of scale, pose, illumination, and expressions, and also may be partially occluded. We develop a recognition method based on a cascade of processing steps that normalize for the effects of the changing imaging environment. In particular there are three areas of novelty: (i) we suppress the background surrounding the face, enabling the maximum area of the face to be retained for recognition rather than a subset; (ii) we include a pose refinement step to optimize the registration between the test image and face exemplar; and (iii) we use robust distance to a sub-space to allow for partial occlusion and expression change. The method is applied and evaluated on several feature length films. It is demonstrated that high recall rates (over 92%) can be achieved whilst maintaining good precision (over 93%).