26 resultados para Automatic Image Annotation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a novel multi-label classification framework for domains with large numbers of labels. Automatic image annotation is such a domain, as the available semantic concepts are typically hundreds. The proposed framework comprises an initial clustering phase that breaks the original training set into several disjoint clusters of data. It then trains a multi-label classifier from the data of each cluster. Given a new test instance, the framework first finds the nearest cluster and then applies the corresponding model. Empirical results using two clustering algorithms, four multi-label classification algorithms and three image annotation data sets suggest that the proposed approach can improve the performance and reduce the training time of standard multi-label classification algorithms, particularly in the case of large number of labels.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an empirical study of multi-label classification methods, and gives suggestions for multi-label classification that are effective for automatic image annotation applications. The study shows that triple random ensemble multi-label classification algorithm (TREMLC) outperforms among its counterparts, especially on scene image dataset. Multi-label k-nearest neighbor (ML-kNN) and binary relevance (BR) learning algorithms perform well on Corel image dataset. Based on the overall evaluation results, examples are given to show label prediction performance for the algorithms using selected image examples. This provides an indication of the suitability of different multi-label classification methods for automatic image annotation under different problem settings.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an image to text translation platform consisting of image segmentation, region features extraction, region blobs clustering, and translation components. A multi-label learning method is suggested for realizing the translation component. Empirical studies show that the predictive performance of the translation component is better than its counterparts when employed a dual-random ensemble multi-label classification algorithm.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Innovative media management, annotation, delivery, and navigation services will enrich online shopping, help-desk services, and anytime-anywhere training over wireless devices. However, the semantic gap between the rich meaning that users want when they query and browse media and the shallowness of the content descriptions that one can actually compute is weakening today's automatic content-annotation systems. To address such problems, an approach that markedly departs from existing methods based on detecting and annotating low-level audio-visual features is advocated.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high level semantics of stories portrayed, thus enabling better video annotation and interpretation systems. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step towards demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for four full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful attribute in its own right and a promising component of semantic constructs such as tone or mood of a film.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A camera based machine vision system for the automatic inspection of surface defects in aluminum die casting is presented. The system uses a hybrid image processing algorithm based on mathematic morphology to detect defects with different sizes and shapes. The defect inspection algorithm consists of two parts. One is a parameter learning algorithm, in which a genetic algorithm is used to extract optimal structuring element parameters, and segmentation and noise removal thresholds. The second part is a defect detection algorithm, in which the parameters obtained by a genetic algorithm are used for morphological operations. The machine vision system has been applied in an industrial setting to detect two types of casting defects: parts mix-up and any defects on the surface of castings. The system performs with a 99% or higher accuracy for both part mix-up and defect detection and is currently used in industry as part of normal production.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A machine vision system is presented for the automatic inspection of surface defects in aluminium die casting. The system uses a hybrid image processing algorithm based on mathematic morphology to detect defects with different sizes and shapes. The defect inspection algorithm consists of two parts. One is a parameter learning algorithm, in which a genetic algorithm is used to extract optimal structuring element parameters, and segmentation and noise removal thresholds. The second part is a defect detection algorithm, in which the parameters obtained by a genetic algorithm are used for morphological operations. The machine vision system has been applied in an industrial setting to detect two types of casting defects: parts mix-up and any defects on the surface of castings. The system performs with a 99% or higher accuracy for both part mix-up and defect detection and is currently used in industry as part of normal production.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Age Specific Human-Computer Interaction (ASHCI) has vast potential applications in daily life. However, automatic age estimation technique is still underdeveloped. One of the main reasons is that the aging effects on human faces present several unique characteristics which make age estimation a challenging task that requires non-standard classification approaches. According to the speciality of the facial aging effects, this paper proposes the AGES (AGing pattErn Sub-space) method for automatic age estimation. The basic idea is to model the aging pattern, which is defined as a sequence of personal aging face images, by learning a representative subspace. The proper aging pattern for an unseen face image is then determined by the projection in the subspace that can best reconstruct the face image, while the position of the face image in that aging pattern will indicate its age. The AGES method has shown encouraging performance in the comparative experiments either as an age estimator or as an age range estimator.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

While recognition of most facial variations, such as identity, expression, and gender, has been extensively studied, automatic age estimation has rarely been explored. In contrast to other facial variations, aging variation presents several unique characteristics which make age estimation a challenging task. This paper proposes an automatic age estimation method named AGES (AGing pattErn Subspace). The basic idea is to model the aging pattern, which is defined as the sequence of a particular individual's face images sorted in time order, by constructing a representative subspace. The proper aging pattern for a previously unseen face image is determined by the projection in the subspace that can reconstruct the face image with minimum reconstruction error, while the position of the face image in that aging pattern will then indicate its age. In the experiments, AGES and its variants are compared with the limited existing age estimation methods (WAS and AAS) and some well-established classification methods (kNN, BP, C4.5, and SVM). Moreover, a comparison with human perception ability on age is conducted. It is interesting to note that the performance of AGES is not only significantly better than that of all the other algorithms, but also comparable to that of the human observers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis focuses on novel technologies for facial image analysis, which involves three topics: face recognition under uncontrolled conditions, automatic facial age estimation, and context-aware fusion of face and gait. They are either key issues bridging laboratorial research and real applications, or innovative problems that have barely been studied before.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tonewood for musical instruments is quarter-sawn and frequently quality-graded based on visual appearance, mechanical and acoustic properties. The assessment uses simple human (subjective) observation, and two ‘‘experts’’ can rate the same sample differently. This paper describes the application of integral transforms (Fourier and Radon) for automatic (objective) assessment of the visual appearance of 10 Sitka spruce (Picea sitchensis) sample images. This work considers surface classification on the basis of grain orientation, count, spacing, and evenness or uniformity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper addresses the challenge of bridging the semantic gap that exists between the simplicity of features that can be currently computed in automated content indexing systems and the richness of semantics in user queries posed for media search and retrieval. It proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high-level semantics of stories portrayed, thus enabling rich video annotation and interpretation. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step toward demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for a number of full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful high-level semantic construct in its own right and a promising component of others such as rhythm, tone or mood of a film. In addition to the development of this computable tempo measure, a study is conducted as to the usefulness of biasing it toward either of its constituents, namely, motion or shot length. Finally, a refinement is made to the shot length normalizing mechanism, driven by the peculiar characteristics of shot length distribution exhibited by movies. Results of these additional studies, and possible applications and limitations are discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In order to enable high-level semantics-based video annotation and interpretation, we tackle the problem of automatic decomposition of motion pictures into meaningful story units, namely scenes. Since a scene is a complicated and subjective concept, we first propose guidelines from film production to determine when a scene change occurs in film. We examine different rules and conventions followed as part of Film Grammar to guide and shape our algorithmic solution for determining a scene boundary. Two different techniques are proposed as new solutions in this paper. Our experimental results on 10 full-length movies show that our technique based on shot sequence coherence performs well and reasonably better than the color edges-based approach.