399 resultados para tao
Resumo:
Both commercial and scientific applications often need to transform color images into gray-scale images, e. g., to reduce the publication cost in printing color images or to help color blind people see visual cues of color images. However, conventional color to gray algorithms are not ready for practical applications because they encounter the following problems: 1) Visual cues are not well defined so it is unclear how to preserve important cues in the transformed gray-scale images; 2) some algorithms have extremely high time cost for computation; and 3) some require human-computer interactions to have a reasonable transformation. To solve or at least reduce these problems, we propose a new algorithm based on a probabilistic graphical model with the assumption that the image is defined over a Markov random field. Thus, color to gray procedure can be regarded as a labeling process to preserve the newly well-defined visual cues of a color image in the transformed gray-scale image. Visual cues are measurements that can be extracted from a color image by a perceiver. They indicate the state of some properties of the image that the perceiver is interested in perceiving. Different people may perceive different cues from the same color image and three cues are defined in this paper, namely, color spatial consistency, image structure information, and color channel perception priority. We cast color to gray as a visual cue preservation procedure based on a probabilistic graphical model and optimize the model based on an integral minimization problem. We apply the new algorithm to both natural color images and artificial pictures, and demonstrate that the proposed approach outperforms representative conventional algorithms in terms of effectiveness and efficiency. In addition, it requires no human-computer interactions.
Resumo:
Eye detection plays an important role in many practical applications. This paper presents a novel two-step scheme for eye detection. The first step models an eye by a newly defined visual-context pattern (VCP), and the second step applies semisupervised boosting for precise detection. VCP describes both the space and appearance relations between an eye region (region of eye) and a reference region (region of reference). The context feature of a VCP is extracted by using the integral image. Aiming to reduce the human labeling efforts, we apply semisupervised boosting, which integrates the context feature and the Haar-like features for precise eye detection. Experimental results on several standard face data sets demonstrate that the proposed approach is effective, robust, and efficient. We finally show that this approach is ready for practical applications.
Resumo:
Inspired by human visual cognition mechanism, this paper first presents a scene classification method based on an improved standard model feature. Compared with state-of-the-art efforts in scene classification, the newly proposed method is more robust, more selective, and of lower complexity. These advantages are demonstrated by two sets of experiments on both our own database and standard public ones. Furthermore, occlusion and disorder problems in scene classification in video surveillance are also first studied in this paper.
Resumo:
Mammographic mass detection is an important task for the early diagnosis of breast cancer. However, it is difficult to distinguish masses from normal regions because of their abundant morphological characteristics and ambiguous margins. To improve the mass detection performance, it is essential to effectively preprocess mammogram to preserve both the intensity distribution and morphological characteristics of regions. In this paper, morphological component analysis is first introduced to decompose a mammogram into a piecewise-smooth component and a texture component. The former is utilized in our detection scheme as it effectively suppresses both structural noises and effects of blood vessels. Then, we propose two novel concentric layer criteria to detect different types of suspicious regions in a mammogram. The combination is evaluated based on the Digital Database for Screening Mammography, where 100 malignant cases and 50 benign cases are utilized. The sensitivity of the proposed scheme is 99% in malignant, 88% in benign, and 95.3% in all types of cases. The results show that the proposed detection scheme achieves satisfactory detection performance and preferable compromises between sensitivity and false positive rates.
Resumo:
Watermarking aims to hide particular information into some carrier but does not change the visual cognition of the carrier itself. Local features are good candidates to address the watermark synchronization error caused by geometric distortions and have attracted great attention for content-based image watermarking. This paper presents a novel feature point-based image watermarking scheme against geometric distortions. Scale invariant feature transform (SIFT) is first adopted to extract feature points and to generate a disk for each feature point that is invariant to translation and scaling. For each disk, orientation alignment is then performed to achieve rotation invariance. Finally, watermark is embedded in middle-frequency discrete Fourier transform (DFT) coefficients of each disk to improve the robustness against common image processing operations. Extensive experimental results and comparisons with some representative image watermarking methods confirm the excellent performance of the proposed method in robustness against various geometric distortions as well as common image processing operations.
Resumo:
Video-based facial expression recognition is a challenging problem in computer vision and human-computer interaction. To target this problem, texture features have been extracted and widely used, because they can capture image intensity changes raised by skin deformation. However, existing texture features encounter problems with albedo and lighting variations. To solve both problems, we propose a new texture feature called image ratio features. Compared with previously proposed texture features, e. g., high gradient component features, image ratio features are more robust to albedo and lighting variations. In addition, to further improve facial expression recognition accuracy based on image ratio features, we combine image ratio features with facial animation parameters (FAPs), which describe the geometric motions of facial feature points. The performance evaluation is based on the Carnegie Mellon University Cohn-Kanade database, our own database, and the Japanese Female Facial Expression database. Experimental results show that the proposed image ratio feature is more robust to albedo and lighting variations, and the combination of image ratio features and FAPs outperforms each feature alone. In addition, we study asymmetric facial expressions based on our own facial expression database and demonstrate the superior performance of our combined expression recognition system.
Resumo:
This paper presents a new image segmentation method that applies an edge-based level set method in a relay fashion. The proposed method segments an image in a series of nested subregions that are automatically created by shrinking the stabilized curves in their previous subregions. The final result is obtained by combining all boundaries detected in these subregions. The proposed method has the following three advantages: 1) It can be automatically executed without human-computer interactions; 2) it applies the edge-based level set method with relay fashion to detect all boundaries; and 3) it automatically obtains a full segmentation without specifying the number of relays in advance. The comparison experiments illustrate that the proposed method performs better than the representative level set methods, and it can obtain similar or better results compared with other popular segmentation algorithms.
Resumo:
The Gaussian process latent variable model (GP-LVM) has been identified to be an effective probabilistic approach for dimensionality reduction because it can obtain a low-dimensional manifold of a data set in an unsupervised fashion. Consequently, the GP-LVM is insufficient for supervised learning tasks (e. g., classification and regression) because it ignores the class label information for dimensionality reduction. In this paper, a supervised GP-LVM is developed for supervised learning tasks, and the maximum a posteriori algorithm is introduced to estimate positions of all samples in the latent variable space. We present experimental evidences suggesting that the supervised GP-LVM is able to use the class label information effectively, and thus, it outperforms the GP-LVM and the discriminative extension of the GP-LVM consistently. The comparison with some supervised classification methods, such as Gaussian process classification and support vector machines, is also given to illustrate the advantage of the proposed method.