976 resultados para image set


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Faces are complex patterns that often differ in only subtle ways. Face recognition algorithms have difficulty in coping with differences in lighting, cameras, pose, expression, etc. We propose a novel approach for facial recognition based on a new feature extraction method called fractal image-set encoding. This feature extraction method is a specialized fractal image coding technique that makes fractal codes more suitable for object and face recognition. A fractal code of a gray-scale image can be divided in two parts – geometrical parameters and luminance parameters. We show that fractal codes for an image are not unique and that we can change the set of fractal parameters without significant change in the quality of the reconstructed image. Fractal image-set coding keeps geometrical parameters the same for all images in the database. Differences between images are captured in the non-geometrical or luminance parameters – which are faster to compute. Results on a subset of the XM2VTS database are presented.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

State-of-the-art image-set matching techniques typically implicitly model each image-set with a Gaussian distribution. Here, we propose to go beyond these representations and model image-sets as probability distribution functions (PDFs) using kernel density estimators. To compare and match image-sets, we exploit Csiszar´ f-divergences, which bear strong connections to the geodesic distance defined on the space of PDFs, i.e., the statistical manifold. Furthermore, we introduce valid positive definite kernels on the statistical manifold, which let us make use of more powerful classification schemes to match image-sets. Finally, we introduce a supervised dimensionality reduction technique that learns a latent space where f-divergences reflect the class labels of the data. Our experiments on diverse problems, such as video-based face recognition and dynamic texture classification, evidence the benefits of our approach over the state-of-the-art image-set matching methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

We address the problem of face recognition on video by employing the recently proposed probabilistic linear discrimi-nant analysis (PLDA). The PLDA has been shown to be robust against pose and expression in image-based face recognition. In this research, the method is extended and applied to video where image set to image set matching is performed. We investigate two approaches of computing similarities between image sets using the PLDA: the closest pair approach and the holistic sets approach. To better model face appearances in video, we also propose the heteroscedastic version of the PLDA which learns the within-class covariance of each individual separately. Our experi-ments on the VidTIMIT and Honda datasets show that the combination of the heteroscedastic PLDA and the closest pair approach achieves the best performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Traditional nearest points methods use all the samples in an image set to construct a single convex or affine hull model for classification. However, strong artificial features and noisy data may be generated from combinations of training samples when significant intra-class variations and/or noise occur in the image set. Existing multi-model approaches extract local models by clustering each image set individually only once, with fixed clusters used for matching with various image sets. This may not be optimal for discrimination, as undesirable environmental conditions (eg. illumination and pose variations) may result in the two closest clusters representing different characteristics of an object (eg. frontal face being compared to non-frontal face). To address the above problem, we propose a novel approach to enhance nearest points based methods by integrating affine/convex hull classification with an adapted multi-model approach. We first extract multiple local convex hulls from a query image set via maximum margin clustering to diminish the artificial variations and constrain the noise in local convex hulls. We then propose adaptive reference clustering (ARC) to constrain the clustering of each gallery image set by forcing the clusters to have resemblance to the clusters in the query image set. By applying ARC, noisy clusters in the query set can be discarded. Experiments on Honda, MoBo and ETH-80 datasets show that the proposed method outperforms single model approaches and other recent techniques, such as Sparse Approximated Nearest Points, Mutual Subspace Method and Manifold Discriminant Analysis.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Feature-based image watermarking schemes, which aim to survive various geometric distortions, have attracted great attention in recent years. Existing schemes have shown robustness against rotation, scaling, and translation, but few are resistant to cropping, nonisotropic scaling, random bending attacks (RBAs), and affine transformations. Seo and Yoo present a geometrically invariant image watermarking based on affine covariant regions (ACRs) that provide a certain degree of robustness. To further enhance the robustness, we propose a new image watermarking scheme on the basis of Seo's work, which is insensitive to geometric distortions as well as common image processing operations. Our scheme is mainly composed of three components: 1) feature selection procedure based on graph theoretical clustering algorithm is applied to obtain a set of stable and nonoverlapped ACRs; 2) for each chosen ACR, local normalization, and orientation alignment are performed to generate a geometrically invariant region, which can obviously improve the robustness of the proposed watermarking scheme; and 3) in order to prevent the degradation in image quality caused by the normalization and inverse normalization, indirect inverse normalization is adopted to achieve a good compromise between the imperceptibility and robustness. Experiments are carried out on an image set of 100 images collected from Internet, and the preliminary results demonstrate that the developed method improves the performance over some representative image watermarking approaches in terms of robustness.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper describes an investigation of changes in image appearance when images are viewed at different image sizes on a high-end LCD device. Two digital image capturing devices of different overall image quality were used for recording identical natural scenes with a variety of pictorial contents. From each capturing device, a total of sixty four captured scenes, including architecture, nature, portraits, still and moving objects and artworks under various illumination conditions and recorded noise level were selected. The test set included some images where camera shake was purposefully introduced. An achromatic version of the image set that contained only lightness information was obtained by processing the captured images in CIELAB space. Rank order experiments were carried out to determine which image attribute(s) were most affected when the displayed image size was altered. These evaluations were carried out for both chromatic and achromatic versions of the stimuli. For the achromatic stimuli, attributes such as contrast, brightness, sharpness and noisiness were rank-ordered by the observers in terms of the degree of change. The same attributes, as well as hue and colourfulness, were investigated for the chromatic versions of the stimuli. Results showed that sharpness and contrast were the two most affected attributes with changes in displayed image size. The ranking of the remaining attributes varied with image content and illumination conditions. Further, experiments were carried out to link original scene content to the attributes that changed mostly with changes in image size.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper explains the Genetic Algorithm (GA) evolution of optimized wavelet that surpass the cdf9/7 wavelet for fingerprint compression and reconstruction. Optimized wavelets have already been evolved in previous works in the literature, but they are highly computationally complex and time consuming. Therefore, in this work, a simple approach is made to reduce the computational complexity of the evolution algorithm. A training image set comprised of three 32x32 size cropped images performed much better than the reported coefficients in literature. An average improvement of 1.0059 dB in PSNR above the classical cdf9/7 wavelet over the 80 fingerprint images was achieved. In addition, the computational speed was increased by 90.18 %. The evolved coefficients for compression ratio (CR) 16:1 yielded better average PSNR for other CRs also. Improvement in average PSNR was experienced for degraded and noisy images as well

Relevância:

70.00% 70.00%

Publicador:

Resumo:

PURPOSE: To determine whether a 3-mm isotropic target margin adequately covers the prostate and seminal vesicles (SVs) during administration of an intensity-modulated radiation therapy (IMRT) treatment fraction, assuming that daily image-guided setup is performed just before each fraction. MATERIALS AND METHODS: In-room computed tomographic (CT) scans were acquired immediately before and after a daily treatment fraction in 46 patients with prostate cancer. An eight-field IMRT plan was designed using the pre-fraction CT with a 3-mm margin and subsequently recalculated on the post-fraction CT. For convenience of comparison, dose plans were scaled to full course of treatment (75.6 Gy). Dose coverage was assessed on the post-treatment CT image set. RESULTS: During one treatment fraction (21.4+/-5.5 min), there were reductions in the volumes of the prostate and SVs receiving the prescribed dose (median reduction 0.1% and 1.0%, respectively, p<0.001) and in the minimum dose to 0.1 cm(3) of their volumes (median reduction 0.5 and 1.5 Gy, p<0.001). Of the 46 patients, three patients' prostates and eight patients' SVs did not maintain dose coverage above 70 Gy. Rectal filling correlated with decreased percentage-volume of SV receiving 75.6, 70, and 60 Gy (p<0.02). CONCLUSIONS: The 3-mm intrafractional margin was adequate for prostate dose coverage. However, a significant subset of patients lost SV dose coverage. The rectal volume change significantly affected SV dose coverage. For advanced-stage prostate cancers, we recommend to use larger margins or improve organ immobilization (such as with a rectal balloon) to ensure SV coverage.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The influence of respiratory motion on patient anatomy poses a challenge to accurate radiation therapy, especially in lung cancer treatment. Modern radiation therapy planning uses models of tumor respiratory motion to account for target motion in targeting. The tumor motion model can be verified on a per-treatment session basis with four-dimensional cone-beam computed tomography (4D-CBCT), which acquires an image set of the dynamic target throughout the respiratory cycle during the therapy session. 4D-CBCT is undersampled if the scan time is too short. However, short scan time is desirable in clinical practice to reduce patient setup time. This dissertation presents the design and optimization of 4D-CBCT to reduce the impact of undersampling artifacts with short scan times. This work measures the impact of undersampling artifacts on the accuracy of target motion measurement under different sampling conditions and for various object sizes and motions. The results provide a minimum scan time such that the target tracking error is less than a specified tolerance. This work also presents new image reconstruction algorithms for reducing undersampling artifacts in undersampled datasets by taking advantage of the assumption that the relevant motion of interest is contained within a volume-of-interest (VOI). It is shown that the VOI-based reconstruction provides more accurate image intensity than standard reconstruction. The VOI-based reconstruction produced 43% fewer least-squares error inside the VOI and 84% fewer error throughout the image in a study designed to simulate target motion. The VOI-based reconstruction approach can reduce acquisition time and improve image quality in 4D-CBCT.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Modelling video sequences by subspaces has recently shown promise for recognising human actions. Subspaces are able to accommodate the effects of various image variations and can capture the dynamic properties of actions. Subspaces form a non-Euclidean and curved Riemannian manifold known as a Grassmann manifold. Inference on manifold spaces usually is achieved by embedding the manifolds in higher dimensional Euclidean spaces. In this paper, we instead propose to embed the Grassmann manifolds into reproducing kernel Hilbert spaces and then tackle the problem of discriminant analysis on such manifolds. To achieve efficient machinery, we propose graph-based local discriminant analysis that utilises within-class and between-class similarity graphs to characterise intra-class compactness and inter-class separability, respectively. Experiments on KTH, UCF Sports, and Ballet datasets show that the proposed approach obtains marked improvements in discrimination accuracy in comparison to several state-of-the-art methods, such as the kernel version of affine hull image-set distance, tensor canonical correlation analysis, spatial-temporal words and hierarchy of discriminative space-time neighbourhood features.