143 resultados para facial images
Resumo:
Cognitive modelling of phenomena in clinical practice allows the operationalisation of otherwise diffuse descriptive terms such as craving or flashbacks. This supports the empirical investigation of the clinical phenomena and the development of targeted treatment interventions. This paper focuses on the cognitive processes underpinning craving, which is recognised as a motivating experience in substance dependence. We use a high-level cognitive architecture, Interacting Cognitive Subsystems (ICS), to compare two theories of craving: Tiffany's theory, centred on the control of automated action schemata, and our own Elaborated Intrusion theory of craving. Data from a questionnaire study of the subjective aspects of everyday desires experienced by a large non-clinical population are presented. Both the data and the high-level modelling support the central claim of the Elaborated Intrusion theory that imagery is a key element of craving, providing the subjective experience and mediating much of the associated disruption of concurrent cognition.
Resumo:
The automatic extraction of road features from remote sensed images has been a topic of great interest within the photogrammetric and remote sensing communities for over 3 decades. Although various techniques have been reported in the literature, it is still challenging to efficiently extract the road details with the increasing of image resolution as well as the requirement for accurate and up-to-date road data. In this paper, we will focus on the automatic detection of road lane markings, which are crucial for many applications, including lane level navigation and lane departure warning. The approach consists of four steps: i) data preprocessing, ii) image segmentation and road surface detection, iii) road lane marking extraction based on the generated road surface, and iv) testing and system evaluation. The proposed approach utilized the unsupervised ISODATA image segmentation algorithm, which segments the image into vegetation regions, and road surface based only on the Cb component of YCbCr color space. A shadow detection method based on YCbCr color space is also employed to detect and recover the shadows from the road surface casted by the vehicles and trees. Finally, the lane marking features are detected from the road surface using the histogram clustering. The experiments of applying the proposed method to the aerial imagery dataset of Gympie, Queensland demonstrate the efficiency of the approach.
Resumo:
Acoustically, vehicles are extremely noisy environments and as a consequence audio-only in-car voice recognition systems perform very poorly. Seeing that the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem. However, implementing such an approach requires a system being able to accurately locate and track the driver’s face and facial features in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using this system, we present our results which show that using the Viola-Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose.
Resumo:
Road features extraction from remote sensed imagery has been a long-term topic of great interest within the photogrammetry and remote sensing communities for over three decades. The majority of the early work only focused on linear feature detection approaches, with restrictive assumption on image resolution and road appearance. The widely available of high resolution digital aerial images makes it possible to extract sub-road features, e.g. road pavement markings. In this paper, we will focus on the automatic extraction of road lane markings, which are required by various lane-based vehicle applications, such as, autonomous vehicle navigation, and lane departure warning. The proposed approach consists of three phases: i) road centerline extraction from low resolution image, ii) road surface detection in the original image, and iii) pavement marking extraction on the generated road surface. The proposed method was tested on the aerial imagery dataset of the Bruce Highway, Queensland, and the results demonstrate the efficiency of our approach.
Resumo:
With the increasing resolution of remote sensing images, road network can be displayed as continuous and homogeneity regions with a certain width rather than traditional thin lines. Therefore, road network extraction from large scale images refers to reliable road surface detection instead of road line extraction. In this paper, a novel automatic road network detection approach based on the combination of homogram segmentation and mathematical morphology is proposed, which includes three main steps: (i) the image is classified based on homogram segmentation to roughly identify the road network regions; (ii) the morphological opening and closing is employed to fill tiny holes and filter out small road branches; and (iii) the extracted road surface is further thinned by a thinning approach, pruned by a proposed method and finally simplified with Douglas-Peucker algorithm. Lastly, the results from some QuickBird images and aerial photos demonstrate the correctness and efficiency of the proposed process.
Resumo:
This article describes a project to unwrap an ancient Egyptian mummy using X-ray computed tomography (CT). About 600 X-ray CT images were obtained through the mummified body of a female named Tjetmutjengebtiu (or Jeni for short), who was a singer in the great temple of Karnak in Egypt during the 22nd dynasty (c. 945-715 BC). The X-ray CT images reveal details of the remains of body organs, wrappings and jewellery. 3D reconstructions of Jeni’s teeth suggest that she was probably only around 20 years old when she died, although the cause of death cannot be ascertained from the CT scans. The CT images were used to build a 3D model of Jeni’s head which enabled an artist to paint a picture of what Jeni may have looked like during life. A PowerPoint presentation and movie clips are provided as supplementary material that may be useful for teaching.
Resumo:
Gabor representations have been widely used in facial analysis (face recognition, face detection and facial expression detection) due to their biological relevance and computational properties. Two popular Gabor representations used in literature are: 1) Log-Gabor and 2) Gabor energy filters. Even though these representations are somewhat similar, they also have distinct differences as the Log-Gabor filters mimic the simple cells in the visual cortex while the Gabor energy filters emulate the complex cells, which causes subtle differences in the responses. In this paper, we analyze the difference between these two Gabor representations and quantify these differences on the task of facial action unit (AU) detection. In our experiments conducted on the Cohn-Kanade dataset, we report an average area underneath the ROC curve (A`) of 92.60% across 17 AUs for the Gabor energy filters, while the Log-Gabor representation achieved an average A` of 96.11%. This result suggests that small spatial differences that the Log-Gabor filters pick up on are more useful for AU detection than the differences in contours and edges that the Gabor energy filters extract.
Resumo:
When classifying a signal, ideally we want our classifier to trigger a large response when it encounters a positive example and have little to no response for all other examples. Unfortunately in practice this does not occur with responses fluctuating, often causing false alarms. There exists a myriad of reasons why this is the case, most notably not incorporating the dynamics of the signal into the classification. In facial expression recognition, this has been highlighted as one major research question. In this paper we present a novel technique which incorporates the dynamics of the signal which can produce a strong response when the peak expression is found and essentially suppresses all other responses as much as possible. We conducted preliminary experiments on the extended Cohn-Kanade (CK+) database which shows its benefits. The ability to automatically and accurately recognize facial expressions of drivers is highly relevant to the automobile. For example, the early recognition of “surprise” could indicate that an accident is about to occur; and various safeguards could immediately be deployed to avoid or minimize injury and damage. In this paper, we conducted initial experiments on the extended Cohn-Kanade (CK+) database which shows its benefits.
Resumo:
Interactive documents for use with the World Wide Web have been developed for viewing multi-dimensional radiographic and visual images of human anatomy, derived from the Visible Human Project. Emphasis has been placed on user-controlled features and selections. The purpose was to develop an interface which was independent of host operating system and browser software which would allow viewing of information by multiple users. The interfaces were implemented using HyperText Markup Language (HTML) forms, C programming language and Perl scripting language. Images were pre-processed using ANALYZE and stored on a Web server in CompuServe GIF format. Viewing options were included in the document design, such as interactive thresholding and two-dimensional slice direction. The interface is an example of what may be achieved using the World Wide Web. Key applications envisaged for such software include education, research and accessing of information through internal databases and simultaneous sharing of images by remote computers by health personnel for diagnostic purposes.
Resumo:
The task addressed in this thesis is the automatic alignment of an ensemble of misaligned images in an unsupervised manner. This application is especially useful in computer vision applications where annotations of the shape of an object of interest present in a collection of images is required. Performing this task manually is a slow, tedious, expensive and error prone process which hinders the progress of research laboratories and businesses. Most recently, the unsupervised removal of geometric variation present in a collection of images has been referred to as congealing based on the seminal work of Learned-Miller [21]. The only assumption made in congealing is that the parametric nature of the misalignment is known a priori (e.g. translation, similarity, a�ne, etc) and that the object of interest is guaranteed to be present in each image. The capability to congeal an ensemble of misaligned images stemming from the same object class has numerous applications in object recognition, detection and tracking. This thesis concerns itself with the construction of a congealing algorithm titled, least-squares congealing, which is inspired by the well known image to image alignment algorithm developed by Lucas and Kanade [24]. The algorithm is shown to have superior performance characteristics when compared to previously established methods: canonical congealing by Learned-Miller [21] and stochastic congealing by Z�ollei [39].
Resumo:
This thesis addresses the problem of detecting and describing the same scene points in different wide-angle images taken by the same camera at different viewpoints. This is a core competency of many vision-based localisation tasks including visual odometry and visual place recognition. Wide-angle cameras have a large field of view that can exceed a full hemisphere, and the images they produce contain severe radial distortion. When compared to traditional narrow field of view perspective cameras, more accurate estimates of camera egomotion can be found using the images obtained with wide-angle cameras. The ability to accurately estimate camera egomotion is a fundamental primitive of visual odometry, and this is one of the reasons for the increased popularity in the use of wide-angle cameras for this task. Their large field of view also enables them to capture images of the same regions in a scene taken at very different viewpoints, and this makes them suited for visual place recognition. However, the ability to estimate the camera egomotion and recognise the same scene in two different images is dependent on the ability to reliably detect and describe the same scene points, or ‘keypoints’, in the images. Most algorithms used for this purpose are designed almost exclusively for perspective images. Applying algorithms designed for perspective images directly to wide-angle images is problematic as no account is made for the image distortion. The primary contribution of this thesis is the development of two novel keypoint detectors, and a method of keypoint description, designed for wide-angle images. Both reformulate the Scale- Invariant Feature Transform (SIFT) as an image processing operation on the sphere. As the image captured by any central projection wide-angle camera can be mapped to the sphere, applying these variants to an image on the sphere enables keypoints to be detected in a manner that is invariant to image distortion. Each of the variants is required to find the scale-space representation of an image on the sphere, and they differ in the approaches they used to do this. Extensive experiments using real and synthetically generated wide-angle images are used to validate the two new keypoint detectors and the method of keypoint description. The best of these two new keypoint detectors is applied to vision based localisation tasks including visual odometry and visual place recognition using outdoor wide-angle image sequences. As part of this work, the effect of keypoint coordinate selection on the accuracy of egomotion estimates using the Direct Linear Transform (DLT) is investigated, and a simple weighting scheme is proposed which attempts to account for the uncertainty of keypoint positions during detection. A word reliability metric is also developed for use within a visual ‘bag of words’ approach to place recognition.
Resumo:
In this study, the feasibility of difference imaging for improving the contrast of electronic portal imaging device (EPID) images is investigated. The difference imaging technique consists of the acquisition of two EPID images (with and without the placement of an additional layer of attenuating medium on the surface of the EPID)and the subtraction of one of these images from the other. The resulting difference image shows improved contrast, compared to a standard EPID image, since it is generated by lower-energy photons. Results of this study show that, ¯rstly, this method can produce images exhibiting greater contrast than is seen in standard megavoltage EPID images and that, secondly, the optimal thickness of attenuating material for producing a maximum contrast enhancement may vary with phantom thickness and composition. Further studies of the possibilities and limitations of the di®erence imaging technique, and the physics behind it, are therefore recommended.