11 resultados para Saliency
em Queensland University of Technology - ePrints Archive
Resumo:
Detection of Region of Interest (ROI) in a video leads to more efficient utilization of bandwidth. This is because any ROIs in a given frame can be encoded in higher quality than the rest of that frame, with little or no degradation of quality from the perception of the viewers. Consequently, it is not necessary to uniformly encode the whole video in high quality. One approach to determine ROIs is to use saliency detectors to locate salient regions. This paper proposes a methodology for obtaining ground truth saliency maps to measure the effectiveness of ROI detection by considering the role of user experience during the labelling process of such maps. User perceptions can be captured and incorporated into the definition of salience in a particular video, taking advantage of human visual recall within a given context. Experiments with two state-of-the-art saliency detectors validate the effectiveness of this approach to validating visual saliency in video. This paper will provide the relevant datasets associated with the experiments.
Resumo:
This paper proposes a novel approach to video deblocking which performs perceptually adaptive bilateral filtering by considering color, intensity, and motion features in a holistic manner. The method is based on bilateral filter which is an effective smoothing filter that preserves edges. The bilateral filter parameters are adaptive and avoid over-blurring of texture regions and at the same time eliminate blocking artefacts in the smooth region and areas of slow motion content. This is achieved by using a saliency map to control the strength of the filter for each individual point in the image based on its perceptual importance. The experimental results demonstrate that the proposed algorithm is effective in deblocking highly compressed video sequences and to avoid over-blurring of edges and textures in salient regions of image.
Resumo:
In this paper, the problem of moving object detection in aerial video is addressed. While motion cues have been extensively exploited in the literature, how to use spatial information is still an open problem. To deal with this issue, we propose a novel hierarchical moving target detection method based on spatiotemporal saliency. Temporal saliency is used to get a coarse segmentation, and spatial saliency is extracted to obtain the object’s appearance details in candidate motion regions. Finally, by combining temporal and spatial saliency information, we can get refined detection results. Additionally, in order to give a full description of the object distribution, spatial saliency is detected in both pixel and region levels based on local contrast. Experiments conducted on the VIVID dataset show that the proposed method is efficient and accurate.
Resumo:
An important trend in Chilean retailing industry is the increase in channel blurring. This investigation attempts to identify the relevant store attributes for different retail formats (grocery, department store, drug store, and home improvement). Do consumer store attribute saliency vary for different retail formats? Interviews identified twelve salient store attributes for the different retail formats. Survey results showed differences in store attribute saliencies for consumers when shopping at different formats. Seven of the twelve variables showed significant differences across formats. However, two attributes were relatively important for all four retail formats: product quality and responsiveness of employees.
Resumo:
This is the second part of a paper that explores a range of magico-religious experiences such as immaterial voices and visions, in terms of local cultural, moral and socio-political circumstances in an Aboriginal town in rural Queensland. This part of the paper explores the political and cultural symbolism and meaning of suicide. It charts the saliency of suicide amongst two groups of kin and cohorts and the social meaningfulness and problematic of the voices and visions in relation to suicide, to identity and family forms and to funerals and a heavily drinking lifestyle. I argue that voices and visions are used to reinterpret social experience and to establish meaning and that tragically suicide evokes connectivity rather than anomie and here cannot be understood merely as an individualistic act or evidence of individual pathology. Rather it is about transformation and crossing a threshold to join an enduring domain of Aboriginality. In this life world, where family is the highest social value and where a relational view of persons holds sway, the individualistic practice of psychiatric and other helping professions, is a considerable problem.
Resumo:
Affine covariant local image features are a powerful tool for many applications, including matching and calibrating wide baseline images. Local feature extractors that use a saliency map to locate features require adaptation processes in order to extract affine covariant features. The most effective extractors make use of the second moment matrix (SMM) to iteratively estimate the affine shape of local image regions. This paper shows that the Hessian matrix can be used to estimate local affine shape in a similar fashion to the SMM. The Hessian matrix requires significantly less computation effort than the SMM, allowing more efficient affine adaptation. Experimental results indicate that using the Hessian matrix in conjunction with a feature extractor that selects features in regions with high second order gradients delivers equivalent quality correspondences in less than 17% of the processing time, compared to the same extractor using the SMM.
Resumo:
The increasing popularity of video consumption from mobile devices requires an effective video coding strategy. To overcome diverse communication networks, video services often need to maintain sustainable quality when the available bandwidth is limited. One of the strategy for a visually-optimised video adaptation is by implementing a region-of-interest (ROI) based scalability, whereby important regions can be encoded at a higher quality while maintaining sufficient quality for the rest of the frame. The result is an improved perceived quality at the same bit rate as normal encoding, which is particularly obvious at the range of lower bit rate. However, because of the difficulties of predicting region-of-interest (ROI) accurately, there is a limited research and development of ROI-based video coding for general videos. In this paper, the phase spectrum quaternion of Fourier Transform (PQFT) method is adopted to determine the ROI. To improve the results of ROI detection, the saliency map from the PQFT is augmented with maps created from high level knowledge of factors that are known to attract human attention. Hence, maps that locate faces and emphasise the centre of the screen are used in combination with the saliency map to determine the ROI. The contribution of this paper lies on the automatic ROI detection technique for coding a low bit rate videos which include the ROI prioritisation technique to give different level of encoding qualities for multiple ROIs, and the evaluation of the proposed automatic ROI detection that is shown to have a close performance to human ROI, based on the eye fixation data.
Resumo:
Many state of the art vision-based Simultaneous Localisation And Mapping (SLAM) and place recognition systems compute the salience of visual features in their environment. As computing salience can be problematic in radically changing environments new low resolution feature-less systems have been introduced, such as SeqSLAM, all of which consider the whole image. In this paper, we implement a supervised classifier system (UCS) to learn the salience of image regions for place recognition by feature-less systems. SeqSLAM only slightly benefits from the results of training, on the challenging real world Eynsham dataset, as it already appears to filter less useful regions of a panoramic image. However, when recognition is limited to specific image regions performance improves by more than an order of magnitude by utilising the learnt image region saliency. We then investigate whether the region salience generated from the Eynsham dataset generalizes to another car-based dataset using a perspective camera. The results suggest the general applicability of an image region salience mask for optimizing route-based navigation applications.
Resumo:
We present a novel approach for multi-object detection in aerial videos based on tracking. The proposed method mainly involves three steps. Firstly, the spatial-temporal saliency is employed to detect moving objects. Secondly, the detected objects are tracked by mean shift in the subsequent frames. Finally, the saliency results are fused with the weight map generated by tracking to get refined detection results, and in turn the modified detection results are used to update the tracking models. The proposed algorithm is evaluated on VIVID aerial videos, and the results show that our approach can reliably detect moving objects even in challenging situations. Meanwhile, the proposed method can process videos in real time, without the effect of time delay.
Resumo:
For robots operating in outdoor environments, a number of factors, including weather, time of day, rough terrain, high speeds, and hardware limitations, make performing vision-based simultaneous localization and mapping with current techniques infeasible due to factors such as image blur and/or underexposure, especially on smaller platforms and low-cost hardware. In this paper, we present novel visual place-recognition and odometry techniques that address the challenges posed by low lighting, perceptual change, and low-cost cameras. Our primary contribution is a novel two-step algorithm that combines fast low-resolution whole image matching with a higher-resolution patch-verification step, as well as image saliency methods that simultaneously improve performance and decrease computing time. The algorithms are demonstrated using consumer cameras mounted on a small vehicle in a mixed urban and vegetated environment and a car traversing highway and suburban streets, at different times of day and night and in various weather conditions. The algorithms achieve reliable mapping over the course of a day, both when incrementally incorporating new visual scenes from different times of day into an existing map, and when using a static map comprising visual scenes captured at only one point in time. Using the two-step place-recognition process, we demonstrate for the first time single-image, error-free place recognition at recall rates above 50% across a day-night dataset without prior training or utilization of image sequences. This place-recognition performance enables topologically correct mapping across day-night cycles.
Resumo:
Policies of inclusion challenge the construct of readiness and require schools to prepare for the diversity of children as they transition to school. However, there is limited empirical evidence concerning how this challenge is met. This paper presents two Australian studies that investigate inclusive practices in the transition to school. Study 1 examined the predictors of child outcomes across a sample of 1831 children in 39 schools. The results indicate that both quantity and quality of programme provision influenced outcomes and that programme effects were particularly potent for children with diverse abilities and backgrounds. Study 2 focuses on pedagogy in three of the schools to highlight how this provision can be achieved. Results show that provisions were reactive, that saliency of children’s needs directed school practices and that professional knowledge impacted on measures of quality. Inclusive processes accounting for both child progress and broader family and teaching influences are necessary for improved transition to school.