2 resultados para airport security, vigilance, threat image projection, detection performance

em Glasgow Theses Service


Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is well known that self-generated stimuli are processed differently from externally generated stimuli. For example, many people have noticed since childhood that it is very difficult to make a self-tickling. In the auditory domain, self-generated sounds elicit smaller brain responses as compared to externally generated sounds, known as the sensory attenuation (SA) effect. SA is manifested in reduced amplitudes of evoked responses as measured through MEEG, decreased firing rates of neurons and a lower level of perceived loudness for self-generated sounds. The predominant explanation for SA is based on the idea that self-generated stimuli are predicted (e.g., the forward model account). It is the nature of their predictability that is crucial for SA. On the contrary, the sensory gating account emphasizes a general suppressive effect of actions on sensory processing, regardless of the predictability of the stimuli. Both accounts have received empirical support, which suggests that both mechanisms may exist. In chapter 2, three behavioural studies concerning the influence of motor activation on auditory perception were presented. Study 1 compared the effect of SA and attention in an auditory detection task and showed that SA was present even when substantial attention was paid to unpredictable stimuli. Study 2 compared the loudness perception of tones generated by others between Chinese and British participants. Compared to externally generated tones, a decrease in perceived loudness for others generated tones was found among Chinese but not among the British. In study 3, partial evidence was found that even when reading words that are related to action, auditory detection performance was impaired. In chapter 3, the classic SA effect of M100 suppression was replicated with MEG in study 4. With time-frequency analysis, a potential neural information processing sequence was found in auditory cortex. Prior to the onset of self-generated tones, there was an increase of oscillatory power in the alpha band. After the stimulus onset, reduced gamma power and alpha/beta phase locking were found. The three temporally segregated oscillatory events correlated with each other and with SA effect, which may be the underlying neural implementation of SA. In chapter 4, a TMS-MEG study was presented investigating the role of the cerebellum in adapting to delayed presentation of self-generated tones (study 5). It demonstrated that in sham stimulation condition, the brain can adapt to the delay (about 100 ms) within 300 trials of learning by showing a significant increase of SA effect in the suppression of M100, but not M200 component. Whereas after stimulating the cerebellum with a suppressive TMS protocol, the adaptation in M100 suppression disappeared and the pattern of M200 suppression reversed to M200 enhancement. These data support the idea that the suppressive effect of actions on auditory processing is a consequence of both motor driven sensory predictions and general sensory gating. The results also demonstrate the importance of neural oscillations in implementing SA effect and the critical role of the cerebellum in learning sensory predictions under sensory perturbation.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).