256 resultados para Database, Image Retrieval, Browsing, Semantic Concept
Resumo:
This thesis targets on a challenging issue that is to enhance users' experience over massive and overloaded web information. The novel pattern-based topic model proposed in this thesis can generate high-quality multi-topic user interest models technically by incorporating statistical topic modelling and pattern mining. We have successfully applied the pattern-based topic model to both fields of information filtering and information retrieval. The success of the proposed model in finding the most relevant information to users mainly comes from its precisely semantic representations to represent documents and also accurate classification of the topics at both document level and collection level.
Resumo:
Affect is an important feature of multimedia content and conveys valuable information for multimedia indexing and retrieval. Most existing studies for affective content analysis are limited to low-level features or mid-level representations, and are generally criticized for their incapacity to address the gap between low-level features and high-level human affective perception. The facial expressions of subjects in images carry important semantic information that can substantially influence human affective perception, but have been seldom investigated for affective classification of facial images towards practical applications. This paper presents an automatic image emotion detector (IED) for affective classification of practical (or non-laboratory) data using facial expressions, where a lot of “real-world” challenges are present, including pose, illumination, and size variations etc. The proposed method is novel, with its framework designed specifically to overcome these challenges using multi-view versions of face and fiducial point detectors, and a combination of point-based texture and geometry. Performance comparisons of several key parameters of relevant algorithms are conducted to explore the optimum parameters for high accuracy and fast computation speed. A comprehensive set of experiments with existing and new datasets, shows that the method is effective despite pose variations, fast, and appropriate for large-scale data, and as accurate as the method with state-of-the-art performance on laboratory-based data. The proposed method was also applied to affective classification of images from the British Broadcast Corporation (BBC) in a task typical for a practical application providing some valuable insights.
Resumo:
The speed at which target pictures are named increases monotonically as a function of prior retrieval of other exemplars of the same semantic category and is unaffected by the number of intervening items. This cumulative semantic interference effect is generally attributed to three mechanisms: shared feature activation, priming and lexical-level selection. However, at least two additional mechanisms have been proposed: (1) a 'booster' to amplify lexical-level activation and (2) retrieval-induced forgetting (RIF). In a perfusion functional Magnetic Resonance Imaging (fMRI) experiment, we tested hypotheses concerning the involvement of all five mechanisms. Our results demonstrate that the cumulative interference effect is associated with perfusion signal changes in the left perirhinal and middle temporal cortices that increase monotonically according to the ordinal position of exemplars being named. The left inferior frontal gyrus (LIFG) also showed significant perfusion signal changes across ordinal presentations; however, these responses did not conform to a monotonically increasing function. None of the cerebral regions linked with RIF in prior neuroimaging and modelling studies showed significant effects. This might be due to methodological differences between the RIF paradigm and continuous naming as the latter does not involve practicing particular information. We interpret the results as indicating priming of shared features and lexical-level selection mechanisms contribute to the cumulative interference effect, while adding noise to a booster mechanism could account for the pattern of responses observed in the LIFG.
Resumo:
Naming an object entails a number of processing stages, including retrieval of a target lexical concept and encoding of its phonological word form. We investigated these stages using the picture-word interference task in an fMRI experiment. Participants named target pictures in the presence of auditorily presented semantically related, phonologically related, or unrelated distractor words or in isolation. We observed BOLD signal changes in left-hemisphere regions associated with lexical-conceptual and phonological processing, including the midto-posterior lateral temporal cortex. However, these BOLD responses manifested as signal reductions for all distractor conditions relative to naming alone. Compared with unrelated words, phonologically related distractors showed further signal reductions, whereas only the pars orbitalis of the left inferior frontal cortex showed a selective reduction in response in the semantic condition. We interpret these findings as indicating that the word forms of lexical competitors are phonologically encoded and that competition during lexical selection is reduced by phonologically related distractors. Since the extended nature of auditory presentation requires a large portion of a word to be presented before its meaning is accessed, we attribute the BOLD signal reductions observed for semantically related and unrelated words to lateral inhibition mechanisms engaged after target name selection has occurred, as has been proposed in some production models.
Resumo:
Ignoring an object slows subsequent naming responses to it, a phenomenon known as negative priming (NP). A central issue in NP research concerns the level of representation at which the effect occurs. As object naming is typically considered to involve access to abstract semantic representations, Tipper 1985 proposed that the NP effect occurred at this level of processing, and other researchers supported this proposal by demonstrating a similar result with categorically related objects (e.g., Allport et al., 1985; Murray, 1995), an effect referred to as semantic NP. However, objects within categories share more physical or structural features than objects from different categories. Consequently, the NP effect observed with categorically related objects might occur at a structural rather than semantic level of representation. We used event related fMRI interleaving overt object naming and image acquisition to demonstrate for the first time that the semantic NP effect activates the left posterior-mid fusiform and insular-opercular cortices. Moreover, both naming latencies and left posterior-mid fusiform cortex responses were influenced by the structural similarity of prime-probe object pairings in the categorically related condition, increasing with the number of shared features. None of the cerebral regions activated in a previous fMRI study of the identity NP effect (de Zubicaray et al., 2006) showed similar activation during semantic NP, including the left anterolateral temporal cortex, a region considered critical for semantic processing. The results suggest that the identity and semantic NP effects differ with respect to their neural mechanisms, and the label "semantic NP" might be a misnomer. We conclude that the effect is most likely the result of competition between structurally similar category exemplars that determines the efficiency of object name retrieval.
Resumo:
Di�culty naming objects is one of the most common impairments in people with aphasia post-stroke, irrespective of aphasia classification (Goodglass & Wingfield, 1997). Thus, remediation of naming impairments is often a focus of treatment in the rehabilitation of language. Such treatments typically employ phonological or semantic approaches, or a combination of the two, in order to target the major cognitive components involved in word retrieval (Nickels,2002). Although individuals can show greater benefit from one approach over the other, the relationship between an individual’s locus of breakdown in word retrieval and response to a particular treatment approach remains unclear, and knowledge of the underlying neural mechanisms which may be responsible for successful treatment is scarce. The aim of this study was to examine brain activity associated with successful phonological and semantic based treatments for word retrieval using functional Magnetic Resonance Imaging fMRI).
Resumo:
Automated digital recordings are useful for large-scale temporal and spatial environmental monitoring. An important research effort has been the automated classification of calling bird species. In this paper we examine a related task, retrieval of birdcalls from a database of audio recordings, similar to a user supplied query call. Such a retrieval task can sometimes be more useful than an automated classifier. We compare three approaches to similarity-based birdcall retrieval using spectral ridge features and two kinds of gradient features, structure tensor and the histogram of oriented gradients. The retrieval accuracy of our spectral ridge method is 94% compared to 82% for the structure tensor method and 90% for the histogram of gradients method. Additionally, this approach potentially offers a more compact representation and is more computationally efficient.
Resumo:
Long-lasting interference effects in picture naming are induced when objects are presented in categorically related contexts in both continuous and blocked cyclic paradigms. Less consistent context effects have been reported when the task is changed to semantic classification. Experiment 1 confirmed the recent finding of cumulative facilitation in the continuous paradigm with living/non-living superordinate categorization. To avoid a potential confound involving participants responding with the identical superordinate category in related contexts in the blocked cyclic paradigm, we devised a novel set of categorically related objects that also varied in terms of relative age – a core semantic type associated with the adjective word class across languages. Experiment 2 demonstrated the typical interference effect with these stimuli in basic level naming. In Experiment 3, using the identical blocked cyclic paradigm, we failed to observe semantic context effects when the same pictures were classified as younger–older. Overall, the results indicate the semantic context effects in the two paradigms do not share a common origin, with the effect in the continuous paradigm arising at the level of conceptual representations or in conceptual-to-lexical connections while the effect in the blocked cyclic paradigm most likely originates at a lexical level of representation. The implications of these findings for current accounts of long-lasting interference effects in spoken word production are discussed.
Resumo:
The impact of disease and treatment on a young adult's self-image and sexuality has been largely overlooked. This is surprising given that establishing social and romantic relationships is a normal occurrence in young adulthood. This article describes three female patients' cancer journeys and demonstrates how their experiences have impacted their psychosocial function and self-regard. The themes of body image, self-esteem, and identity formation are explored, in relation to implications for relationship-building and moving beyond a cancer diagnosis. This article has been written by young cancer survivors, Danielle Tindle, Kelly Denver, and Faye Lilley, in an effort to elucidate the ongoing struggle to reconcile cancer into a normal young adult's life.
Resumo:
Frog species have been declining worldwide at unprecedented rates in the past decades. There are many reasons for this decline including pollution, habitat loss, and invasive species [1]. To preserve, protect, and restore frog biodiversity, it is important to monitor and assess frog species. In this paper, a novel method using image processing techniques for analyzing Australian frog vocalisations is proposed. An FFT is applied to audio data to produce a spectrogram. Then, acoustic events are detected and isolated into corresponding segments through image processing techniques applied to the spectrogram. For each segment, spectral peak tracks are extracted with selected seeds and a region growing technique is utilised to obtain the contour of each frog vocalisation. Based on spectral peak tracks and the contour of each frog vocalisation, six feature sets are extracted. Principal component analysis reduces each feature set down to six principal components which are tested for classification performance with a k-nearest neighbor classifier. This experiment tests the proposed method of classification on fourteen frog species which are geographically well distributed throughout Queensland, Australia. The experimental results show that the best average classification accuracy for the fourteen frog species can be up to 87%.
Resumo:
Bioacoustic data can be used for monitoring animal species diversity. The deployment of acoustic sensors enables acoustic monitoring at large temporal and spatial scales. We describe a content-based birdcall retrieval algorithm for the exploration of large data bases of acoustic recordings. In the algorithm, an event-based searching scheme and compact features are developed. In detail, ridge events are detected from audio files using event detection on spectral ridges. Then event alignment is used to search through audio files to locate candidate instances. A similarity measure is then applied to dimension-reduced spectral ridge feature vectors. The event-based searching method processes a smaller list of instances for faster retrieval. The experimental results demonstrate that our features achieve better success rate than existing methods and the feature dimension is greatly reduced.
Resumo:
Recent advances in neural language models have contributed new methods for learning distributed vector representations of words (also called word embeddings). Two such methods are the continuous bag-of-words model and the skipgram model. These methods have been shown to produce embeddings that capture higher order relationships between words that are highly effective in natural language processing tasks involving the use of word similarity and word analogy. Despite these promising results, there has been little analysis of the use of these word embeddings for retrieval. Motivated by these observations, in this paper, we set out to determine how these word embeddings can be used within a retrieval model and what the benefit might be. To this aim, we use neural word embeddings within the well known translation language model for information retrieval. This language model captures implicit semantic relations between the words in queries and those in relevant documents, thus producing more accurate estimations of document relevance. The word embeddings used to estimate neural language models produce translations that differ from previous translation language model approaches; differences that deliver improvements in retrieval effectiveness. The models are robust to choices made in building word embeddings and, even more so, our results show that embeddings do not even need to be produced from the same corpus being used for retrieval.
Resumo:
Clustering identities in a video is a useful task to aid in video search, annotation and retrieval, and cast identification. However, reliably clustering faces across multiple videos is challenging task due to variations in the appearance of the faces, as videos are captured in an uncontrolled environment. A person's appearance may vary due to session variations including: lighting and background changes, occlusions, changes in expression and make up. In this paper we propose the novel Local Total Variability Modelling (Local TVM) approach to cluster faces across a news video corpus; and incorporate this into a novel two stage video clustering system. We first cluster faces within a single video using colour, spatial and temporal cues; after which we use face track modelling and hierarchical agglomerative clustering to cluster faces across the entire corpus. We compare different face recognition approaches within this framework. Experiments on a news video database show that the Local TVM technique is able effectively model the session variation observed in the data, resulting in improved clustering performance, with much greater computational efficiency than other methods.
Resumo:
Natural history collections are an invaluable resource housing a wealth of knowledge with a long tradition of contributing to a wide range of fields such as taxonomy, quarantine, conservation and climate change. It is recognized however [Smith and Blagoderov 2012] that such physical collections are often heavily underutilized as a result of the practical issues of accessibility. The digitization of these collections is a step towards removing these access issues, but other hurdles must be addressed before we truly unlock the potential of this knowledge.