7 resultados para Visual and acoustic signaling
em Glasgow Theses Service
Resumo:
In recent years, more and more Chinese films have been exported abroad. This thesis intends to explore the subtitling of Chinese cinema into English, with Zhang Yimou’s films as a case study. Zhang Yimou is arguably the most critically and internationally acclaimed Chinese filmmaker, who has experimented with a variety of genres of films. I argue that in the subtitling of his films, there is an obvious adoption of the domestication translation strategy that reduces or even omits Chinese cultural references. I try to discover what cultural categories or perspectives of China are prone to the domestication of translation and have formulated five categories: humour, politeness, dialect, history and songs and the Peking Opera. My methodology is that I compare the source Chinese dialogue lines with the existing English subtitles by providing literal translations of the source lines, and I will also give my alternative translations that tend to retain the source cultural references better. I also speculate that the domestication strategy is frequently employed by subtitlers possibly because the subtitlers assume the source cultural references are difficult for target language subtitle readers to comprehend, even if they are translated into a target language. However, subtitle readers are very likely to understand more than what the dialogue lines and the target language subtitles express, because films are multimodal entities and verbal information is not the only source of information for subtitle readers. The image and the sound are also significant sources of information for subtitle readers who are constantly involved in a dynamic film-watching experience. They are also expected to grasp visual and acoustic information. The complete omission or domestication of source cultural references might also affect their interpretation of the non-verbal cues. I also contemplate that the translation, which frequently domesticates the source culture carried out by a translator who is also a native speaker of the source language, is ‘submissive translation’.
Resumo:
The study investigates the acoustic, articulatory and sociophonetic properties of the Swedish /iː/ variant known as 'Viby-i' in 13 speakers of Central Swedish from Stockholm, Gothenburg, Varberg, Jönköping and Katrineholm. The vowel is described in terms of its auditory quality, its acoustic F1 and F2 values, and its tongue configuration. A brief, qualitative description of lip position is also included. Variation in /iː/ production is mapped against five sociolinguistic factors: city, dialectal region, metropolitan vs. urban location, sex and socioeconomic rating. Articulatory data is collected using ultrasound tongue imaging (UTI), for which the study proposes and evaluates a methodology. The study shows that Viby-i varies in auditory strength between speakers, and that strong instances of the vowel are associated with a high F1 and low F2, a trend which becomes more pronounced as the strength of Viby-i increases. The articulation of Viby-i is characterised by a lowered and backed tongue body, sometimes accompanied by a double-bunched tongue shape. The relationship between tongue position and acoustic results appears to be non-linear, suggesting either a measurement error or the influence of additional articulatory factors. Preliminary images of the lips show that Viby-i is produced with a spread but lax lip posture. The lip data also reveals parts of the tongue, which in many speakers appears to be extremely fronted and braced against the lower teeth, or sometimes protruded, when producing Viby-i. No sociophonetic difference is found between speakers from different cities or dialect regions. Metropolitan speakers are found to have an auditorily and acoustically stronger Viby-i than urban speakers, but this pattern is not matched in tongue backing or lowering. Overall the data shows a weak trend towards higher-class females having stronger Viby-i, but these results are tentative due to the limited size and stratification of the sample. Further research is needed to fully explore the sociophonetic properties of Viby-i.
Resumo:
This thesis proposes a generic visual perception architecture for robotic clothes perception and manipulation. This proposed architecture is fully integrated with a stereo vision system and a dual-arm robot and is able to perform a number of autonomous laundering tasks. Clothes perception and manipulation is a novel research topic in robotics and has experienced rapid development in recent years. Compared to the task of perceiving and manipulating rigid objects, clothes perception and manipulation poses a greater challenge. This can be attributed to two reasons: firstly, deformable clothing requires precise (high-acuity) visual perception and dexterous manipulation; secondly, as clothing approximates a non-rigid 2-manifold in 3-space, that can adopt a quasi-infinite configuration space, the potential variability in the appearance of clothing items makes them difficult to understand, identify uniquely, and interact with by machine. From an applications perspective, and as part of EU CloPeMa project, the integrated visual perception architecture refines a pre-existing clothing manipulation pipeline by completing pre-wash clothes (category) sorting (using single-shot or interactive perception for garment categorisation and manipulation) and post-wash dual-arm flattening. To the best of the author’s knowledge, as investigated in this thesis, the autonomous clothing perception and manipulation solutions presented here were first proposed and reported by the author. All of the reported robot demonstrations in this work follow a perception-manipulation method- ology where visual and tactile feedback (in the form of surface wrinkledness captured by the high accuracy depth sensor i.e. CloPeMa stereo head or the predictive confidence modelled by Gaussian Processing) serve as the halting criteria in the flattening and sorting tasks, respectively. From scientific perspective, the proposed visual perception architecture addresses the above challenges by parsing and grouping 3D clothing configurations hierarchically from low-level curvatures, through mid-level surface shape representations (providing topological descriptions and 3D texture representations), to high-level semantic structures and statistical descriptions. A range of visual features such as Shape Index, Surface Topologies Analysis and Local Binary Patterns have been adapted within this work to parse clothing surfaces and textures and several novel features have been devised, including B-Spline Patches with Locality-Constrained Linear coding, and Topology Spatial Distance to describe and quantify generic landmarks (wrinkles and folds). The essence of this proposed architecture comprises 3D generic surface parsing and interpretation, which is critical to underpinning a number of laundering tasks and has the potential to be extended to other rigid and non-rigid object perception and manipulation tasks. The experimental results presented in this thesis demonstrate that: firstly, the proposed grasp- ing approach achieves on-average 84.7% accuracy; secondly, the proposed flattening approach is able to flatten towels, t-shirts and pants (shorts) within 9 iterations on-average; thirdly, the proposed clothes recognition pipeline can recognise clothes categories from highly wrinkled configurations and advances the state-of-the-art by 36% in terms of classification accuracy, achieving an 83.2% true-positive classification rate when discriminating between five categories of clothes; finally the Gaussian Process based interactive perception approach exhibits a substantial improvement over single-shot perception. Accordingly, this thesis has advanced the state-of-the-art of robot clothes perception and manipulation.
Resumo:
Apparitions of empire and imperial ideologies were deeply embedded in the International Exhibition, a distinct exhibitionary paradigm that came to prominence in the mid-nineteenth century. Exhibitions were platforms for the display of objects, the movement of people, and the dissemination of ideas across and between regions of the British Empire, thereby facilitating contact between its different cultures and societies. This thesis aims to disrupt a dominant understanding of International Exhibitions, which forwards the notion that all exhibitions, irrespective of when or where they were staged, upheld a singular imperial discourse (i.e. Greenhalgh 1988, Rydell 1984). Rather, this thesis suggests International Exhibitions responded to and reflected the unique social, political and economic circumstances in which they took place, functioning as cultural environments in which pressing concerns of the day were worked through. Understood thus, the International Exhibition becomes a space for self-presentation, serving as a stage from which a multitude of interests and identities were constructed, performed and projected. This thesis looks to the visual and material culture of the International Exhibition in order to uncover this more nuanced history, and foregrounds an analysis of the intersections between practices of exhibition-making and identity-making. The primary focus is a set of exhibitions held in Glasgow in the late-1880s and early-1900s, which extends the geographic and temporal boundaries of the existing scholarship. What is more, it looks at representations of Canada at these events, another party whose involvement in the International Exhibition tradition has gone largely unnoticed. Consequently, this thesis is a thematic investigation of the links between a municipality routinely deemed the ‘Second City of the Empire’ and a Dominion settler colony, two types of geographic setting rarely brought into dialogue. It analyses three key elements of the exhibition-making process, exploring how iconographies of ‘quasi-nationhood’ were expressed through an exhibition’s planning and negotiation, its architecture and its displays. This original research framework deliberately cuts across strata that continue to define conceptions of the British Empire, and pushes beyond a conceptual model defined by metropole and colony. Through examining International Exhibitions held in Glasgow in the late-Victorian and Edwardian periods, and visions of Canada in evidence at these events, the goal is to offer a novel intervention into the existing literature concerning the cultural history of empire, one that emphasises fluidity rather than fixity and which muddles the boundaries between centre and periphery.
Resumo:
With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).
Resumo:
This research investigates the process of “opening out” spaces with sound as an approach to sonic arts practice, investigating the spaces that sounds articulate, reveal and imply in our encounter with them. It positions spatial aesthetics as a key consideration at each stage of the creative process and connects approaches to spatiality in sonic arts practices with contextual considerations drawn from, for example, phenomenological accounts of spatial and sonic experience, human geography, architecture and acoustic ecology. The portfolio consists of seven sonic artworks and two collaborative projects that each engage with these ideas from a different perspective, exploring a number of applications, contexts and outcomes in the investigation. This accompanying commentary discusses these works, providing an introduction to the portfolio followed by a discussion, in the subsequent chapters, of the practices explored and developed in the research process.
Resumo:
Signifying road-related events with warnings can be highly beneficial, especially when imminent attention is needed. This thesis describes how modality, urgency and situation can influence driver responses to multimodal displays used as warnings. These displays utilise all combinations of audio, visual and tactile modalities, reflecting different urgency levels. In this way, a new rich set of cues is designed, conveying information multimodally, to enhance reactions during driving, which is a highly visual task. The importance of the signified events to driving is reflected in the warnings, and safety-critical or non-critical situations are communicated through the cues. Novel warning designs are considered, using both abstract displays, with no semantic association to the signified event, and language-based ones, using speech. These two cue designs are compared, to discover their strengths and weaknesses as car alerts. The situations in which the new cues are delivered are varied, by simulating both critical and non-critical events and both manual and autonomous car scenarios. A novel set of guidelines for using multimodal driver displays is finally provided, considering the modalities utilised, the urgency signified, and the situation simulated.