352 resultados para Audio indexing
Resumo:
Cultural objects are increasingly generated and stored in digital form, yet effective methods for their indexing and retrieval still remain an important area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. We firstly discuss the requirements from a number of perspectives: users, content providers, content managers and technical systems. We then present an overview of our system architecture and describe various techniques which underlie the major components of the system. These include: automatic object category detection; user-driven tagging; metadata transform and augmentation, and an expression language for digital cultural objects. In addition, we discuss our experience on testing and evaluating some existing collections, analyse the difficulties encountered and propose ways to address these problems.
Resumo:
Our students come from diverse backgrounds. They need flexibility in their learning. First year students tend to worry when they miss lectures or part of lectures. Having the lecture as an on line resource allows students to miss a lecture without stressing about it and to be more relaxed in the lecture, knowing that anything they may miss will be available later. The resource: The Windows based program from Blueberry Software (not Blackberry!) - BB Flashback - allows the simultaneous recording of the computer screen together with the audio, as well as Webcam recording. Editing capabilities include adding pause buttons, graphics and text to the file before exporting it in a flash file. Any diagrams drawn on the board or shown via visualiser can be photographed and easily incorporated. The audio from the file can be extracted if required to be posted as podcast. Exporting modes other than Flash are also available, allowing vodcasting if you wish. What you will need: - the recording software: it can be installed on the lecture hall computer just prior to lecture if needed - a computer: either the ones in lecture halls, especially if fitted with audio recording, or a laptop (I have used audio recording via Bluetooth for mobility). Feedback from students has been positive and will be presented on the poster.
Resumo:
Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems.
Resumo:
Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.
Resumo:
Tagging has become one of the key activities in next generation websites which allow users selecting short labels to annotate, manage, and share multimedia information such as photos, videos and bookmarks. Tagging does not require users any prior training before participating in the annotation activities as they can freely choose any terms which best represent the semantic of contents without worrying about any formal structure or ontology. However, the practice of free-form tagging can lead to several problems, such as synonymy, polysemy and ambiguity, which potentially increase the complexity of managing the tags and retrieving information. To solve these problems, this research aims to construct a lightweight indexing scheme to structure tags by identifying and disambiguating the meaning of terms and construct a knowledge base or dictionary. News has been chosen as the primary domain of application to demonstrate the benefits of using structured tags for managing the rapidly changing and dynamic nature of news information. One of the main outcomes of this work is an automatically constructed vocabulary that defines the meaning of each named entity tag, which can be extracted from a news article (including person, location and organisation), based on experts suggestions from major search engines and the knowledge from public database such as Wikipedia. To demonstrate the potential applications of the vocabulary, we have used it to provide more functionalities in an online news website, including topic-based news reading, intuitive tagging, clipping and sharing of interesting news, as well as news filtering or searching based on named entity tags. The evaluation results on the impact of disambiguating tags have shown that the vocabulary can help to significantly improve news searching performance. The preliminary results from our user study have demonstrated that users can benefit from the additional functionalities on the news websites as they are able to retrieve more relevant news, clip and share news with friends and families effectively.
Resumo:
The inquiry documented in this thesis is located at the nexus of technological innovation and traditional schooling. As we enter the second decade of a new century, few would argue against the increasingly urgent need to integrate digital literacies with traditional academic knowledge. Yet, despite substantial investments from governments and businesses, the adoption and diffusion of contemporary digital tools in formal schooling remain sluggish. To date, research on technology adoption in schools tends to take a deficit perspective of schools and teachers, with the lack of resources and teacher ‘technophobia’ most commonly cited as barriers to digital uptake. Corresponding interventions that focus on increasing funding and upskilling teachers, however, have made little difference to adoption trends in the last decade. Empirical evidence that explicates the cultural and pedagogical complexities of innovation diffusion within long-established conventions of mainstream schooling, particularly from the standpoint of students, is wanting. To address this knowledge gap, this thesis inquires into how students evaluate and account for the constraints and affordances of contemporary digital tools when they engage with them as part of their conventional schooling. It documents the attempted integration of a student-led Web 2.0 learning initiative, known as the Student Media Centre (SMC), into the schooling practices of a long-established, high-performing independent senior boys’ school in urban Australia. The study employed an ‘explanatory’ two-phase research design (Creswell, 2003) that combined complementary quantitative and qualitative methods to achieve both breadth of measurement and richness of characterisation. In the initial quantitative phase, a self-reported questionnaire was administered to the senior school student population to determine adoption trends and predictors of SMC usage (N=481). Measurement constructs included individual learning dispositions (learning and performance goals, cognitive playfulness and personal innovativeness), as well as social and technological variables (peer support, perceived usefulness and ease of use). Incremental predictive models of SMC usage were conducted using Classification and Regression Tree (CART) modelling: (i) individual-level predictors, (ii) individual and social predictors, and (iii) individual, social and technological predictors. Peer support emerged as the best predictor of SMC usage. Other salient predictors include perceived ease of use and usefulness, cognitive playfulness and learning goals. On the whole, an overwhelming proportion of students reported low usage levels, low perceived usefulness and a lack of peer support for engaging with the digital learning initiative. The small minority of frequent users reported having high levels of peer support and robust learning goal orientations, rather than being predominantly driven by performance goals. These findings indicate that tensions around social validation, digital learning and academic performance pressures influence students’ engagement with the Web 2.0 learning initiative. The qualitative phase that followed provided insights into these tensions by shifting the analytics from individual attitudes and behaviours to shared social and cultural reasoning practices that explain students’ engagement with the innovation. Six indepth focus groups, comprising 60 students with different levels of SMC usage, were conducted, audio-recorded and transcribed. Textual data were analysed using Membership Categorisation Analysis. Students’ accounts converged around a key proposition. The Web 2.0 learning initiative was useful-in-principle but useless-in-practice. While students endorsed the usefulness of the SMC for enhancing multimodal engagement, extending peer-topeer networks and acquiring real-world skills, they also called attention to a number of constraints that obfuscated the realisation of these design affordances in practice. These constraints were cast in terms of three binary formulations of social and cultural imperatives at play within the school: (i) ‘cool/uncool’, (ii) ‘dominant staff/compliant student’, and (iii) ‘digital learning/academic performance’. The first formulation foregrounds the social stigma of the SMC among peers and its resultant lack of positive network benefits. The second relates to students’ perception of the school culture as authoritarian and punitive with adverse effects on the very student agency required to drive the innovation. The third points to academic performance pressures in a crowded curriculum with tight timelines. Taken together, findings from both phases of the study provide the following key insights. First, students endorsed the learning affordances of contemporary digital tools such as the SMC for enhancing their current schooling practices. For the majority of students, however, these learning affordances were overshadowed by the performative demands of schooling, both social and academic. The student participants saw engagement with the SMC in-school as distinct from, even oppositional to, the conventional social and academic performance indicators of schooling, namely (i) being ‘cool’ (or at least ‘not uncool’), (ii) sufficiently ‘compliant’, and (iii) achieving good academic grades. Their reasoned response therefore, was simply to resist engagement with the digital learning innovation. Second, a small minority of students seemed dispositionally inclined to negotiate the learning affordances and performance constraints of digital learning and traditional schooling more effectively than others. These students were able to engage more frequently and meaningfully with the SMC in school. Their ability to adapt and traverse seemingly incommensurate social and institutional identities and norms is theorised as cultural agility – a dispositional construct that comprises personal innovativeness, cognitive playfulness and learning goals orientation. The logic then is ‘both and’ rather than ‘either or’ for these individuals with a capacity to accommodate both learning and performance in school, whether in terms of digital engagement and academic excellence, or successful brokerage across multiple social identities and institutional affiliations within the school. In sum, this study takes us beyond the familiar terrain of deficit discourses that tend to blame institutional conservatism, lack of resourcing and teacher resistance for low uptake of digital technologies in schools. It does so by providing an empirical base for the development of a ‘third way’ of theorising technological and pedagogical innovation in schools, one which is more informed by students as critical stakeholders and thus more relevant to the lived culture within the school, and its complex relationship to students’ lives outside of school. It is in this relationship that we find an explanation for how these individuals can, at the one time, be digital kids and analogue students.
Resumo:
The increasing diversity of the Internet has created a vast number of multilingual resources on the Web. A huge number of these documents are written in various languages other than English. Consequently, the demand for searching in non-English languages is growing exponentially. It is desirable that a search engine can search for information over collections of documents in other languages. This research investigates the techniques for developing high-quality Chinese information retrieval systems. A distinctive feature of Chinese text is that a Chinese document is a sequence of Chinese characters with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose two approaches to deal with the problems. In the first approach, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach. In the second approach, we propose a novel query expansion method which applies text mining techniques in order to find the most relevant words to extend the query. Unlike most existing query expansion methods, which generally select the highly frequent indexing terms from the retrieved documents to expand the query. In our approach, we utilize text mining techniques to find patterns from the retrieved documents that highly correlate with the query term and then use the relevant words in the patterns to expand the original query. This research project develops and implements a Chinese information retrieval system for evaluating the proposed approaches. There are two stages in the experiments. The first stage is to investigate if high accuracy segmentation can make an improvement to Chinese information retrieval. In the second stage, a text mining based query expansion approach is implemented and a further experiment has been done to compare its performance with the standard Rocchio approach with the proposed text mining based query expansion method. The NTCIR5 Chinese collections are used in the experiments. The experiment results show that by incorporating the text mining based query expansion with the hybrid model, significant improvement has been achieved in both precision and recall assessments.
Resumo:
This correspondence presents a microphone array shape calibration procedure for diffuse noise environments. The procedure estimates intermicrophone distances by fitting the measured noise coherence with its theoretical model and then estimates the array geometry using classical multidimensional scaling. The technique is validated on noise recordings from two office environments.
Resumo:
Acoustically, vehicles are extremely noisy environments and as a consequence audio-only in-car voice recognition systems perform very poorly. Seeing that the visual modality is immune to acoustic noise, using the visual lip information from the driver is seen as a viable strategy in circumventing this problem. However, implementing such an approach requires a system being able to accurately locate and track the driver’s face and facial features in real-time. In this paper we present such an approach using the Viola-Jones algorithm. Using this system, we present our results which show that using the Viola-Jones approach is a suitable method of locating and tracking the driver’s lips despite the visual variability of illumination and head pose.
Resumo:
In this paper, technology is described as involving processes whereby resources are utilised to satisfy human needs or to take advantage of opportunities, to develop practical solutions to problems. This study, set within one type of technology context, information technology, investigated how, through a one semester undergraduate university course, elements of technological processes were made explicit to students. While it was acknowledged in the development and implementation of this course that students needed to learn technical skills, technological skills and knowledge, including design, were seen as vital also, to enable students to think about information technology from a perspective that was not confined and limited to `technology as hardware and software'. This paper describes how the course, set within a three year program of study, was aimed at helping students to develop their thinking and their knowledge about design processes in an explicit way. An interpretive research approach was used and data sources included a repertory grid `survey'; student interviews; video recordings of classroom interactions, audio recordings of lectures, observations of classroom interactions made by researchers; and artefacts which included students' journals and portfolios. The development of students' knowledge about design practices is discussed and reflections upon student knowledge development in conjunction with their learning experiences are made. Implications for ensuring explicitness of design practice within information technology contexts are presented, and the need to identify what constitutes design knowledge is argued.
Resumo:
Purpose – This paper aims to report findings from an exploratory study investigating the web interactions and technoliteracy of children in the early childhood years. Previous research has studied aspects of older children’s technoliteracy and web searching; however, few studies have analyzed web search data from children younger than six years of age. Design/methodology/approach – The study explored the Google web searching and technoliteracy of young children who are enrolled in a “preparatory classroom” or kindergarten (the year before young children begin compulsory schooling in Queensland, Australia). Young children were video- and audio-taped while conducting Google web searches in the classroom. The data were qualitatively analysed to understand the young children’s web search behaviour. Findings – The findings show that young children engage in complex web searches, including keyword searching and browsing, query formulation and reformulation, relevance judgments, successive searches, information multitasking and collaborative behaviours. The study results provide significant initial insights into young children’s web searching and technoliteracy. Practical implications – The use of web search engines by young children is an important research area with implications for educators and web technologies developers. Originality/value – This is the first study of young children’s interaction with a web search engine.
Resumo:
Purpose: The classic study of Sumby and Pollack (1954, JASA, 26(2), 212-215) demonstrated that visual information aided speech intelligibility under noisy auditory conditions. Their work showed that visual information is especially useful under low signal-to-noise conditions where the auditory signal leaves greater margins for improvement. We investigated whether simulated cataracts interfered with the ability of participants to use visual cues to help disambiguate the auditory signal in the presence of auditory noise. Methods: Participants in the study were screened to ensure normal visual acuity (mean of 20/20) and normal hearing (auditory threshold ≤ 20 dB HL). Speech intelligibility was tested under an auditory only condition and two visual conditions: normal vision and simulated cataracts. The light scattering effects of cataracts were imitated using cataract-simulating filters. Participants wore blacked-out glasses in the auditory only condition and lens-free frames in the normal auditory-visual condition. Individual sentences were spoken by a live speaker in the presence of prerecorded four-person background babble set to a speech-to-noise ratio (SNR) of -16 dB. The SNR was determined in a preliminary experiment to support 50% correct identification of sentence under the auditory only conditions. The speaker was trained to match the rate, intensity and inflections of a prerecorded audio track of everyday speech sentences. The speaker was blind to the visual conditions of the participant to control for bias.Participants’ speech intelligibility was measured by comparing the accuracy of their written account of what they believed the speaker to have said to the actual spoken sentence. Results: Relative to the normal vision condition, speech intelligibility was significantly poorer when participants wore simulated catarcts. Conclusions: The results suggest that cataracts may interfere with the acquisition of visual cues to speech perception.
Resumo:
This article explores the role of radio sound in establishing what I term ‘affective rhythms’ in everyday life. Through exploring the affective qualities of radio sound and its capacity for mood generation in the home, this article explores personal affective states and personal organisation. The term affective rhythm relates both to mood, and to routine. It is the combination of both that allows the possibility of thinking about sound and affect, and how they relate to, and integrate with, routine everyday life. The notion of ‘affective rhythm’ forces us to consider the idea of mood in the light of the routine nature of everyday domestic life.
Resumo:
Information and communication technologies (particularly websites and e-mail) have the potential to deliver health behavior change programs to large numbers of adults at low cost. Controlled trials using these new media to promote physical activity have produced mixed results. User-centered development methods can assist in understanding the preferences of potential participants for website functions and content, and may lead to more effective programs. Eight focus group discussions were conducted with 40 adults after they had accessed a previously trialed physical activity website. The discussions were audio taped, transcribed and interpreted using a themed analysis method. Four key themes emerged: structure, interactivity, environmental context and content. Preferences were expressed for websites that include simple interactive features, together with information on local community activity opportunities. Particular suggestions included online community notice boards, personalized progress charts, e-mail access to expert advice and access to information on specific local physical activity facilities and services. Website physical activity interventions could usefully include personally relevant interactive and environmentally focused features and services identified through a user-centered development process.
Resumo:
LUPTAI is a decision-aiding tool to enable local and state governments to optimise land use and transport integration. In contrast to mobility between land uses (typically via road), accessibility represents opportunity and choice to reach common land use destinations by public transport and/or walking. LUPTAI uses a GIS-based methodology to quantify and map accessibility to common land use destinations by walking and/or public transport. The tool can be applied to small or large study areas. It can be applied to the current situation in a study area or to future scenarios (such as scenarios involving changes to public transport services, public transport corridors or stations, population density or land use). The tool has been piloted on the Gold Coast and the results are encouraging. This paper outlines the GIS-based methodology and the findings related to this pilot study. The paper demonstrates benefits and possible application of LUPTAI to other urbanised local government areas in Queensland. It also discusses how this accessibility indexing approach could be developed into a decision-support tool to assist local and state government agencies in a range of transport and land-use planning activities.