998 resultados para Multimedia computing


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Mobile devices are rapidly developing into the primary technology for users to work, socialize, and play in a variety of settings and contexts. Their pervasiveness has provided researchers with the means to investigate innovative solutions to ever more complex user demands. Tools for Mobile Multimedia Programming and Development investigates the use of mobile platforms for research projects, focusing on the development, testing, and evaluation of prototypes rather than final products, which enables researchers to better understand the needs of users through image processing, object recognition, sensor integration, and user interactions. This book benefits researchers and professionals in multiple disciplines who utilize such techniques in the creation of prototypes for mobile devices and applications. This book is part of the Advances in Wireless Technologies and Telecommunication series collection.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article addresses the problem of how to select the optimal combination of sensors and how to determine their optimal placement in a surveillance region in order to meet the given performance requirements at a minimal cost for a multimedia surveillance system. We propose to solve this problem by obtaining a performance vector, with its elements representing the performances of subtasks, for a given input combination of sensors and their placement. Then we show that the optimal sensor selection problem can be converted into the form of Integer Linear Programming problem (ILP) by using a linear model for computing the optimal performance vector corresponding to a sensor combination. Optimal performance vector corresponding to a sensor combination refers to the performance vector corresponding to the optimal placement of a sensor combination. To demonstrate the utility of our technique, we design and build a surveillance system consisting of PTZ (Pan-Tilt-Zoom) cameras and active motion sensors for capturing faces. Finally, we show experimentally that optimal placement of sensors based on the design maximizes the system performance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

CD-ROMs have proliferated as a distribution media for desktop machines for a large variety of multimedia applications (targeted for a single-user environment) like encyclopedias, magazines and games. With CD-ROM capacities up to 3 GB being available in the near future, they will form an integral part of Video on Demand (VoD) servers to store full-length movies and multimedia. In the first section of this paper we look at issues related to the single- user desktop environment. Since these multimedia applications are highly interactive in nature, we take a pragmatic approach, and have made a detailed study of the multimedia application behavior in terms of the I/O request patterns generated to the CD-ROM subsystem by tracing these patterns. We discuss prefetch buffer design and seek time characteristics in the context of the analysis of these traces. We also propose an adaptive main-memory hosted cache that receives caching hints from the application to reduce the latency when the user moves from one node of the hyper graph to another. In the second section we look at the use of CD-ROM in a VoD server and discuss the problem of scheduling multiple request streams and buffer management in this scenario. We adapt the C-SCAN (Circular SCAN) algorithm to suit the CD-ROM drive characteristics and prove that it is optimal in terms of buffer size management. We provide computationally inexpensive relations by which this algorithm can be implemented. We then propose an admission control algorithm which admits new request streams without disrupting the continuity of playback of the previous request streams. The algorithm also supports operations such as fast forward and replay. Finally, we discuss the problem of optimal placement of MPEG streams on CD-ROMs in the third section.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This article presents the results of a study that explored the human side of the multimedia experience. We propose a model that assesses quality variation from three distinct levels: the network, the media and the content levels; and from two views: the technical and the user perspective. By facilitating parameter variation at each of the quality levels and from each of the perspectives, we were able to examine their impact on user quality perception. Results show that a significant reduction in frame rate does not proportionally reduce the user's understanding of the presentation independent of technical parameters, that multimedia content type significantly impacts user information assimilation, user level of enjoyment, and user perception of quality, and that the device display type impacts user information assimilation and user perception of quality. Finally, to ensure the transfer of information, low-level abstraction (network-level) parameters, such as delay and jitter, should be adapted; to maintain the user's level of enjoyment, high-level abstraction quality parameters (content-level), such as the appropriate use of display screens, should be adapted.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Adaptability for distributed object-oriented enterprise frameworks in multimedia technology is a critical mission for system evolution. Today, building adaptive services is a complex task due to lack of adequate framework support in the distributed computing systems. In this paper, we propose a Metalevel Component-Based Framework which uses distributed computing design patterns as components to develop an adaptable pattern-oriented framework for distributed computing applications. We describe our approach of combining a meta-architecture with a pattern-oriented framework, resulting in an adaptable framework which provides a mechanism to facilitate system evolution. This approach resolves the problem of dynamic adaptation in the framework, which is encountered in most distributed multimedia applications. The proposed architecture of the pattern-oriented framework has the abilities to dynamically adapt new design patterns to address issues in the domain of distributed computing and they can be woven together to shape the framework in future. © 2011 Springer Science+Business Media B.V.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Image annotation is a significant step towards semantic based image retrieval. Ontology is a popular approach for semantic representation and has been intensively studied for multimedia analysis. However, relations among concepts are seldom used to extract higher-level semantics. Moreover, the ontology inference is often crisp. This paper aims to enable sophisticated semantic querying of images, and thus contributes to 1) an ontology framework to contain both visual and contextual knowledge, and 2) a probabilistic inference approach to reason the high-level concepts based on different sources of information. The experiment on a natural scene database from LabelMe database shows encouraging results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Investigates the use of lip information, in conjunction with speech information, for robust speaker verification in the presence of background noise. We have previously shown (Int. Conf. on Acoustics, Speech and Signal Proc., vol. 6, pp. 3693-3696, May 1998) that features extracted from a speaker's moving lips hold speaker dependencies which are complementary with speech features. We demonstrate that the fusion of lip and speech information allows for a highly robust speaker verification system which outperforms either subsystem individually. We present a new technique for determining the weighting to be applied to each modality so as to optimize the performance of the fused system. Given a correct weighting, lip information is shown to be highly effective for reducing the false acceptance and false rejection error rates in the presence of background noise

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Regions in video streams attracting human interest contribute significantly to human understanding of the video. Being able to predict salient and informative Regions of Interest (ROIs) through a sequence of eye movements is a challenging problem. Applications such as content-aware retargeting of videos to different aspect ratios while preserving informative regions and smart insertion of dialog (closed-caption text) into the video stream can significantly be improved using the predicted ROIs. We propose an interactive human-in-the-loop framework to model eye movements and predict visual saliency into yet-unseen frames. Eye tracking and video content are used to model visual attention in a manner that accounts for important eye-gaze characteristics such as temporal discontinuities due to sudden eye movements, noise, and behavioral artifacts. A novel statistical-and algorithm-based method gaze buffering is proposed for eye-gaze analysis and its fusion with content-based features. Our robust saliency prediction is instantiated for two challenging and exciting applications. The first application alters video aspect ratios on-the-fly using content-aware video retargeting, thus making them suitable for a variety of display sizes. The second application dynamically localizes active speakers and places dialog captions on-the-fly in the video stream. Our method ensures that dialogs are faithful to active speaker locations and do not interfere with salient content in the video stream. Our framework naturally accommodates personalisation of the application to suit biases and preferences of individual users.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In collaborative situations, eye gaze is a critical element of behavior which supports and fulfills many activities and roles. In current computer-supported collaboration systems, eye gaze is poorly supported. Even in a state-of-the-art video conferencing system such as the access grid, although one can see the face of the user, much of the communicative power of eye gaze is lost. This article gives an overview of some preliminary work that looks towards integrating eye gaze into an immersive collaborative virtual environment and assessing the impact that this would have on interaction between the users of such a system. Three experiments were conducted to assess the efficacy of eye gaze within immersive virtual environments. In each experiment, subjects observed on a large screen the eye-gaze behavior of an avatar. The eye-gaze behavior of that avatar had previously been recorded from a user with the use of a head-mounted eye tracker. The first experiment was conducted to assess the difference between users' abilities to judge what objects an avatar is looking at with only head gaze being viewed and also with eye- and head-gaze data being displayed. The results from the experiment show that eye gaze is of vital importance to the subjects, correctly identifying what a person is looking at in an immersive virtual environment. The second experiment examined whether a monocular or binocular eye-tracker would be required. This was examined by testing subjects' ability to identify where an avatar was looking from their eye direction alone, or by eye direction combined with convergence. This experiment showed that convergence had a significant impact on the subjects' ability to identify where the avatar was looking. The final experiment looked at the effects of stereo and mono-viewing of the scene, with the subjects being asked to identify where the avatar was looking. This experiment showed that there was no difference in the subjects' ability to detect where the avatar was gazing. This is followed by a description of how the eye-tracking system has been integrated into an immersive collaborative virtual environment and some preliminary results from the use of such a system.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Mulsemedia—multiple sensorial media—captures a wide variety of research efforts and applications. This article presents a historic perspective on mulsemedia work and reviews current developments in the area. These take place across the traditional multimedia spectrum—from virtual reality applications to computer games—as well as efforts in the arts, gastronomy, and therapy, to mention a few. We also describe standardization efforts, via the MPEG-V standard, and identify future developments and exciting challenges the community needs to overcome.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The literature reports research efforts allowing the editing of interactive TV multimedia documents by end-users. In this article we propose complementary contributions relative to end-user generated interactive video, video tagging, and collaboration. In earlier work we proposed the watch-and-comment (WaC) paradigm as the seamless capture of an individual`s comments so that corresponding annotated interactive videos be automatically generated. As a proof of concept, we implemented a prototype application, the WACTOOL, that supports the capture of digital ink and voice comments over individual frames and segments of the video, producing a declarative document that specifies both: different media stream structure and synchronization. In this article, we extend the WaC paradigm in two ways. First, user-video interactions are associated with edit commands and digital ink operations. Second, focusing on collaboration and distribution issues, we employ annotations as simple containers for context information by using them as tags in order to organize, store and distribute information in a P2P-based multimedia capture platform. We highlight the design principles of the watch-and-comment paradigm, and demonstrate related results including the current version of the WACTOOL and its architecture. We also illustrate how an interactive video produced by the WACTOOL can be rendered in an interactive video environment, the Ginga-NCL player, and include results from a preliminary evaluation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We discuss the design and implementation of an integrated media creation environment, and demonstrate its efficacy in the generation of two simple home movies. The significance for the average user seeking to create home movies lies in the flexible and automatic application of film principles to the task, removal of tedious low-level editing by means of wellformed media transformations in terms of high-level film constructs (e.g. tempo), and content repurposing powered by those same transformations added to the rich semantic information maintained at each phase of the process.