7 resultados para Audio indexing
em Digital Commons at Florida International University
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
Moving objects database systems are the most challenging sub-category among Spatio-Temporal database systems. A database system that updates in real-time the location information of GPS-equipped moving vehicles has to meet even stricter requirements. Currently existing data storage models and indexing mechanisms work well only when the number of moving objects in the system is relatively small. This dissertation research aimed at the real-time tracking and history retrieval of massive numbers of vehicles moving on road networks. A total solution has been provided for the real-time update of the vehicles' location and motion information, range queries on current and history data, and prediction of vehicles' movement in the near future. ^ To achieve these goals, a new approach called Segmented Time Associated to Partitioned Space (STAPS) was first proposed in this dissertation for building and manipulating the indexing structures for moving objects databases. ^ Applying the STAPS approach, an indexing structure of associating a time interval tree to each road segment was developed for real-time database systems of vehicles moving on road networks. The indexing structure uses affordable storage to support real-time data updates and efficient query processing. The data update and query processing performance it provides is consistent without restrictions such as a time window or assuming linear moving trajectories. ^ An application system design based on distributed system architecture with centralized organization was developed to maximally support the proposed data and indexing structures. The suggested system architecture is highly scalable and flexible. Finally, based on a real-world application model of vehicles moving in region-wide, main issues on the implementation of such a system were addressed. ^
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
Digital systems can generate left and right audio channels that create the effect of virtual sound source placement (spatialization) by processing an audio signal through pairs of Head-Related Transfer Functions (HRTFs) or, equivalently, Head-Related Impulse Responses (HRIRs). The spatialization effect is better when individually-measured HRTFs or HRIRs are used than when generic ones (e.g., from a mannequin) are used. However, the measurement process is not available to the majority of users. There is ongoing interest to find mechanisms to customize HRTFs or HRIRs to a specific user, in order to achieve an improved spatialization effect for that subject. Unfortunately, the current models used for HRTFs and HRIRs contain over a hundred parameters and none of those parameters can be easily related to the characteristics of the subject. This dissertation proposes an alternative model for the representation of HRTFs, which contains at most 30 parameters, all of which have a defined functional significance. It also presents methods to obtain the value of parameters in the model to make it approximately equivalent to an individually-measured HRTF. This conversion is achieved by the systematic deconstruction of HRIR sequences through an augmented version of the Hankel Total Least Squares (HTLS) decomposition approach. An average 95% match (fit) was observed between the original HRIRs and those re-constructed from the Damped and Delayed Sinusoids (DDSs) found by the decomposition process, for ipsilateral source locations. The dissertation also introduces and evaluates an HRIR customization procedure, based on a multilinear model implemented through a 3-mode tensor, for mapping of anatomical data from the subjects to the HRIR sequences at different sound source locations. This model uses the Higher-Order Singular Value Decomposition (HOSVD) method to represent the HRIRs and is capable of generating customized HRIRs from easily attainable anatomical measurements of a new intended user of the system. Listening tests were performed to compare the spatialization performance of customized, generic and individually-measured HRIRs when they are used for synthesized spatial audio. Statistical analysis of the results confirms that the type of HRIRs used for spatialization is a significant factor in the spatialization success, with the customized HRIRs yielding better results than generic HRIRs.
Resumo:
Moving objects database systems are the most challenging sub-category among Spatio-Temporal database systems. A database system that updates in real-time the location information of GPS-equipped moving vehicles has to meet even stricter requirements. Currently existing data storage models and indexing mechanisms work well only when the number of moving objects in the system is relatively small. This dissertation research aimed at the real-time tracking and history retrieval of massive numbers of vehicles moving on road networks. A total solution has been provided for the real-time update of the vehicles’ location and motion information, range queries on current and history data, and prediction of vehicles’ movement in the near future. To achieve these goals, a new approach called Segmented Time Associated to Partitioned Space (STAPS) was first proposed in this dissertation for building and manipulating the indexing structures for moving objects databases. Applying the STAPS approach, an indexing structure of associating a time interval tree to each road segment was developed for real-time database systems of vehicles moving on road networks. The indexing structure uses affordable storage to support real-time data updates and efficient query processing. The data update and query processing performance it provides is consistent without restrictions such as a time window or assuming linear moving trajectories. An application system design based on distributed system architecture with centralized organization was developed to maximally support the proposed data and indexing structures. The suggested system architecture is highly scalable and flexible. Finally, based on a real-world application model of vehicles moving in region-wide, main issues on the implementation of such a system were addressed.
Resumo:
This study examines the correlation between how certified music educators understand audio technology and how they incorporate it in their instructional methods. Participants were classroom music teachers selected from fifty middle schools in Miami- Dade Public Schools. The study adopted a non-experimental research design in which a survey was the primary tool of investigation. The findings reveal that a majority of middle school music teachers in Miami-Dade are not familiar with advanced audiorecording software or any other digital device dedicated to the recording and processing of audio signals. Moreover, they report a lack of opportunities to develop this knowledge. Younger music teachers, however, are more open to developing up-to-date instructional methodologies. Most of the participants agreed that music instruction should be a platform for preparing students for a future in the entertainment industry. A basic knowledge of music business should be delivered to students enrolled in middle-school music courses.
Resumo:
With the exponential growth of the usage of web-based map services, the web GIS application has become more and more popular. Spatial data index, search, analysis, visualization and the resource management of such services are becoming increasingly important to deliver user-desired Quality of Service. First, spatial indexing is typically time-consuming and is not available to end-users. To address this, we introduce TerraFly sksOpen, an open-sourced an Online Indexing and Querying System for Big Geospatial Data. Integrated with the TerraFly Geospatial database [1-9], sksOpen is an efficient indexing and query engine for processing Top-k Spatial Boolean Queries. Further, we provide ergonomic visualization of query results on interactive maps to facilitate the user’s data analysis. Second, due to the highly complex and dynamic nature of GIS systems, it is quite challenging for the end users to quickly understand and analyze the spatial data, and to efficiently share their own data and analysis results with others. Built on the TerraFly Geo spatial database, TerraFly GeoCloud is an extra layer running upon the TerraFly map and can efficiently support many different visualization functions and spatial data analysis models. Furthermore, users can create unique URLs to visualize and share the analysis results. TerraFly GeoCloud also enables the MapQL technology to customize map visualization using SQL-like statements [10]. Third, map systems often serve dynamic web workloads and involve multiple CPU and I/O intensive tiers, which make it challenging to meet the response time targets of map requests while using the resources efficiently. Virtualization facilitates the deployment of web map services and improves their resource utilization through encapsulation and consolidation. Autonomic resource management allows resources to be automatically provisioned to a map service and its internal tiers on demand. v-TerraFly are techniques to predict the demand of map workloads online and optimize resource allocations, considering both response time and data freshness as the QoS target. The proposed v-TerraFly system is prototyped on TerraFly, a production web map service, and evaluated using real TerraFly workloads. The results show that v-TerraFly can accurately predict the workload demands: 18.91% more accurate; and efficiently allocate resources to meet the QoS target: improves the QoS by 26.19% and saves resource usages by 20.83% compared to traditional peak load-based resource allocation.