12 resultados para audio-visual systems
em Digital Commons at Florida International University
Resumo:
This study explored the critical features of temporal synchrony for the facilitation of prenatal perceptual learning with respect to unimodal stimulation using an animal model, the bobwhite quail. The following related hypotheses were examined: (1) the availability of temporal synchrony is a critical feature to facilitate prenatal perceptual learning, (2) a single temporally synchronous note is sufficient to facilitate prenatal perceptual learning, with respect to unimodal stimulation, and (3) in situations where embryos are exposed to a single temporally synchronous note, facilitated perceptual learning, with respect to unimodal stimulation, will be optimal when the temporally synchronous note occurs at the onset of the stimulation bout. To assess these hypotheses, two experiments were conducted in which quail embryos were exposed to various audio-visual configurations of a bobwhite maternal call and tested at 24 hr after hatching for evidence of facilitated prenatal perceptual learning with respect to unimodal stimulation. Experiment 1 explored if intermodal equivalence was sufficient to facilitate prenatal perceptual learning with respect to unimodal stimulation. A Bimodal Sequential Temporal Equivalence (BSTE) condition was created that provided embryos with sequential auditory and visual stimulation in which the same amodal properties (rate, duration, rhythm) were made available across modalities. Experiment 2 assessed: (a) whether a limited number of temporally synchronous notes are sufficient for facilitated prenatal perceptual learning with respect to unimodal stimulation, and (b) whether there is a relationship between timing of occurrence of a temporally synchronous note and the facilitation of prenatal perceptual learning. Results revealed that prenatal exposure to BSTE was not sufficient to facilitate perceptual learning. In contrast, a maternal call that contained a single temporally synchronous note was sufficient to facilitate embryos’ prenatal perceptual learning with respect to unimodal stimulation. Furthermore, the most salient prenatal condition was that which contained the synchronous note at the onset of the call burst. Embryos’ prenatal perceptual learning of the call was four times faster in this condition than when exposed to a unimodal call. Taken together, bobwhite quail embryos’ remarkable sensitivity to temporal synchrony suggests that this amodal property plays a key role in attention and learning during prenatal development.
Resumo:
Context: Clinicians use exercises in rehabilitation to enhance sensorimotor-function, however evidence supporting their use is scarce. Objective: To evaluate acute effects of handheld-vibration on joint position sense (JPS). Design: A repeated-measure, randomized, counter-balanced 3-condition design. Setting: Sports Medicine and Science Research Laboratory. Patients or Other Participants: 31 healthy college-aged volunteers (16-males, 15-females; age=23+3y, mass=76+14kg, height=173+8cm). Interventions: We measured elbow JPS and monitored training using the Flock-of-Birds system (Ascension Technology, Burlington, VT) and MotionMonitor software (Innsport, Chicago, IL), accurate to 0.5°. For each condition (15,5,0Hz vibration), subjects completed three 15-s bouts holding a 2.55kg Mini-VibraFlex dumbbell (Orthometric, New York, NY), and used software-generated audio/visual biofeedback to locate the target. Participants performed separate pre- and post-test JPS measures for each condition. For JPS testing, subjects held a non-vibrating dumbbell, identified the target (90°flexion) using biofeedback, and relaxed 3-5s. We removed feedback and subjects recreated the target and pressed a trigger. We used SPSS 14.0 (SPSS Inc., Chicago, IL) to perform separate ANOVAs (p<0.05) for each protocol and calculated effect sizes using standard-mean differences. Main Outcome Measures: Dependent variables were absolute and variable error between target and reproduced angles, pre-post vibration training. Results: 0Hz (F1,61=1.310,p=0.3) and 5Hz (F1,61=2.625,p=0.1) vibration did not affect accuracy. 15Hz vibration enhanced accuracy (6.5±0.6 to 5.0±0.5°) (F1,61=8.681,p=0.005,ES=0.3). 0Hz did not affect variability (F1,61=0.007,p=0.9). 5Hz vibration decreased variability (3.0±1.8 to 2.3±1.3°) (F1,61=7.250,p=0.009), as did 15Hz (2.8±1.8 to 1.8±1.2°) (F1,61=24.027, p<0.001). Conclusions: Our results support using handheld-vibration to improve sensorimotor-function. Future research should include injured subjects, functional multi-joint/multi-planar measures, and long-term effects of similar training.
Resumo:
This study explored the critical features of temporal synchrony for the facilitation of prenatal perceptual learning with respect to unimodal stimulation using an animal model, the bobwhite quail. The following related hypotheses were examined: (1) the availability of temporal synchrony is a critical feature to facilitate prenatal perceptual learning, (2) a single temporally synchronous note is sufficient to facilitate prenatal perceptual learning, with respect to unimodal stimulation, and (3) in situations where embryos are exposed to a single temporally synchronous note, facilitated perceptual learning, with respect to unimodal stimulation, will be optimal when the temporally synchronous note occurs at the onset of the stimulation bout. To assess these hypotheses, two experiments were conducted in which quail embryos were exposed to various audio-visual configurations of a bobwhite maternal call and tested at 24 hr after hatching for evidence of facilitated prenatal perceptual learning with respect to unimodal stimulation. Experiment 1 explored if intermodal equivalence was sufficient to facilitate prenatal perceptual learning with respect to unimodal stimulation. A Bimodal Sequential Temporal Equivalence (BSTE) condition was created that provided embryos with sequential auditory and visual stimulation in which the same amodal properties (rate, duration, rhythm) were made available across modalities. Experiment 2 assessed: (a) whether a limited number of temporally synchronous notes are sufficient for facilitated prenatal perceptual learning with respect to unimodal stimulation, and (b) whether there is a relationship between timing of occurrence of a temporally synchronous note and the facilitation of prenatal perceptual learning. Results revealed that prenatal exposure to BSTE was not sufficient to facilitate perceptual learning. In contrast, a maternal call that contained a single temporally synchronous note was sufficient to facilitate embryos’ prenatal perceptual learning with respect to unimodal stimulation. Furthermore, the most salient prenatal condition was that which contained the synchronous note at the onset of the call burst. Embryos’ prenatal perceptual learning of the call was four times faster in this condition than when exposed to a unimodal call. Taken together, bobwhite quail embryos’ remarkable sensitivity to temporal synchrony suggests that this amodal property plays a key role in attention and learning during prenatal development.
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
Choosing between Light Rail Transit (LRT) and Bus Rapid Transit (BRT) systems is often controversial and not an easy task for transportation planners who are contemplating the upgrade of their public transportation services. These two transit systems provide comparable services for medium-sized cities from the suburban neighborhood to the Central Business District (CBD) and utilize similar right-of-way (ROW) categories. The research is aimed at developing a method to assist transportation planners and decision makers in determining the most feasible system between LRT and BRT. ^ Cost estimation is a major factor when evaluating a transit system. Typically, LRT is more expensive to build and implement than BRT, but has significantly lower Operating and Maintenance (OM) costs than BRT. This dissertation examines the factors impacting capacity and costs, and develops cost models, which are a capacity-based cost estimate for the LRT and BRT systems. Various ROW categories and alignment configurations of the systems are also considered in the developed cost models. Kikuchi's fleet size model (1985) and cost allocation method are used to develop the cost models to estimate the capacity and costs. ^ The comparison between LRT and BRT are complicated due to many possible transportation planning and operation scenarios. In the end, a user-friendly computer interface integrated with the established capacity-based cost models, the LRT and BRT Cost Estimator (LBCostor), was developed by using Microsoft Visual Basic language to facilitate the process and will guide the users throughout the comparison operations. The cost models and the LBCostor can be used to analyze transit volumes, alignments, ROW configurations, number of stops and stations, headway, size of vehicle, and traffic signal timing at the intersections. The planners can make the necessary changes and adjustments depending on their operating practices. ^
Resumo:
This research pursued the conceptualization, implementation, and verification of a system that enhances digital information displayed on an LCD panel to users with visual refractive errors. The target user groups for this system are individuals who have moderate to severe visual aberrations for which conventional means of compensation, such as glasses or contact lenses, does not improve their vision. This research is based on a priori knowledge of the user's visual aberration, as measured by a wavefront analyzer. With this information it is possible to generate images that, when displayed to this user, will counteract his/her visual aberration. The method described in this dissertation advances the development of techniques for providing such compensation by integrating spatial information in the image as a means to eliminate some of the shortcomings inherent in using display devices such as monitors or LCD panels. Additionally, physiological considerations are discussed and integrated into the method for providing said compensation. In order to provide a realistic sense of the performance of the methods described, they were tested by mathematical simulation in software, as well as by using a single-lens high resolution CCD camera that models an aberrated eye, and finally with human subjects having various forms of visual aberrations. Experiments were conducted on these systems and the data collected from these experiments was evaluated using statistical analysis. The experimental results revealed that the pre-compensation method resulted in a statistically significant improvement in vision for all of the systems. Although significant, the improvement was not as large as expected for the human subject tests. Further analysis suggest that even under the controlled conditions employed for testing with human subjects, the characterization of the eye may be changing. This would require real-time monitoring of relevant variables (e.g. pupil diameter) and continuous adjustment in the pre-compensation process to yield maximum viewing enhancement.
Resumo:
With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.
Resumo:
Digital systems can generate left and right audio channels that create the effect of virtual sound source placement (spatialization) by processing an audio signal through pairs of Head-Related Transfer Functions (HRTFs) or, equivalently, Head-Related Impulse Responses (HRIRs). The spatialization effect is better when individually-measured HRTFs or HRIRs are used than when generic ones (e.g., from a mannequin) are used. However, the measurement process is not available to the majority of users. There is ongoing interest to find mechanisms to customize HRTFs or HRIRs to a specific user, in order to achieve an improved spatialization effect for that subject. Unfortunately, the current models used for HRTFs and HRIRs contain over a hundred parameters and none of those parameters can be easily related to the characteristics of the subject. This dissertation proposes an alternative model for the representation of HRTFs, which contains at most 30 parameters, all of which have a defined functional significance. It also presents methods to obtain the value of parameters in the model to make it approximately equivalent to an individually-measured HRTF. This conversion is achieved by the systematic deconstruction of HRIR sequences through an augmented version of the Hankel Total Least Squares (HTLS) decomposition approach. An average 95% match (fit) was observed between the original HRIRs and those re-constructed from the Damped and Delayed Sinusoids (DDSs) found by the decomposition process, for ipsilateral source locations. The dissertation also introduces and evaluates an HRIR customization procedure, based on a multilinear model implemented through a 3-mode tensor, for mapping of anatomical data from the subjects to the HRIR sequences at different sound source locations. This model uses the Higher-Order Singular Value Decomposition (HOSVD) method to represent the HRIRs and is capable of generating customized HRIRs from easily attainable anatomical measurements of a new intended user of the system. Listening tests were performed to compare the spatialization performance of customized, generic and individually-measured HRIRs when they are used for synthesized spatial audio. Statistical analysis of the results confirms that the type of HRIRs used for spatialization is a significant factor in the spatialization success, with the customized HRIRs yielding better results than generic HRIRs.
Resumo:
With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.
Resumo:
More information is now readily available to computer users than at any time in human history; however, much of this information is often inaccessible to people with blindness or low-vision, for whom information must be presented non-visually. Currently, screen readers are able to verbalize on-screen text using text-to-speech (TTS) synthesis; however, much of this vocalization is inadequate for browsing the Internet. An auditory interface that incorporates auditory-spatial orientation was created and tested. For information that can be structured as a two-dimensional table, links can be semantically grouped as cells in a row within an auditory table, which provides a consistent structure for auditory navigation. An auditory display prototype was tested.^ Sixteen legally blind subjects participated in this research study. Results demonstrated that stereo panning was an effective technique for audio-spatially orienting non-visual navigation in a five-row, six-column HTML table as compared to a centered, stationary synthesized voice. These results were based on measuring the time- to-target (TTT), or the amount of time elapsed from the first prompting to the selection of each tabular link. Preliminary analysis of the TTT values recorded during the experiment showed that the populations did not conform to the ANOVA requirements of normality and equality of variances. Therefore, the data were transformed using the natural logarithm. The repeated-measures two-factor ANOVA results show that the logarithmically-transformed TTTs were significantly affected by the tonal variation method, F(1,15) = 6.194, p= 0.025. Similarly, the results show that the logarithmically transformed TTTs were marginally affected by the stereo spatialization method, F(1,15) = 4.240, p=0.057. The results show that the logarithmically transformed TTTs were not significantly affected by the interaction of both methods, F(1,15) = 1.381, p=0.258. These results suggest that some confusion may be caused in the subject when employing both of these methods simultaneously. The significant effect of tonal variation indicates that the effect is actually increasing the average TTT. In other words, the presence of preceding tones increases task completion time on average. The marginally-significant effect of stereo spatialization decreases the average log(TTT) from 2.405 to 2.264.^
Resumo:
A man-machine system called teleoperator system has been developed to work in hazardous environments such as nuclear reactor plants. Force reflection is a type of force feedback in which forces experienced by the remote manipulator are fed back to the manual controller. In a force-reflecting teleoperation system, the operator uses the manual controller to direct the remote manipulator and receives visual information from a video image and/or graphical animation on the computer screen. This thesis presents the design of a portable Force-Reflecting Manual Controller (FRMC) for the teleoperation of tasks such as hazardous material handling, waste cleanup, and space-related operations. The work consists of the design and construction of a prototype 1-Degree-of-Freedom (DOF) FRMC, the development of the Graphical User Interface (GUI), and system integration. Two control strategies - PID and fuzzy logic controllers are developed and experimentally tested. The system response of each is analyzed and evaluated. In addition, the concept of a telesensation system is introduced, and a variety of design alternatives of a 3-DOF FRMC are proposed for future development.
Resumo:
This research pursued the conceptualization, implementation, and verification of a system that enhances digital information displayed on an LCD panel to users with visual refractive errors. The target user groups for this system are individuals who have moderate to severe visual aberrations for which conventional means of compensation, such as glasses or contact lenses, does not improve their vision. This research is based on a priori knowledge of the user's visual aberration, as measured by a wavefront analyzer. With this information it is possible to generate images that, when displayed to this user, will counteract his/her visual aberration. The method described in this dissertation advances the development of techniques for providing such compensation by integrating spatial information in the image as a means to eliminate some of the shortcomings inherent in using display devices such as monitors or LCD panels. Additionally, physiological considerations are discussed and integrated into the method for providing said compensation. In order to provide a realistic sense of the performance of the methods described, they were tested by mathematical simulation in software, as well as by using a single-lens high resolution CCD camera that models an aberrated eye, and finally with human subjects having various forms of visual aberrations. Experiments were conducted on these systems and the data collected from these experiments was evaluated using statistical analysis. The experimental results revealed that the pre-compensation method resulted in a statistically significant improvement in vision for all of the systems. Although significant, the improvement was not as large as expected for the human subject tests. Further analysis suggest that even under the controlled conditions employed for testing with human subjects, the characterization of the eye may be changing. This would require real-time monitoring of relevant variables (e.g. pupil diameter) and continuous adjustment in the pre-compensation process to yield maximum viewing enhancement.