873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI
Resumo:
Parkinson's disease (PD) is a neuro-degenerative disorder, the second most common after Alzheimer's disease. After diagnosis, treatments can help to relieve the symptoms, but there is no known cure for PD. PD is characterized by a combination of motor and no-motor dysfunctions. Among the motor symptoms there is the so called Freezing of Gait (FoG). The FoG is a phenomenon in PD patients in which the feet stock to the floor and is difficult for the patient to initiate movement. FoG is a severe problem, since it is associated with falls, anxiety, loss of mobility, accidents, mortality and it has substantial clinical and social consequences decreasing the quality of life in PD patients. Medicine can be very successful in controlling movements disorders and dealing with some of the PD symptoms. However, the relationship between medication and the development of FoG remains unclear. Several studies have demonstrated that visual or auditory rhythmical cuing allows PD patients to improve their motor abilities. Rhythmic auditory stimulation (RAS) was shown to be particularly effective at improving gait, specially with patients that manifest FoG. While RAS allows to reduce the time and the effects of FoGs occurrence in PD patients after the FoG is detected, it can not avoid the episode due to the latency of detection. An improvement of the system would be the prediction of the FoG. This thesis was developed following two main objectives: (1) the finding of specifics properties during pre FoG periods different from normal walking context and other walking events like turns and stops using the information provided by the inertial measurements units (IMUs) and (2) the formulation of a model for automatically detect the pre FoG patterns in order to completely avoid the upcoming freezing event in PD patients. The first part focuses on the analysis of different methods for feature extraction which might lead in the FoG occurrence.
Resumo:
Bibliography: p. 41.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-06
Resumo:
This chapter provides the theoretical foundation and background on data envelopment analysis (DEA) method. We first introduce the basic DEA models. The balance of this chapter focuses on evidences showing DEA has been extensively applied for measuring efficiency and productivity of services including financial services (banking, insurance, securities, and fund management), professional services, health services, education services, environmental and public services, energy services, logistics, tourism, information technology, telecommunications, transport, distribution, audio-visual, media, entertainment, cultural and other business services. Finally, we provide information on the use of Performance Improvement Management Software (PIM-DEA). A free limited version of this software and downloading procedure is also included in this chapter.
Resumo:
The main challenges of multimedia data retrieval lie in the effective mapping between low-level features and high-level concepts, and in the individual users' subjective perceptions of multimedia content. ^ The objectives of this dissertation are to develop an integrated multimedia indexing and retrieval framework with the aim to bridge the gap between semantic concepts and low-level features. To achieve this goal, a set of core techniques have been developed, including image segmentation, content-based image retrieval, object tracking, video indexing, and video event detection. These core techniques are integrated in a systematic way to enable the semantic search for images/videos, and can be tailored to solve the problems in other multimedia related domains. In image retrieval, two new methods of bridging the semantic gap are proposed: (1) for general content-based image retrieval, a stochastic mechanism is utilized to enable the long-term learning of high-level concepts from a set of training data, such as user access frequencies and access patterns of images. (2) In addition to whole-image retrieval, a novel multiple instance learning framework is proposed for object-based image retrieval, by which a user is allowed to more effectively search for images that contain multiple objects of interest. An enhanced image segmentation algorithm is developed to extract the object information from images. This segmentation algorithm is further used in video indexing and retrieval, by which a robust video shot/scene segmentation method is developed based on low-level visual feature comparison, object tracking, and audio analysis. Based on shot boundaries, a novel data mining framework is further proposed to detect events in soccer videos, while fully utilizing the multi-modality features and object information obtained through video shot/scene detection. ^ Another contribution of this dissertation is the potential of the above techniques to be tailored and applied to other multimedia applications. This is demonstrated by their utilization in traffic video surveillance applications. The enhanced image segmentation algorithm, coupled with an adaptive background learning algorithm, improves the performance of vehicle identification. A sophisticated object tracking algorithm is proposed to track individual vehicles, while the spatial and temporal relationships of vehicle objects are modeled by an abstract semantic model. ^
Resumo:
This study explored the critical features of temporal synchrony for the facilitation of prenatal perceptual learning with respect to unimodal stimulation using an animal model, the bobwhite quail. The following related hypotheses were examined: (1) the availability of temporal synchrony is a critical feature to facilitate prenatal perceptual learning, (2) a single temporally synchronous note is sufficient to facilitate prenatal perceptual learning, with respect to unimodal stimulation, and (3) in situations where embryos are exposed to a single temporally synchronous note, facilitated perceptual learning, with respect to unimodal stimulation, will be optimal when the temporally synchronous note occurs at the onset of the stimulation bout. To assess these hypotheses, two experiments were conducted in which quail embryos were exposed to various audio-visual configurations of a bobwhite maternal call and tested at 24 hr after hatching for evidence of facilitated prenatal perceptual learning with respect to unimodal stimulation. Experiment 1 explored if intermodal equivalence was sufficient to facilitate prenatal perceptual learning with respect to unimodal stimulation. A Bimodal Sequential Temporal Equivalence (BSTE) condition was created that provided embryos with sequential auditory and visual stimulation in which the same amodal properties (rate, duration, rhythm) were made available across modalities. Experiment 2 assessed: (a) whether a limited number of temporally synchronous notes are sufficient for facilitated prenatal perceptual learning with respect to unimodal stimulation, and (b) whether there is a relationship between timing of occurrence of a temporally synchronous note and the facilitation of prenatal perceptual learning. Results revealed that prenatal exposure to BSTE was not sufficient to facilitate perceptual learning. In contrast, a maternal call that contained a single temporally synchronous note was sufficient to facilitate embryos’ prenatal perceptual learning with respect to unimodal stimulation. Furthermore, the most salient prenatal condition was that which contained the synchronous note at the onset of the call burst. Embryos’ prenatal perceptual learning of the call was four times faster in this condition than when exposed to a unimodal call. Taken together, bobwhite quail embryos’ remarkable sensitivity to temporal synchrony suggests that this amodal property plays a key role in attention and learning during prenatal development.
Resumo:
This study explored the critical features of temporal synchrony for the facilitation of prenatal perceptual learning with respect to unimodal stimulation using an animal model, the bobwhite quail. The following related hypotheses were examined: (1) the availability of temporal synchrony is a critical feature to facilitate prenatal perceptual learning, (2) a single temporally synchronous note is sufficient to facilitate prenatal perceptual learning, with respect to unimodal stimulation, and (3) in situations where embryos are exposed to a single temporally synchronous note, facilitated perceptual learning, with respect to unimodal stimulation, will be optimal when the temporally synchronous note occurs at the onset of the stimulation bout. To assess these hypotheses, two experiments were conducted in which quail embryos were exposed to various audio-visual configurations of a bobwhite maternal call and tested at 24 hr after hatching for evidence of facilitated prenatal perceptual learning with respect to unimodal stimulation. Experiment 1 explored if intermodal equivalence was sufficient to facilitate prenatal perceptual learning with respect to unimodal stimulation. A Bimodal Sequential Temporal Equivalence (BSTE) condition was created that provided embryos with sequential auditory and visual stimulation in which the same amodal properties (rate, duration, rhythm) were made available across modalities. Experiment 2 assessed: (a) whether a limited number of temporally synchronous notes are sufficient for facilitated prenatal perceptual learning with respect to unimodal stimulation, and (b) whether there is a relationship between timing of occurrence of a temporally synchronous note and the facilitation of prenatal perceptual learning. Results revealed that prenatal exposure to BSTE was not sufficient to facilitate perceptual learning. In contrast, a maternal call that contained a single temporally synchronous note was sufficient to facilitate embryos’ prenatal perceptual learning with respect to unimodal stimulation. Furthermore, the most salient prenatal condition was that which contained the synchronous note at the onset of the call burst. Embryos’ prenatal perceptual learning of the call was four times faster in this condition than when exposed to a unimodal call. Taken together, bobwhite quail embryos’ remarkable sensitivity to temporal synchrony suggests that this amodal property plays a key role in attention and learning during prenatal development.
Resumo:
Negli ultimi anni, l'avanzamento incredibilmente rapido della tecnologia ha portato allo sviluppo e alla diffusione di dispositivi elettronici portatili aventi dimensioni estremamente ridotte e, allo stesso tempo, capacità computazionali molto notevoli. Più nello specifico, una particolare categoria di dispositivi, attualmente in forte sviluppo, che ha già fatto la propria comparsa sul mercato mondiale è sicuramente la categoria dei dispositivi Wearable. Come suggerisce il nome, questi sono progettati per essere letteralmente indossati, pensati per fornire continuo supporto, in diversi ambiti, a chi li utilizza. Se per interagire con essi l’utente non deve ricorrere obbligatoriamente all'utilizzo delle mani, allora si parla di dispositivi Wearable Hands Free. Questi sono generalmente in grado di percepire e catture l’input dell'utente seguendo tecniche e metodologie diverse, non basate sul tatto. Una di queste è sicuramente quella che prevede di modellare l’input dell’utente stesso attraverso la sua voce, appoggiandosi alla disciplina dell’ASR (Automatic Speech Recognition), che si occupa della traduzione del linguaggio parlato in testo, mediante l’utilizzo di dispositivi computerizzati. Si giunge quindi all’obiettivo della tesi, che è quello di sviluppare un framework, utilizzabile nell’ambito dei dispositivi Wearable, che fornisca un servizio di riconoscimento vocale appoggiandosi ad uno già esistente, in modo che presenti un certo livello di efficienza e facilità di utilizzo. Più in generale, in questo documento si punta a fornire una descrizione approfondita di quelli che sono i dispositivi Wearable e Wearable Hands-Free, definendone caratteristiche, criticità e ambiti di utilizzo. Inoltre, l’intento è quello di illustrare i principi di funzionamento dell’Automatic Speech Recognition per passare poi ad analisi, progettazione e sviluppo del framework appena citato.
Resumo:
In contrast to Muslins traditions and costumes, the US government and society seems to invest in the media to forge discourses on Western way of life. In addition, it creates idealized images of the woman, the hero, the father, the family, and an everyday speech invoking repeated and widespread moral values, including “justice” and “freedom”, in opposition to the “terror”. In this research we analysed the TV series Homeland, using as theoretical support the Cultural Studies, particularly the concept of Social Representation by Denise Jodelet, the analytics tools created by Michel Foucault on power devices, and feminist studies by Teresa of Lauretis. I’ve tried to see how forces in correlations operate, and how representations of womanhood, sexuality and nationality are built and reiterated in speeches, creating patterns of behaviour for men and women. Spreading images of the “good” man, the “good” wife, and the “hero”, the audio-visual product creates and produces the family, the society and the nation considered exemplar.
Resumo:
When referring to cinema and its emancipatory potential, realism, like Plato’s pharmakon, has signified both illness and cure, poison and medicine. On the one hand, realism is regarded as the main feature of so-called classical cinema, inherently conservative and thoroughly ideological, its main raison d’être being to reify and make a particular version of the status quo believable and to pass it out as ‘reality’ (Burch, 1990; MacCabe, 1974). On the other, realism has also been interpreted as a quest for truth and social justice, as in the positivist ethos that informs documentary (Zavattini, 1953). Even in the latter sense, however, the extent to which realism has served colonizing ends when used to investigate the ‘truth’ of the Other has also been noted, rendering the form profoundly suspicious (Chow, 2007, p. 150). For realism has been a Western form of representation, one that can be traced back to the invention of perspective in painting and that peaked with the secular worldview brought about by the Enlightenment. And like realism, the nation state too is a product of the Enlightenment, nationalism being, as it were, a secular replacement for the religious - that is enchanted or fantastic - worldview. In this way, realism, cinema and nation are inextricably linked, and equally strained under the current decline of the Enlightenment paradigm. This chapter looks at Y tu Mamá También by Alfonso Cuarón (2001), a highly successful road movie with documentary features, to explore the ways in which realism, cinema and nation interact with each other in the present conditions of ‘globalization’ as experienced in Mexico. The chapter compares and contrasts various interpretations of the role of realism in this film put forward by critics and scholars and other discourses about it circulating in the media with actual ways of audience engagement with it.
Resumo:
Current state of the art techniques for landmine detection in ground penetrating radar (GPR) utilize statistical methods to identify characteristics of a landmine response. This research makes use of 2-D slices of data in which subsurface landmine responses have hyperbolic shapes. Various methods from the field of visual image processing are adapted to the 2-D GPR data, producing superior landmine detection results. This research goes on to develop a physics-based GPR augmentation method motivated by current advances in visual object detection. This GPR specific augmentation is used to mitigate issues caused by insufficient training sets. This work shows that augmentation improves detection performance under training conditions that are normally very difficult. Finally, this work introduces the use of convolutional neural networks as a method to learn feature extraction parameters. These learned convolutional features outperform hand-designed features in GPR detection tasks. This work presents a number of methods, both borrowed from and motivated by the substantial work in visual image processing. The methods developed and presented in this work show an improvement in overall detection performance and introduce a method to improve the robustness of statistical classification.
Resumo:
This article argues that sonic technologies, such as telephones, voice recorders and phonographs, alongside more (audio)visual ones such as flickering fluorescent lights, videos, and the television sets are crucial to the world of Twin Peaks, and constitute this world as both a communications network with portals to the unknown, and an accumulation of recordings of ghosted voices and entities, perhaps finding its ultimate expression in the backwards reprocessed speech in the Black Lodge. This lodge can be understood as a space in which there are nothing but recordings, albeit now on a cosmic, spiritual and demonic level. Using a media archaeological approach to these devices in the series, this paper will argue that they were already operating by a media archaeological logic, generating the world of Twin Peaks as a haunted archive of sonic and other mediations.
Resumo:
El objetivo de este artículo es doble: por un lado explorar la habilidad de la Unión Europea para llevar a cabo una política audiovisual dirigida al Mercosur y promover las normas de la Convención sobre la diversidad de las expresiones culturales; por otro, analizar el impacto del modelo de política audiovisual de la UE en el desarrollo de la cooperación audiovisual con el Mercosur y centrarse en los principales vectores que configuran el paisaje audiovisual del Mercosur. El texto pretende destacar cómo y por qué la UE persigue una política audiovisual con esa región, cuáles son los propósitos y los límites de actuación. En este sentido, se preocupa por entender cómo la diplomacia audiovisual de la UE interactúa con otros actores, como las acciones gubernamentales llevadas a cabo desde la propia UE y el Mercosur, así como las prácticas del sector privado (Hollywwod y los grandes conglomerados de medios).
Resumo:
Se presenta en este texto, una introducción al Síndrome de Asperger y aquellas características que lo distinguen, con el fin de conocer un poco más, en qué consiste este Trastorno Generalizado del Desarrollo (TGD). Además, se pretende facilitar cuales son las herramientas de comunicación y lenguaje más aptas para la enseñanza y aprendizaje del sujeto, haciendo hincapié en los recursos visuales, audiovisuales y artísticos como herramientas de aprendizaje para su inclusión social en cualquier ámbito de la sociedad (colegios, institutos, asociaciones, universidades o administraciones).
Resumo:
Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.