873 resultados para Audio-visual Speech Recognition, Visual Feature Extraction, Free-parts, Monolithic, ROI


Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is already a truism that emerging communication technologies have changed the landscape of communication in every aspect of our lives, but this is specifically true for how we communicate at work. Advances in communication technologies have enabled a wide range of digital communication modes to be utilized for both internal and external business communication; including audio and visual communication and voice-over protocols, as well as text-based channels, such as email, forums, instant messaging and social media. In spite of the wide range of available audio-visual channels, and despite the ever-increasing popularity of email, real-time text-based communication technologies (instant messaging or IM) are also on the rise (see Mak, 2014; Pazos et al., 2013; Radicati & Levenstein, 2013; and Markman in this volume). The prominence of IM is evident in the rise of this mode of communication, not only as a tool for internal business communication, but as a front-stage channel, particularly for customer service encounters or professional-client conversations (Makarem et al., 2009; Pearce et al., 2013; L. Zhang et al., 2011).

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Online writing plays a complex and increasingly prominent role in the life of organizations. From newsletters to press releases, social media marketing and advertising, to virtual presentations and interactions via e-mail and instant messaging, digital writing intertwines and affects the day-to-day running of the company - yet we rarely pay enough attention to it. Typing on the screen can become particularly problematic because digital text-based communication increases the opportunities for misunderstanding: it lacks the direct audio-visual contact and the norms and conventions that would normally help people to understand each other. Providing a clear, convincing and approachable discussion, this book addresses arenas of online writing: virtual teamwork, instant messaging, emails, corporate communication channels, and social media. Instead of offering do and don’t lists, however, it teaches the reader to develop a practice that is observant, reflective, and grounded in the understanding of the basic principles of language and communication. Through real-life examples and case studies, it helps the reader to notice previously unnoticed small details, question previously unchallenged assumptions and practices, and become a competent digital communicator in a wide range of professional contexts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary goal of this dissertation is to develop point-based rigid and non-rigid image registration methods that have better accuracy than existing methods. We first present point-based PoIRe, which provides the framework for point-based global rigid registrations. It allows a choice of different search strategies including (a) branch-and-bound, (b) probabilistic hill-climbing, and (c) a novel hybrid method that takes advantage of the best characteristics of the other two methods. We use a robust similarity measure that is insensitive to noise, which is often introduced during feature extraction. We show the robustness of PoIRe using it to register images obtained with an electronic portal imaging device (EPID), which have large amounts of scatter and low contrast. To evaluate PoIRe we used (a) simulated images and (b) images with fiducial markers; PoIRe was extensively tested with 2D EPID images and images generated by 3D Computer Tomography (CT) and Magnetic Resonance (MR) images. PoIRe was also evaluated using benchmark data sets from the blind retrospective evaluation project (RIRE). We show that PoIRe is better than existing methods such as Iterative Closest Point (ICP) and methods based on mutual information. We also present a novel point-based local non-rigid shape registration algorithm. We extend the robust similarity measure used in PoIRe to non-rigid registrations adapting it to a free form deformation (FFD) model and making it robust to local minima, which is a drawback common to existing non-rigid point-based methods. For non-rigid registrations we show that it performs better than existing methods and that is less sensitive to starting conditions. We test our non-rigid registration method using available benchmark data sets for shape registration. Finally, we also explore the extraction of features invariant to changes in perspective and illumination, and explore how they can help improve the accuracy of multi-modal registration. For multimodal registration of EPID-DRR images we present a method based on a local descriptor defined by a vector of complex responses to a circular Gabor filter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation develops an image processing framework with unique feature extraction and similarity measurements for human face recognition in the thermal mid-wave infrared portion of the electromagnetic spectrum. The goals of this research is to design specialized algorithms that would extract facial vasculature information, create a thermal facial signature and identify the individual. The objective is to use such findings in support of a biometrics system for human identification with a high degree of accuracy and a high degree of reliability. This last assertion is due to the minimal to no risk for potential alteration of the intrinsic physiological characteristics seen through thermal infrared imaging. The proposed thermal facial signature recognition is fully integrated and consolidates the main and critical steps of feature extraction, registration, matching through similarity measures, and validation through testing our algorithm on a database, referred to as C-X1, provided by the Computer Vision Research Laboratory at the University of Notre Dame. Feature extraction was accomplished by first registering the infrared images to a reference image using the functional MRI of the Brain’s (FMRIB’s) Linear Image Registration Tool (FLIRT) modified to suit thermal infrared images. This was followed by segmentation of the facial region using an advanced localized contouring algorithm applied on anisotropically diffused thermal images. Thermal feature extraction from facial images was attained by performing morphological operations such as opening and top-hat segmentation to yield thermal signatures for each subject. Four thermal images taken over a period of six months were used to generate thermal signatures and a thermal template for each subject, the thermal template contains only the most prevalent and consistent features. Finally a similarity measure technique was used to match signatures to templates and the Principal Component Analysis (PCA) was used to validate the results of the matching process. Thirteen subjects were used for testing the developed technique on an in-house thermal imaging system. The matching using an Euclidean-based similarity measure showed 88% accuracy in the case of skeletonized signatures and templates, we obtained 90% accuracy for anisotropically diffused signatures and templates. We also employed the Manhattan-based similarity measure and obtained an accuracy of 90.39% for skeletonized and diffused templates and signatures. It was found that an average 18.9% improvement in the similarity measure was obtained when using diffused templates. The Euclidean- and Manhattan-based similarity measure was also applied to skeletonized signatures and templates of 25 subjects in the C-X1 database. The highly accurate results obtained in the matching process along with the generalized design process clearly demonstrate the ability of the thermal infrared system to be used on other thermal imaging based systems and related databases. A novel user-initialization registration of thermal facial images has been successfully implemented. Furthermore, the novel approach at developing a thermal signature template using four images taken at various times ensured that unforeseen changes in the vasculature did not affect the biometric matching process as it relied on consistent thermal features.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Context: Clinicians use exercises in rehabilitation to enhance sensorimotor-function, however evidence supporting their use is scarce. Objective: To evaluate acute effects of handheld-vibration on joint position sense (JPS). Design: A repeated-measure, randomized, counter-balanced 3-condition design. Setting: Sports Medicine and Science Research Laboratory. Patients or Other Participants: 31 healthy college-aged volunteers (16-males, 15-females; age=23+3y, mass=76+14kg, height=173+8cm). Interventions: We measured elbow JPS and monitored training using the Flock-of-Birds system (Ascension Technology, Burlington, VT) and MotionMonitor software (Innsport, Chicago, IL), accurate to 0.5°. For each condition (15,5,0Hz vibration), subjects completed three 15-s bouts holding a 2.55kg Mini-VibraFlex dumbbell (Orthometric, New York, NY), and used software-generated audio/visual biofeedback to locate the target. Participants performed separate pre- and post-test JPS measures for each condition. For JPS testing, subjects held a non-vibrating dumbbell, identified the target (90°flexion) using biofeedback, and relaxed 3-5s. We removed feedback and subjects recreated the target and pressed a trigger. We used SPSS 14.0 (SPSS Inc., Chicago, IL) to perform separate ANOVAs (p<0.05) for each protocol and calculated effect sizes using standard-mean differences. Main Outcome Measures: Dependent variables were absolute and variable error between target and reproduced angles, pre-post vibration training. Results: 0Hz (F1,61=1.310,p=0.3) and 5Hz (F1,61=2.625,p=0.1) vibration did not affect accuracy. 15Hz vibration enhanced accuracy (6.5±0.6 to 5.0±0.5°) (F1,61=8.681,p=0.005,ES=0.3). 0Hz did not affect variability (F1,61=0.007,p=0.9). 5Hz vibration decreased variability (3.0±1.8 to 2.3±1.3°) (F1,61=7.250,p=0.009), as did 15Hz (2.8±1.8 to 1.8±1.2°) (F1,61=24.027, p<0.001). Conclusions: Our results support using handheld-vibration to improve sensorimotor-function. Future research should include injured subjects, functional multi-joint/multi-planar measures, and long-term effects of similar training.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The primary goal of this dissertation is to develop point-based rigid and non-rigid image registration methods that have better accuracy than existing methods. We first present point-based PoIRe, which provides the framework for point-based global rigid registrations. It allows a choice of different search strategies including (a) branch-and-bound, (b) probabilistic hill-climbing, and (c) a novel hybrid method that takes advantage of the best characteristics of the other two methods. We use a robust similarity measure that is insensitive to noise, which is often introduced during feature extraction. We show the robustness of PoIRe using it to register images obtained with an electronic portal imaging device (EPID), which have large amounts of scatter and low contrast. To evaluate PoIRe we used (a) simulated images and (b) images with fiducial markers; PoIRe was extensively tested with 2D EPID images and images generated by 3D Computer Tomography (CT) and Magnetic Resonance (MR) images. PoIRe was also evaluated using benchmark data sets from the blind retrospective evaluation project (RIRE). We show that PoIRe is better than existing methods such as Iterative Closest Point (ICP) and methods based on mutual information. We also present a novel point-based local non-rigid shape registration algorithm. We extend the robust similarity measure used in PoIRe to non-rigid registrations adapting it to a free form deformation (FFD) model and making it robust to local minima, which is a drawback common to existing non-rigid point-based methods. For non-rigid registrations we show that it performs better than existing methods and that is less sensitive to starting conditions. We test our non-rigid registration method using available benchmark data sets for shape registration. Finally, we also explore the extraction of features invariant to changes in perspective and illumination, and explore how they can help improve the accuracy of multi-modal registration. For multimodal registration of EPID-DRR images we present a method based on a local descriptor defined by a vector of complex responses to a circular Gabor filter.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A sociedade digital nos abraça em todos os aspectos do cotidiano e uma parte significativa da população vive conectada em multiplataformas. Com a instantaneidade dos fluxos de comunicação, vivemos uma rotina onde muitos acessos estão a um “clique” ou toque. A televisão como mídia preponderante durante várias décadas, na sua transição digital comporta uma função além da TV que conhecíamos, como display interativo que se conecta e absorve conteúdos provenientes de várias fontes. Os consagrados modelos mundiais de distribuição de audiovisual, especialmente pelo Broadcast, sofrem as consequências da mudança do comportamento do seu público pelas novas oportunidades de acesso aos conteúdos, agora interativos e sob demanda. Neste contexto, os modelos das SmartTVs (TVs conectadas) em Broadband (Banda Larga) apresentam opções diferenciadas e requerem um espaço cada vez maior na conexão com todos os outros displays. Com este cenário, o presente estudo busca descrever e analisar as novas ofertas de conteúdos, aplicativos, possibilidades e tendências do hibridismo das fontes para a futura TV.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Pinterest, la comunidad en línea donde sitúo mi objeto de estudio, permite a sus miembros crear colecciones audio - visuales a partir de imágenes (fijas o dinámicas), vídeos, audios, gráficos e incluso textos encontrados en el universo de internet. El estudio del <> (la unidad audio - visual básica con que se crean estas colecciones y que es confundida con una fotografía digital) es una pequeña red hipermedial conformada por contenidos en formatos de diversa naturaleza con gran potencial para la educación artística y el desarrollo de competencias digitales. El pin, y por extensión, Pinterest, pueden convertirse en herramientas significativas para artistas y educadores, propiciando la utilización de las TIC en los procesos de enseñanza y aprendizaje.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Based on close examinations of instant message (IM) interactions, this chapter argues that an interactional sociolinguistic approach to computer-mediated language use could provide explanations for phenomena that previously could not be accounted for in computer-mediated discourse analysis (CMDA). Drawing on the theoretical framework of relational work (Locher, 2006), the analysis focuses on non-task oriented talk and its function in forming and establishing communication norms in the team, as well as micro-level phenomena, such as hesitation, backchannel signals and emoticons. The conclusions of this preliminary research suggest that the linguistic strategies used for substituting audio-visual signals are strategically used in discursive functions and have an important role in relational work

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This dissertation focuses on two vital challenges in relation to whale acoustic signals: detection and classification.

In detection, we evaluated the influence of the uncertain ocean environment on the spectrogram-based detector, and derived the likelihood ratio of the proposed Short Time Fourier Transform detector. Experimental results showed that the proposed detector outperforms detectors based on the spectrogram. The proposed detector is more sensitive to environmental changes because it includes phase information.

In classification, our focus is on finding a robust and sparse representation of whale vocalizations. Because whale vocalizations can be modeled as polynomial phase signals, we can represent the whale calls by their polynomial phase coefficients. In this dissertation, we used the Weyl transform to capture chirp rate information, and used a two dimensional feature set to represent whale vocalizations globally. Experimental results showed that our Weyl feature set outperforms chirplet coefficients and MFCC (Mel Frequency Cepstral Coefficients) when applied to our collected data.

Since whale vocalizations can be represented by polynomial phase coefficients, it is plausible that the signals lie on a manifold parameterized by these coefficients. We also studied the intrinsic structure of high dimensional whale data by exploiting its geometry. Experimental results showed that nonlinear mappings such as Laplacian Eigenmap and ISOMAP outperform linear mappings such as PCA and MDS, suggesting that the whale acoustic data is nonlinear.

We also explored deep learning algorithms on whale acoustic data. We built each layer as convolutions with either a PCA filter bank (PCANet) or a DCT filter bank (DCTNet). With the DCT filter bank, each layer has different a time-frequency scale representation, and from this, one can extract different physical information. Experimental results showed that our PCANet and DCTNet achieve high classification rate on the whale vocalization data set. The word error rate of the DCTNet feature is similar to the MFSC in speech recognition tasks, suggesting that the convolutional network is able to reveal acoustic content of speech signals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Sexual risk behavior among young adults is a serious public health concern; 50% will contract a sexually transmitted infection (STI) before the age of 25. The current study collected self-report personality and sexual history data, as well as neuroimaging, experimental behavioral (e.g., real-time hypothetical sexual decision making data), and self-report sexual arousal data from 120 heterosexual young adults ages 18-26. In addition, longitudinal changes in self-reported sexual behavior were collected from a subset (n = 70) of the participants. The primary aims of the study were (1) to predict differences in self-report sexual behavior and hypothetical sexual decision-making (in response to sexually explicit audio-visual cues) as a function of ventral striatum (VS) and amygdala activity, (2) test whether the association between sexual behavior/decision-making and brain function is moderated by gender, self-reported sexual arousal, and/or trait-level personality factors (i.e., self-control, impulsivity, and sensation seeking) and (3) to examine how the main effects of neural function and interaction effects predict sexual risk behavior over time. Our hypotheses were mostly supported across the sexual behavior and decision-making outcome variables, such that neural risk phenotypes (heightened reward-related ventral striatum activity coupled with decreased threat-related amygdala activity) were associated with greater lifetime sexual partners at baseline measured and over time (longitudinal analyses). Impulsivity moderated the relationship between neural function and self-reported number of sexual partners at baseline and follow up measures, as well as experimental condom use decision-making. Sexual arousal and sensation seeking moderated the relationship between neural function and baseline and follow up self-reports of number of sexual partners. Finally, unique gender differences were observed in the relationship between threat and reward-related neural reactivity and self-reported sexual risk behavior. The results of this study provide initial evidence for the potential role for neurobiological approaches to understanding sexual decision-making and risk behavior. With continued research, establishing biomarkers for sexual risk behavior could help inform the development of novel and more effective individually tailored sexual health prevention and intervention efforts.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the introduction of new input devices, such as multi-touch surface displays, the Nintendo WiiMote, the Microsoft Kinect, and the Leap Motion sensor, among others, the field of Human-Computer Interaction (HCI) finds itself at an important crossroads that requires solving new challenges. Given the amount of three-dimensional (3D) data available today, 3D navigation plays an important role in 3D User Interfaces (3DUI). This dissertation deals with multi-touch, 3D navigation, and how users can explore 3D virtual worlds using a multi-touch, non-stereo, desktop display. The contributions of this dissertation include a feature-extraction algorithm for multi-touch displays (FETOUCH), a multi-touch and gyroscope interaction technique (GyroTouch), a theoretical model for multi-touch interaction using high-level Petri Nets (PeNTa), an algorithm to resolve ambiguities in the multi-touch gesture classification process (Yield), a proposed technique for navigational experiments (FaNS), a proposed gesture (Hold-and-Roll), and an experiment prototype for 3D navigation (3DNav). The verification experiment for 3DNav was conducted with 30 human-subjects of both genders. The experiment used the 3DNav prototype to present a pseudo-universe, where each user was required to find five objects using the multi-touch display and five objects using a game controller (GamePad). For the multi-touch display, 3DNav used a commercial library called GestureWorks in conjunction with Yield to resolve the ambiguity posed by the multiplicity of gestures reported by the initial classification. The experiment compared both devices. The task completion time with multi-touch was slightly shorter, but the difference was not statistically significant. The design of experiment also included an equation that determined the level of video game console expertise of the subjects, which was used to break down users into two groups: casual users and experienced users. The study found that experienced gamers performed significantly faster with the GamePad than casual users. When looking at the groups separately, casual gamers performed significantly better using the multi-touch display, compared to the GamePad. Additional results are found in this dissertation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

For those who are not new to the world of Japanese animation, known mainly as anime, the debate of "dub vs. sub" is by no means anything out of the ordinary, but rather a very heated argument amongst fans. The study will focus on the differences in the US English version between the two approaches of translating audio-visual media, namely subtitling (official subtitles and fanmade subtitles) and dubbing, in a qualitative context. More precisely, which of the two approaches can store the most information from the same audiovisual segment, in order to satisfy the needs of the anime audience. In order to draw substantial conclusions, the analysis will be conducted on a corpus of 1 episode from the first season of the popular mid-nineties TV animated series, Sailor Moon. The main objective of this research is to analyze the three versions and compare the findings to what anime fans expect each of them to provide, in terms of how culture specific terms are handled, how accurate the translation is, localization, censorship, and omission. As for the fans’ opinions, the study will include a survey regarding the personal preference of fans when it comes to choosing between the official subtitled version, the fanmade subtitles and the dubbed version.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Participation in group exhibition themed around the 25th anniversary of the Elba Benitez Gallery in Madrid. My work comprised a series of performances in which I translated reviews from the magazine Art Forum from 1990. The performances took place in various locations in London, throughout the run of the exhibition, and were streamed live to an iPad in the gallery in Madrid. I made audio visual recordings of the performances via the streaming media, which located me as the performer alongside the viewers in a single split image. These recordings were then archived in a shared folder held between the gallery and me, and which visitors to the exhibition could access when a performance was not taking place. The work extends my concerns with translation and performance, and with a consideration of how the mechanism of the gallery and the exhibition might be used to generate innovative viewing engagements facilitated by technology. The work also attempts to develop thinking and practice around the relationship between art works and their documentation - in this case the documentation and even its potential for distribution is generated as the work comes into being. The exhibition included works by Ignasi Aballí, Armando Andrade Tudela,Lothar Baumgarten, Carlos Bunga, Cabello/Carceller, Juan Cruz, Gintaras Didžiapetris, Fernanda Fragateiro, Hreinn Fridfinnsson, Carlos Garaicoa,Mario García Torres, David Goldblatt, Cristina Iglesias,Ana Mendieta, Vik Muniz, Ernesto Neto, Francisco Ruiz de Infante,Alexander Sokurov, Francesc Torres and Valentín Vallhonrat.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The estimating of the relative orientation and position of a camera is one of the integral topics in the field of computer vision. The accuracy of a certain Finnish technology company’s traffic sign inventory and localization process can be improved by utilizing the aforementioned concept. The company’s localization process uses video data produced by a vehicle installed camera. The accuracy of estimated traffic sign locations depends on the relative orientation between the camera and the vehicle. This thesis proposes a computer vision based software solution which can estimate a camera’s orientation relative to the movement direction of the vehicle by utilizing video data. The task was solved by using feature-based methods and open source software. When using simulated data sets, the camera orientation estimates had an absolute error of 0.31 degrees on average. The software solution can be integrated to be a part of the traffic sign localization pipeline of the company in question.