861 resultados para Video-based interface
Resumo:
In the last few years the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how good is a voice when the application is a speech based interface. In this paper we present a new automatic voice pleasantness classification system based on prosodic and acoustic patterns of voice preference. Our study is based on a multi-language database composed by female voices. In the objective performance evaluation the system achieved a 7.3% error rate.
Resumo:
This project examines the effects of age, experience, and video-based feedback on the rate and type of safety-relevant events captured on video event recorders in the vehicles of three groups of newly licensed young drivers: 1. 14.5- to 15.5-year-old drivers who hold a minor school license (see Appendix A for the provisions of the Iowa code governing minor school licenses); 2. 16-year-old drivers with an intermediate license who are driving unsupervised for the first time; 3. 16-year-old drivers with an intermediate license who previously drove unsupervised for at least four months with a school license. METHODS: The young drivers’ vehicles were equipped with an event-triggered video recording device for 24 weeks. Half of the participants received feedback regarding their driving, and the other half received no feedback at all and served as a control group. The number of safety-relevant events per 1,000 miles (i.e., “event rate”) was analyzed for 90 participants who completed the study. RESULTS: On average, the young drivers who received the video-based intervention had significantly lower event rates than those in the control group. This finding was true for all three groups. An effect of experience was seen for drivers in the control group; the 16-year-olds with driving experience had significantly lower event rates than the 16-year-olds without experience. When the intervention concluded, an increase in event rate was seen for the school license holders, but not for either group of 16-year-old drivers. There is strong evidence that giving young drivers video-based feedback, regardless of their age or level of driving experience, is effective in reducing the rate of safety-relevant events relative to a control group who do not receive feedback. Specific comparisons with regard to age and experience indicated that the age of the driver did not have an effect on the rate of safety-events, while experience did. Young drivers with six months or more of additional experience behind the wheel had nearly half as many safety-relevant events as those without that experience.
Resumo:
This paper presents methods for moving object detection in airborne video surveillance. The motion segmentation in the above scenario is usually difficult because of small size of the object, motion of camera, and inconsistency in detected object shape etc. Here we present a motion segmentation system for moving camera video, based on background subtraction. An adaptive background building is used to take advantage of creation of background based on most recent frame. Our proposed system suggests CPU efficient alternative for conventional batch processing based background subtraction systems. We further refine the segmented motion by meanshift based mode association.
Resumo:
This paper describes the development and validation of a novel web-based interface for the gathering of feedback from building occupants about their environmental discomfort including signs of Sick Building Syndrome (SBS). The gathering of such feedback may enable better targeting of environmental discomfort down to the individual as well as the early detection and subsequently resolution by building services of more complex issues such as SBS. The occupant's discomfort is interpreted and converted to air-conditioning system set points using Fuzzy Logic. Experimental results from a multi-zone air-conditioning test rig have been included in this paper.
Resumo:
The present work presents a new method for activity extraction and reporting from video based on the aggregation of fuzzy relations. Trajectory clustering is first employed mainly to discover the points of entry and exit of mobiles appearing in the scene. In a second step, proximity relations between resulting clusters of detected mobiles and contextual elements from the scene are modeled employing fuzzy relations. These can then be aggregated employing typical soft-computing algebra. A clustering algorithm based on the transitive closure calculation of the fuzzy relations allows building the structure of the scene and characterises the ongoing different activities of the scene. Discovered activity zones can be reported as activity maps with different granularities thanks to the analysis of the transitive closure matrix. Taking advantage of the soft relation properties, activity zones and related activities can be labeled in a more human-like language. We present results obtained on real videos corresponding to apron monitoring in the Toulouse airport in France.
Resumo:
A crescente utilização dos serviços de telecomunicações principalmente sem fio tem exigido a adoção de novos padrões de redes que ofereçam altas taxas de transmissão e que alcance um número maior de usuários. Neste sentido o padrão IEEE 802.16, no qual é baseado o WiMAX, surge como uma tecnologia em potencial para o fornecimento de banda larga na próxima geração de redes sem fio, principalmente porque oferece Qualidade de Serviço (QoS) nativamente para fluxos de voz, dados e vídeo. A respeito das aplicações baseadas vídeo, tem ocorrido um grande crescimento nos últimos anos. Em 2011 a previsão é que esse tipo de conteúdo ultrapasse 50% de todo tráfego proveniente de dispositivos móveis. Aplicações do tipo vídeo têm um forte apelo ao usuário final que é quem de fato deve ser o avaliador do nível de qualidade recebida. Diante disso, são necessárias novas formas de avaliação de desempenho que levem em consideração a percepção do usuário, complementando assim as técnicas tradicionais que se baseiam apenas em aspectos de rede (QoS). Nesse sentido, surgiu a avaliação de desempenho baseada Qualidade de Experiência (QoE) onde a avaliação do usuário final em detrimento a aplicação é o principal parâmetro mensurado. Os resultados das investigações em QoE podem ser usados como uma extensão em detrimento aos tradicionais métodos de QoS, e ao mesmo tempo fornecer informações a respeito da entrega de serviços multimídias do ponto de vista do usuário. Exemplos de mecanismos de controle que poderão ser incluídos em redes com suporte a QoE são novas abordagens de roteamento, processo de seleção de estação base e tráfego condicionado. Ambas as metodologias de avaliação são complementares, e se usadas de forma combinada podem gerar uma avaliação mais robusta. Porém, a grande quantidade de informações dificulta essa combinação. Nesse contexto, esta dissertação tem como objetivo principal criar uma metodologia de predição de qualidade de vídeo em redes WiMAX com uso combinado de simulações e técnicas de Inteligência Computacional (IC). A partir de parâmetros de QoS e QoE obtidos através das simulações será realizado a predição do comportamento futuro do vídeo com uso de Redes Neurais Artificiais (RNA). Se por um lado o uso de simulações permite uma gama de opções como extrapolação de cenários de modo a imitar as mesmas situações do mundo real, as técnicas de IC permitem agilizar a análise dos resultados de modo que sejam feitos previsões de um comportamento futuro, correlações e outros. No caso deste trabalho, optou-se pelo uso de RNAs uma vez que é a técnica mais utilizada para previsão do comportamento, como está sendo proposto nesta dissertação.
Resumo:
Recently, stable markerless 6 DOF video based handtracking devices became available. These devices simultaneously track the positions and orientations of both user hands in different postures with at least 25 frames per second. Such hand-tracking allows for using the human hands as natural input devices. However, the absence of physical buttons for performing click actions and state changes poses severe challenges in designing an efficient and easy to use 3D interface on top of such a device. In particular, for coupling and decoupling a virtual object’s movements to the user’s hand (i.e. grabbing and releasing) a solution has to be found. In this paper, we introduce a novel technique for efficient two-handed grabbing and releasing objects and intuitively manipulating them in the virtual space. This technique is integrated in a novel 3D interface for virtual manipulations. A user experiment shows the superior applicability of this new technique. Last but not least, we describe how this technique can be exploited in practice to improve interaction by integrating it with RTT DeltaGen, a professional CAD/CAS visualization and editing tool.
Resumo:
BACKGROUND The number of older adults in the global population is increasing. This demographic shift leads to an increasing prevalence of age-associated disorders, such as Alzheimer's disease and other types of dementia. With the progression of the disease, the risk for institutional care increases, which contrasts with the desire of most patients to stay in their home environment. Despite doctors' and caregivers' awareness of the patient's cognitive status, they are often uncertain about its consequences on activities of daily living (ADL). To provide effective care, they need to know how patients cope with ADL, in particular, the estimation of risks associated with the cognitive decline. The occurrence, performance, and duration of different ADL are important indicators of functional ability. The patient's ability to cope with these activities is traditionally assessed with questionnaires, which has disadvantages (eg, lack of reliability and sensitivity). Several groups have proposed sensor-based systems to recognize and quantify these activities in the patient's home. Combined with Web technology, these systems can inform caregivers about their patients in real-time (e.g., via smartphone). OBJECTIVE We hypothesize that a non-intrusive system, which does not use body-mounted sensors, video-based imaging, and microphone recordings would be better suited for use in dementia patients. Since it does not require patient's attention and compliance, such a system might be well accepted by patients. We present a passive, Web-based, non-intrusive, assistive technology system that recognizes and classifies ADL. METHODS The components of this novel assistive technology system were wireless sensors distributed in every room of the participant's home and a central computer unit (CCU). The environmental data were acquired for 20 days (per participant) and then stored and processed on the CCU. In consultation with medical experts, eight ADL were classified. RESULTS In this study, 10 healthy participants (6 women, 4 men; mean age 48.8 years; SD 20.0 years; age range 28-79 years) were included. For explorative purposes, one female Alzheimer patient (Montreal Cognitive Assessment score=23, Timed Up and Go=19.8 seconds, Trail Making Test A=84.3 seconds, Trail Making Test B=146 seconds) was measured in parallel with the healthy subjects. In total, 1317 ADL were performed by the participants, 1211 ADL were classified correctly, and 106 ADL were missed. This led to an overall sensitivity of 91.27% and a specificity of 92.52%. Each subject performed an average of 134.8 ADL (SD 75). CONCLUSIONS The non-intrusive wireless sensor system can acquire environmental data essential for the classification of activities of daily living. By analyzing retrieved data, it is possible to distinguish and assign data patterns to subjects' specific activities and to identify eight different activities in daily living. The Web-based technology allows the system to improve care and provides valuable information about the patient in real-time.
Resumo:
One of the main challenges for intelligent vehicles is the capability of detecting other vehicles in their environment, which constitute the main source of accidents. Specifically, many methods have been proposed in the literature for video-based vehicle detection. Most of them perform supervised classification using some appearance-related feature, in particular, symmetry has been extensively utilized. However, an in-depth analysis of the classification power of this feature is missing. As a first contribution of this paper, a thorough study of the classification performance of symmetry is presented within a Bayesian decision framework. This study reveals that the performance of symmetry-based classification is very limited. Therefore, as a second contribution, a new gradient-based descriptor is proposed for vehicle detection. This descriptor exploits the known rectangular structure of vehicle rears within a Histogram of Gradients (HOG)-based framework. Experiments show that the proposed descriptor outperforms largely symmetry as a feature for vehicle verification, achieving classification rates over 90%.
Resumo:
This paper presents a novel background modeling system that uses a spatial grid of Support Vector Machines classifiers for segmenting moving objects, which is a key step in many video-based consumer applications. The system is able to adapt to a large range of dynamic background situations since no parametric model or statistical distribution are assumed. This is achieved by using a different classifier per image region that learns the specific appearance of that scene region and its variations (illumination changes, dynamic backgrounds, etc.). The proposed system has been tested with a recent public database, outperforming other state-of-the-art algorithms.
Resumo:
Cellular networks have been widely used to support many new audio-and video-based multimedia applications. The demand for higher data rate and diverse services has driven the research on multihop cellular networks (MCNs). With its ad hoc network features, an MCN can offer many additional advantages, such as increased network throughput, scalability and coverage. However, providing ad hoc capability to MCNs is challenging as it may require proper wireless interfaces. In this article, the architecture of IEEE 802.16 network interface to provide ad hoc capability for MCNs is investigated, with its focus on the IEEE 802.16 mesh networking and scheduling. Several distributed routing algorithms based on network entry mechanism are studied and compared with a centralized routing algorithm. It is observed from the simulation results that 802.16 mesh networks have limitations on providing sufficient bandwidth for the traffic from the cellular base stations when a cellular network size is relatively large. © 2007 IEEE.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
In the last few years, the number of systems and devices that use voice based interaction has grown significantly. For a continued use of these systems, the interface must be reliable and pleasant in order to provide an optimal user experience. However there are currently very few studies that try to evaluate how pleasant is a voice from a perceptual point of view when the final application is a speech based interface. In this paper we present an objective definition for voice pleasantness based on the composition of a representative feature subset and a new automatic voice pleasantness classification and intensity estimation system. Our study is based on a database composed by European Portuguese female voices but the methodology can be extended to male voices or to other languages. In the objective performance evaluation the system achieved a 9.1% error rate for voice pleasantness classification and a 15.7% error rate for voice pleasantness intensity estimation.
Resumo:
Mestrado em Engenharia Informática - Área de Especialização em Sistemas Gráficos e Multimédia