151 resultados para Wheelock, Eleazar, 1711-1779.
Resumo:
For the first time in this paper we present results showing the effect of speaker head pose angle on automatic lip-reading performance over a wide range of closely spaced angles. We analyse the effect head pose has upon the features themselves and show that by selecting coefficients with minimum variance w.r.t. pose angle, recognition performance can be improved when train-test pose angles differ. Experiments are conducted using the initial phase of a unique multi view Audio-Visual database designed specifically for research and development of pose-invariant lip-reading systems. We firstly show that it is the higher order horizontal spatial frequency components that become most detrimental as the pose deviates. Secondly we assess the performance of different feature selection masks across a range of pose angles including a new mask based on Minimum Cross-Pose Variance coefficients. We report a relative improvement of 50% in Word Error Rate when using our selection mask over a common energy based selection during profile view lip-reading.
Resumo:
This paper presents a novel method of audio-visual feature-level fusion for person identification where both the speech and facial modalities may be corrupted, and there is a lack of prior knowledge about the corruption. Furthermore, we assume there are limited amount of training data for each modality (e.g., a short training speech segment and a single training facial image for each person). A new multimodal feature representation and a modified cosine similarity are introduced to combine and compare bimodal features with limited training data, as well as vastly differing data rates and feature sizes. Optimal feature selection and multicondition training are used to reduce the mismatch between training and testing, thereby making the system robust to unknown bimodal corruption. Experiments have been carried out on a bimodal dataset created from the SPIDRE speaker recognition database and AR face recognition database with variable noise corruption of speech and occlusion in the face images. The system's speaker identification performance on the SPIDRE database, and facial identification performance on the AR database, is comparable with the literature. Combining both modalities using the new method of multimodal fusion leads to significantly improved accuracy over the unimodal systems, even when both modalities have been corrupted. The new method also shows improved identification accuracy compared with the bimodal systems based on multicondition model training or missing-feature decoding alone.
Resumo:
Conventional approaches of digital modulation schemes make use of amplitude, frequency and/or phase as modulation characteristic to transmit data. In this paper, we exploit circular polarization (CP) of the propagating electromagnetic carrier as modulation attribute which is a novel concept in digital communications. The requirement of antenna alignment to maximize received power is eliminated for CP signals and these are not affected by linearly polarized jamming signals. The work presents the concept of Circular Polarization Modulation for 2, 4 and 8 states of carrier and refers them as binary circular polarization modulation (BCPM), quaternary circular polarization modulation (QCPM) and 8-state circular polarization modulation (8CPM) respectively. Issues of modulation, demodulation, 3D symbol constellations and 3D propagating waveforms for the proposed modulation schemes are presented and analyzed in the presence of channel effects, and they are shown to have the same bit error performance in the presence of AWGN compared with conventional schemes while provide 3dB gain in the flat Rayleigh fading channel.
Resumo:
This paper examines the ability of the doubly fed induction generator (DFIG) to deliver multiple reactive power objectives during variable wind conditions. The reactive power requirement is decomposed based on various control objectives (e.g. power factor control, voltage control, loss minimisation, and flicker mitigation) defined around different time frames (i.e. seconds, minutes, and hourly), and the control reference is generated by aggregating the individual reactive power requirement for each control strategy. A novel coordinated controller is implemented for the rotor-side converter and the grid-side converter considering their capability curves and illustrating that it can effectively utilise the aggregated DFIG reactive power capability for system performance enhancement. The performance of the multi-objective strategy is examined for a range of wind and network conditions, and it is shown that for the majority of the scenarios, more than 92% of the main control objective can be achieved while introducing the integrated flicker control scheme with the main reactive power control scheme. Therefore, optimal control coordination across the different control strategies can maximise the availability of ancillary services from DFIG-based wind farms without additional dynamic reactive power devices being installed in power networks.
Resumo:
Smart Spaces, Ambient Intelligence, and Ambient Assisted Living are environmental paradigms that strongly depend on their capability to recognize human actions. While most solutions rest on sensor value interpretations and video analysis applications, few have realized the importance of incorporating common-sense capabilities to support the recognition process. Unfortunately, human action recognition cannot be successfully accomplished by only analyzing body postures. On the contrary, this task should be supported by profound knowledge of human agency nature and its tight connection to the reasons and motivations that explain it. The combination of this knowledge and the knowledge about how the world works is essential for recognizing and understanding human actions without committing common-senseless mistakes. This work demonstrates the impact that episodic reasoning has in improving the accuracy of a computer vision system for human action recognition. This work also presents formalization, implementation, and evaluation details of the knowledge model that supports the episodic reasoning.
Resumo:
We present a study of the behavior of two different figures of merit for quantum correlations, entanglement of formation and quantum discord, under quantum channels showing how the former can, counterintuitively, be more resilient to such environments spoiling effects. By exploiting strict conservation relations between the two measures and imposing necessary constraints on the initial conditions we are able to explicitly show this predominance is related to build-up of the system-environment correlations.
Resumo:
In this paper, we examine a novel approach to network security against passive eavesdroppers in a ray-tracing model and implement it on a hardware platform. By configuring antenna array beam patterns to transmit the data to specific regions, it is possible to create defined regions of coverage for targeted users. By adapting the antenna configuration according to the intended user’s channel state information, this allows the vulnerability of the physical regions to eavesdropping to be reduced. We present the application of our concept to 802.11n networks where an antenna array is employed at the access point. A range of antenna array configurations are examined by simulation and then realized using the Wireless Open-Access Research Platform(WARP)
Resumo:
In this paper, we first provide a theoretical validation for a low-complexity transmit diversity algorithm which employs only one RF chain and a low-complexity switch for transmission. Our theoretical analysis is compared to the simulation results and proved to be accurate. We then apply the transmit diversity scheme to multiple-input and multiple-output (MIMO) systems with bit-interleaved coded modulation (BICM). © 2012 IEEE.
Resumo:
With security and surveillance, there is an increasing need to be able to process image data efficiently and effectively either at source or in a large data networks. Whilst Field Programmable Gate Arrays have been seen as a key technology for enabling this, they typically use high level and/or hardware description language synthesis approaches; this provides a major disadvantage in terms of the time needed to design or program them and to verify correct operation; it considerably reduces the programmability capability of any technique based on this technology. The work here proposes a different approach of using optimised soft-core processors which can be programmed in software. In particular, the paper proposes a design tool chain for programming such processors that uses the CAL Actor Language as a starting point for describing an image processing algorithm and targets its implementation to these custom designed, soft-core processors on FPGA. The main purpose is to exploit the task and data parallelism in order to achieve the same parallelism as a previous HDL implementation but avoiding the design time, verification and debugging steps associated with such approaches.