488 resultados para File processing (Computer science)
Resumo:
Audio-visualspeechrecognition, or the combination of visual lip-reading with traditional acoustic speechrecognition, has been previously shown to provide a considerable improvement over acoustic-only approaches in noisy environments, such as that present in an automotive cabin. The research presented in this paper will extend upon the established audio-visualspeechrecognition literature to show that further improvements in speechrecognition accuracy can be obtained when multiple frontal or near-frontal views of a speaker's face are available. A series of visualspeechrecognition experiments using a four-stream visual synchronous hidden Markov model (SHMM) are conducted on the four-camera AVICAR automotiveaudio-visualspeech database. We study the relative contribution between the side and central orientated cameras in improving visualspeechrecognition accuracy. Finally combination of the four visual streams with a single audio stream in a five-stream SHMM demonstrates a relative improvement of over 56% in word recognition accuracy when compared to the acoustic-only approach in the noisiest conditions of the AVICAR database.
Resumo:
Teleradiology allows medical images to be transmitted over electronic networks for clinical interpretation, and for improved healthcare access, delivery and standards. Although, such remote transmission of the images is raising various new and complex legal and ethical issues, including image retention and fraud, privacy, malpractice liability, etc., considerations of the security measures used in teleradiology remain unchanged. Addressing this problem naturally warrants investigations on the security measures for their relative functional limitations and for the scope of considering them further. In this paper, starting with various security and privacy standards, the security requirements of medical images as well as expected threats in teleradiology are reviewed. This will make it possible to determine the limitations of the conventional measures used against the expected threats. Further, we thoroughly study the utilization of digital watermarking for teleradiology. Following the key attributes and roles of various watermarking parameters, justification for watermarking over conventional security measures is made in terms of their various objectives, properties, and requirements. We also outline the main objectives of medical image watermarking for teleradiology, and provide recommendations on suitable watermarking techniques and their characterization. Finally, concluding remarks and directions for future research are presented.
Resumo:
From a law enforcement standpoint, the ability to search for a person matching a semantic description (i.e. 1.8m tall, red shirt, jeans) is highly desirable. While a significant research effort has focused on person re-detection (the task of identifying a previously observed individual in surveillance video), these techniques require descriptors to be built from existing image or video observations. As such, person re-detection techniques are not suited to situations where footage of the person of interest is not readily available, such as a witness reporting a recent crime. In this paper, we present a novel framework that is able to search for a person based on a semantic description. The proposed approach uses size and colour cues, and does not require a person detection routine to locate people in the scene, improving utility in crowded conditions. The proposed approach is demonstrated with a new database that will be made available to the research community, and we show that the proposed technique is able to correctly localise a person in a video based on a simple semantic description.
Resumo:
Person re-identification involves recognising individuals in different locations across a network of cameras and is a challenging task due to a large number of varying factors such as pose (both subject and camera) and ambient lighting conditions. Existing databases do not adequately capture these variations, making evaluations of proposed techniques difficult. In this paper, we present a new challenging multi-camera surveillance database designed for the task of person re-identification. This database consists of 150 unscripted sequences of subjects travelling in a building environment though up to eight camera views, appearing from various angles and in varying illumination conditions. A flexible XML-based evaluation protocol is provided to allow a highly configurable evaluation setup, enabling a variety of scenarios relating to pose and lighting conditions to be evaluated. A baseline person re-identification system consisting of colour, height and texture models is demonstrated on this database.
Resumo:
Thermal-infrared imagery is relatively robust to many of the failure conditions of visual and laser-based SLAM systems, such as fog, dust and smoke. The ability to use thermal-infrared video for localization is therefore highly appealing for many applications. However, operating in thermal-infrared is beyond the capacity of existing SLAM implementations. This paper presents the first known monocular SLAM system designed and tested for hand-held use in the thermal-infrared modality. The implementation includes a flexible feature detection layer able to achieve robust feature tracking in high-noise, low-texture thermal images. A novel approach for structure initialization is also presented. The system is robust to irregular motion and capable of handling the unique mechanical shutter interruptions common to thermal-infrared cameras. The evaluation demonstrates promising performance of the algorithm in several environments.
Resumo:
This paper establishes practical stability results for an important range of approximate discrete-time filtering problems involving mismatch between the true system and the approximating filter model. Practical stability is established in the sense of an asymptotic bound on the amount of bias introduced by the model approximation. Our analysis applies to a wide range of estimation problems and justifies the common practice of approximating intractable infinite dimensional nonlinear filters by simpler computationally tractable filters.
Resumo:
In this paper a real-time vision based power line extraction solution is investigated for active UAV guidance. The line extraction algorithm starts from ridge points detected by steerable filters. A collinear line segments fitting algorithm is followed up by considering global and local information together with multiple collinear measurements. GPU boosted algorithm implementation is also investigated in the experiment. The experimental result shows that the proposed algorithm outperforms two baseline line detection algorithms and is able to fitting long collinear line segments. The low computational cost of the algorithm make suitable for real-time applications.
Resumo:
The Midwestern US is a wind-rich resource and wind power is being developed in this region at a very brisk pace. Transporting this energy resource to load centers invariably requires massive transmission lines. This issue of developing additional transmission to support reliable integration of wind on to the power grid provides a multitude of interesting challenges spanning various areas of power systems such as transmission planning, real-time operations and cost-allocation for new transmission. The Midwest ISO as a regional transmission provider is responsible for processing requests to interconnect proposed generation on to the transmission grid under its purview. This paper provides information about some of the issues faced in performing interconnection planning studies and Midwest ISO's efforts to improve its generator interconnection procedures. Related cost-allocation efforts currently ongoing at the Midwest ISO to streamline integration of bulk quantities of wind power in to the transmission grid are also presented.
Resumo:
This paper establishes sufficient conditions to bound the error in perturbed conditional mean estimates derived from a perturbed model (only the scalar case is shown in this paper but a similar result is expected to hold for the vector case). The results established here extend recent stability results on approximating information state filter recursions to stability results on the approximate conditional mean estimates. The presented filter stability results provide bounds for a wide variety of model error situations.
Resumo:
Over the past few years, the Midwest ISO has experienced a surge in requests to interconnect large amounts of wind generation, driven largely by a favorable political environment and an abundant wind resource in the Midwestern US. This tremendous influx of proposed generators along with a highly constrained transmission system adversely impacted interconnection queue processing, resulting in an unmanageable backlog. Under these circumstances, Midwest ISO successfully reformed the interconnection tariff to improve cycle times and provide increased certainty to interconnection customers. One of the key features of the reformed queue process is the System Planning and Analysis (SPA) phase which allows integration of the interconnection studies with regional transmission planning. This paper presents a brief background of the queue reform effort and then delves deeply in to the work performed at the Midwest ISO during the first SPA cycle - the study approach, the challenges faced in having to study over 50,000 MWs of wind generation and the effective solutions designed to complete these studies within tariff timelines.
Resumo:
Spatial navigation requires the processing of complex, disparate and often ambiguous sensory data. The neurocomputations underpinning this vital ability remain poorly understood. Controversy remains as to whether multimodal sensory information must be combined into a unified representation, consistent with Tolman's "cognitive map", or whether differential activation of independent navigation modules suffice to explain observed navigation behaviour. Here we demonstrate that key neural correlates of spatial navigation in darkness cannot be explained if the path integration system acted independently of boundary (landmark) information. In vivo recordings demonstrate that the rodent head direction (HD) system becomes unstable within three minutes without vision. In contrast, rodents maintain stable place fields and grid fields for over half an hour without vision. Using a simple HD error model, we show analytically that idiothetic path integration (iPI) alone cannot be used to maintain any stable place representation beyond two to three minutes. We then use a measure of place stability based on information theoretic principles to prove that featureless boundaries alone cannot be used to improve localization above chance level. Having shown that neither iPI nor boundaries alone are sufficient, we then address the question of whether their combination is sufficient and - we conjecture - necessary to maintain place stability for prolonged periods without vision. We addressed this question in simulations and robot experiments using a navigation model comprising of a particle filter and boundary map. The model replicates published experimental results on place field and grid field stability without vision, and makes testable predictions including place field splitting and grid field rescaling if the true arena geometry differs from the acquired boundary map. We discuss our findings in light of current theories of animal navigation and neuronal computation, and elaborate on their implications and significance for the design, analysis and interpretation of experiments.
Resumo:
This paper addresses the issue of analogical inference, and its potential role as the mediator of new therapeutic discoveries, by using disjunction operators based on quantum connectives to combine many potential reasoning pathways into a single search expression. In it, we extend our previous work in which we developed an approach to analogical retrieval using the Predication-based Semantic Indexing (PSI) model, which encodes both concepts and the relationships between them in high-dimensional vector space. As in our previous work, we leverage the ability of PSI to infer predicate pathways connecting two example concepts, in this case comprising of known therapeutic relationships. For example, given that drug x TREATS disease z, we might infer the predicate pathway drug x INTERACTS WITH gene y ASSOCIATED WITH disease z, and use this pathway to search for drugs related to another disease in similar ways. As biological systems tend to be characterized by networks of relationships, we evaluate the ability of quantum-inspired operators to mediate inference and retrieval across multiple relations, by testing the ability of different approaches to recover known therapeutic relationships. In addition, we introduce a novel complex vector based implementation of PSI, based on Plate’s Circular Holographic Reduced Representations, which we utilize for all experiments in addition to the binary vector based approach we have applied in our previous research.
Resumo:
Key decisions at the collection, pre-processing, transformation, mining and interpretation phase of any knowledge discovery from database (KDD) process depend heavily on assumptions and theorectical perspectives relating to the type of task to be performed and characteristics of data sourced. In this article, we compare and contrast theoretical perspectives and assumptions taken in data mining exercises in the legal domain with those adopted in data mining in TCM and allopathic medicine. The juxtaposition results in insights for the application of KDD for Traditional Chinese Medicine.
Resumo:
Quality based frame selection is a crucial task in video face recognition, to both improve the recognition rate and to reduce the computational cost. In this paper we present a framework that uses a variety of cues (face symmetry, sharpness, contrast, closeness of mouth, brightness and openness of the eye) to select the highest quality facial images available in a video sequence for recognition. Normalized feature scores are fused using a neural network and frames with high quality scores are used in a Local Gabor Binary Pattern Histogram Sequence based face recognition system. Experiments on the Honda/UCSD database shows that the proposed method selects the best quality face images in the video sequence, resulting in improved recognition performance.
Resumo:
Detailed representations of complex flow datasets are often difficult to generate using traditional vector visualisation techniques such as arrow plots and streamlines. This is particularly true when the flow regime changes in time. Texture-based techniques, which are based on the advection of dense textures, are novel techniques for visualising such flows. We review two popular texture based techniques and their application to flow datasets sourced from active research projects. The techniques investigated were Line integral convolution (LIC) [1], and Image based flow visualisation (IBFV) [18]. We evaluated these and report on their effectiveness from a visualisation perspective. We also report on their ease of implementation and computational overheads.