920 resultados para Computer vision teaching


Relevância:

80.00% 80.00%

Publicador:

Resumo:

The world is rich with information such as signage and maps to assist humans to navigate. We present a method to extract topological spatial information from a generic bitmap floor plan and build a topometric graph that can be used by a mobile robot for tasks such as path planning and guided exploration. The algorithm first detects and extracts text in an image of the floor plan. Using the locations of the extracted text, flood fill is used to find the rooms and hallways. Doors are found by matching SURF features and these form the connections between rooms, which are the edges of the topological graph. Our system is able to automatically detect doors and differentiate between hallways and rooms, which is important for effective navigation. We show that our method can extract a topometric graph from a floor plan and is robust against ambiguous cases most commonly seen in floor plans including elevators and stairwells.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recovering the motion of a non-rigid body from a set of monocular images permits the analysis of dynamic scenes in uncontrolled environments. However, the extension of factorisation algorithms for rigid structure from motion to the low-rank non-rigid case has proved challenging. This stems from the comparatively hard problem of finding a linear “corrective transform” which recovers the projection and structure matrices from an ambiguous factorisation. We elucidate that this greater difficulty is due to the need to find multiple solutions to a non-trivial problem, casting a number of previous approaches as alleviating this issue by either a) introducing constraints on the basis, making the problems nonidentical, or b) incorporating heuristics to encourage a diverse set of solutions, making the problems inter-dependent. While it has previously been recognised that finding a single solution to this problem is sufficient to estimate cameras, we show that it is possible to bootstrap this partial solution to find the complete transform in closed-form. However, we acknowledge that our method minimises an algebraic error and is thus inherently sensitive to deviation from the low-rank model. We compare our closed-form solution for non-rigid structure with known cameras to the closed-form solution of Dai et al. [1], which we find to produce only coplanar reconstructions. We therefore make the recommendation that 3D reconstruction error always be measured relative to a trivial reconstruction such as a planar one.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Color displays used in image processing systems consist of a refresh memory buffer storing digital image data which are converted into analog signals to display an image by driving the primary color channels (red, green, and blue) of a color television monitor. The color cathode ray tube (CRT) of the monitor is unable to reproduce colors exactly due to phosphor limitations, exponential luminance response of the tube to the applied signal, and limitations imposed by the digital-to-analog conversion. In this paper we describe some computer simulation studies (using the U*V*W* color space) carried out to measure these reproduction errors. Further, a procedure to correct for color reproduction error due to the exponential luminance response (gamma) of the picture tube is proposed, using a video-lookup-table and a higher resolution digital-to-analog converter. It is found, on the basis of computer simulation studies, that the proposed gamma correction scheme is effective and robust with respect to variations in the assumed value of the gamma.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Domain-invariant representations are key to addressing the domain shift problem where the training and test exam- ples follow different distributions. Existing techniques that have attempted to match the distributions of the source and target domains typically compare these distributions in the original feature space. This space, however, may not be di- rectly suitable for such a comparison, since some of the fea- tures may have been distorted by the domain shift, or may be domain specific. In this paper, we introduce a Domain Invariant Projection approach: An unsupervised domain adaptation method that overcomes this issue by extracting the information that is invariant across the source and tar- get domains. More specifically, we learn a projection of the data to a low-dimensional latent space where the distance between the empirical distributions of the source and target examples is minimized. We demonstrate the effectiveness of our approach on the task of visual object recognition and show that it outperforms state-of-the-art methods on a stan- dard domain adaptation benchmark dataset

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we tackle the problem of unsupervised domain adaptation for classification. In the unsupervised scenario where no labeled samples from the target domain are provided, a popular approach consists in transforming the data such that the source and target distributions be- come similar. To compare the two distributions, existing approaches make use of the Maximum Mean Discrepancy (MMD). However, this does not exploit the fact that prob- ability distributions lie on a Riemannian manifold. Here, we propose to make better use of the structure of this man- ifold and rely on the distance on the manifold to compare the source and target distributions. In this framework, we introduce a sample selection method and a subspace-based method for unsupervised domain adaptation, and show that both these manifold-based techniques outperform the cor- responding approaches based on the MMD. Furthermore, we show that our subspace-based approach yields state-of- the-art results on a standard object recognition benchmark.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Movement of tephritid flies underpins their survival, reproduction, and ability to establish in new areas and is thus of importance when designing effective management strategies. Much of the knowledge currently available on tephritid movement throughout landscapes comes from the use of direct or indirect methods that rely on the trapping of individuals. Here, we review published experimental designs and methods from mark-release-recapture (MRR) studies, as well as other methods, that have been used to estimate movement of the four major tephritid pest genera (Bactrocera, Ceratitis, Anastrepha, and Rhagoletis). In doing so, we aim to illustrate the theoretical and practical considerations needed to study tephritid movement. MRR studies make use of traps to directly estimate the distance that tephritid species can move within a generation and to evaluate the ecological and physiological factors that influence dispersal patterns. MRR studies, however, require careful planning to ensure that the results obtained are not biased by the methods employed, including marking methods, trap properties, trap spacing, and spatial extent of the trapping array. Despite these obstacles, MRR remains a powerful tool for determining tephritid movement, with data particularly required for understudied species that affect developing countries. To ensure that future MRR studies are successful, we suggest that site selection be carefully considered and sufficient resources be allocated to achieve optimal spacing and placement of traps in line with the stated aims of each study. An alternative to MRR is to make use of indirect methods for determining movement, or more correctly, gene flow, which have become widely available with the development of molecular tools. Key to these methods is the trapping and sequencing of a suitable number of individuals to represent the genetic diversity of the sampled population and investigate population structuring using nuclear genomic markers or non-recombinant mitochondrial DNA markers. Microsatellites are currently the preferred marker for detecting recent population displacement and provide genetic information that may be used in assignment tests for the direct determination of contemporary movement. Neither MRR nor molecular methods, however, are able to monitor fine-scale movements of individual flies. Recent developments in the miniaturization of electronics offer the tantalising possibility to track individual movements of insects using harmonic radar. Computer vision and radio frequency identification tags may also permit the tracking of fine-scale movements by tephritid flies by automated resampling, although these methods come with the same problems as traditional traps used in MRR studies. Although all methods described in this chapter have limitations, a better understanding of tephritid movement far outweighs the drawbacks of the individual methods because of the need for this information to manage tephritid populations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Deep convolutional network models have dominated recent work in human action recognition as well as image classification. However, these methods are often unduly influenced by the image background, learning and exploiting the presence of cues in typical computer vision datasets. For unbiased robotics applications, the degree of variation and novelty in action backgrounds is far greater than in computer vision datasets. To address this challenge, we propose an “action region proposal” method that, informed by optical flow, extracts image regions likely to contain actions for input into the network both during training and testing. In a range of experiments, we demonstrate that manually segmenting the background is not enough; but through active action region proposals during training and testing, state-of-the-art or better performance can be achieved on individual spatial and temporal video components. Finally, we show by focusing attention through action region proposals, we can further improve upon the existing state-of-the-art in spatio-temporally fused action recognition performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The aim of the study was to explore why the MuPSiNet project - a computer and network supported learning environment for the field of health care and social work - did not develop as expected. To grasp the problem some hypotheses were formulated. The hypotheses regarded the teachers' skills in and attitudes towards computing and their attitudes towards constructivist study methods. An online survey containing 48 items was performed. The survey targeted all the teachers within the field of health care and social work in the country, and it produced 461 responses that were analysed against the hypotheses. The reliability of the variables was tested using the Cronbach alpha coefficient and t-tests. Poor basic computing skills among the teachers combined with a vulnerable technical solution, and inadequate project management combined with lack of administrative models for transforming economic resources into manpower were the factors that turned out to play a decisive role in the project. Other important findings were that the teachers had rather poor skills and knowledge in computing, computer safety and computer supported instruction, and that these skills were significantly poorer among female teachers who were in majority in the sample. The fraction of teachers who were familiar with software for electronic patient records (EPR) was low. The attitudes towards constructivist teaching methods were positive, and further education seemed to utterly increase the teachers' readiness to use alternative teaching methods. The most important conclusions were the following: In order to integrate EPR software as a natural tool in teaching planning and documenting health care, it is crucial that the teachers have sufficient basic skills in computing and that more teachers have personal experience of using EPR software. In order for computer supported teaching to become accepted it is necessary to arrange with extensive further education for the teachers presently working, and for that further education to succeed it should be backed up locally among other things by sufficient support in matters concerning computer supported teaching. The attitudes towards computing showed significant gender differences. Based on the findings it is suggested that basic skills in computing should also include an awareness of data safety in relation to work in different kinds of computer networks, and that projects of this kind should be built up around a proper project organisation with sufficient resources. Suggestions concerning curricular development and further education are also presented. Conclusions concerning the research method were that reminders have a better effect, and that respondents tend to answer open-ended questions more verbosely in electronically distributed online surveys compared to traditional surveys. A method of utilising randomized passwords to guarantee respondent anonymity while maintaining sample control is presented. Keywords: computer-assisted learning, computer-assisted instruction, health care, social work, vocational education, computerized patient record, online survey

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Age estimation from facial images is increasingly receiving attention to solve age-based access control, age-adaptive targeted marketing, amongst other applications. Since even humans can be induced in error due to the complex biological processes involved, finding a robust method remains a research challenge today. In this paper, we propose a new framework for the integration of Active Appearance Models (AAM), Local Binary Patterns (LBP), Gabor wavelets (GW) and Local Phase Quantization (LPQ) in order to obtain a highly discriminative feature representation which is able to model shape, appearance, wrinkles and skin spots. In addition, this paper proposes a novel flexible hierarchical age estimation approach consisting of a multi-class Support Vector Machine (SVM) to classify a subject into an age group followed by a Support Vector Regression (SVR) to estimate a specific age. The errors that may happen in the classification step, caused by the hard boundaries between age classes, are compensated in the specific age estimation by a flexible overlapping of the age ranges. The performance of the proposed approach was evaluated on FG-NET Aging and MORPH Album 2 datasets and a mean absolute error (MAE) of 4.50 and 5.86 years was achieved respectively. The robustness of the proposed approach was also evaluated on a merge of both datasets and a MAE of 5.20 years was achieved. Furthermore, we have also compared the age estimation made by humans with the proposed approach and it has shown that the machine outperforms humans. The proposed approach is competitive with current state-of-the-art and it provides an additional robustness to blur, lighting and expression variance brought about by the local phase features.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Video surveillance infrastructure has been widely installed in public places for security purposes. However, live video feeds are typically monitored by human staff, making the detection of important events as they occur difficult. As such, an expert system that can automatically detect events of interest in surveillance footage is highly desirable. Although a number of approaches have been proposed, they have significant limitations: supervised approaches, which can detect a specific event, ideally require a large number of samples with the event spatially and temporally localised; while unsupervised approaches, which do not require this demanding annotation, can only detect whether an event is abnormal and not specific event types. To overcome these problems, we formulate a weakly-supervised approach using Kullback-Leibler (KL) divergence to detect rare events. The proposed approach leverages the sparse nature of the target events to its advantage, and we show that this data imbalance guarantees the existence of a decision boundary to separate samples that contain the target event from those that do not. This trait, combined with the coarse annotation used by weakly supervised learning (that only indicates approximately when an event occurs), greatly reduces the annotation burden while retaining the ability to detect specific events. Furthermore, the proposed classifier requires only a decision threshold, simplifying its use compared to other weakly supervised approaches. We show that the proposed approach outperforms state-of-the-art methods on a popular real-world traffic surveillance dataset, while preserving real time performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper we investigate the effectiveness of class specific sparse codes in the context of discriminative action classification. The bag-of-words representation is widely used in activity recognition to encode features, and although it yields state-of-the art performance with several feature descriptors it still suffers from large quantization errors and reduces the overall performance. Recently proposed sparse representation methods have been shown to effectively represent features as a linear combination of an over complete dictionary by minimizing the reconstruction error. In contrast to most of the sparse representation methods which focus on Sparse-Reconstruction based Classification (SRC), this paper focuses on a discriminative classification using a SVM by constructing class-specific sparse codes for motion and appearance separately. Experimental results demonstrates that separate motion and appearance specific sparse coefficients provide the most effective and discriminative representation for each class compared to a single class-specific sparse coefficients.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents an effective feature representation method in the context of activity recognition. Efficient and effective feature representation plays a crucial role not only in activity recognition, but also in a wide range of applications such as motion analysis, tracking, 3D scene understanding etc. In the context of activity recognition, local features are increasingly popular for representing videos because of their simplicity and efficiency. While they achieve state-of-the-art performance with low computational requirements, their performance is still limited for real world applications due to a lack of contextual information and models not being tailored to specific activities. We propose a new activity representation framework to address the shortcomings of the popular, but simple bag-of-words approach. In our framework, first multiple instance SVM (mi-SVM) is used to identify positive features for each action category and the k-means algorithm is used to generate a codebook. Then locality-constrained linear coding is used to encode the features into the generated codebook, followed by spatio-temporal pyramid pooling to convey the spatio-temporal statistics. Finally, an SVM is used to classify the videos. Experiments carried out on two popular datasets with varying complexity demonstrate significant performance improvement over the base-line bag-of-feature method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The study examines various uses of computer technology in acquisition of information for visually impaired people. For this study 29 visually impaired persons took part in a survey about their experiences concerning acquisition of infomation and use of computers, especially with a screen magnification program, a speech synthesizer and a braille display. According to the responses, the evolution of computer technology offers an important possibility for visually impaired people to cope with everyday activities and interacting with the environment. Nevertheless, the functionality of assistive technology needs further development to become more usable and versatile. Since the challenges of independent observation of environment were emphasized in the survey, the study led into developing a portable text vision system called Tekstinäkö. Contrary to typical stand-alone applications, Tekstinäkö system was constructed by combining devices and programs that are readily available on consumer market. As the system operates, pictures are taken by a digital camera and instantly transmitted to a text recognition program in a laptop computer that talks out loud the text using a speech synthesizer. Visually impaired test users described that even unsure interpretations of the texts in the environment given by Tekstinäkö system are at least a welcome addition to complete perception of the environment. It became clear that even with a modest development work it is possible to bring new, useful and valuable methods to everyday life of disabled people. Unconventional production process of the system appeared to be efficient as well. Achieved results and the proposed working model offer one suggestion for giving enough attention to easily overlooked needs of the people with special abilities. ACM Computing Classification System (1998): K.4.2 Social Issues: Assistive technologies for persons with disabilities I.4.9 Image processing and computer vision: Applications Keywords: Visually impaired, computer-assisted, information, acquisition, assistive technology, computer, screen magnification program, speech synthesizer, braille display, survey, testing, text recognition, camera, text, perception, picture, environment, trasportation, guidance, independence, vision, disabled, blind, speech, synthesizer, braille, software engineering, programming, program, system, freeware, shareware, open source, Tekstinäkö, text vision, TopOCR, Autohotkey, computer engineering, computer science

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Surveying threatened and invasive species to obtain accurate population estimates is an important but challenging task that requires a considerable investment in time and resources. Estimates using existing ground-based monitoring techniques, such as camera traps and surveys performed on foot, are known to be resource intensive, potentially inaccurate and imprecise, and difficult to validate. Recent developments in unmanned aerial vehicles (UAV), artificial intelligence and miniaturized thermal imaging systems represent a new opportunity for wildlife experts to inexpensively survey relatively large areas. The system presented in this paper includes thermal image acquisition as well as a video processing pipeline to perform object detection, classification and tracking of wildlife in forest or open areas. The system is tested on thermal video data from ground based and test flight footage, and is found to be able to detect all the target wildlife located in the surveyed area. The system is flexible in that the user can readily define the types of objects to classify and the object characteristics that should be considered during classification.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Natural history collections are an invaluable resource housing a wealth of knowledge with a long tradition of contributing to a wide range of fields such as taxonomy, quarantine, conservation and climate change. It is recognized however [Smith and Blagoderov 2012] that such physical collections are often heavily underutilized as a result of the practical issues of accessibility. The digitization of these collections is a step towards removing these access issues, but other hurdles must be addressed before we truly unlock the potential of this knowledge.