858 resultados para Feature Descriptors


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective
Pedestrian detection under video surveillance systems has always been a hot topic in computer vision research. These systems are widely used in train stations, airports, large commercial plazas, and other public places. However, pedestrian detection remains difficult because of complex backgrounds. Given its development in recent years, the visual attention mechanism has attracted increasing attention in object detection and tracking research, and previous studies have achieved substantial progress and breakthroughs. We propose a novel pedestrian detection method based on the semantic features under the visual attention mechanism.
Method
The proposed semantic feature-based visual attention model is a spatial-temporal model that consists of two parts: the static visual attention model and the motion visual attention model. The static visual attention model in the spatial domain is constructed by combining bottom-up with top-down attention guidance. Based on the characteristics of pedestrians, the bottom-up visual attention model of Itti is improved by intensifying the orientation vectors of elementary visual features to make the visual saliency map suitable for pedestrian detection. In terms of pedestrian attributes, skin color is selected as a semantic feature for pedestrian detection. The regional and Gaussian models are adopted to construct the skin color model. Skin feature-based visual attention guidance is then proposed to complete the top-down process. The bottom-up and top-down visual attentions are linearly combined using the proper weights obtained from experiments to construct the static visual attention model in the spatial domain. The spatial-temporal visual attention model is then constructed via the motion features in the temporal domain. Based on the static visual attention model in the spatial domain, the frame difference method is combined with optical flowing to detect motion vectors. Filtering is applied to process the field of motion vectors. The saliency of motion vectors can be evaluated via motion entropy to make the selected motion feature more suitable for the spatial-temporal visual attention model.
Result
Standard datasets and practical videos are selected for the experiments. The experiments are performed on a MATLAB R2012a platform. The experimental results show that our spatial-temporal visual attention model demonstrates favorable robustness under various scenes, including indoor train station surveillance videos and outdoor scenes with swaying leaves. Our proposed model outperforms the visual attention model of Itti, the graph-based visual saliency model, the phase spectrum of quaternion Fourier transform model, and the motion channel model of Liu in terms of pedestrian detection. The proposed model achieves a 93% accuracy rate on the test video.
Conclusion
This paper proposes a novel pedestrian method based on the visual attention mechanism. A spatial-temporal visual attention model that uses low-level and semantic features is proposed to calculate the saliency map. Based on this model, the pedestrian targets can be detected through focus of attention shifts. The experimental results verify the effectiveness of the proposed attention model for detecting pedestrians.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To maintain the pace of development set by Moore's law, production processes in semiconductor manufacturing are becoming more and more complex. The development of efficient and interpretable anomaly detection systems is fundamental to keeping production costs low. As the dimension of process monitoring data can become extremely high anomaly detection systems are impacted by the curse of dimensionality, hence dimensionality reduction plays an important role. Classical dimensionality reduction approaches, such as Principal Component Analysis, generally involve transformations that seek to maximize the explained variance. In datasets with several clusters of correlated variables the contributions of isolated variables to explained variance may be insignificant, with the result that they may not be included in the reduced data representation. It is then not possible to detect an anomaly if it is only reflected in such isolated variables. In this paper we present a new dimensionality reduction technique that takes account of such isolated variables and demonstrate how it can be used to build an interpretable and robust anomaly detection system for Optical Emission Spectroscopy data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The 2015 FRVT gender classification (GC) report evidences the problems that current approaches tackle in situations with large variations in pose, illumination, background and facial expression. The report suggests that both commercial and research solutions are hardly able to reach an accuracy over 90% for The Images of Groups dataset, a proven scenario exhibiting unrestricted or in the wild conditions. In this paper, we focus on this challenging dataset, stepping forward in GC performance by observing: 1) recent literature results combining multiple local descriptors, and 2) the psychophysics evidences of the greater importance of the ocular and mouth areas to solve this task...

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The governance of climate adaptation involves the collective efforts of multiple societal actors to address problems, or to reap the benefits, associated with impacts of climate change. Governing involves the creation of institutions, rules and organizations, and the selection of normative principles to guide problem solution and institution building. We argue that actors involved in governing climate change adaptation, as climate change governance regimes evolve, inevitably must engage in making choices, for instance on problem definitions, jurisdictional levels, on modes of governance and policy instruments, and on the timing of interventions. Yet little is known about how and why these choices are made in practice, and how such choices affect the outcomes of our efforts to govern adaptation. In this introduction we review the current state of evidence and the specific contribution of the articles published in this Special Feature, which are aimed at bringing greater clarity in these matters, and thereby informing both governance theory and practice. Collectively, the contributing papers suggest that the way issues are defined has important consequences for the support for governance interventions, and their effectiveness. The articles suggest that currently the emphasis in adaptation governance is on the local and regional levels, while underscoring the benefits of interventions and governance at higher jurisdictional levels in terms of visioning and scaling-up effective approaches. The articles suggest that there is a central role of government agencies in leading governance interventions to address spillover effects, to provide public goods, and to promote the long-term perspectives for planning. They highlight the issue of justice in the governance of adaptation showing how governance measures have wide distributional consequences, including the potential to amplify existing inequalities, access to resources, or generating new injustices through distribution of risks. For several of these findings, future research directions are suggested.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The estimating of the relative orientation and position of a camera is one of the integral topics in the field of computer vision. The accuracy of a certain Finnish technology company’s traffic sign inventory and localization process can be improved by utilizing the aforementioned concept. The company’s localization process uses video data produced by a vehicle installed camera. The accuracy of estimated traffic sign locations depends on the relative orientation between the camera and the vehicle. This thesis proposes a computer vision based software solution which can estimate a camera’s orientation relative to the movement direction of the vehicle by utilizing video data. The task was solved by using feature-based methods and open source software. When using simulated data sets, the camera orientation estimates had an absolute error of 0.31 degrees on average. The software solution can be integrated to be a part of the traffic sign localization pipeline of the company in question.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of human brain electroencephalography (EEG) signals for automatic person identi cation has been investigated for a decade. It has been found that the performance of an EEG-based person identication system highly depends on what feature to be extracted from multi-channel EEG signals. Linear methods such as Power Spectral Density and Autoregressive Model have been used to extract EEG features. However these methods assumed that EEG signals are stationary. In fact, EEG signals are complex, non-linear, non-stationary, and random in nature. In addition, other factors such as brain condition or human characteristics may have impacts on the performance, however these factors have not been investigated and evaluated in previous studies. It has been found in the literature that entropy is used to measure the randomness of non-linear time series data. Entropy is also used to measure the level of chaos of braincomputer interface systems. Therefore, this thesis proposes to study the role of entropy in non-linear analysis of EEG signals to discover new features for EEG-based person identi- cation. Five dierent entropy methods including Shannon Entropy, Approximate Entropy, Sample Entropy, Spectral Entropy, and Conditional Entropy have been proposed to extract entropy features that are used to evaluate the performance of EEG-based person identication systems and the impacts of epilepsy, alcohol, age and gender characteristics on these systems. Experiments were performed on the Australian EEG and Alcoholism datasets. Experimental results have shown that, in most cases, the proposed entropy features yield very fast person identication, yet with compatible accuracy because the feature dimension is low. In real life security operation, timely response is critical. The experimental results have also shown that epilepsy, alcohol, age and gender characteristics have impacts on the EEG-based person identication systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Model Driven based approach for Service Evolution in Clouds will mainly focus on the reusable evolution patterns' advantage to solve evolution problems. During the process, evolution pattern will be driven by MDA models to pattern aspects. Weaving the aspects into service based process by using Aspect-Oriented extended BPEL engine at runtime will be the dynamic feature of the evolution.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The use of digital image processing techniques is prominent in medical settings for the automatic diagnosis of diseases. Glaucoma is the second leading cause of blindness in the world and it has no cure. Currently, there are treatments to prevent vision loss, but the disease must be detected in the early stages. Thus, the objective of this work is to develop an automatic detection method of Glaucoma in retinal images. The methodology used in the study were: acquisition of image database, Optic Disc segmentation, texture feature extraction in different color models and classification of images in glaucomatous or not. We obtained results of 93% accuracy

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The size of online image datasets is constantly increasing. Considering an image dataset with millions of images, image retrieval becomes a seemingly intractable problem for exhaustive similarity search algorithms. Hashing methods, which encodes high-dimensional descriptors into compact binary strings, have become very popular because of their high efficiency in search and storage capacity. In the first part, we propose a multimodal retrieval method based on latent feature models. The procedure consists of a nonparametric Bayesian framework for learning underlying semantically meaningful abstract features in a multimodal dataset, a probabilistic retrieval model that allows cross-modal queries and an extension model for relevance feedback. In the second part, we focus on supervised hashing with kernels. We describe a flexible hashing procedure that treats binary codes and pairwise semantic similarity as latent and observed variables, respectively, in a probabilistic model based on Gaussian processes for binary classification. We present a scalable inference algorithm with the sparse pseudo-input Gaussian process (SPGP) model and distributed computing. In the last part, we define an incremental hashing strategy for dynamic databases where new images are added to the databases frequently. The method is based on a two-stage classification framework using binary and multi-class SVMs. The proposed method also enforces balance in binary codes by an imbalance penalty to obtain higher quality binary codes. We learn hash functions by an efficient algorithm where the NP-hard problem of finding optimal binary codes is solved via cyclic coordinate descent and SVMs are trained in a parallelized incremental manner. For modifications like adding images from an unseen class, we propose an incremental procedure for effective and efficient updates to the previous hash functions. Experiments on three large-scale image datasets demonstrate that the incremental strategy is capable of efficiently updating hash functions to the same retrieval performance as hashing from scratch.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Image (Video) retrieval is an interesting problem of retrieving images (videos) similar to the query. Images (Videos) are represented in an input (feature) space and similar images (videos) are obtained by finding nearest neighbors in the input representation space. Numerous input representations both in real valued and binary space have been proposed for conducting faster retrieval. In this thesis, we present techniques that obtain improved input representations for retrieval in both supervised and unsupervised settings for images and videos. Supervised retrieval is a well known problem of retrieving same class images of the query. We address the practical aspects of achieving faster retrieval with binary codes as input representations for the supervised setting in the first part, where binary codes are used as addresses into hash tables. In practice, using binary codes as addresses does not guarantee fast retrieval, as similar images are not mapped to the same binary code (address). We address this problem by presenting an efficient supervised hashing (binary encoding) method that aims to explicitly map all the images of the same class ideally to a unique binary code. We refer to the binary codes of the images as `Semantic Binary Codes' and the unique code for all same class images as `Class Binary Code'. We also propose a new class­ based Hamming metric that dramatically reduces the retrieval times for larger databases, where only hamming distance is computed to the class binary codes. We also propose a Deep semantic binary code model, by replacing the output layer of a popular convolutional Neural Network (AlexNet) with the class binary codes and show that the hashing functions learned in this way outperforms the state­ of ­the art, and at the same time provide fast retrieval times. In the second part, we also address the problem of supervised retrieval by taking into account the relationship between classes. For a given query image, we want to retrieve images that preserve the relative order i.e. we want to retrieve all same class images first and then, the related classes images before different class images. We learn such relationship aware binary codes by minimizing the similarity between inner product of the binary codes and the similarity between the classes. We calculate the similarity between classes using output embedding vectors, which are vector representations of classes. Our method deviates from the other supervised binary encoding schemes as it is the first to use output embeddings for learning hashing functions. We also introduce new performance metrics that take into account the related class retrieval results and show significant gains over the state­ of­ the art. High Dimensional descriptors like Fisher Vectors or Vector of Locally Aggregated Descriptors have shown to improve the performance of many computer vision applications including retrieval. In the third part, we will discuss an unsupervised technique for compressing high dimensional vectors into high dimensional binary codes, to reduce storage complexity. In this approach, we deviate from adopting traditional hyperplane hashing functions and instead learn hyperspherical hashing functions. The proposed method overcomes the computational challenges of directly applying the spherical hashing algorithm that is intractable for compressing high dimensional vectors. A practical hierarchical model that utilizes divide and conquer techniques using the Random Select and Adjust (RSA) procedure to compress such high dimensional vectors is presented. We show that our proposed high dimensional binary codes outperform the binary codes obtained using traditional hyperplane methods for higher compression ratios. In the last part of the thesis, we propose a retrieval based solution to the Zero shot event classification problem - a setting where no training videos are available for the event. To do this, we learn a generic set of concept detectors and represent both videos and query events in the concept space. We then compute similarity between the query event and the video in the concept space and videos similar to the query event are classified as the videos belonging to the event. We show that we significantly boost the performance using concept features from other modalities.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Phenotypic variation in plants can be evaluated by morphological characterization using visual attributes. Fruits have been the major descriptors for identification of different varieties of fruit crops. However, even in their absence, farmers, breeders and interested stakeholders require to distinguish between different mango varieties. This study aimed at determining diversity in mango germplasm from the Upper Athi River (UAR) and providing useful alternative descriptors for the identification of different mango varieties in the absence of fruits. A total of 20 International Plant Genetic Resources Institute (IPGRI) descriptors for mango were selected for use in the visual assessment of 98 mango accessions from 15 sites of the UAR region of eastern Kenya. Purposive sampling was used to identify farmers growing diverse varieties of mangoes. Evaluation of the descriptors was performed on-site and the data collected were then subjected to multivariate analysis including Principal Component Analysis (PCA) and Cluster analysis, one- way analysis of variance (ANOVA) and Chi square tests. Results classified the accessions into two major groups corresponding to indigenous and exotic varieties. The PCA showed the first six principal components accounting for 75.12% of the total variance. A strong and highly significant correlation was observed between the color of fully grown leaves, leaf blade width, leaf blade length and petiole length and also between the leaf attitude, color of young leaf, stem circumference, tree height, leaf margin, growth habit and fragrance. Useful descriptors for morphological evaluation were 14 out of the selected 20; however, ANOVA and Chi square test revealed that diversity in the accessions was majorly as a result of variations in color of young leaves, leaf attitude, leaf texture, growth habit, leaf blade length, leaf blade width and petiole length traits. These results reveal that mango germplasm in the UAR has significant diversity and that other morphological traits apart from fruits can be useful in morphological characterization of mango.