996 resultados para dictionary learning


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nanotechnology has revolutionised humanity's capability in building microscopic systems by manipulating materials on a molecular and atomic scale. Nan-osystems are becoming increasingly smaller and more complex from the chemical perspective which increases the demand for microscopic characterisation techniques. Among others, transmission electron microscopy (TEM) is an indispensable tool that is increasingly used to study the structures of nanosystems down to the molecular and atomic scale. However, despite the effectivity of this tool, it can only provide 2-dimensional projection (shadow) images of the 3D structure, leaving the 3-dimensional information hidden which can lead to incomplete or erroneous characterization. One very promising inspection method is Electron Tomography (ET), which is rapidly becoming an important tool to explore the 3D nano-world. ET provides (sub-)nanometer resolution in all three dimensions of the sample under investigation. However, the fidelity of the ET tomogram that is achieved by current ET reconstruction procedures remains a major challenge. This thesis addresses the assessment and advancement of electron tomographic methods to enable high-fidelity three-dimensional investigations. A quality assessment investigation was conducted to provide a quality quantitative analysis of the main established ET reconstruction algorithms and to study the influence of the experimental conditions on the quality of the reconstructed ET tomogram. Regular shaped nanoparticles were used as a ground-truth for this study. It is concluded that the fidelity of the post-reconstruction quantitative analysis and segmentation is limited, mainly by the fidelity of the reconstructed ET tomogram. This motivates the development of an improved tomographic reconstruction process. In this thesis, a novel ET method was proposed, named dictionary learning electron tomography (DLET). DLET is based on the recent mathematical theorem of compressed sensing (CS) which employs the sparsity of ET tomograms to enable accurate reconstruction from undersampled (S)TEM tilt series. DLET learns the sparsifying transform (dictionary) in an adaptive way and reconstructs the tomogram simultaneously from highly undersampled tilt series. In this method, the sparsity is applied on overlapping image patches favouring local structures. Furthermore, the dictionary is adapted to the specific tomogram instance, thereby favouring better sparsity and consequently higher quality reconstructions. The reconstruction algorithm is based on an alternating procedure that learns the sparsifying dictionary and employs it to remove artifacts and noise in one step, and then restores the tomogram data in the other step. Simulation and real ET experiments of several morphologies are performed with a variety of setups. Reconstruction results validate its efficiency in both noiseless and noisy cases and show that it yields an improved reconstruction quality with fast convergence. The proposed method enables the recovery of high-fidelity information without the need to worry about what sparsifying transform to select or whether the images used strictly follow the pre-conditions of a certain transform (e.g. strictly piecewise constant for Total Variation minimisation). This can also avoid artifacts that can be introduced by specific sparsifying transforms (e.g. the staircase artifacts the may result when using Total Variation minimisation). Moreover, this thesis shows how reliable elementally sensitive tomography using EELS is possible with the aid of both appropriate use of Dual electron energy loss spectroscopy (DualEELS) and the DLET compressed sensing algorithm to make the best use of the limited data volume and signal to noise inherent in core-loss electron energy loss spectroscopy (EELS) from nanoparticles of an industrially important material. Taken together, the results presented in this thesis demonstrates how high-fidelity ET reconstructions can be achieved using a compressed sensing approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Object recognition has long been a core problem in computer vision. To improve object spatial support and speed up object localization for object recognition, generating high-quality category-independent object proposals as the input for object recognition system has drawn attention recently. Given an image, we generate a limited number of high-quality and category-independent object proposals in advance and used as inputs for many computer vision tasks. We present an efficient dictionary-based model for image classification task. We further extend the work to a discriminative dictionary learning method for tensor sparse coding. In the first part, a multi-scale greedy-based object proposal generation approach is presented. Based on the multi-scale nature of objects in images, our approach is built on top of a hierarchical segmentation. We first identify the representative and diverse exemplar clusters within each scale. Object proposals are obtained by selecting a subset from the multi-scale segment pool via maximizing a submodular objective function, which consists of a weighted coverage term, a single-scale diversity term and a multi-scale reward term. The weighted coverage term forces the selected set of object proposals to be representative and compact; the single-scale diversity term encourages choosing segments from different exemplar clusters so that they will cover as many object patterns as possible; the multi-scale reward term encourages the selected proposals to be discriminative and selected from multiple layers generated by the hierarchical image segmentation. The experimental results on the Berkeley Segmentation Dataset and PASCAL VOC2012 segmentation dataset demonstrate the accuracy and efficiency of our object proposal model. Additionally, we validate our object proposals in simultaneous segmentation and detection and outperform the state-of-art performance. To classify the object in the image, we design a discriminative, structural low-rank framework for image classification. We use a supervised learning method to construct a discriminative and reconstructive dictionary. By introducing an ideal regularization term, we perform low-rank matrix recovery for contaminated training data from all categories simultaneously without losing structural information. A discriminative low-rank representation for images with respect to the constructed dictionary is obtained. With semantic structure information and strong identification capability, this representation is good for classification tasks even using a simple linear multi-classifier.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cross domain and cross-modal matching has many applications in the field of computer vision and pattern recognition. A few examples are heterogeneous face recognition, cross view action recognition, etc. This is a very challenging task since the data in two domains can differ significantly. In this work, we propose a coupled dictionary and transformation learning approach that models the relationship between the data in both domains. The approach learns a pair of transformation matrices that map the data in the two domains in such a manner that they share common sparse representations with respect to their own dictionaries in the transformed space. The dictionaries for the two domains are learnt in a coupled manner with an additional discriminative term to ensure improved recognition performance. The dictionaries and the transformation matrices are jointly updated in an iterative manner. The applicability of the proposed approach is illustrated by evaluating its performance on different challenging tasks: face recognition across pose, illumination and resolution, heterogeneous face recognition and cross view action recognition. Extensive experiments on five datasets namely, CMU-PIE, Multi-PIE, ChokePoint, HFB and IXMAS datasets and comparisons with several state-of-the-art approaches show the effectiveness of the proposed approach. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Cross domain and cross-modal matching has many applications in the field of computer vision and pattern recognition. A few examples are heterogeneous face recognition, cross view action recognition, etc. This is a very challenging task since the data in two domains can differ significantly. In this work, we propose a coupled dictionary and transformation learning approach that models the relationship between the data in both domains. The approach learns a pair of transformation matrices that map the data in the two domains in such a manner that they share common sparse representations with respect to their own dictionaries in the transformed space. The dictionaries for the two domains are learnt in a coupled manner with an additional discriminative term to ensure improved recognition performance. The dictionaries and the transformation matrices are jointly updated in an iterative manner. The applicability of the proposed approach is illustrated by evaluating its performance on different challenging tasks: face recognition across pose, illumination and resolution, heterogeneous face recognition and cross view action recognition. Extensive experiments on five datasets namely, CMU-PIE, Multi-PIE, ChokePoint, HFB and IXMAS datasets and comparisons with several state-of-the-art approaches show the effectiveness of the proposed approach. (C) 2015 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We propose a joint representation and classification framework that achieves the dual goal of finding the most discriminative sparse overcomplete encoding and optimal classifier parameters. Formulating an optimization problem that combines the objective function of the classification with the representation error of both labeled and unlabeled data, constrained by sparsity, we propose an algorithm that alternates between solving for subsets of parameters, whilst preserving the sparsity. The method is then evaluated over two important classification problems in computer vision: object categorization of natural images using the Caltech 101 database and face recognition using the Extended Yale B face database. The results show that the proposed method is competitive against other recently proposed sparse overcomplete counterparts and considerably outperforms many recently proposed face recognition techniques when the number training samples is small.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study applies theories of cognitive linguistics to the compilation of English learners’ dictionaries. Specifically, it employs the concepts of basic level categories and image schemas, two basic cognitive experiences, to examine the ‘definition proper’ of English dictionaries for foreign learners. In the study, the definition proper refers to the constituent part of a reference work that provides an explanation of the meanings of a word, phrase or term. This rationalization mainly consists of defining vocabulary, sense division and arrangement, as well as the means of defining (i.e. paraphrase, true definition, functional definition, and pictorial illustration). The aim of the study is to suggest ways of aligning the consultation and learning of definitions with dictionary users’ cognitive experiences. For this purpose, an analysis of the definition proper of the fourth edition of the Longman Dictionary of Contemporary English (LDOCE4) from the perspective of basic cognitive experiences has been undertaken. The study found that, generally, the lexicographic practices of LDOCE4 are consistent with theories of cognitive linguistics. However, there exist shortcomings that result from disregarding basic cognitive experiences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To perform super resolution of low resolution images, state-of-the-art methods are based on learning a pair of lowresolution and high-resolution dictionaries from multiple images. These trained dictionaries are used to replace patches in lowresolution image with appropriate matching patches from the high-resolution dictionary. In this paper we propose using a single common image as dictionary, in conjunction with approximate nearest neighbour fields (ANNF) to perform super resolution (SR). By using a common source image, we are able to bypass the learning phase and also able to reduce the dictionary from a collection of hundreds of images to a single image. By adapting recent developments in ANNF computation, to suit super-resolution, we are able to perform much faster and accurate SR than existing techniques. To establish this claim, we compare the proposed algorithm against various state-of-the-art algorithms, and show that we are able to achieve b etter and faster reconstruction without any training.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this study, we introduce an original distance definition for graphs, called the Markov-inverse-F measure (MiF). This measure enables the integration of classical graph theory indices with new knowledge pertaining to structural feature extraction from semantic networks. MiF improves the conventional Jaccard and/or Simpson indices, and reconciles both the geodesic information (random walk) and co-occurrence adjustment (degree balance and distribution). We measure the effectiveness of graph-based coefficients through the application of linguistic graph information for a neural activity recorded during conceptual processing in the human brain. Specifically, the MiF distance is computed between each of the nouns used in a previous neural experiment and each of the in-between words in a subgraph derived from the Edinburgh Word Association Thesaurus of English. From the MiF-based information matrix, a machine learning model can accurately obtain a scalar parameter that specifies the degree to which each voxel in (the MRI image of) the brain is activated by each word or each principal component of the intermediate semantic features. Furthermore, correlating the voxel information with the MiF-based principal components, a new computational neurolinguistics model with a network connectivity paradigm is created. This allows two dimensions of context space to be incorporated with both semantic and neural distributional representations.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

El test de circuits és una fase del procés de producció que cada vegada pren més importància quan es desenvolupa un nou producte. Les tècniques de test i diagnosi per a circuits digitals han estat desenvolupades i automatitzades amb èxit, mentre que aquest no és encara el cas dels circuits analògics. D'entre tots els mètodes proposats per diagnosticar circuits analògics els més utilitzats són els diccionaris de falles. En aquesta tesi se'n descriuen alguns, tot analitzant-ne els seus avantatges i inconvenients. Durant aquests últims anys, les tècniques d'Intel·ligència Artificial han esdevingut un dels camps de recerca més importants per a la diagnosi de falles. Aquesta tesi desenvolupa dues d'aquestes tècniques per tal de cobrir algunes de les mancances que presenten els diccionaris de falles. La primera proposta es basa en construir un sistema fuzzy com a eina per identificar. Els resultats obtinguts son força bons, ja que s'aconsegueix localitzar la falla en un elevat tant percent dels casos. Per altra banda, el percentatge d'encerts no és prou bo quan a més a més s'intenta esbrinar la desviació. Com que els diccionaris de falles es poden veure com una aproximació simplificada al Raonament Basat en Casos (CBR), la segona proposta fa una extensió dels diccionaris de falles cap a un sistema CBR. El propòsit no és donar una solució general del problema sinó contribuir amb una nova metodologia. Aquesta consisteix en millorar la diagnosis dels diccionaris de falles mitjançant l'addició i l'adaptació dels nous casos per tal d'esdevenir un sistema de Raonament Basat en Casos. Es descriu l'estructura de la base de casos així com les tasques d'extracció, de reutilització, de revisió i de retenció, fent èmfasi al procés d'aprenentatge. En el transcurs del text s'utilitzen diversos circuits per mostrar exemples dels mètodes de test descrits, però en particular el filtre biquadràtic és l'utilitzat per provar les metodologies plantejades, ja que és un dels benchmarks proposats en el context dels circuits analògics. Les falles considerades son paramètriques, permanents, independents i simples, encara que la metodologia pot ser fàcilment extrapolable per a la diagnosi de falles múltiples i catastròfiques. El mètode es centra en el test dels components passius, encara que també es podria extendre per a falles en els actius.