101 resultados para Bag-of-Features
em Indian Institute of Science - Bangalore - Índia
Resumo:
Feature selection is an important first step in regional hydrologic studies (RHYS). Over the past few decades, advances in data collection facilities have resulted in development of data archives on a variety of hydro-meteorological variables that may be used as features in RHYS. Currently there are no established procedures for selecting features from such archives. Therefore, hydrologists often use subjective methods to arrive at a set of features. This may lead to misleading results. To alleviate this problem, a probabilistic clustering method for regionalization is presented to determine appropriate features from the available dataset. The effectiveness of the method is demonstrated by application to regionalization of watersheds in conterminous United States for low flow frequency analysis. Plausible homogeneous regions that are formed by using the proposed clustering method are compared with those from conventional methods of regionalization using L-moment based homogeneity tests. Results show that the proposed methodology is promising for RHYS.
Resumo:
In this paper, we have proposed a simple and effective approach to classify H.264 compressed videos, by capturing orientation information from the motion vectors. Our major contribution involves computing Histogram of Oriented Motion Vectors (HOMV) for overlapping hierarchical Space-Time cubes. The Space-Time cubes selected are partially overlapped. HOMV is found to be very effective to define the motion characteristics of these cubes. We then use Bag of Features (B OF) approach to define the video as histogram of HOMV keywords, obtained using k-means clustering. The video feature, thus computed, is found to be very effective in classifying videos. We demonstrate our results with experiments on two large publicly available video database.
Resumo:
We propose a robust method for mosaicing of document images using features derived from connected components. Each connected component is described using the Angular Radial Tran. form (ART). To ensure geometric consistency during feature matching, the ART coefficients of a connected component are augmented with those of its two nearest neighbors. The proposed method addresses two critical issues often encountered in correspondence matching: (i) The stability of features and (ii) Robustness against false matches due to the multiple instances of characters in a document image. The use of connected components guarantees a stable localization across images. The augmented features ensure a successful correspondence matching even in the presence of multiple similar regions within the page. We illustrate the effectiveness of the proposed method on camera captured document images exhibiting large variations in viewpoint, illumination and scale.
Resumo:
Abstract-The success of automatic speaker recognition in laboratory environments suggests applications in forensic science for establishing the Identity of individuals on the basis of features extracted from speech. A theoretical model for such a verification scheme for continuous normaliy distributed featureIss developed. The three cases of using a) single feature, b)multipliendependent measurements of a single feature, and c)multpleindependent features are explored.The number iofndependent features needed for areliable personal identification is computed based on the theoretcal model and an expklatory study of some speech featues.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
In this paper we propose a hypothetical scheme for recognizing the alphanumerics. The scheme is based on the known physiological structure of the visual cortex and the concept of a short Lino extractor nouron (SLEN). We assumo four basic typca of such units for extracting vertical, horizontal, right and left inclined straight line segments. The patterns reconstructed from the scheme show perfect agreement with the test patterns. The model indicates that the recognition of letters T and H requires extraction of the largest number of features.
Resumo:
The concept of feature selection in a nonparametric unsupervised learning environment is practically undeveloped because no true measure for the effectiveness of a feature exists in such an environment. The lack of a feature selection phase preceding the clustering process seriously affects the reliability of such learning. New concepts such as significant features, level of significance of features, and immediate neighborhood are introduced which result in meeting implicitly the need for feature slection in the context of clustering techniques.
Resumo:
It is important to identify the ``correct'' number of topics in mechanisms like Latent Dirichlet Allocation(LDA) as they determine the quality of features that are presented as features for classifiers like SVM. In this work we propose a measure to identify the correct number of topics and offer empirical evidence in its favor in terms of classification accuracy and the number of topics that are naturally present in the corpus. We show the merit of the measure by applying it on real-world as well as synthetic data sets(both text and images). In proposing this measure, we view LDA as a matrix factorization mechanism, wherein a given corpus C is split into two matrix factors M-1 and M-2 as given by C-d*w = M1(d*t) x Q(t*w).Where d is the number of documents present in the corpus anti w is the size of the vocabulary. The quality of the split depends on ``t'', the right number of topics chosen. The measure is computed in terms of symmetric KL-Divergence of salient distributions that are derived from these matrix factors. We observe that the divergence values are higher for non-optimal number of topics - this is shown by a `dip' at the right value for `t'.
Resumo:
A 4 A electron-density map of Pf1 filamentous bacterial virus has been calculated from x-ray fiber diffraction data by using the maximum-entropy method. This method produces a map that is free of features due to noise in the data and enables incomplete isomorphous-derivative phase information to be supplemented by information about the nature of the solution. The map shows gently curved (banana-shaped) rods of density about 70 A long, oriented roughly parallel to the virion axis but slewing by about 1/6th turn while running from a radius of 28 A to one of 13 A. Within these rods, there is a helical periodicity with a pitch of 5 to 6 A. We interpret these rods to be the helical subunits of the virion. The position of strongly diffracted intensity on the x-ray fiber pattern shows that the basic helix of the virion is right handed and that neighboring nearly parallel protein helices cross one another in an unusual negative sense.
Resumo:
A new feature-based technique is introduced to solve the nonlinear forward problem (FP) of the electrical capacitance tomography with the target application of monitoring the metal fill profile in the lost foam casting process. The new technique is based on combining a linear solution to the FP and a correction factor (CF). The CF is estimated using an artificial neural network (ANN) trained using key features extracted from the metal distribution. The CF adjusts the linear solution of the FP to account for the nonlinear effects caused by the shielding effects of the metal. This approach shows promising results and avoids the curse of dimensionality through the use of features and not the actual metal distribution to train the ANN. The ANN is trained using nine features extracted from the metal distributions as input. The expected sensors readings are generated using ANSYS software. The performance of the ANN for the training and testing data was satisfactory, with an average root-mean-square error equal to 2.2%.
Resumo:
The present work demonstrates a novel strategy to synthesize orthogonally bio-engineered magnetonanohybrids (MNPs) through the design of versatile, biocompatible linkers whose structure includes: (i) a robust anchor to bind with metal-oxide surfaces; (ii) tailored surface groups to act as spacers and (iii) a general method to implement orthogonal functionalizations of the substrate via ``click chemistry''. Ligands that possess the synthetic generality of features (i)-(iii) are categorized as ``universal ligands''. Herein, we report the synthesis of a novel, azido-terminated poly(ethylene glycol) (PEG) silane that can easily self-assemble on MNPs through hetero-condensation between surface hydroxyl groups and the silane end of the ligand, and simultaneously provide multiple clickable sites for high density, chemoselective bio-conjugation. To establish the universal-ligand-strategy, we clicked alkyl-functionalized folate onto the surface of PEGylated MNPs. By further integrating a near-infrared fluorescent (NIRF) marker (Alexa-Fluor 647) with MNPs, we demonstrated their folate-receptor mediated internalization inside cancer cells and subsequent translocation into lysosomes and mitochondria. Ex vivo NIRF imaging established that the azido-PEG-silane developed in course of the study can effectively reduce the sequestration of MNPs by macrophage organs (viz. liver and spleen). These folate-PEG-MNPs were not only stealth and noncytotoxic but their dual optical and magnetic properties aided in tracking their whereabouts through combined magnetic resonance and optical imaging. Together, these results provided a strong motivation for the future use of the ``universal ligand'' strategy towards development of ``smart'' nanohybrids for theragnostic applications.
Resumo:
When document corpus is very large, we often need to reduce the number of features. But it is not possible to apply conventional Non-negative Matrix Factorization(NMF) on billion by million matrix as the matrix may not fit in memory. Here we present novel Online NMF algorithm. Using Online NMF, we reduced original high-dimensional space to low-dimensional space. Then we cluster all the documents in reduced dimension using k-means algorithm. We experimentally show that by processing small subsets of documents we will be able to achieve good performance. The method proposed outperforms existing algorithms.
Resumo:
SARAS is a correlation spectrometer purpose designed for precision measurements of the cosmic radio background and faint features in the sky spectrum at long wavelengths that arise from redshifted 21-cm from gas in the reionization epoch. SARAS operates in the octave band 87.5-175 MHz. We present herein the system design arguing for a complex correlation spectrometer concept. The SARAS design concept provides a differential measurement between the antenna temperature and that of an internal reference termination, with measurements in switched system states allowing for cancellation of additive contaminants from a large part of the signal flow path including the digital spectrometer. A switched noise injection scheme provides absolute spectral calibration. Additionally, we argue for an electrically small frequency-independent antenna over an absorber ground. Various critical design features that aid in avoidance of systematics and in providing calibration products for the parametrization of other unavoidable systematics are described and the rationale discussed. The signal flow and processing is analyzed and the response to noise temperatures of the antenna, reference termination and amplifiers is computed. Multi-path propagation arising from internal reflections are considered in the analysis, which includes a harmonic series of internal reflections. We opine that the SARAS design concept is advantageous for precision measurement of the absolute cosmic radio background spectrum; therefore, the design features and analysis methods presented here are expected to serve as a basis for implementations tailored to measurements of a multiplicity of features in the background sky at long wavelengths, which may arise from events in the dark ages and subsequent reionization era.
Resumo:
We propose a novel space-time descriptor for region-based tracking which is very concise and efficient. The regions represented by covariance matrices within a temporal fragment, are used to estimate this space-time descriptor which we call the Eigenprofiles(EP). EP so obtained is used in estimating the Covariance Matrix of features over spatio-temporal fragments. The Second Order Statistics of spatio-temporal fragments form our target model which can be adapted for variations across the video. The model being concise also allows the use of multiple spatially overlapping fragments to represent the target. We demonstrate good tracking results on very challenging datasets, shot under insufficient illumination conditions.
Resumo:
-helices are amongst the most common secondary structural elements seen in membrane proteins and are packed in the form of helix bundles. These -helices encounter varying external environments (hydrophobic, hydrophilic) that may influence the sequence preferences at their N and C-termini. The role of the external environment in stabilization of the helix termini in membrane proteins is still unknown. Here we analyze -helices in a high-resolution dataset of integral -helical membrane proteins and establish that their sequence and conformational preferences differ from those in globular proteins. We specifically examine these preferences at the N and C-termini in helices initiating/terminating inside the membrane core as well as in linkers connecting these transmembrane helices. We find that the sequence preferences and structural motifs at capping (Ncap and Ccap) and near-helical (N' and C') positions are influenced by a combination of features including the membrane environment and the innate helix initiation and termination property of residues forming structural motifs. We also find that a large number of helix termini which do not form any particular capping motif are stabilized by formation of hydrogen bonds and hydrophobic interactions contributed from the neighboring helices in the membrane protein. We further validate the sequence preferences obtained from our analysis with data from an ultradeep sequencing study that identifies evolutionarily conserved amino acids in the rat neurotensin receptor. The results from our analysis provide insights for the secondary structure prediction, modeling and design of membrane proteins. Proteins 2014; 82:3420-3436. (c) 2014 Wiley Periodicals, Inc.