89 resultados para datasets


Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present an extension of the iterative closest point (ICP) algorithm that simultaneously registers multiple 3D scans. While ICP fails to utilize the multiview constraints available, our method exploits the information redundancy in a set of 3D scans by using the averaging of relative motions. This averaging method utilizes the Lie group structure of motions, resulting in a 3D registration method that is both efficient and accurate. In addition, we present two variants of our approach, i.e., a method that solves for multiview 3D registration while obeying causality and a transitive correspondence variant that efficiently solves the correspondence problem across multiple scans. We present experimental results to characterize our method and explain its behavior as well as those of some other multiview registration methods in the literature. We establish the superior accuracy of our method in comparison to these multiview methods with registration results on a set of well-known real datasets of 3D scans.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present paper details the prediction of blast induced ground vibration, using artificial neural network. The data was generated from five different coal mines. Twenty one different parameters involving rock mass parameters, explosive parameters and blast design parameters, were used to develop the one comprehensive ANN model for five different coal bearing formations. A total of 131 datasets was used to develop the ANN model and 44 datasets was used to test the model. The developed ANN model was compared with the USBM model. The prediction capability to predict blast induced ground vibration, of the comprehensive ANN model was found to be superior.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Structural Support Vector Machines (SSVMs) and Conditional Random Fields (CRFs) are popular discriminative methods used for classifying structured and complex objects like parse trees, image segments and part-of-speech tags. The datasets involved are very large dimensional, and the models designed using typical training algorithms for SSVMs and CRFs are non-sparse. This non-sparse nature of models results in slow inference. Thus, there is a need to devise new algorithms for sparse SSVM and CRF classifier design. Use of elastic net and L1-regularizer has already been explored for solving primal CRF and SSVM problems, respectively, to design sparse classifiers. In this work, we focus on dual elastic net regularized SSVM and CRF. By exploiting the weakly coupled structure of these convex programming problems, we propose a new sequential alternating proximal (SAP) algorithm to solve these dual problems. This algorithm works by sequentially visiting each training set example and solving a simple subproblem restricted to a small subset of variables associated with that example. Numerical experiments on various benchmark sequence labeling datasets demonstrate that the proposed algorithm scales well. Further, the classifiers designed are sparser than those designed by solving the respective primal problems and demonstrate comparable generalization performance. Thus, the proposed SAP algorithm is a useful alternative for sparse SSVM and CRF classifier design.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The objective in this work is to develop downscaling methodologies to obtain a long time record of inundation extent at high spatial resolution based on the existing low spatial resolution results of the Global Inundation Extent from Multi-Satellites (GIEMS) dataset. In semiarid regions, high-spatial-resolution a priori information can be provided by visible and infrared observations from the Moderate Resolution Imaging Spectroradiometer (MODIS). The study concentrates on the Inner Niger Delta where MODIS-derived inundation extent has been estimated at a 500-m resolution. The space-time variability is first analyzed using a principal component analysis (PCA). This is particularly effective to understand the inundation variability, interpolate in time, or fill in missing values. Two innovative methods are developed (linear regression and matrix inversion) both based on the PCA representation. These GIEMS downscaling techniques have been calibrated using the 500-m MODIS data. The downscaled fields show the expected space-time behaviors from MODIS. A 20-yr dataset of the inundation extent at 500 m is derived from this analysis for the Inner Niger Delta. The methods are very general and may be applied to many basins and to other variables than inundation, provided enough a priori high-spatial-resolution information is available. The derived high-spatial-resolution dataset will be used in the framework of the Surface Water Ocean Topography (SWOT) mission to develop and test the instrument simulator as well as to select the calibration validation sites (with high space-time inundation variability). In addition, once SWOT observations are available, the downscaled methodology will be calibrated on them in order to downscale the GIEMS datasets and to extend the SWOT benefits back in time to 1993.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Maximum entropy approach to classification is very well studied in applied statistics and machine learning and almost all the methods that exists in literature are discriminative in nature. In this paper, we introduce a maximum entropy classification method with feature selection for large dimensional data such as text datasets that is generative in nature. To tackle the curse of dimensionality of large data sets, we employ conditional independence assumption (Naive Bayes) and we perform feature selection simultaneously, by enforcing a `maximum discrimination' between estimated class conditional densities. For two class problems, in the proposed method, we use Jeffreys (J) divergence to discriminate the class conditional densities. To extend our method to the multi-class case, we propose a completely new approach by considering a multi-distribution divergence: we replace Jeffreys divergence by Jensen-Shannon (JS) divergence to discriminate conditional densities of multiple classes. In order to reduce computational complexity, we employ a modified Jensen-Shannon divergence (JS(GM)), based on AM-GM inequality. We show that the resulting divergence is a natural generalization of Jeffreys divergence to a multiple distributions case. As far as the theoretical justifications are concerned we show that when one intends to select the best features in a generative maximum entropy approach, maximum discrimination using J-divergence emerges naturally in binary classification. Performance and comparative study of the proposed algorithms have been demonstrated on large dimensional text and gene expression datasets that show our methods scale up very well with large dimensional datasets.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Elastic Net Regularizers have shown much promise in designing sparse classifiers for linear classification. In this work, we propose an alternating optimization approach to solve the dual problems of elastic net regularized linear classification Support Vector Machines (SVMs) and logistic regression (LR). One of the sub-problems turns out to be a simple projection. The other sub-problem can be solved using dual coordinate descent methods developed for non-sparse L2-regularized linear SVMs and LR, without altering their iteration complexity and convergence properties. Experiments on very large datasets indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The problem of classification of time series data is an interesting problem in the field of data mining. Even though several algorithms have been proposed for the problem of time series classification we have developed an innovative algorithm which is computationally fast and accurate in several cases when compared with 1NN classifier. In our method we are calculating the fuzzy membership of each test pattern to be classified to each class. We have experimented with 6 benchmark datasets and compared our method with 1NN classifier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses a novel high-speed approach for human action recognition in H. 264/AVC compressed domain. The proposed algorithm utilizes cues from quantization parameters and motion vectors extracted from the compressed video sequence for feature extraction and further classification using Support Vector Machines (SVM). The ultimate goal of our work is to portray a much faster algorithm than pixel domain counterparts, with comparable accuracy, utilizing only the sparse information from compressed video. Partial decoding rules out the complexity of full decoding, and minimizes computational load and memory usage, which can effect in reduced hardware utilization and fast recognition results. The proposed approach can handle illumination changes, scale, and appearance variations, and is robust in outdoor as well as indoor testing scenarios. We have tested our method on two benchmark action datasets and achieved more than 85% accuracy. The proposed algorithm classifies actions with speed (>2000 fps) approximately 100 times more than existing state-of-the-art pixel-domain algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The tonic is a fundamental concept in Indian art music. It is the base pitch, which an artist chooses in order to construct the melodies during a rg(a) rendition, and all accompanying instruments are tuned using the tonic pitch. Consequently, tonic identification is a fundamental task for most computational analyses of Indian art music, such as intonation analysis, melodic motif analysis and rg recognition. In this paper we review existing approaches for tonic identification in Indian art music and evaluate them on six diverse datasets for a thorough comparison and analysis. We study the performance of each method in different contexts such as the presence/absence of additional metadata, the quality of audio data, the duration of audio data, music tradition (Hindustani/Carnatic) and the gender of the singer (male/female). We show that the approaches that combine multi-pitch analysis with machine learning provide the best performance in most cases (90% identification accuracy on average), and are robust across the aforementioned contexts compared to the approaches based on expert knowledge. In addition, we also show that the performance of the latter can be improved when additional metadata is available to further constrain the problem. Finally, we present a detailed error analysis of each method, providing further insights into the advantages and limitations of the methods.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, we present a novel algorithm for piecewise linear regression which can learn continuous as well as discontinuous piecewise linear functions. The main idea is to repeatedly partition the data and learn a linear model in each partition. The proposed algorithm is similar in spirit to k-means clustering algorithm. We show that our algorithm can also be viewed as a special case of an EM algorithm for maximum likelihood estimation under a reasonable probability model. We empirically demonstrate the effectiveness of our approach by comparing its performance with that of the state of art algorithms on various datasets. (C) 2014 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Designing a robust algorithm for visual object tracking has been a challenging task since many years. There are trackers in the literature that are reasonably accurate for many tracking scenarios but most of them are computationally expensive. This narrows down their applicability as many tracking applications demand real time response. In this paper, we present a tracker based on random ferns. Tracking is posed as a classification problem and classification is done using ferns. We used ferns as they rely on binary features and are extremely fast at both training and classification as compared to other classification algorithms. Our experiments show that the proposed tracker performs well on some of the most challenging tracking datasets and executes much faster than one of the state-of-the-art trackers, without much difference in tracking accuracy.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as Protein Blocks (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The understanding of protein-protein interactions is indispensable in comprehending most of the biological processes in a cell. Small-scale experiments as well as large-scale high-throughput techniques over the past few decades have facilitated identification and analysis of protein-protein interactions which form the basis of much of our knowledge on functional and regulatory aspects of proteins. However, such rich catalog of interaction data should be used with caution when establishing protein-protein interactions in silico, as the high-throughput datasets are prone to false positives. Numerous computational means developed to pursue genome-wide studies on protein-protein interactions at times overlook the mechanistic and molecular details, thus questioning the reliability of predicted protein-protein interactions. We review the development, advantages, and shortcomings of varied approaches and demonstrate that by providing a structural viewpoint in terms of shape complementarity and interaction energies at protein-protein interfaces coupled with information on expression and localization of proteins homologous to an interacting pair, it is possible to assess the credibility of predicted interactions in biological context. With a focus on human pathogen Mycobacterium tuberculosis H37Rv, we show that such scrupulous use of details at the molecular level can predict physicochemically viable protein-protein interactions across host and pathogen. Such predicted interactions have the potential to provide molecular basis of probable mechanisms of pathogenesis and hence open up ways to explore their usefulness as targets in the light of drug discovery. (c) 2014 IUBMB Life, 66(11):759-774, 2014

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The ability of Coupled General Circulation Models (CGCMs) participating in the Intergovernmental Panel for Climate Change's fourth assessment report (IPCC AR4) for the 20th century climate (20C3M scenario) to simulate the daily precipitation over the Indian region is explored. The skill is evaluated on a 2.5A degrees x 2.5A degrees grid square compared with the Indian Meteorological Department's (IMD) gridded dataset, and every GCM is ranked for each of these grids based on its skill score. Skill scores (SSs) are estimated from the probability density functions (PDFs) obtained from observed IMD datasets and GCM simulations. The methodology takes into account (high) extreme precipitation events simulated by GCMs. The results are analyzed and presented for three categories and six zones. The three categories are the monsoon season (JJASO - June to October), non-monsoon season (JFMAMND - January to May, November, December) and for the entire year (''Annual''). The six precipitation zones are peninsular, west central, northwest, northeast, central northeast India, and the hilly region. Sensitivity analysis was performed for three spatial scales, 2.5A degrees grid square, zones, and all of India, in the three categories. The models were ranked based on the SS. The category JFMAMND had a higher SS than the JJASO category. The northwest zone had higher SSs, whereas the peninsular and hilly regions had lower SS. No single GCM can be identified as the best for all categories and zones. Some models consistently outperformed the model ensemble, and one model had particularly poor performance. Results show that most models underestimated the daily precipitation rates in the 0-1 mm/day range and overestimated it in the 1-15 mm/day range.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Aims. In this work we search for the signatures of low-dimensional chaos in the temporal behavior of the Kepler-field blazar W2R 1946+42. Methods. We use a publicly available, similar to 160 000-point-long and mostly equally spaced light curve of W2R 1946+42. We apply the correlation integral method to both real datasets and phase randomized surrogates. Results. We are not able to confirm the presence of low-dimensional chaos in the light curve. This result, however, still leads to some important implications for blazar emission mechanisms, which are discussed.