16 resultados para Non-formal learning
em Indian Institute of Science - Bangalore - Índia
Resumo:
Learning to rank from relevance judgment is an active research area. Itemwise score regression, pairwise preference satisfaction, and listwise structured learning are the major techniques in use. Listwise structured learning has been applied recently to optimize important non-decomposable ranking criteria like AUC (area under ROC curve) and MAP(mean average precision). We propose new, almost-lineartime algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain)in the max-margin structured learning framework. We also demonstrate that, for different ranking criteria, one may need to use different feature maps. Search applications should not be optimized in favor of a single criterion, because they need to cater to a variety of queries. E.g., MRR is best for navigational queries, while NDCG is best for informational queries. A key contribution of this paper is to fold multiple ranking loss functions into a multi-criteria max-margin optimization.The result is a single, robust ranking model that is close to the best accuracy of learners trained on individual criteria. In fact, experiments over the popular LETOR and TREC data sets show that, contrary to conventional wisdom, a test criterion is often not best served by training with the same individual criterion.
Resumo:
Modeling the performance behavior of parallel applications to predict the execution times of the applications for larger problem sizes and number of processors has been an active area of research for several years. The existing curve fitting strategies for performance modeling utilize data from experiments that are conducted under uniform loading conditions. Hence the accuracy of these models degrade when the load conditions on the machines and network change. In this paper, we analyze a curve fitting model that attempts to predict execution times for any load conditions that may exist on the systems during application execution. Based on the experiments conducted with the model for a parallel eigenvalue problem, we propose a multi-dimensional curve-fitting model based on rational polynomials for performance predictions of parallel applications in non-dedicated environments. We used the rational polynomial based model to predict execution times for 2 other parallel applications on systems with large load dynamics. In all the cases, the model gave good predictions of execution times with average percentage prediction errors of less than 20%
Resumo:
This paper describes the application of vector spaces over Galois fields, for obtaining a formal description of a picture in the form of a very compact, non-redundant, unique syntactic code. Two different methods of encoding are described. Both these methods consist in identifying the given picture as a matrix (called picture matrix) over a finite field. In the first method, the eigenvalues and eigenvectors of this matrix are obtained. The eigenvector expansion theorem is then used to reconstruct the original matrix. If several of the eigenvalues happen to be zero this scheme results in a considerable compression. In the second method, the picture matrix is reduced to a primitive diagonal form (Hermite canonical form) by elementary row and column transformations. These sequences of elementary transformations constitute a unique and unambiguous syntactic code-called Hermite code—for reconstructing the picture from the primitive diagonal matrix. A good compression of the picture results, if the rank of the matrix is considerably lower than its order. An important aspect of this code is that it preserves the neighbourhood relations in the picture and the primitive remains invariant under translation, rotation, reflection, enlargement and replication. It is also possible to derive the codes for these transformed pictures from the Hermite code of the original picture by simple algebraic manipulation. This code will find extensive applications in picture compression, storage, retrieval, transmission and in designing pattern recognition and artificial intelligence systems.
Resumo:
Screening and early identification of primary immunodeficiency disease (PID) genes is a major challenge for physicians. Many resources have catalogued molecular alterations in known PID genes along with their associated clinical and immunological phenotypes. However, these resources do not assist in identifying candidate PID genes. We have recently developed a platform designated Resource of Asian PDIs, which hosts information pertaining to molecular alterations, protein-protein interaction networks, mouse studies and microarray gene expression profiling of all known PID genes. Using this resource as a discovery tool, we describe the development of an algorithm for prediction of candidate PID genes. Using a support vector machine learning approach, we have predicted 1442 candidate PID genes using 69 binary features of 148 known PID genes and 3162 non-PID genes as a training data set. The power of this approach is illustrated by the fact that six of the predicted genes have recently been experimentally confirmed to be PID genes. The remaining genes in this predicted data set represent attractive candidates for testing in patients where the etiology cannot be ascribed to any of the known PID genes.
Resumo:
A new form of a multi-step transversal linearization (MTL) method is developed and numerically explored in this study for a numeric-analytical integration of non-linear dynamical systems under deterministic excitations. As with other transversal linearization methods, the present version also requires that the linearized solution manifold transversally intersects the non-linear solution manifold at a chosen set of points or cross-section in the state space. However, a major point of departure of the present method is that it has the flexibility of treating non-linear damping and stiffness terms of the original system as damping and stiffness terms in the transversally linearized system, even though these linearized terms become explicit functions of time. From this perspective, the present development is closely related to the popular practice of tangent-space linearization adopted in finite element (FE) based solutions of non-linear problems in structural dynamics. The only difference is that the MTL method would require construction of transversal system matrices in lieu of the tangent system matrices needed within an FE framework. The resulting time-varying linearized system matrix is then treated as a Lie element using Magnus’ characterization [W. Magnus, On the exponential solution of differential equations for a linear operator, Commun. Pure Appl. Math., VII (1954) 649–673] and the associated fundamental solution matrix (FSM) is obtained through repeated Lie-bracket operations (or nested commutators). An advantage of this approach is that the underlying exponential transformation could preserve certain intrinsic structural properties of the solution of the non-linear problem. Yet another advantage of the transversal linearization lies in the non-unique representation of the linearized vector field – an aspect that has been specifically exploited in this study to enhance the spectral stability of the proposed family of methods and thus contain the temporal propagation of local errors. A simple analysis of the formal orders of accuracy is provided within a finite dimensional framework. Only a limited numerical exploration of the method is presently provided for a couple of popularly known non-linear oscillators, viz. a hardening Duffing oscillator, which has a non-linear stiffness term, and the van der Pol oscillator, which is self-excited and has a non-linear damping term.
Resumo:
We present a case study of formal verification of full-wave rectifier for analog and mixed signal designs. We have used the Checkmate tool from CMU [1], which is a public domain formal verification tool for hybrid systems. Due to the restriction imposed by Checkmate it necessitates to make the changes in the Checkmate implementation to implement the complex and non-linear system. Full-wave rectifier has been implemented by using the Checkmate custom blocks and the Simulink blocks from MATLAB from Math works. After establishing the required changes in the Checkmate implementation we are able to efficiently verify, the safety properties of the full-wave rectifier.
Resumo:
Two optimal non-linear reinforcement schemes—the Reward-Inaction and the Penalty-Inaction—for the two-state automaton functioning in a stationary random environment are considered. Very simple conditions of symmetry of the non-linear function figuring in the reinforcement scheme are shown to be necessary and sufficient for optimality. General expressions for the variance and rate of learning are derived. These schemes are compared with the already existing optimal linear schemes in the light of average variance and average rate of learning.
Resumo:
The enantiodivergent formal syntheses of both enantiomers of aspercyclide C is accomplished. Starting from L-(+)-tartaric acid, the key protected allylic alcohol, (3R,4R)-4-(methoxy-methoxy) non-1-en-3-ol is prepared, and is then elaborated into both enantiomers of 3-(4-methoxybenzyl)oxy]non-1-en-4-ol via Mitsunobu inversion. Esterification with a known biaryl acid, followed by ring-closing metathesis and deprotection completes the syntheses.
Resumo:
Channel assignment in multi-channel multi-radio wireless networks poses a significant challenge due to scarcity of number of channels available in the wireless spectrum. Further, additional care has to be taken to consider the interference characteristics of the nodes in the network especially when nodes are in different collision domains. This work views the problem of channel assignment in multi-channel multi-radio networks with multiple collision domains as a non-cooperative game where the objective of the players is to maximize their individual utility by minimizing its interference. Necessary and sufficient conditions are derived for the channel assignment to be a Nash Equilibrium (NE) and efficiency of the NE is analyzed by deriving the lower bound of the price of anarchy of this game. A new fairness measure in multiple collision domain context is proposed and necessary and sufficient conditions for NE outcomes to be fair are derived. The equilibrium conditions are then applied to solve the channel assignment problem by proposing three algorithms, based on perfect/imperfect information, which rely on explicit communication between the players for arriving at an NE. A no-regret learning algorithm known as Freund and Schapire Informed algorithm, which has an additional advantage of low overhead in terms of information exchange, is proposed and its convergence to the stabilizing outcomes is studied. New performance metrics are proposed and extensive simulations are done using Matlab to obtain a thorough understanding of the performance of these algorithms on various topologies with respect to these metrics. It was observed that the algorithms proposed were able to achieve good convergence to NE resulting in efficient channel assignment strategies.
Resumo:
The impulse response of a typical wireless multipath channel can be modeled as a tapped delay line filter whose non-zero components are sparse relative to the channel delay spread. In this paper, a novel method of estimating such sparse multipath fading channels for OFDM systems is explored. In particular, Sparse Bayesian Learning (SBL) techniques are applied to jointly estimate the sparse channel and its second order statistics, and a new Bayesian Cramer-Rao bound is derived for the SBL algorithm. Further, in the context of OFDM channel estimation, an enhancement to the SBL algorithm is proposed, which uses an Expectation Maximization (EM) framework to jointly estimate the sparse channel, unknown data symbols and the second order statistics of the channel. The EM-SBL algorithm is able to recover the support as well as the channel taps more efficiently, and/or using fewer pilot symbols, than the SBL algorithm. To further improve the performance of the EM-SBL, a threshold-based pruning of the estimated second order statistics that are input to the algorithm is proposed, and its mean square error and symbol error rate performance is illustrated through Monte-Carlo simulations. Thus, the algorithms proposed in this paper are capable of obtaining efficient sparse channel estimates even in the presence of a small number of pilots.
Resumo:
This paper(1) presents novel algorithms and applications for a particular class of mixed-norm regularization based Multiple Kernel Learning (MKL) formulations. The formulations assume that the given kernels are grouped and employ l(1) norm regularization for promoting sparsity within RKHS norms of each group and l(s), s >= 2 norm regularization for promoting non-sparse combinations across groups. Various sparsity levels in combining the kernels can be achieved by varying the grouping of kernels-hence we name the formulations as Variable Sparsity Kernel Learning (VSKL) formulations. While previous attempts have a non-convex formulation, here we present a convex formulation which admits efficient Mirror-Descent (MD) based solving techniques. The proposed MD based algorithm optimizes over product of simplices and has a computational complexity of O (m(2)n(tot) log n(max)/epsilon(2)) where m is no. training data points, n(max), n(tot) are the maximum no. kernels in any group, total no. kernels respectively and epsilon is the error in approximating the objective. A detailed proof of convergence of the algorithm is also presented. Experimental results show that the VSKL formulations are well-suited for multi-modal learning tasks like object categorization. Results also show that the MD based algorithm outperforms state-of-the-art MKL solvers in terms of computational efficiency.
Resumo:
An extension to a formal verification approach of hybrid systems is proposed to verify analog and mixed signal (AMS) designs. AMS designs can be formally modeled as hybrid systems and therefore lend themselves to the formal analysis and verification techniques applied to hybrid systems. The proposed approach employs simulation traces obtained from an actual design implementation of AMS circuit blocks (for example, in the form of SPICE netlists) to carry out formal analysis and verification. This enables the same platform used for formally validating an abstract model of an AMS design, to be also used for validating its different refinements and design implementation; thereby, providing a simple route to formal verification at different levels of implementation. The feasibility of the proposed approach is demonstrated with a case study based on a tunnel diode oscillator. Since the device characteristic of a tunnel diode is highly non-linear with a negative resistance region, dynamic behavior of circuits in which it is employed as an element is difficult to model, analyze and verify within a general hybrid system formal verification tool. In the case study presented the formal model and the proposed computational techniques have been incorporated into CheckMate, a formal verification tool based on MATLAB and Simulink-Stateflow Framework from MathWorks.
Resumo:
In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search. Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.
Resumo:
Learning from Positive and Unlabelled examples (LPU) has emerged as an important problem in data mining and information retrieval applications. Existing techniques are not ideally suited for real world scenarios where the datasets are linearly inseparable, as they either build linear classifiers or the non-linear classifiers fail to achieve the desired performance. In this work, we propose to extend maximum margin clustering ideas and present an iterative procedure to design a non-linear classifier for LPU. In particular, we build a least squares support vector classifier, suitable for handling this problem due to symmetry of its loss function. Further, we present techniques for appropriately initializing the labels of unlabelled examples and for enforcing the ratio of positive to negative examples while obtaining these labels. Experiments on real-world datasets demonstrate that the non-linear classifier designed using the proposed approach gives significantly better generalization performance than the existing relevant approaches for LPU.
Resumo:
We consider the problem of Probably Ap-proximate Correct (PAC) learning of a bi-nary classifier from noisy labeled exam-ples acquired from multiple annotators(each characterized by a respective clas-sification noise rate). First, we consider the complete information scenario, where the learner knows the noise rates of all the annotators. For this scenario, we derive sample complexity bound for the Mini-mum Disagreement Algorithm (MDA) on the number of labeled examples to be ob-tained from each annotator. Next, we consider the incomplete information sce-nario, where each annotator is strategic and holds the respective noise rate as a private information. For this scenario, we design a cost optimal procurement auc-tion mechanism along the lines of Myer-son’s optimal auction design framework in a non-trivial manner. This mechanism satisfies incentive compatibility property,thereby facilitating the learner to elicit true noise rates of all the annotators.