919 resultados para Rank regression
Resumo:
Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.
Resumo:
This paper presents an optimization algorithm for an ammonia reactor based on a regression model relating the yield to several parameters, control inputs and disturbances. This model is derived from the data generated by hybrid simulation of the steady-state equations describing the reactor behaviour. The simplicity of the optimization program along with its ability to take into account constraints on flow variables make it best suited in supervisory control applications.
Resumo:
Background: In higher primates, although LH/CG play a critical role in the control of corpus luteum (CL) function, the direct effects of progesterone (P4) in the maintenance of CL structure and function are unclear. Several experiments were conducted in the bonnet monkey to examine direct effects of P4 on gene expression changes in the CL, during induced luteolysis and the late luteal phase of natural cycles. Methods: To identify differentially expressed genes encoding PR, PR binding factors, cofactors and PR downstream signaling target genes, the genome-wide analysis data generated in CL of monkeys after LH/P-4 depletion and LH replacement were mined and validated by real-time RT-PCR analysis. Initially, expression of these P4 related genes were determined in CL during different stages of luteal phase. The recently reported model system of induced luteolysis, yet capable of responsive to tropic support, afforded an ideal situation to examine direct effects of P4 on structure and function of CL. For this purpose, P4 was infused via ALZET pumps into monkeys 24 h after LH/P4 depletion to maintain mid luteal phase circulating P4 concentration (P4 replacement). In another experiment, exogenous P4 was supplemented during late luteal phase to mimic early pregnancy. Results: Based on the published microarray data, 45 genes were identified to be commonly regulated by LH and P4. From these 19 genes belonging to PR signaling were selected to determine their expression in LH/P-4 depletion and P4 replacement experiments. These 19 genes when analyzed revealed 8 genes to be directly responsive to P4, whereas the other genes to be regulated by both LH and P4. Progesterone supplementation for 24 h during the late luteal phase also showed changes in expression of 17 out of 19 genes examined. Conclusion: These results taken together suggest that P4 regulates, directly or indirectly, expression of a number of genes involved in the CL structure and function.
Resumo:
Extensive molecular dynamics simulations have been carried out to calculate the orientational correlation functions Cl(t), G(t) = [4n/(21 + l)]Ci=-l (Y*lm(sZ(0)) Ylm(Q(t))) (where Y,,(Q) are the spherical harmonics) of point dipoles in a cubic lattice. The decay of Cl(t) is found to be strikingly different from higher l-correlation functions-the latter do not exhibit diffusive dynamics even in the long time. Both the cumulant expansion expression of Lynden-Bell and the conventional memory function equation provide very good description of the Cl(t) in the short time but fail to reproduce the observed slow, long time decay of c1 (t) .
Resumo:
An important yet unsolved problem in the field of orientational relaxation in dipolar liquids is the dependence of the correlation functions C(l)(t), C(l)(t) = [4pi/(2l + 1)SIGMA(m = -l)l [Y(lm)(OMEGA(0)Y(lm)(OMEGA(t))] on the rank l (where Y(lm)(OMEGA) are the usual spherical harmonics). The existing theories on this effect differ in their predictions. To investigate this, we have carried out extensive computer simulations of a Brownian dipolar lattice. The dielectric friction was found to decrease rapidly with increasing l, in qualitative agreement with the predictions of Hubbard-Wolynes. However, the observed effect is much stronger than the predictions of the existing theories.
Resumo:
A successful protein-protein docking study culminates in identification of decoys at top ranks with near-native quaternary structures. However, this task remains enigmatic because no generalized scoring functions exist that effectively infer decoys according to the similarity to near-native quaternary structures. Difficulties arise because of the highly irregular nature of the protein surface and the significant variation of the nonbonding and solvation energies based on the chemical composition of the protein-protein interface. In this work, we describe a novel method combining an interface-size filter, a regression model for geometric compatibility (based on two correlated surface and packing parameters), and normalized interaction energy (calculated from correlated nonbonded and solvation energies), to effectively rank decoys from a set of 10,000 decoys. Tests on 30 unbound binary protein-protein complexes show that in 16 cases we can identify at least one decoy in top three ranks having <= 10 angstrom backbone root mean square deviation from true binding geometry. Comparisons with other state-of-art methods confirm the improved ranking power of our method without the use of any experiment-guided restraints, evolutionary information, statistical propensities, or modified interaction energy equations. Tests on 118 less-difficult bound binary protein-protein complexes with <= 35% sequence redundancy at the interface showed that in 77% cases, at least 1 in 10,000 decoys were identified with <= 5 angstrom backbone root mean square deviation from true geometry at first rank. The work will promote the use of new concepts where correlations among parameters provide more robust scoring models. It will facilitate studies involving molecular interactions, including modeling of large macromolecular assemblies and protein structure prediction. (C) 2010 Wiley Periodicals, Inc. J Comput Chem 32: 787-796, 2011.
Resumo:
The protein-protein docking programs typically perform four major tasks: (i) generation of docking poses, (ii) selecting a subset of poses, (iii) their structural refinement and (iv) scoring, ranking for the final assessment of the true quaternary structure. Although the tasks can be integrated or performed in a serial order, they are by nature modular, allowing an opportunity to substitute one algorithm with another. We have implemented two modular web services, (i) PRUNE: to select a subset of docking poses generated during sampling search (http://pallab.serc.iisc.ernet.in/prune) and (ii) PROBE: to refine, score and rank them (http://pallab.serc.iisc.ernet.in/probe). The former uses a new interface area based edge-scoring function to eliminate > 95% of the poses generated during docking search. In contrast to other multi-parameter-based screening functions, this single parameter based elimination reduces the computational time significantly, in addition to increasing the chances of selecting native-like models in the top rank list. The PROBE server performs ranking of pruned poses, after structure refinement and scoring using a regression model for geometric compatibility, and normalized interaction energy. While web-service similar to PROBE is infrequent, no web-service akin to PRUNE has been described before. Both the servers are publicly accessible and free for use.
Resumo:
This paper introduces a scheme for classification of online handwritten characters based on polynomial regression of the sampled points of the sub-strokes in a character. The segmentation is done based on the velocity profile of the written character and this requires a smoothening of the velocity profile. We propose a novel scheme for smoothening the velocity profile curve and identification of the critical points to segment the character. We also porpose another method for segmentation based on the human eye perception. We then extract two sets of features for recognition of handwritten characters. Each sub-stroke is a simple curve, a part of the character, and is represented by the distance measure of each point from the first point. This forms the first set of feature vector for each character. The second feature vector are the coeficients obtained from the B-splines fitted to the control knots obtained from the segmentation algorithm. The feature vector is fed to the SVM classifier and it indicates an efficiency of 68% using the polynomial regression technique and 74% using the spline fitting method.
Resumo:
We address the problem of local-polynomial modeling of smooth time-varying signals with unknown functional form, in the presence of additive noise. The problem formulation is in the time domain and the polynomial coefficients are estimated in the pointwise minimum mean square error (PMMSE) sense. The choice of the window length for local modeling introduces a bias-variance tradeoff, which we solve optimally by using the intersection-of-confidence-intervals (ICI) technique. The combination of the local polynomial model and the ICI technique gives rise to an adaptive signal model equipped with a time-varying PMMSE-optimal window length whose performance is superior to that obtained by using a fixed window length. We also evaluate the sensitivity of the ICI technique with respect to the confidence interval width. Simulation results on electrocardiogram (ECG) signals show that at 0dB signal-to-noise ratio (SNR), one can achieve about 12dB improvement in SNR. Monte-Carlo performance analysis shows that the performance is comparable to the basic wavelet techniques. For 0 dB SNR, the adaptive window technique yields about 2-3dB higher SNR than wavelet regression techniques and for SNRs greater than 12dB, the wavelet techniques yield about 2dB higher SNR.
Resumo:
We consider the problem of maintaining information about the rank of a matrix $M$ under changes to its entries. For an $n \times n$ matrix $M$, we show an amortized upper bound of $O(n^{\omega-1})$ arithmetic operations per change for this problem, where $\omega < 2.376$ is the exponent for matrix multiplication, under the assumption that there is a {\em lookahead} of up to $\Theta(n)$ locations. That is, we know up to the next $\Theta(n)$ locations $(i_1,j_1),(i_2,j_2),\ldots,$ whose entries are going to change, in advance; however we do not know the new entries in these locations in advance. We get the new entries in these locations in a dynamic manner.
Resumo:
Learning to rank from relevance judgment is an active research area. Itemwise score regression, pairwise preference satisfaction, and listwise structured learning are the major techniques in use. Listwise structured learning has been applied recently to optimize important non-decomposable ranking criteria like AUC (area under ROC curve) and MAP(mean average precision). We propose new, almost-lineartime algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain)in the max-margin structured learning framework. We also demonstrate that, for different ranking criteria, one may need to use different feature maps. Search applications should not be optimized in favor of a single criterion, because they need to cater to a variety of queries. E.g., MRR is best for navigational queries, while NDCG is best for informational queries. A key contribution of this paper is to fold multiple ranking loss functions into a multi-criteria max-margin optimization.The result is a single, robust ranking model that is close to the best accuracy of learners trained on individual criteria. In fact, experiments over the popular LETOR and TREC data sets show that, contrary to conventional wisdom, a test criterion is often not best served by training with the same individual criterion.
Resumo:
In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.
Resumo:
This paper presents a method of partial automation of specification based regression testing, which we call ESSE (Explicit State Space Enumeration). The first step in ESSE method is the extraction of a finite state model of the system making use of an already tested version of the system under test (SUT). Thereafter, the finite state model thus obtained is used to compute good test sequences that can be used to regression test subsequent versions of the system. We present two new algorithms for test sequence computation - both based on our finite state model generated by the above method. We also provide the details and results of the experimental evaluation of ESSE method. Comparison with a practically used random-testing algorithm has shown substantial improvements.
Resumo:
We associate a sheaf model to a class of Hilbert modules satisfying a natural finiteness condition. It is obtained as the dual to a linear system of Hermitian vector spaces (in the sense of Grothendieck). A refined notion of curvature is derived from this construction leading to a new unitary invariant for the Hilbert module. A division problem with bounds, originating in Douady's privilege, is related to this framework. A series of concrete computations illustrate the abstract concepts of the paper.
Resumo:
Nonlinear equations in mathematical physics and engineering are solved by linearizing the equations and forming various iterative procedures, then executing the numerical simulation. For strongly nonlinear problems, the solution obtained in the iterative process can diverge due to numerical instability. As a result, the application of numerical simulation for strongly nonlinear problems is limited. Helicopter aeroelasticity involves the solution of systems of nonlinear equations in a computationally expensive environment. Reliable solution methods which do not need Jacobian calculation at each iteration are needed for this problem. In this paper, a comparative study is done by incorporating different methods for solving the nonlinear equations in helicopter trim. Three different methods based on calculating the Jacobian at the initial guess are investigated. (C) 2011 Elsevier Masson SAS. All rights reserved.