25 resultados para Ranked Regression

em Indian Institute of Science - Bangalore - Índia


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Chemical composition of rainwater changes from sea to inland under the influence of several major factors - topographic location of area, its distance from sea, annual rainfall. A model is developed here to quantify the variation in precipitation chemistry under the influence of inland distance and rainfall amount. Various sites in India categorized as 'urban', 'suburban' and 'rural' have been considered for model development. pH, HCO3, NO3 and Mg do not change much from coast to inland while, SO4 and Ca change is subjected to local emissions. Cl and Na originate solely from sea salinity and are the chemistry parameters in the model. Non-linear multiple regressions performed for the various categories revealed that both rainfall amount and precipitation chemistry obeyed a power law reduction with distance from sea. Cl and Na decrease rapidly for the first 100 km distance from sea, then decrease marginally for the next 100 km, and later stabilize. Regression parameters estimated for different cases were found to be consistent (R-2 similar to 0.8). Variation in one of the parameters accounted for urbanization. Model was validated using data points from the southern peninsular region of the country. Estimates are found to be within 99.9% confidence interval. Finally, this relationship between the three parameters - rainfall amount, coastline distance, and concentration (in terms of Cl and Na) was validated with experiments conducted in a small experimental watershed in the south-west India. Chemistry estimated using the model was in good correlation with observed values with a relative error of similar to 5%. Monthly variation in the chemistry is predicted from a downscaling model and then compared with the observed data. Hence, the model developed for rain chemistry is useful in estimating the concentrations at different spatio-temporal scales and is especially applicable for south-west region of India. (C) 2008 Elsevier Ltd. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: A genetic network can be represented as a directed graph in which a node corresponds to a gene and a directed edge specifies the direction of influence of one gene on another. The reconstruction of such networks from transcript profiling data remains an important yet challenging endeavor. A transcript profile specifies the abundances of many genes in a biological sample of interest. Prevailing strategies for learning the structure of a genetic network from high-dimensional transcript profiling data assume sparsity and linearity. Many methods consider relatively small directed graphs, inferring graphs with up to a few hundred nodes. This work examines large undirected graphs representations of genetic networks, graphs with many thousands of nodes where an undirected edge between two nodes does not indicate the direction of influence, and the problem of estimating the structure of such a sparse linear genetic network (SLGN) from transcript profiling data. Results: The structure learning task is cast as a sparse linear regression problem which is then posed as a LASSO (l1-constrained fitting) problem and solved finally by formulating a Linear Program (LP). A bound on the Generalization Error of this approach is given in terms of the Leave-One-Out Error. The accuracy and utility of LP-SLGNs is assessed quantitatively and qualitatively using simulated and real data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) initiative provides gold standard data sets and evaluation metrics that enable and facilitate the comparison of algorithms for deducing the structure of networks. The structures of LP-SLGNs estimated from the INSILICO1, INSILICO2 and INSILICO3 simulated DREAM2 data sets are comparable to those proposed by the first and/or second ranked teams in the DREAM2 competition. The structures of LP-SLGNs estimated from two published Saccharomyces cerevisae cell cycle transcript profiling data sets capture known regulatory associations. In each S. cerevisiae LP-SLGN, the number of nodes with a particular degree follows an approximate power law suggesting that its degree distributions is similar to that observed in real-world networks. Inspection of these LP-SLGNs suggests biological hypotheses amenable to experimental verification. Conclusion: A statistically robust and computationally efficient LP-based method for estimating the topology of a large sparse undirected graph from high-dimensional data yields representations of genetic networks that are biologically plausible and useful abstractions of the structures of real genetic networks. Analysis of the statistical and topological properties of learned LP-SLGNs may have practical value; for example, genes with high random walk betweenness, a measure of the centrality of a node in a graph, are good candidates for intervention studies and hence integrated computational – experimental investigations designed to infer more realistic and sophisticated probabilistic directed graphical model representations of genetic networks. The LP-based solutions of the sparse linear regression problem described here may provide a method for learning the structure of transcription factor networks from transcript profiling and transcription factor binding motif data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Regression ra tes of a hypergolic combination of fuel and oxidiser have been experimentally measured as a function of chamber pressure, mass flux and the percentage component of the hypergolic compound in natural rubber. The hypergolic compound used is difurfurylidene cyclohexanone (DFCH) which is hypergolic with the oxidiser red fuming nitric acid (RFNA) with ignition dela y of 60-70 ms. The data of weight loss versus time is obtained for burn times varying between 5 and 20 seconds. Two methods of correlating the data using mass flux of oxidiser and the total flux of hot gases have shown that index n of the regression law r=aGoxn or r=aGnxn-1 (x the axial distance) is about 0.5 or a little lower and not 0.8 even though the flow through the port is turbulent. It is argued that the reduction of index n is due to heterogeneous reaction between the liquid oxidiser and the hypergolic fuel component on the surface.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Lateral displacement and global stability are the two main stability criteria for soil nail walls. Conventional design methods do not adequately address the deformation behaviour of soil nail walls, owing to the complexity involved in handling a large number of influencing factors. Consequently, limited methods of deformation estimates based on empirical relationships and in situ performance monitoring are available in the literature. It is therefore desirable that numerical techniques and statistical methods are used in order to gain a better insight into the deformation behaviour of soil nail walls. In the present study numerical experiments are conducted using a 2 4 factorial design method. Based on analysis of the maximum lateral deformation and factor-of-safety observations from the numerical experiments, regression models for maximum lateral deformation and factor-of-safety prediction are developed and checked for adequacy. Selection of suitable design factors for the 2 4 factorial design of numerical experiments enabled the use of the proposed regression models over a practical range of soil nail wall heights and in situ soil variability. It is evident from the model adequacy analyses and illustrative example that the proposed regression models provided a reasonably good estimate of the lateral deformation and global factor of safety of the soil nail walls.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper gives a new iterative algorithm for kernel logistic regression. It is based on the solution of a dual problem using ideas similar to those of the Sequential Minimal Optimization algorithm for Support Vector Machines. Asymptotic convergence of the algorithm is proved. Computational experiments show that the algorithm is robust and fast. The algorithmic ideas can also be used to give a fast dual algorithm for solving the optimization problem arising in the inner loop of Gaussian Process classifiers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present two new support vector approaches for ordinal regression. These approaches find the concentric spheres with minimum volume that contain most of the training samples. Both approaches guarantee that the radii of the spheres are properly ordered at the optimal solution. The size of the optimization problem is linear in the number of training samples. The popular SMO algorithm is adapted to solve the resulting optimization problem. Numerical experiments on some real-world data sets verify the usefulness of our approaches for data mining.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Processor architects have a challenging task of evaluating a large design space consisting of several interacting parameters and optimizations. In order to assist architects in making crucial design decisions, we build linear regression models that relate Processor performance to micro-architecture parameters, using simulation based experiments. We obtain good approximate models using an iterative process in which Akaike's information criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We used this procedure to establish the relationship of the CPI performance response to 26 key micro-architectural parameters using a detailed cycle-by-cycle superscalar processor simulator The resulting models provide a significance ordering on all micro-architectural parameters and their interactions, and explain the performance variations of micro-architectural techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Gaussian Processes (GPs) are promising Bayesian methods for classification and regression problems. They have also been used for semi-supervised learning tasks. In this paper, we propose a new algorithm for solving semi-supervised binary classification problem using sparse GP regression (GPR) models. It is closely related to semi-supervised learning based on support vector regression (SVR) and maximum margin clustering. The proposed algorithm is simple and easy to implement. It gives a sparse solution directly unlike the SVR based algorithm. Also, the hyperparameters are estimated easily without resorting to expensive cross-validation technique. Use of sparse GPR model helps in making the proposed algorithm scalable. Preliminary results on synthetic and real-world data sets demonstrate the efficacy of the new algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents an optimization algorithm for an ammonia reactor based on a regression model relating the yield to several parameters, control inputs and disturbances. This model is derived from the data generated by hybrid simulation of the steady-state equations describing the reactor behaviour. The simplicity of the optimization program along with its ability to take into account constraints on flow variables make it best suited in supervisory control applications.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background: In higher primates, although LH/CG play a critical role in the control of corpus luteum (CL) function, the direct effects of progesterone (P4) in the maintenance of CL structure and function are unclear. Several experiments were conducted in the bonnet monkey to examine direct effects of P4 on gene expression changes in the CL, during induced luteolysis and the late luteal phase of natural cycles. Methods: To identify differentially expressed genes encoding PR, PR binding factors, cofactors and PR downstream signaling target genes, the genome-wide analysis data generated in CL of monkeys after LH/P-4 depletion and LH replacement were mined and validated by real-time RT-PCR analysis. Initially, expression of these P4 related genes were determined in CL during different stages of luteal phase. The recently reported model system of induced luteolysis, yet capable of responsive to tropic support, afforded an ideal situation to examine direct effects of P4 on structure and function of CL. For this purpose, P4 was infused via ALZET pumps into monkeys 24 h after LH/P4 depletion to maintain mid luteal phase circulating P4 concentration (P4 replacement). In another experiment, exogenous P4 was supplemented during late luteal phase to mimic early pregnancy. Results: Based on the published microarray data, 45 genes were identified to be commonly regulated by LH and P4. From these 19 genes belonging to PR signaling were selected to determine their expression in LH/P-4 depletion and P4 replacement experiments. These 19 genes when analyzed revealed 8 genes to be directly responsive to P4, whereas the other genes to be regulated by both LH and P4. Progesterone supplementation for 24 h during the late luteal phase also showed changes in expression of 17 out of 19 genes examined. Conclusion: These results taken together suggest that P4 regulates, directly or indirectly, expression of a number of genes involved in the CL structure and function.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces a scheme for classification of online handwritten characters based on polynomial regression of the sampled points of the sub-strokes in a character. The segmentation is done based on the velocity profile of the written character and this requires a smoothening of the velocity profile. We propose a novel scheme for smoothening the velocity profile curve and identification of the critical points to segment the character. We also porpose another method for segmentation based on the human eye perception. We then extract two sets of features for recognition of handwritten characters. Each sub-stroke is a simple curve, a part of the character, and is represented by the distance measure of each point from the first point. This forms the first set of feature vector for each character. The second feature vector are the coeficients obtained from the B-splines fitted to the control knots obtained from the segmentation algorithm. The feature vector is fed to the SVM classifier and it indicates an efficiency of 68% using the polynomial regression technique and 74% using the spline fitting method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We address the problem of local-polynomial modeling of smooth time-varying signals with unknown functional form, in the presence of additive noise. The problem formulation is in the time domain and the polynomial coefficients are estimated in the pointwise minimum mean square error (PMMSE) sense. The choice of the window length for local modeling introduces a bias-variance tradeoff, which we solve optimally by using the intersection-of-confidence-intervals (ICI) technique. The combination of the local polynomial model and the ICI technique gives rise to an adaptive signal model equipped with a time-varying PMMSE-optimal window length whose performance is superior to that obtained by using a fixed window length. We also evaluate the sensitivity of the ICI technique with respect to the confidence interval width. Simulation results on electrocardiogram (ECG) signals show that at 0dB signal-to-noise ratio (SNR), one can achieve about 12dB improvement in SNR. Monte-Carlo performance analysis shows that the performance is comparable to the basic wavelet techniques. For 0 dB SNR, the adaptive window technique yields about 2-3dB higher SNR than wavelet regression techniques and for SNRs greater than 12dB, the wavelet techniques yield about 2dB higher SNR.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we propose a novel, scalable, clustering based Ordinal Regression formulation, which is an instance of a Second Order Cone Program (SOCP) with one Second Order Cone (SOC) constraint. The main contribution of the paper is a fast algorithm, CB-OR, which solves the proposed formulation more eficiently than general purpose solvers. Another main contribution of the paper is to pose the problem of focused crawling as a large scale Ordinal Regression problem and solve using the proposed CB-OR. Focused crawling is an efficient mechanism for discovering resources of interest on the web. Posing the problem of focused crawling as an Ordinal Regression problem avoids the need for a negative class and topic hierarchy, which are the main drawbacks of the existing focused crawling methods. Experiments on large synthetic and benchmark datasets show the scalability of CB-OR. Experiments also show that the proposed focused crawler outperforms the state-of-the-art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a method of partial automation of specification based regression testing, which we call ESSE (Explicit State Space Enumeration). The first step in ESSE method is the extraction of a finite state model of the system making use of an already tested version of the system under test (SUT). Thereafter, the finite state model thus obtained is used to compute good test sequences that can be used to regression test subsequent versions of the system. We present two new algorithms for test sequence computation - both based on our finite state model generated by the above method. We also provide the details and results of the experimental evaluation of ESSE method. Comparison with a practically used random-testing algorithm has shown substantial improvements.