29 resultados para Quantile regression
Resumo:
This paper proposes a novel approach to solve the ordinal regression problem using Gaussian processes. The proposed approach, probabilistic least squares ordinal regression (PLSOR), obtains the probability distribution over ordinal labels using a particular likelihood function. It performs model selection (hyperparameter optimization) using the leave-one-out cross-validation (LOO-CV) technique. PLSOR has conceptual simplicity and ease of implementation of least squares approach. Unlike the existing Gaussian process ordinal regression (GPOR) approaches, PLSOR does not use any approximation techniques for inference. We compare the proposed approach with the state-of-the-art GPOR approaches on some synthetic and benchmark data sets. Experimental results show the competitiveness of the proposed approach.
Resumo:
This paper proposes a sparse modeling approach to solve ordinal regression problems using Gaussian processes (GP). Designing a sparse GP model is important from training time and inference time viewpoints. We first propose a variant of the Gaussian process ordinal regression (GPOR) approach, leave-one-out GPOR (LOO-GPOR). It performs model selection using the leave-one-out cross-validation (LOO-CV) technique. We then provide an approach to design a sparse model for GPOR. The sparse GPOR model reduces computational time and storage requirements. Further, it provides faster inference. We compare the proposed approaches with the state-of-the-art GPOR approach on some benchmark data sets. Experimental results show that the proposed approaches are competitive.
Resumo:
Multiple input multiple output (MIMO) systems with large number of antennas have been gaining wide attention as they enable very high throughputs. A major impediment is the complexity at the receiver needed to detect the transmitted data. To this end we propose a new receiver, called LRR (Linear Regression of MMSE Residual), which improves the MMSE receiver by learning a linear regression model for the error of the MMSE receiver. The LRR receiver uses pilot data to estimate the channel, and then uses locally generated training data (not transmitted over the channel), to find the linear regression parameters. The proposed receiver is suitable for applications where the channel remains constant for a long period (slow-fading channels) and performs quite well: at a bit error rate (BER) of 10(-3), the SNR gain over MMSE receiver is about 7 dB for a 16 x 16 system; for a 64 x 64 system the gain is about 8.5 dB. For large coherence time, the complexity order of the LRR receiver is the same as that of the MMSE receiver, and in simulations we find that it needs about 4 times as many floating point operations. We also show that further gain of about 4 dB is obtained by local search around the estimate given by the LRR receiver.
Resumo:
An important question in kernel regression is one of estimating the order and bandwidth parameters from available noisy data. We propose to solve the problem within a risk estimation framework. Considering an independent and identically distributed (i.i.d.) Gaussian observations model, we use Stein's unbiased risk estimator (SURE) to estimate a weighted mean-square error (MSE) risk, and optimize it with respect to the order and bandwidth parameters. The two parameters are thus spatially adapted in such a manner that noise smoothing and fine structure preservation are simultaneously achieved. On the application side, we consider the problem of image restoration from uniform/non-uniform data, and show that the SURE approach to spatially adaptive kernel regression results in better quality estimation compared with its spatially non-adaptive counterparts. The denoising results obtained are comparable to those obtained using other state-of-the-art techniques, and in some scenarios, superior.
Resumo:
Elastic Net Regularizers have shown much promise in designing sparse classifiers for linear classification. In this work, we propose an alternating optimization approach to solve the dual problems of elastic net regularized linear classification Support Vector Machines (SVMs) and logistic regression (LR). One of the sub-problems turns out to be a simple projection. The other sub-problem can be solved using dual coordinate descent methods developed for non-sparse L2-regularized linear SVMs and LR, without altering their iteration complexity and convergence properties. Experiments on very large datasets indicate that the proposed dual coordinate descent - projection (DCD-P) methods are fast and achieve comparable generalization performance after the first pass through the data, with extremely sparse models.
Resumo:
Using a realistic nonlinear mathematical model for melanoma dynamics and the technique of optimal dynamic inversion (exact feedback linearization with static optimization), a multimodal automatic drug dosage strategy is proposed in this paper for complete regression of melanoma cancer in humans. The proposed strategy computes different drug dosages and gives a nonlinear state feedback solution for driving the number of cancer cells to zero. However, it is observed that when tumor is regressed to certain value, then there is no need of external drug dosages as immune system and other therapeutic states are able to regress tumor at a sufficiently fast rate which is more than exponential rate. As model has three different drug dosages, after applying dynamic inversion philosophy, drug dosages can be selected in optimized manner without crossing their toxicity limits. The combination of drug dosages is decided by appropriately selecting the control design parameter values based on physical constraints. The process is automated for all possible combinations of the chemotherapy and immunotherapy drug dosages with preferential emphasis of having maximum possible variety of drug inputs at any given point of time. Simulation study with a standard patient model shows that tumor cells are regressed from 2 x 107 to order of 105 cells because of external drug dosages in 36.93 days. After this no external drug dosages are required as immune system and other therapeutic states are able to regress tumor at greater than exponential rate and hence, tumor goes to zero (less than 0.01) in 48.77 days and healthy immune system of the patient is restored. Study with different chemotherapy drug resistance value is also carried out. (C) 2014 Elsevier Ltd. All rights reserved.
Resumo:
The present work presents the results of experimental investigation of semi-solid rheocasting of A356 Al alloy using a cooling slope. The experiments have been carried out following Taguchi method of parameter design (orthogonal array of L-9 experiments). Four key process variables (slope angle, pouring temperature, wall temperature, and length of travel of the melt) at three different levels have been considered for the present experimentation. Regression analysis and analysis of variance (ANOVA) has also been performed to develop a mathematical model for degree of sphericity evolution of primary alpha-Al phase and to find the significance and percentage contribution of each process variable towards the final outcome of degree of sphericity, respectively. The best processing condition has been identified for optimum degree of sphericity (0.83) as A(3), B-3, C-2, D-1 i.e., slope angle of 60 degrees, pouring temperature of 650 degrees C, wall temperature 60 degrees C, and 500 mm length of travel of the melt, based on mean response and signal to noise ratio (SNR). ANOVA results shows that the length of travel has maximum impact on degree of sphericity evolution. The predicted sphericity obtained from the developed regression model and the values obtained experimentally are found to be in good agreement with each other. The sphericity values obtained from confirmation experiment, performed at 95% confidence level, ensures that the optimum result is correct and also the confirmation experiment values are within permissible limits. (c) 2014 Elsevier Ltd. All rights reserved.
Resumo:
Seismic site characterization is the basic requirement for seismic microzonation and site response studies of an area. Site characterization helps to gauge the average dynamic properties of soil deposits and thus helps to evaluate the surface level response. This paper presents a seismic site characterization of Agartala city, the capital of Tripura state, in the northeast of India. Seismically, Agartala city is situated in the Bengal Basin zone which is classified as a highly active seismic zone, assigned by Indian seismic code BIS-1893, Indian Standard Criteria for Earthquake Resistant Design of Structures, Part-1 General Provisions and Buildings. According to the Bureau of Indian Standards, New Delhi (2002), it is the highest seismic level (zone-V) in the country. The city is very close to the Sylhet fault (Bangladesh) where two major earthquakes (M (w) > 7) have occurred in the past and affected severely this city and the whole of northeast India. In order to perform site response evaluation, a series of geophysical tests at 27 locations were conducted using the multichannel analysis of surface waves (MASW) technique, which is an advanced method for obtaining shear wave velocity (V (s)) profiles from in situ measurements. Similarly, standard penetration test (SPT-N) bore log data sets have been obtained from the Urban Development Department, Govt. of Tripura. In the collected data sets, out of 50 bore logs, 27 were selected which are close to the MASW test locations and used for further study. Both the data sets (V (s) profiles with depth and SPT-N bore log profiles) have been used to calculate the average shear wave velocity (V (s)30) and average SPT-N values for the upper 30 m depth of the subsurface soil profiles. These were used for site classification of the study area recommended by the National Earthquake Hazard Reduction Program (NEHRP) manual. The average V (s)30 and SPT-N classified the study area as seismic site class D and E categories, indicating that the city is susceptible to site effects and liquefaction. Further, the different data set combinations between V (s) and SPT-N (corrected and uncorrected) values have been used to develop site-specific correlation equations by statistical regression, as `V (s)' is a function of SPT-N value (corrected and uncorrected), considered with or without depth. However, after considering the data set pairs, a probabilistic approach has also been presented to develop a correlation using a quantile-quantile (Q-Q) plot. A comparison has also been made with the well known published correlations (for all soils) available in the literature. The present correlations closely agree with the other equations, but, comparatively, the correlation of shear wave velocity with the variation of depth and uncorrected SPT-N values provides a more suitable predicting model. Also the Q-Q plot agrees with all the other equations. In the absence of in situ measurements, the present correlations could be used to measure V (s) profiles of the study area for site response studies.
Resumo:
In this paper, we present a novel algorithm for piecewise linear regression which can learn continuous as well as discontinuous piecewise linear functions. The main idea is to repeatedly partition the data and learn a linear model in each partition. The proposed algorithm is similar in spirit to k-means clustering algorithm. We show that our algorithm can also be viewed as a special case of an EM algorithm for maximum likelihood estimation under a reasonable probability model. We empirically demonstrate the effectiveness of our approach by comparing its performance with that of the state of art algorithms on various datasets. (C) 2014 Elsevier Inc. All rights reserved.
Resumo:
Regional frequency analysis is widely used for estimating quantiles of hydrological extreme events at sparsely gauged/ungauged target sites in river basins. It involves identification of a region (group of watersheds) resembling watershed of the target site, and use of information pooled from the region to estimate quantile for the target site. In the analysis, watershed of the target site is assumed to completely resemble watersheds in the identified region in terms of mechanism underlying generation of extreme event. In reality, it is rare to find watersheds that completely resemble each other. Fuzzy clustering approach can account for partial resemblance of watersheds and yield region(s) for the target site. Formation of regions and quantile estimation requires discerning information from fuzzy-membership matrix obtained based on the approach. Practitioners often defuzzify the matrix to form disjoint clusters (regions) and use them as the basis for quantile estimation. The defuzzification approach (DFA) results in loss of information discerned on partial resemblance of watersheds. The lost information cannot be utilized in quantile estimation, owing to which the estimates could have significant error. To avert the loss of information, a threshold strategy (TS) was considered in some prior studies. In this study, it is analytically shown that the strategy results in under-prediction of quantiles. To address this, a mathematical approach is proposed in this study and its effectiveness in estimating flood quantiles relative to DFA and TS is demonstrated through Monte-Carlo simulation experiments and case study on Mid-Atlantic water resources region, USA. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Scaling approaches are widely used by hydrologists for Regional Frequency Analysis (RFA) of floods at ungauged/sparsely gauged site(s) in river basins. This paper proposes a Recursive Multi-scaling (RMS) approach to RFA that overcomes limitations of conventional simple- and multi-scaling approaches. The approach involves identification of a separate set of attributes corresponding to each of the sites (being considered in the study area/region) in a recursive manner according to their importance, and utilizing those attributes to construct effective regional regression relationships to estimate statistical raw moments (SMs) of peak flows. The SMs are then utilized to arrive at parameters of flood frequency distribution and quantile estimate(s) corresponding to target return period(s). Effectiveness of the RMS approach in arriving at flood quantile estimates for ungauged sites is demonstrated through leave-one-out cross-validation experiment on watersheds in Indiana State, USA. Results indicate that the approach outperforms index-flood based Region-of-Influence approach, simple- and multi-scaling approaches and a multiple linear regression method. (C) 2015 Elsevier B.V. All rights reserved.
Resumo:
Climate change in response to a change in external forcing can be understood in terms of fast response to the imposed forcing and slow feedback associated with surface temperature change. Previous studies have investigated the characteristics of fast response and slow feedback for different forcing agents. Here we examine to what extent that fast response and slow feedback derived from time-mean results of climate model simulations can be used to infer total climate change. To achieve this goal, we develop a multivariate regression model of climate change, in which the change in a climate variable is represented by a linear combination of its sensitivity to CO2 forcing, solar forcing, and change in global mean surface temperature. We derive the parameters of the regression model using time-mean results from a set of HadCM3L climate model step-forcing simulations, and then use the regression model to emulate HadCM3L-simulated transient climate change. Our results show that the regression model emulates well HadCM3L-simulated temporal evolution and spatial distribution of climate change, including surface temperature, precipitation, runoff, soil moisture, cloudiness, and radiative fluxes under transient CO2 and/or solar forcing scenarios. Our findings suggest that temporal and spatial patterns of total change for the climate variables considered here can be represented well by the sum of fast response and slow feedback. Furthermore, by using a simple 1-D heat-diffusion climate model, we show that the temporal and spatial characteristics of climate change under transient forcing scenarios can be emulated well using information from step-forcing simulations alone.
Resumo:
In this paper, we present two new stochastic approximation algorithms for the problem of quantile estimation. The algorithms uses the characterization of the quantile provided in terms of an optimization problem in 1]. The algorithms take the shape of a stochastic gradient descent which minimizes the optimization problem. Asymptotic convergence of the algorithms to the true quantile is proven using the ODE method. The theoretical results are also supplemented through empirical evidence. The algorithms are shown to provide significant improvement in terms of memory requirement and accuracy.
Resumo:
Naturally occurring compounds are considered as attractive candidates for cancer treatment and prevention. Quercetin and ellagic acid are naturally occurring flavonoids abundantly seen in several fruits and vegetables. In the present study, we evaluate and compare antitumor efficacies of quercetin and ellagic acid in animal models and cancer cell lines in a comprehensive manner. We found that quercetin induced cytotoxicity in leukemic cells in a dose-dependent manner, while ellagic acid showed only limited toxicity. Besides leukemic cells, quercetin also induced cytotoxicity in breast cancer cells, however, its effect on normal cells was limited or none. Further, quercetin caused S phase arrest during cell cycle progression in tested cancer cells. Quercetin induced tumor regression in mice at a concentration 3-fold lower than ellagic acid. Importantly, administration of quercetin lead to -5 fold increase in the life span in tumor bearing mice compared to that of untreated controls. Further, we found that quercetin interacts with DNA directly, and could be one of the mechanisms for inducing apoptosis in both, cancer cell lines and tumor tissues by activating the intrinsic pathway. Thus, our data suggests that quercetin can be further explored for its potential to be used in cancer therapeutics and combination therapy.