919 resultados para Rank regression


Relevância:

40.00% 40.00%

Publicador:

Resumo:

The paper addresses the problem of learning a regression model parameterized by a fixed-rank positive semidefinite matrix. The focus is on the nonlinear nature of the search space and on scalability to high-dimensional problems. The mathematical developments rely on the theory of gradient descent algorithms adapted to the Riemannian geometry that underlies the set of fixedrank positive semidefinite matrices. In contrast with previous contributions in the literature, no restrictions are imposed on the range space of the learned matrix. The resulting algorithms maintain a linear complexity in the problem size and enjoy important invariance properties. We apply the proposed algorithms to the problem of learning a distance function parameterized by a positive semidefinite matrix. Good performance is observed on classical benchmarks. © 2011 Gilles Meyer, Silvere Bonnabel and Rodolphe Sepulchre.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Let Y_i = f(x_i) + E_i\ (1\le i\le n) with given covariates x_1\lt x_2\lt \cdots\lt x_n , an unknown regression function f and independent random errors E_i with median zero. It is shown how to apply several linear rank test statistics simultaneously in order to test monotonicity of f in various regions and to identify its local extrema.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A smoothed rank-based procedure is developed for the accelerated failure time model to overcome computational issues. The proposed estimator is based on an EM-type procedure coupled with the induced smoothing. "The proposed iterative approach converges provided the initial value is based on a consistent estimator, and the limiting covariance matrix can be obtained from a sandwich-type formula. The consistency and asymptotic normality of the proposed estimator are also established. Extensive simulations show that the new estimator is not only computationally less demanding but also more reliable than the other existing estimators.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper addresses the problem of low-rank trace norm minimization. We propose an algorithm that alternates between fixed-rank optimization and rank-one updates. The fixed-rank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a Riemannian structure that leads to efficient computations. We present a second-order trust-region algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges superlinearly to the global solution while maintaining complexity that is linear in the number of rows and columns of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictor-corrector approach that outperforms the naive warm-restart approach on the fixed-rank quotient manifold. The performance of the proposed algorithm is illustrated on problems of low-rank matrix completion and multivariate linear regression. © 2013 Society for Industrial and Applied Mathematics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Motivated by the problem of learning a linear regression model whose parameter is a large fixed-rank non-symmetric matrix, we consider the optimization of a smooth cost function defined on the set of fixed-rank matrices. We adopt the geometric framework of optimization on Riemannian quotient manifolds. We study the underlying geometries of several well-known fixed-rank matrix factorizations and then exploit the Riemannian quotient geometry of the search space in the design of a class of gradient descent and trust-region algorithms. The proposed algorithms generalize our previous results on fixed-rank symmetric positive semidefinite matrices, apply to a broad range of applications, scale to high-dimensional problems, and confer a geometric basis to recent contributions on the learning of fixed-rank non-symmetric matrices. We make connections with existing algorithms in the context of low-rank matrix completion and discuss the usefulness of the proposed framework. Numerical experiments suggest that the proposed algorithms compete with state-of-the-art algorithms and that manifold optimization offers an effective and versatile framework for the design of machine learning algorithms that learn a fixed-rank matrix. © 2013 Springer-Verlag Berlin Heidelberg.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mature weight breeding values were estimated using a multi-trait animal model (MM) and a random regression animal model (RRM). Data consisted of 82 064 weight records from 8 145 animals, recorded from birth to eight years of age. Weights at standard ages were considered in the MM. All models included contemporary groups as fixed effects, and age of dam (linear and quadratic effects) and animal age as covariates. In the RRM, mean trends were modelled through a cubic regression on orthogonal polynomials of animal age and genetic maternal and direct and maternal permanent environmental effects were also included as random. Legendre polynomials of orders 4, 3, 6 and 3 were used for animal and maternal genetic and permanent environmental effects, respectively, considering five classes of residual variances. Mature weight (five years) direct heritability estimates were 0.35 (MM) and 0.38 (RRM). Rank correlation between sires' breeding values estimated by MM and RRM was 0.82. However, selecting the top 2% (12) or 10% (62) of the young sires based on the MM predicted breeding values, respectively 71% and 80% of the same sires would be selected if RRM estimates were used instead. The RRM modelled the changes in the (co)variances with age adequately and larger breeding value accuracies can be expected using this model. © South African Society for Animal Science.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Studies investigating the use of random regression models for genetic evaluation of milk production in Zebu cattle are scarce. In this study, 59,744 test-day milk yield records from 7,810 first lactations of purebred dairy Gyr (Bos indicus) and crossbred (dairy Gyr × Holstein) cows were used to compare random regression models in which additive genetic and permanent environmental effects were modeled using orthogonal Legendre polynomials or linear spline functions. Residual variances were modeled considering 1, 5, or 10 classes of days in milk. Five classes fitted the changes in residual variances over the lactation adequately and were used for model comparison. The model that fitted linear spline functions with 6 knots provided the lowest sum of residual variances across lactation. On the other hand, according to the deviance information criterion (DIC) and Bayesian information criterion (BIC), a model using third-order and fourth-order Legendre polynomials for additive genetic and permanent environmental effects, respectively, provided the best fit. However, the high rank correlation (0.998) between this model and that applying third-order Legendre polynomials for additive genetic and permanent environmental effects, indicates that, in practice, the same bulls would be selected by both models. The last model, which is less parameterized, is a parsimonious option for fitting dairy Gyr breed test-day milk yield records. © 2013 American Dairy Science Association.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various inference procedures for linear regression models with censored failure times have been studied extensively. Recent developments on efficient algorithms to implement these procedures enhance the practical usage of such models in survival analysis. In this article, we present robust inferences for certain covariate effects on the failure time in the presence of "nuisance" confounders under a semiparametric, partial linear regression setting. Specifically, the estimation procedures for the regression coefficients of interest are derived from a working linear model and are valid even when the function of the confounders in the model is not correctly specified. The new proposals are illustrated with two examples and their validity for cases with practical sample sizes is demonstrated via a simulation study.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Histopathologic determination of tumor regression provides important prognostic information for locally advanced gastroesophageal carcinomas after neoadjuvant treatment. Regression grading systems mostly refer to the amount of therapy-induced fibrosis in relation to residual tumor or the estimated percentage of residual tumor in relation to the former tumor site. Although these methods are generally accepted, currently there is no common standard for reporting tumor regression in gastroesophageal cancers. We compared the application of these 2 major principles for assessment of tumor regression: hematoxylin and eosin-stained slides from 89 resection specimens of esophageal adenocarcinomas following neoadjuvant chemotherapy were independently reviewed by 3 pathologists from different institutions. Tumor regression was determined by the 5-tiered Mandard system (fibrosis/tumor relation) and the 4-tiered Becker system (residual tumor in %). Interobserver agreement for the Becker system showed better weighted κ values compared with the Mandard system (0.78 vs. 0.62). Evaluation of the whole embedded tumor site showed improved results (Becker: 0.83; Mandard: 0.73) as compared with only 1 representative slide (Becker: 0.68; Mandard: 0.71). Modification into simplified 3-tiered systems showed comparable interobserver agreement but better prognostic stratification for both systems (log rank Becker: P=0.015; Mandard P=0.03), with independent prognostic impact for overall survival (modified Becker: P=0.011, hazard ratio=3.07; modified Mandard: P=0.023, hazard ratio=2.72). In conclusion, both systems provide substantial to excellent interobserver agreement for estimation of tumor regression after neoadjuvant chemotherapy in esophageal adenocarcinomas. A simple 3-tiered system with the estimation of residual tumor in % (complete regression/1% to 50% residual tumor/>50% residual tumor) maintains the highest reproducibility and prognostic value.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Expert elicitation is the process of retrieving and quantifying expert knowledge in a particular domain. Such information is of particular value when the empirical data is expensive, limited, or unreliable. This paper describes a new software tool, called Elicitator, which assists in quantifying expert knowledge in a form suitable for use as a prior model in Bayesian regression. Potential environmental domains for applying this elicitation tool include habitat modeling, assessing detectability or eradication, ecological condition assessments, risk analysis, and quantifying inputs to complex models of ecological processes. The tool has been developed to be user-friendly, extensible, and facilitate consistent and repeatable elicitation of expert knowledge across these various domains. We demonstrate its application to elicitation for logistic regression in a geographically based ecological context. The underlying statistical methodology is also novel, utilizing an indirect elicitation approach to target expert knowledge on a case-by-case basis. For several elicitation sites (or cases), experts are asked simply to quantify their estimated ecological response (e.g. probability of presence), and its range of plausible values, after inspecting (habitat) covariates via GIS.