990 resultados para reduced rank regression
Resumo:
Machine learning provides tools for automated construction of predictive models in data intensive areas of engineering and science. The family of regularized kernel methods have in the recent years become one of the mainstream approaches to machine learning, due to a number of advantages the methods share. The approach provides theoretically well-founded solutions to the problems of under- and overfitting, allows learning from structured data, and has been empirically demonstrated to yield high predictive performance on a wide range of application domains. Historically, the problems of classification and regression have gained the majority of attention in the field. In this thesis we focus on another type of learning problem, that of learning to rank. In learning to rank, the aim is from a set of past observations to learn a ranking function that can order new objects according to how well they match some underlying criterion of goodness. As an important special case of the setting, we can recover the bipartite ranking problem, corresponding to maximizing the area under the ROC curve (AUC) in binary classification. Ranking applications appear in a large variety of settings, examples encountered in this thesis include document retrieval in web search, recommender systems, information extraction and automated parsing of natural language. We consider the pairwise approach to learning to rank, where ranking models are learned by minimizing the expected probability of ranking any two randomly drawn test examples incorrectly. The development of computationally efficient kernel methods, based on this approach, has in the past proven to be challenging. Moreover, it is not clear what techniques for estimating the predictive performance of learned models are the most reliable in the ranking setting, and how the techniques can be implemented efficiently. The contributions of this thesis are as follows. First, we develop RankRLS, a computationally efficient kernel method for learning to rank, that is based on minimizing a regularized pairwise least-squares loss. In addition to training methods, we introduce a variety of algorithms for tasks such as model selection, multi-output learning, and cross-validation, based on computational shortcuts from matrix algebra. Second, we improve the fastest known training method for the linear version of the RankSVM algorithm, which is one of the most well established methods for learning to rank. Third, we study the combination of the empirical kernel map and reduced set approximation, which allows the large-scale training of kernel machines using linear solvers, and propose computationally efficient solutions to cross-validation when using the approach. Next, we explore the problem of reliable cross-validation when using AUC as a performance criterion, through an extensive simulation study. We demonstrate that the proposed leave-pair-out cross-validation approach leads to more reliable performance estimation than commonly used alternative approaches. Finally, we present a case study on applying machine learning to information extraction from biomedical literature, which combines several of the approaches considered in the thesis. The thesis is divided into two parts. Part I provides the background for the research work and summarizes the most central results, Part II consists of the five original research articles that are the main contribution of this thesis.
Resumo:
Cardiopulmonary reflexes are activated via changes in cardiac filling pressure (volume-sensitive reflex) and chemical stimulation (chemosensitive reflex). The sensitivity of the cardiopulmonary reflexes to these stimuli is impaired in the spontaneously hypertensive rat (SHR) and other models of hypertension and is thought to be associated with cardiac hypertrophy. The present study investigated whether the sensitivity of the cardiopulmonary reflexes in SHR is restored when cardiac hypertrophy and hypertension are reduced by enalapril treatment. Untreated SHR and WKY rats were fed a normal diet. Another groups of rats were treated with enalapril (10 mg kg-1 day-1, mixed in the diet; SHRE or WKYE) for one month. After treatment, the volume-sensitive reflex was evaluated in each group by determining the decrease in magnitude of the efferent renal sympathetic nerve activity (RSNA) produced by acute isotonic saline volume expansion. Chemoreflex sensitivity was evaluated by examining the bradycardia response elicited by phenyldiguanide administration. Cardiac hypertrophy was determined from the left ventricular/body weight (LV/BW) ratio. Volume expansion produced an attenuated renal sympathoinhibitory response in SHR as compared to WKY rats. As compared to the levels observed in normotensive WKY rats, however, enalapril treatment restored the volume expansion-induced decrease in RSNA in SHRE. SHR with established hypertension had a higher LV/BW ratio (45%) as compared to normotensive WKY rats. With enalapril treatment, the LV/BW ratio was reduced to 19% in SHRE. Finally, the reflex-induced bradycardia response produced by phenyldiguanide was significantly attenuated in SHR compared to WKY rats. Unlike the effects on the volume reflex, the sensitivity of the cardiac chemosensitive reflex to phenyldiguanide was not restored by enalapril treatment in SHRE. Taken together, these results indicate that the impairment of the volume-sensitive, but not the chemosensitive, reflex can be restored by treatment of SHR with enalapril. It is possible that by augmenting the gain of the volume-sensitive reflex control of RSNA, enalapril contributed to the reversal of cardiac hypertrophy and normalization of arterial blood pressure in SHR.
Resumo:
Exercise capacity and quality of life (QOL) are important outcome predictors in patients with systolic heart failure (HF), independent of left ventricular (LV) ejection fraction (LVEF). LV diastolic function has been shown to be a better predictor of aerobic exercise capacity in patients with systolic dysfunction and a New York Heart Association (NYHA) classification ≥II. We hypothesized that the currently used index of diastolic function E/e' is associated with exercise capacity and QOL, even in optimally treated HF patients with reduced LVEF. This prospective study included 44 consecutive patients aged 55±11 years (27 men and 17 women), with LVEF<0.50 and NYHA functional class I-III, receiving optimal pharmacological treatment and in a stable clinical condition, as shown by the absence of dyspnea exacerbation for at least 3 months. All patients had conventional transthoracic echocardiography and answered the Minnesota Living with HF Questionnaire, followed by the 6-min walk test (6MWT). In a multivariable model with 6MWT as the dependent variable, age and E/e' explained 27% of the walked distance in 6MWT (P=0.002; multivariate regression analysis). No association was found between walk distance and LVEF or mitral annulus systolic velocity. Only normalized left atrium volume, a sensitive index of diastolic function, was associated with decreased QOL. Despite the small number of patients included, this study offers evidence that diastolic function is associated with physical capacity and QOL and should be considered along with ejection fraction in patients with compensated systolic HF.
Resumo:
It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features
Resumo:
This paper analyses the cut flower market as an example of an invasion pathway along which species of non-indigenous plant pests can travel to reach new areas. The paper examines the probability of pest detection by assessing information on pest detection and detection effort associated with the import of cut flowers. We test the link between the probability of plant pest arrivals as a precursor to potential invasion, and volume of traded flowers using count data regression models. The analysis is applied to the UK import of specific genera of cut flowers form Kenya between 1996 and 2004. There is a link between pest detection and the Genus of cut flower imported. Hence, pest detection efforts should focus on identifying and targeting those imported plants with a high risk of carrying pest species. For most of the plants studied efforts allocated to inspection have a significant influence on the probabilty of pest detction. However, by better targetting inspection efforts, it is shown that plant inspection effort could be reduced without increasing the risk of pest entry. Similarly, for most of the plants analysed, an increase in volume traded will not necessarily lead to an increase in the number of pests entering the UK. For some species, such as conclude that analysis at the rank of plant Genus is important both to understand the effectiveness of plant pest detection efforts and consequently to manage the risk of introduction of non-indigenous species.
Resumo:
An automatic nonlinear predictive model-construction algorithm is introduced based on forward regression and the predicted-residual-sums-of-squares (PRESS) statistic. The proposed algorithm is based on the fundamental concept of evaluating a model's generalisation capability through crossvalidation. This is achieved by using the PRESS statistic as a cost function to optimise model structure. In particular, the proposed algorithm is developed with the aim of achieving computational efficiency, such that the computational effort, which would usually be extensive in the computation of the PRESS statistic, is reduced or minimised. The computation of PRESS is simplified by avoiding a matrix inversion through the use of the orthogonalisation procedure inherent in forward regression, and is further reduced significantly by the introduction of a forward-recursive formula. Based on the properties of the PRESS statistic, the proposed algorithm can achieve a fully automated procedure without resort to any other validation data set for iterative model evaluation. Numerical examples are used to demonstrate the efficacy of the algorithm.
Resumo:
An efficient data based-modeling algorithm for nonlinear system identification is introduced for radial basis function (RBF) neural networks with the aim of maximizing generalization capability based on the concept of leave-one-out (LOO) cross validation. Each of the RBF kernels has its own kernel width parameter and the basic idea is to optimize the multiple pairs of regularization parameters and kernel widths, each of which is associated with a kernel, one at a time within the orthogonal forward regression (OFR) procedure. Thus, each OFR step consists of one model term selection based on the LOO mean square error (LOOMSE), followed by the optimization of the associated kernel width and regularization parameter, also based on the LOOMSE. Since like our previous state-of-the-art local regularization assisted orthogonal least squares (LROLS) algorithm, the same LOOMSE is adopted for model selection, our proposed new OFR algorithm is also capable of producing a very sparse RBF model with excellent generalization performance. Unlike our previous LROLS algorithm which requires an additional iterative loop to optimize the regularization parameters as well as an additional procedure to optimize the kernel width, the proposed new OFR algorithm optimizes both the kernel widths and regularization parameters within the single OFR procedure, and consequently the required computational complexity is dramatically reduced. Nonlinear system identification examples are included to demonstrate the effectiveness of this new approach in comparison to the well-known approaches of support vector machine and least absolute shrinkage and selection operator as well as the LROLS algorithm.
Resumo:
Background Children with callous-unemotional (CU) traits, a proposed precursor to adult psychopathy, are characterized by impaired emotion recognition, reduced responsiveness to others’ distress, and a lack of guilt or empathy. Reduced attention to faces, and more specifically to the eye region, has been proposed to underlie these difficulties, although this has never been tested longitudinally from infancy. Attention to faces occurs within the context of dyadic caregiver interactions, and early environment including parenting characteristics has been associated with CU traits. The present study tested whether infants’ preferential tracking of a face with direct gaze and levels of maternal sensitivity predict later CU traits. Methods Data were analyzed from a stratified random sample of 213 participants drawn from a population-based sample of 1233 first-time mothers. Infants’ preferential face tracking at 5 weeks and maternal sensitivity at 29 weeks were entered into a weighted linear regression as predictors of CU traits at 2.5 years. Results Controlling for a range of confounders (e.g., deprivation), lower preferential face tracking predicted higher CU traits (p = .001). Higher maternal sensitivity predicted lower CU traits in girls (p = .009), but not boys. No significant interaction between face tracking and maternal sensitivity was found. Conclusions This is the first study to show that attention to social features during infancy as well as early sensitive parenting predict the subsequent development of CU traits. Identifying such early atypicalities offers the potential for developing parent-mediated interventions in children at risk for developing CU traits.
Resumo:
This is a note about proxy variables and instruments for identification of structural parameters in regression models. We have experienced that in the econometric textbooks these two issues are treated separately, although in practice these two concepts are very often combined. Usually, proxy variables are inserted in instrument variable regressions with the motivation they are exogenous. Implicitly meaning they are exogenous in a reduced form model and not in a structural model. Actually if these variables are exogenous they should be redundant in the structural model, e.g. IQ as a proxy for ability. Valid proxies reduce unexplained variation and increases the efficiency of the estimator of the structural parameter of interest. This is especially important in situations when the instrument is weak. With a simple example we demonstrate what is required of a proxy and an instrument when they are combined. It turns out that when a researcher has a valid instrument the requirements on the proxy variable is weaker than if no such instrument exists
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
One of the main difficulties in studying quantum field theory, in the perturbative regime, is the calculation of D-dimensional Feynman integrals. In general, one introduces the so-called Feynman parameters and, associated with them, the cumbersome parametric integrals. Solving these integrals beyond the one-loop level can be a difficult task. The negative-dimensional integration method (NDIM) is a technique whereby such a problem is dramatically reduced. We present the calculation of two-loop integrals in three different cases: scalar ones with three different masses, massless with arbitrary tensor rank, with and N insertions of a two-loop diagram.
Resumo:
The impact of tillage systems on soil CO2 emission is a complex issue as different soil types are managed in various ways, from no-till to intensive land preparation. In southern Brazil, the adoption of a new management option has arisen most recently, with no-tillage as well as no burning of crops residues left on soil surface after harvesting, especially in sugar cane areas. Although such practice has helped to restore soil carbon, the tillage impact on soil carbon loss in such areas has not been widely investigated. This study evaluated the effect of moldboard plowing followed by offset disk harrow and chisel plowing on clay oxisolCO(2) emission in a sugar cane field treated with no-tillage and high crop residues input in the last 6 years. Emissions after tillage were compared to undisturbed soil CO2 emissions during a 4-week period by using an LI-6400 system coupled to a portable soil chamber. Conventional tillage caused the highest emission during almost the whole period studied, except for the efflux immediately following tillage, when the reduced plot produced the highest peak. The lowest emissions were recorded 7 days after tillage, at the end of a dry period, when soil moisture reached its lowest rate. A linear regression between Soil CO2 effluxes and soil moisture in the no-till and conventional plots corroborate the fact that moisture, and not soil temperature, was a controlling factor. Total soil CO2 loss was huge and indicates that the adoption of reduced tillage would considerably decrease soil carbon dioxide emission in our region, particularly during the summer season and when growers leave large amounts of crop residues on the soil surface. Although it is known that crop residues are important for restoring soil carbon, our result indicates that an amount equivalent to approximately 30% of annual crop carbon residues could be transferred to the atmosphere, in a period of 4 weeks only, when conventional tillage is applied on no-tilled soils. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Toward the end of the larval phase (pre-pupa), the reproductive systems of Melipona quadrifasciata and Frieseomelitta varia workers are anatomically similar. Scanning electron microscopy showed that during this developmental phase the right and left ovaries are fused and form a heart-shaped structure located above the midgut. Each ovary is connected to the genital chamber by a long and slender lateral oviduct. During pupal development, the lateral oviducts of workers from both species become extremely reduced due to a drastic process of cell death, as shown by transmission electron microscopy. During the lateral oviduct shortening, their simple columnar epithelial cells show some signs of apoptosis in addition to necrosis. Cell death was characterized by cytoplasmic vesiculation, peculiar accumulation of glycogen, and dilation of cytoplasmic organelles such as mitochondria and rough endoplasmic reticulum. The nuclei, at first irregularly contoured, became swollen, with chromatin flocculation and various areas of condensed chromatin next to the nuclear envelope. At the end of the pupal phase, deep recesses marked the nuclei. At emergence, worker and queen reproductive systems showed marked differences, although reduction in the lateral oviducts was an event occurring in both castes. However, in queens the ovarioles increased in length and the spermatheca was larger than that of workers. At the external anatomical level, the reproductive system of workers and queens could be distinguished in the white- and pink-eyed pupal phase. The metamorphic function of the death of lateral oviduct cells, with consequent oviduct shortening, is discussed in terms of the anatomical reorganization of the reproductive system and of the ventrolateral positioning of adult worker bee ovaries. (C) 2000 Wiley-Liss, Inc.
Resumo:
Objectives: To describe the use of antenatal corticosteroid and clinical evolution of preterm babies. Methods: An observational prospective cohort study was carried out. All 463 pregnant women and their 514 newborn babies with gestational age ranging from 23 to 34 weeks, born at the Brazilian Neonatal Research Network units, were evaluated from August 1 to December 31, 2001. The data were obtained through maternal interview, analysis of medical records, and follow-up of the newborn infants. Data analysis was performed with the use of chi-square, t Student, Mann-Whitney, and ANOVA tests and multiple logistic regression, with level of significance set at 5%. Results: Treatment was directly associated with the number of prenatal visits, with maternal hypertension and with the antenatal use of tocolytic agents. Babies from treated pregnant women presented better Apgar scores at the 1st and 5th minute, reduced need for intervention in the delivery room and lower SNAPPE II. They were born with higher birth weight, longer gestational age and needed less surfactant use, ventilation, and oxygenation time. After multiple logistic regression, the use of antenatal corticosteroid independently improved birth conditions, decreased ventilation time, being related to increased occurrence of neonatal sepsis. Conclusions: The use of corticosteroid was associated with better prenatal care and birth conditions, better preterm evolution but higher risk of infection. Copyright © 2004 by Sociedade Brasileira de Pediatria.
Resumo:
Introduction: The present study examines cardiovascular risk factor profiles and 24-month mortality in patients with symptomatic peripheral arterial disease. Design Study: Prospective observational study including 75 consecutive patients with PAD (67 ± 9.7 years of age; 52 men and 23 women) hospitalized for planned peripheral vascular reconstruction. Doppler echocardiograms were performed before surgery in 54 cases. Univariate analyses were performed using Student's t-test or Fisher's exact test. Survival analysis at 24-month follow-up was performed using the Cox regression model and Kaplan-Meier method including age and chronic use of aspirin as covariates. Survival curves were compared using the log-rank test. Results: Hypertension and smoking were the most frequent risk factors (52 cases and 51 cases, respectively), followed by diabetes (32 cases). Undertreated dyslipidemia was found in 26 cases. Fasting glycine levels (131 ± 69.1 mg/dl) were elevated in 29 cases. Myocardial hypertrophy was found in 18 out of 54 patients. Thirty-four patients had been treated with aspirin. Overall mortality over 24 months was 24% and was associated with age (HR: 0.064; CI95: 0.014-0.115; p=0.013) and lack of use of aspirin, as no deaths occurred among those using this drug (p<0.001). No association was found between cardiovascular death (11 cases) and the other risk factors. Conclusion: There is a high prevalence of uncontrolled (treated or untreated) cardiovascular risk factors in patients undergoing planned peripheral vascular reconstruction, and chronic use of aspirin is associated with reduced all-cause mortality in these patients.