948 resultados para Linear regression analysis
Resumo:
We consider a mixture model approach to the regression analysis of competing-risks data. Attention is focused on inference concerning the effects of factors on both the probability of occurrence and the hazard rate conditional on each of the failure types. These two quantities are specified in the mixture model using the logistic model and the proportional hazards model, respectively. We propose a semi-parametric mixture method to estimate the logistic and regression coefficients jointly, whereby the component-baseline hazard functions are completely unspecified. Estimation is based on maximum likelihood on the basis of the full likelihood, implemented via an expectation-conditional maximization (ECM) algorithm. Simulation studies are performed to compare the performance of the proposed semi-parametric method with a fully parametric mixture approach. The results show that when the component-baseline hazard is monotonic increasing, the semi-parametric and fully parametric mixture approaches are comparable for mildly and moderately censored samples. When the component-baseline hazard is not monotonic increasing, the semi-parametric method consistently provides less biased estimates than a fully parametric approach and is comparable in efficiency in the estimation of the parameters for all levels of censoring. The methods are illustrated using a real data set of prostate cancer patients treated with different dosages of the drug diethylstilbestrol. Copyright (C) 2003 John Wiley Sons, Ltd.
Resumo:
The prediction of the time and the efficiency of the remediation of contaminated soils using soil vapor extraction remain a difficult challenge to the scientific community and consultants. This work reports the development of multiple linear regression and artificial neural network models to predict the remediation time and efficiency of soil vapor extractions performed in soils contaminated separately with benzene, toluene, ethylbenzene, xylene, trichloroethylene, and perchloroethylene. The results demonstrated that the artificial neural network approach presents better performances when compared with multiple linear regression models. The artificial neural network model allowed an accurate prediction of remediation time and efficiency based on only soil and pollutants characteristics, and consequently allowing a simple and quick previous evaluation of the process viability.
Resumo:
In health related research it is common to have multiple outcomes of interest in a single study. These outcomes are often analysed separately, ignoring the correlation between them. One would expect that a multivariate approach would be a more efficient alternative to individual analyses of each outcome. Surprisingly, this is not always the case. In this article we discuss different settings of linear models and compare the multivariate and univariate approaches. We show that for linear regression models, the estimates of the regression parameters associated with covariates that are shared across the outcomes are the same for the multivariate and univariate models while for outcome-specific covariates the multivariate model performs better in terms of efficiency.
Resumo:
Accurate size measurements are fundamental in characterizing the population structure and secondary production of a species. The purpose of this study was to determine the best morphometric parameter to estimate the size of individuals of Capitella capitata (Fabricius, 1780). The morphometric analysis was applied to individuals collected in the intertidal zones of two beaches on the northern coast of the state of São Paulo, Brazil: São Francisco and Araçá. The following measurements were taken: the width and length (height) of the 4th, 5th and 7th setigers, and the length of the thoracic region (first nine setigers). The area and volume of these setigers were calculated and a linear regression analysis was applied to the data. The data were log-transformed to fit the allometric equation y = ax b into a straight line (log y = log a + b * log x). The measurements which best correlated with the thoracic length in individuals from both beaches were the length of setiger 5 (r² = 0.722; p<0.05 in São Francisco and r² = 0.795; p<0.05 in Araçá) and the area of setiger 7 (r² = 0.705; p<0.05 in São Francisco and r² = 0.634; p<0.05 in Araçá). According to these analyses, the length of setiger 5 and/or the area of setiger 7 are the best parameters to evaluate the growth of individuals of C. capitata.
Resumo:
This paper explores the effects of two main sources of innovation -intramural and external R&D- on the productivity level in a sample of 3,267 Catalonian firms. The data set used is based on the official innovation survey of Catalonia which was a part of the Spanish sample of CIS4, covering the years 2002-2004. We compare empirical results by applying usual OLS and quantile regression techniques both in manufacturing and services industries. In quantile regression, results suggest different patterns at both innovation sources as we move across conditional quantiles. The elasticity of intramural R&D activities on productivity decreased when we move up the high productivity levels both in manufacturing and services sectors, while the effects of external R&D rise in high-technology industries but are more ambiguous in low-technology and knowledge-intensive services. JEL codes: O300, C100, O140. Keywords: Innovation sources, R&D, Productivity, Quantile regression
Resumo:
This paper explores the effects of two main sources of innovation —intramural and external R&D— on the productivity level in a sample of 3,267 Catalan firms. The data set used is based on the official innovation survey of Catalonia which was a part of the Spanish sample of CIS4, covering the years 2002-2004. We compare empirical results by applying usual OLS and quantile regression techniques both in manufacturing and services industries. In quantile regression, results suggest different patterns at both innovation sources as we move across conditional quantiles. The elasticity of intramural R&D activities on productivity decreased when we move up the high productivity levels both in manufacturing and services sectors, while the effects of external R&D rise in high-technology industries but are more ambiguous in low-technology and services industries.
Resumo:
Privatization of local public services has been implemented worldwide in the last decades. Why local governments privatize has been the subject of much discussion, and many empirical works have been devoted to analyzing the factors that explain local privatization. Such works have found a great diversity of motivations, and the variation among reported empirical results is large. To investigate this diversity we undertake a meta-regression analysis of the factors explaining the decision to privatize local services. Overall, our results indicate that significant relationships are very dependent upon the characteristics of the studies. Indeed, fiscal stress and political considerations have been found to contribute to local privatization specially in the studies of US cases published in the eighties that consider a broad range of services. Studies that focus on one service capture more accurately the influence of scale economies on privatization. Finally, governments of small towns are more affected by fiscal stress, political considerations and economic efficiency, while ideology seems to play a major role for large cities.
Resumo:
This paper explores the effects of two main sources of innovation - intramural and external R&D— on the productivity level in a sample of 3,267 Catalonian firms. The data set used is based on the official innovation survey of Catalonia which was a part of the Spanish sample of CIS4, covering the years 2002-2004. We compare empirical results by applying usual OLS and quantile regression techniques both in manufacturing and services industries. In quantile regression, results suggest different patterns at both innovation sources as we move across conditional quantiles. The elasticity of intramural R&D activities on productivity decreased when we move up the high productivity levels both in manufacturing and services sectors, while the effects of external R&D rise in high-technology industries but are more ambiguous in low-technology and knowledge-intensive services. JEL codes: O300, C100, O140 Keywords: Innovation sources, R&D, Productivity, Quantile Regression
Resumo:
X-ray is a technology that is used for numerous applications in the medical field. The process of X-ray projection gives a 2-dimension (2D) grey-level texture from a 3- dimension (3D) object. Until now no clear demonstration or correlation has positioned the 2D texture analysis as a valid indirect evaluation of the 3D microarchitecture. TBS is a new texture parameter based on the measure of the experimental variogram. TBS evaluates the variation between 2D image grey-levels. The aim of this study was to evaluate existing correlations between 3D bone microarchitecture parameters - evaluated from μCT reconstructions - and the TBS value, calculated on 2D projected images. 30 dried human cadaveric vertebrae were acquired on a micro-scanner (eXplorer Locus, GE) at isotropic resolution of 93 μm. 3D vertebral body models were used. The following 3D microarchitecture parameters were used: Bone volume fraction (BV/TV), Trabecular thickness (TbTh), trabecular space (TbSp), trabecular number (TbN) and connectivity density (ConnD). 3D/2D projections has been done by taking into account the Beer-Lambert Law at X-ray energy of 50, 100, 150 KeV. TBS was assessed on 2D projected images. Correlations between TBS and the 3D microarchitecture parameters were evaluated using a linear regression analysis. Paired T-test is used to assess the X-ray energy effects on TBS. Multiple linear regressions (backward) were used to evaluate relationships between TBS and 3D microarchitecture parameters using a bootstrap process. BV/TV of the sample ranged from 18.5 to 37.6% with an average value at 28.8%. Correlations' analysis showedthat TBSwere strongly correlatedwith ConnD(0.856≤r≤0.862; p<0.001),with TbN (0.805≤r≤0.810; p<0.001) and negatively with TbSp (−0.714≤r≤−0.726; p<0.001), regardless X-ray energy. Results show that lower TBS values are related to "degraded" microarchitecture, with low ConnD, low TbN and a high TbSp. The opposite is also true. X-ray energy has no effect onTBS neither on the correlations betweenTBS and the 3Dmicroarchitecture parameters. In this study, we demonstrated that TBS was significantly correlated with 3D microarchitecture parameters ConnD and TbN, and negatively with TbSp, no matter what X-ray energy has been used. This article is part of a Special Issue entitled ECTS 2011. Disclosure of interest: None declared.
Resumo:
Background/objectives:Bioelectrical impedance analysis (BIA) is used in population and clinical studies as a technique for estimating body composition. Because of significant under-representation in existing literature, we sought to develop and validate predictive equation(s) for BIA for studies in populations of African origin.Subjects/methods:Among five cohorts of the Modeling the Epidemiologic Transition Study, height, weight, waist circumference and body composition, using isotope dilution, were measured in 362 adults, ages 25-45 with mean body mass indexes ranging from 24 to 32. BIA measures of resistance and reactance were measured using tetrapolar placement of electrodes and the same model of analyzer across sites (BIA 101Q, RJL Systems). Multiple linear regression analysis was used to develop equations for predicting fat-free mass (FFM), as measured by isotope dilution; covariates included sex, age, waist, reactance and height(2)/resistance, along with dummy variables for each site. Developed equations were then tested in a validation sample; FFM predicted by previously published equations were tested in the total sample.Results:A site-combined equation and site-specific equations were developed. The mean differences between FFM (reference) and FFM predicted by the study-derived equations were between 0.4 and 0.6âeuro0/00kg (that is, 1% difference between the actual and predicted FFM), and the measured and predicted values were highly correlated. The site-combined equation performed slightly better than the site-specific equations and the previously published equations.Conclusions:Relatively small differences exist between BIA equations to estimate FFM, whether study-derived or published equations, although the site-combined equation performed slightly better than others. The study-derived equations provide an important tool for research in these understudied populations.
Resumo:
In line with the rights and incentives provided by the Bayh-Dole Act of 1980, U.S. universities have increased their involvement in patenting and licensing activities through their own technology transfer offices. Only a few U.S. universities are obtaining large returns, however, whereas others are continuing with these activities despite negligible or negative returns. We assess the U.S. universities’ potential to generate returns from licensing activities by modeling and estimating quantiles of the distribution of net licensing returns conditional on some of their structural characteristics. We find limited prospects for public universities without a medical school everywhere in their distribution. Other groups of universities (private, and public with a medical school) can expect significant but still fairly modest returns only beyond the 0.9th quantile. These findings call into question the appropriateness of the revenue-generating motive for the aggressive rate of patenting and licensing by U.S. universities.
Resumo:
Aim This study used data from temperate forest communities to assess: (1) five different stepwise selection methods with generalized additive models, (2) the effect of weighting absences to ensure a prevalence of 0.5, (3) the effect of limiting absences beyond the environmental envelope defined by presences, (4) four different methods for incorporating spatial autocorrelation, and (5) the effect of integrating an interaction factor defined by a regression tree on the residuals of an initial environmental model. Location State of Vaud, western Switzerland. Methods Generalized additive models (GAMs) were fitted using the grasp package (generalized regression analysis and spatial predictions, http://www.cscf.ch/grasp). Results Model selection based on cross-validation appeared to be the best compromise between model stability and performance (parsimony) among the five methods tested. Weighting absences returned models that perform better than models fitted with the original sample prevalence. This appeared to be mainly due to the impact of very low prevalence values on evaluation statistics. Removing zeroes beyond the range of presences on main environmental gradients changed the set of selected predictors, and potentially their response curve shape. Moreover, removing zeroes slightly improved model performance and stability when compared with the baseline model on the same data set. Incorporating a spatial trend predictor improved model performance and stability significantly. Even better models were obtained when including local spatial autocorrelation. A novel approach to include interactions proved to be an efficient way to account for interactions between all predictors at once. Main conclusions Models and spatial predictions of 18 forest communities were significantly improved by using either: (1) cross-validation as a model selection method, (2) weighted absences, (3) limited absences, (4) predictors accounting for spatial autocorrelation, or (5) a factor variable accounting for interactions between all predictors. The final choice of model strategy should depend on the nature of the available data and the specific study aims. Statistical evaluation is useful in searching for the best modelling practice. However, one should not neglect to consider the shapes and interpretability of response curves, as well as the resulting spatial predictions in the final assessment.
Resumo:
We present an exact test for whether two random variables that have known bounds on their support are negatively correlated. The alternative hypothesis is that they are not negatively correlated. No assumptions are made on the underlying distributions. We show by example that the Spearman rank correlation test as the competing exact test of correlation in nonparametric settings rests on an additional assumption on the data generating process without which it is not valid as a test for correlation.We then show how to test for the significance of the slope in a linear regression analysis that invovles a single independent variable and where outcomes of the dependent variable belong to a known bounded set.
Resumo:
Many of the most interesting questions ecologists ask lead to analyses of spatial data. Yet, perhaps confused by the large number of statistical models and fitting methods available, many ecologists seem to believe this is best left to specialists. Here, we describe the issues that need consideration when analysing spatial data and illustrate these using simulation studies. Our comparative analysis involves using methods including generalized least squares, spatial filters, wavelet revised models, conditional autoregressive models and generalized additive mixed models to estimate regression coefficients from synthetic but realistic data sets, including some which violate standard regression assumptions. We assess the performance of each method using two measures and using statistical error rates for model selection. Methods that performed well included generalized least squares family of models and a Bayesian implementation of the conditional auto-regressive model. Ordinary least squares also performed adequately in the absence of model selection, but had poorly controlled Type I error rates and so did not show the improvements in performance under model selection when using the above methods. Removing large-scale spatial trends in the response led to poor performance. These are empirical results; hence extrapolation of these findings to other situations should be performed cautiously. Nevertheless, our simulation-based approach provides much stronger evidence for comparative analysis than assessments based on single or small numbers of data sets, and should be considered a necessary foundation for statements of this type in future.
Resumo:
Uromodulin is expressed exclusively in the thick ascending limb and is the most abundant protein excreted in normal urine. Variants in UMOD, which encodes uromodulin, are associated with renal function, and urinary uromodulin levels may be a biomarker for kidney disease. However, the genetic factors regulating uromodulin excretion are unknown. We conducted a meta-analysis of urinary uromodulin levels to identify associated common genetic variants in the general population. We included 10,884 individuals of European descent from three genetic isolates and three urban cohorts. Each study measured uromodulin indexed to creatinine and conducted linear regression analysis of approximately 2.5 million single nucleotide polymorphisms using an additive model. We also tested whether variants in genes expressed in the thick ascending limb associate with uromodulin levels. rs12917707, located near UMOD and previously associated with renal function and CKD, had the strongest association with urinary uromodulin levels (P<0.001). In all cohorts, carriers of a G allele of this variant had higher uromodulin levels than noncarriers did (geometric means 10.24, 14.05, and 17.67 μg/g creatinine for zero, one, or two copies of the G allele). rs12446492 in the adjacent gene PDILT (protein disulfide isomerase-like, testis expressed) also reached genome-wide significance (P<0.001). Regarding genes expressed in the thick ascending limb, variants in KCNJ1, SORL1, and CAB39 associated with urinary uromodulin levels. These data indicate that common variants in the UMOD promoter region may influence urinary uromodulin levels. They also provide insights into uromodulin biology and the association of UMOD variants with renal function.