916 resultados para Weighted linear regression schemes
Resumo:
Strategies are compared for the development of a linear regression model with stochastic (multivariate normal) regressor variables and the subsequent assessment of its predictive ability. Bias and mean squared error of four estimators of predictive performance are evaluated in simulated samples of 32 population correlation matrices. Models including all of the available predictors are compared with those obtained using selected subsets. The subset selection procedures investigated include two stopping rules, C$\sb{\rm p}$ and S$\sb{\rm p}$, each combined with an 'all possible subsets' or 'forward selection' of variables. The estimators of performance utilized include parametric (MSEP$\sb{\rm m}$) and non-parametric (PRESS) assessments in the entire sample, and two data splitting estimates restricted to a random or balanced (Snee's DUPLEX) 'validation' half sample. The simulations were performed as a designed experiment, with population correlation matrices representing a broad range of data structures.^ The techniques examined for subset selection do not generally result in improved predictions relative to the full model. Approaches using 'forward selection' result in slightly smaller prediction errors and less biased estimators of predictive accuracy than 'all possible subsets' approaches but no differences are detected between the performances of C$\sb{\rm p}$ and S$\sb{\rm p}$. In every case, prediction errors of models obtained by subset selection in either of the half splits exceed those obtained using all predictors and the entire sample.^ Only the random split estimator is conditionally (on $\\beta$) unbiased, however MSEP$\sb{\rm m}$ is unbiased on average and PRESS is nearly so in unselected (fixed form) models. When subset selection techniques are used, MSEP$\sb{\rm m}$ and PRESS always underestimate prediction errors, by as much as 27 percent (on average) in small samples. Despite their bias, the mean squared errors (MSE) of these estimators are at least 30 percent less than that of the unbiased random split estimator. The DUPLEX split estimator suffers from large MSE as well as bias, and seems of little value within the context of stochastic regressor variables.^ To maximize predictive accuracy while retaining a reliable estimate of that accuracy, it is recommended that the entire sample be used for model development, and a leave-one-out statistic (e.g. PRESS) be used for assessment. ^
Resumo:
We propose a linear regression method for estimating Weibull parameters from life tests. The method uses stochastic models of the unreliability at each failure instant. As a result, a heteroscedastic regression problem arises that is solved by weighted least squares minimization. The main feature of our method is an innovative s-normalization of the failure data models, to obtain analytic expressions of centers and weights for the regression. The method has been Monte Carlo contrasted with Benard?s approximation, and Maximum Likelihood Estimation; and it has the highest global scores for its robustness, and performance.
Resumo:
Specific cutting energy (SE) has been widely used to assess the rock cuttability for mechanical excavation purposes. Some prediction models were developed for SE through correlating rock properties with SE values. However, some of the textural and compositional rock parameters i.e. texture coefficient and feldspar, mafic, and felsic mineral contents were not considered. The present study is to investigate the effects of previously ignored rock parameters along with engineering rock properties on SE. Mineralogical and petrographic analyses, rock mechanics, and linear rock cutting tests were performed on sandstone samples taken from sites around Ankara, Turkey. Relationships between SE and rock properties were evaluated using bivariate correlation and linear regression analyses. The tests and subsequent analyses revealed that the texture coefficient and feldspar content of sandstones affected rock cuttability, evidenced by significant correlations between these parameters and SE at a 90% confidence level. Felsic and mafic mineral contents of sandstones did not exhibit any statistically significant correlation against SE. Cementation coefficient, effective porosity, and pore volume had good correlations against SE. Poisson's ratio, Brazilian tensile strength, Shore scleroscope hardness, Schmidt hammer hardness, dry density, and point load strength index showed very strong linear correlations against SE at confidence levels of 95% and above, all of which were also found suitable to be used in predicting SE individually, depending on the results of regression analysis, ANOVA, Student's t-tests, and R2 values. Poisson's ratio exhibited the highest correlation with SE and seemed to be the most reliable SE prediction tool in sandstones.
Resumo:
Specific cutting energy (SE) has been widely used to assess the rock cuttability for mechanical excavation purposes. Some prediction models were developed for SE through correlating rock properties with SE values. However, some of the textural and compositional rock parameters i.e. texture coefficient and feldspar, mafic, and felsic mineral contents were not considered. The present study is to investigate the effects of previously ignored rock parameters along with engineering rock properties on SE. Mineralogical and petrographic analyses, rock mechanics, and linear rock cutting tests were performed on sandstone samples taken from sites around Ankara, Turkey. Relationships between SE and rock properties were evaluated using bivariate correlation and linear regression analyses. The tests and subsequent analyses revealed that the texture coefficient and feldspar content of sandstones affected rock cuttability, evidenced by significant correlations between these parameters and SE at a 90% confidence level. Felsic and mafic mineral contents of sandstones did not exhibit any statistically significant correlation against SE. Cementation coefficient, effective porosity, and pore volume had good correlations against SE. Poisson's ratio, Brazilian tensile strength, Shore scleroscope hardness, Schmidt hammer hardness, dry density, and point load strength index showed very strong linear correlations against SE at confidence levels of 95% and above, all of which were also found suitable to be used in predicting SE individually, depending on the results of regression analysis, ANOVA, Student's t-tests, and R-2 values. Poisson's ratio exhibited the highest correlation with SE and seemed to be the most reliable SE prediction tool in sandstones.
Resumo:
Correlation and regression are two of the statistical procedures most widely used by optometrists. However, these tests are often misused or interpreted incorrectly, leading to erroneous conclusions from clinical experiments. This review examines the major statistical tests concerned with correlation and regression that are most likely to arise in clinical investigations in optometry. First, the use, interpretation and limitations of Pearson's product moment correlation coefficient are described. Second, the least squares method of fitting a linear regression to data and for testing how well a regression line fits the data are described. Third, the problems of using linear regression methods in observational studies, if there are errors associated in measuring the independent variable and for predicting a new value of Y for a given X, are discussed. Finally, methods for testing whether a non-linear relationship provides a better fit to the data and for comparing two or more regression lines are considered.
Resumo:
Fitting a linear regression to data provides much more information about the relationship between two variables than a simple correlation test. A goodness of fit test of the line should always be carried out. Hence, ‘r squared’ estimates the strength of the relationship between Y and X, ANOVA whether a statistically significant line is present, and the ‘t’ test whether the slope of the line is significantly different from zero. In addition, it is important to check whether the data fit the assumptions for regression analysis and, if not, whether a transformation of the Y and/or X variables is necessary.
Resumo:
1. Fitting a linear regression to data provides much more information about the relationship between two variables than a simple correlation test. A goodness of fit test of the line should always be carried out. Hence, r squared estimates the strength of the relationship between Y and X, ANOVA whether a statistically significant line is present, and the ‘t’ test whether the slope of the line is significantly different from zero. 2. Always check whether the data collected fit the assumptions for regression analysis and, if not, whether a transformation of the Y and/or X variables is necessary. 3. If the regression line is to be used for prediction, it is important to determine whether the prediction involves an individual y value or a mean. Care should be taken if predictions are made close to the extremities of the data and are subject to considerable error if x falls beyond the range of the data. Multiple predictions require correction of the P values. 3. If several individual regression lines have been calculated from a number of similar sets of data, consider whether they should be combined to form a single regression line. 4. If the data exhibit a degree of curvature, then fitting a higher-order polynomial curve may provide a better fit than a straight line. In this case, a test of whether the data depart significantly from a linear regression should be carried out.
Resumo:
In previous statnotes, the application of correlation and regression methods to the analysis of two variables (X,Y) was described. These methods can be used to determine whether there is a linear relationship between the two variables, whether the relationship is positive or negative, to test the degree of significance of the linear relationship, and to obtain an equation relating Y to X. This Statnote extends the methods of linear correlation and regression to situations where there are two or more X variables, i.e., 'multiple linear regression’.
Resumo:
Estimation of economic relationships often requires imposition of constraints such as positivity or monotonicity on each observation. Methods to impose such constraints, however, vary depending upon the estimation technique employed. We describe a general methodology to impose (observation-specific) constraints for the class of linear regression estimators using a method known as constraint weighted bootstrapping. While this method has received attention in the nonparametric regression literature, we show how it can be applied for both parametric and nonparametric estimators. A benefit of this method is that imposing numerous constraints simultaneously can be performed seamlessly. We apply this method to Norwegian dairy farm data to estimate both unconstrained and constrained parametric and nonparametric models.
Resumo:
Purpose: To determine whether curve-fitting analysis of the ranked segment distributions of topographic optic nerve head (ONH) parameters, derived using the Heidelberg Retina Tomograph (HRT), provide a more effective statistical descriptor to differentiate the normal from the glaucomatous ONH. Methods: The sample comprised of 22 normal control subjects (mean age 66.9 years; S.D. 7.8) and 22 glaucoma patients (mean age 72.1 years; S.D. 6.9) confirmed by reproducible visual field defects on the Humphrey Field Analyser. Three 10°-images of the ONH were obtained using the HRT. The mean topography image was determined and the HRT software was used to calculate the rim volume, rim area to disc area ratio, normalised rim area to disc area ratio and retinal nerve fibre cross-sectional area for each patient at 10°-sectoral intervals. The values were ranked in descending order, and each ranked-segment curve of ordered values was fitted using the least squares method. Results: There was no difference in disc area between the groups. The group mean cup-disc area ratio was significantly lower in the normal group (0.204 ± 0.16) compared with the glaucoma group (0.533 ± 0.083) (p < 0.001). The visual field indices, mean deviation and corrected pattern S.D., were significantly greater (p < 0.001) in the glaucoma group (-9.09 dB ± 3.3 and 7.91 ± 3.4, respectively) compared with the normal group (-0.15 dB ± 0.9 and 0.95 dB ± 0.8, respectively). Univariate linear regression provided the best overall fit to the ranked segment data. The equation parameters of the regression line manually applied to the normalised rim area-disc area and the rim area-disc area ratio data, correctly classified 100% of normal subjects and glaucoma patients. In this study sample, the regression analysis of ranked segment parameters method was more effective than conventional ranked segment analysis, in which glaucoma patients were misclassified in approximately 50% of cases. Further investigation in larger samples will enable the calculation of confidence intervals for normality. These reference standards will then need to be investigated for an independent sample to fully validate the technique. Conclusions: Using a curve-fitting approach to fit ranked segment curves retains information relating to the topographic nature of neural loss. Such methodology appears to overcome some of the deficiencies of conventional ranked segment analysis, and subject to validation in larger scale studies, may potentially be of clinical utility for detecting and monitoring glaucomatous damage. © 2007 The College of Optometrists.
Resumo:
Background/aims - To determine which biometric parameters provide optimum predictive power for ocular volume. Methods - Sixty-seven adult subjects were scanned with a Siemens 3-T MRI scanner. Mean spherical error (MSE) (D) was measured with a Shin-Nippon autorefractor and a Zeiss IOLMaster used to measure (mm) axial length (AL), anterior chamber depth (ACD) and corneal radius (CR). Total ocular volume (TOV) was calculated from T2-weighted MRIs (voxel size 1.0 mm3) using an automatic voxel counting and shading algorithm. Each MR slice was subsequently edited manually in the axial, sagittal and coronal plane, the latter enabling location of the posterior pole of the crystalline lens and partitioning of TOV into anterior (AV) and posterior volume (PV) regions. Results - Mean values (±SD) for MSE (D), AL (mm), ACD (mm) and CR (mm) were −2.62±3.83, 24.51±1.47, 3.55±0.34 and 7.75±0.28, respectively. Mean values (±SD) for TOV, AV and PV (mm3) were 8168.21±1141.86, 1099.40±139.24 and 7068.82±1134.05, respectively. TOV showed significant correlation with MSE, AL, PV (all p<0.001), CR (p=0.043) and ACD (p=0.024). Bar CR, the correlations were shown to be wholly attributable to variation in PV. Multiple linear regression indicated that the combination of AL and CR provided optimum R2 values of 79.4% for TOV. Conclusion - Clinically useful estimations of ocular volume can be obtained from measurement of AL and CR.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62K15, 91B42, 62H99.
Resumo:
Analysis of risk measures associated with price series data movements and its predictions are of strategic importance in the financial markets as well as to policy makers in particular for short- and longterm planning for setting up economic growth targets. For example, oilprice risk-management focuses primarily on when and how an organization can best prevent the costly exposure to price risk. Value-at-Risk (VaR) is the commonly practised instrument to measure risk and is evaluated by analysing the negative/positive tail of the probability distributions of the returns (profit or loss). In modelling applications, least-squares estimation (LSE)-based linear regression models are often employed for modeling and analyzing correlated data. These linear models are optimal and perform relatively well under conditions such as errors following normal or approximately normal distributions, being free of large size outliers and satisfying the Gauss-Markov assumptions. However, often in practical situations, the LSE-based linear regression models fail to provide optimal results, for instance, in non-Gaussian situations especially when the errors follow distributions with fat tails and error terms possess a finite variance. This is the situation in case of risk analysis which involves analyzing tail distributions. Thus, applications of the LSE-based regression models may be questioned for appropriateness and may have limited applicability. We have carried out the risk analysis of Iranian crude oil price data based on the Lp-norm regression models and have noted that the LSE-based models do not always perform the best. We discuss results from the L1, L2 and L∞-norm based linear regression models. ACM Computing Classification System (1998): B.1.2, F.1.3, F.2.3, G.3, J.2.
Resumo:
2010 Mathematics Subject Classification: 68T50,62H30,62J05.
Resumo:
This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.