901 resultados para QUANTILE REGRESSION


Relevância:

20.00% 20.00%

Publicador:

Resumo:

An investigator may also wish to select a small subset of the X variables which give the best prediction of the Y variable. In this case, the question is how many variables should the regression equation include? One method would be to calculate the regression of Y on every subset of the X variables and choose the subset that gives the smallest mean square deviation from the regression. Most investigators, however, prefer to use a ‘stepwise multiple regression’ procedure. There are two forms of this analysis called the ‘step-up’ (or ‘forward’) method and the ‘step-down’ (or ‘backward’) method. This Statnote illustrates the use of stepwise multiple regression with reference to the scenario introduced in Statnote 24, viz., the influence of climatic variables on the growth of the crustose lichen Rhizocarpon geographicum (L.)DC.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The aim of this research work was primarily to examine the relevance of patient parameters, ward structures, procedures and practices, in respect of the potential hazards of wound cross-infection and nasal colonisation with multiple resistant strains of Staphylococcus aureus, which it is thought might provide a useful indication of a patient's general susceptibility to wound infection. Information from a large cross-sectional survey involving 12,000 patients from some 41 hospitals and 375 wards was collected over a five-year period from 1967-72, and its validity checked before any subsequent analysis was carried out. Many environmental factors and procedures which had previously been thought (but never conclusively proved) to have an influence on wound infection or nasal colonisation rates, were assessed, and subsequently dismissed as not being significant, provided that the standard of the current range of practices and procedures is maintained and not allowed to deteriorate. Retrospective analysis revealed that the probability of wound infection was influenced by the patient's age, duration of pre-operative hospitalisation, sex, type of wound, presence and type of drain, number of patients in ward, and other special risk factors, whilst nasal colonisation was found to be influenced by the patient's age, total duration of hospitalisation, sex, antibiotics, proportion of occupied beds in the ward, average distance between bed centres and special risk factors. A multi-variate regression analysis technique was used to develop statistical models, consisting of variable patient and environmental factors which were found to have a significant influence on the risks pertaining to wound infection and nasal colonisation. A relationship between wound infection and nasal colonisation was then established and this led to the development of a more advanced model for predicting wound infections, taking advantage of the additional knowledge of the patient's state of nasal colonisation prior to operation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In previous statnotes, the application of correlation and regression methods to the analysis of two variables (X,Y) was described. These methods can be used to determine whether there is a linear relationship between the two variables, whether the relationship is positive or negative, to test the degree of significance of the linear relationship, and to obtain an equation relating Y to X. This Statnote extends the methods of linear correlation and regression to situations where there are two or more X variables, i.e., 'multiple linear regression’.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Factors associated with duration of dementia in a consecutive series of 103 Alzheimer's disease (AD) cases were studied using the Kaplan-Meier estimator and Cox regression analysis (proportional hazard model). Mean disease duration was 7.1 years (range: 6 weeks-30 years, standard deviation = 5.18); 25% of cases died within four years, 50% within 6.9 years, and 75% within 10 years. Familial AD cases (FAD) had a longer duration than sporadic cases (SAD), especially cases linked to presenilin (PSEN) genes. No significant differences in duration were associated with age, sex, or apolipoprotein E (Apo E) genotype. Duration was reduced in cases with arterial hypertension. Cox regression analysis suggested longer duration was associated with an earlier disease onset and increased senile plaque (SP) and neurofibrillary tangle (NFT) pathology in the orbital gyrus (OrG), CA1 sector of the hippocampus, and nucleus basalis of Meynert (NBM). The data suggest shorter disease duration in SAD and in cases with hypertensive comorbidity. In addition, degree of neuropathology did not influence survival, but spread of SP/NFT pathology into the frontal lobe, hippocampus, and basal forebrain was associated with longer disease duration. © 2014 R. A. Armstrong.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Purpose: To determine whether curve-fitting analysis of the ranked segment distributions of topographic optic nerve head (ONH) parameters, derived using the Heidelberg Retina Tomograph (HRT), provide a more effective statistical descriptor to differentiate the normal from the glaucomatous ONH. Methods: The sample comprised of 22 normal control subjects (mean age 66.9 years; S.D. 7.8) and 22 glaucoma patients (mean age 72.1 years; S.D. 6.9) confirmed by reproducible visual field defects on the Humphrey Field Analyser. Three 10°-images of the ONH were obtained using the HRT. The mean topography image was determined and the HRT software was used to calculate the rim volume, rim area to disc area ratio, normalised rim area to disc area ratio and retinal nerve fibre cross-sectional area for each patient at 10°-sectoral intervals. The values were ranked in descending order, and each ranked-segment curve of ordered values was fitted using the least squares method. Results: There was no difference in disc area between the groups. The group mean cup-disc area ratio was significantly lower in the normal group (0.204 ± 0.16) compared with the glaucoma group (0.533 ± 0.083) (p < 0.001). The visual field indices, mean deviation and corrected pattern S.D., were significantly greater (p < 0.001) in the glaucoma group (-9.09 dB ± 3.3 and 7.91 ± 3.4, respectively) compared with the normal group (-0.15 dB ± 0.9 and 0.95 dB ± 0.8, respectively). Univariate linear regression provided the best overall fit to the ranked segment data. The equation parameters of the regression line manually applied to the normalised rim area-disc area and the rim area-disc area ratio data, correctly classified 100% of normal subjects and glaucoma patients. In this study sample, the regression analysis of ranked segment parameters method was more effective than conventional ranked segment analysis, in which glaucoma patients were misclassified in approximately 50% of cases. Further investigation in larger samples will enable the calculation of confidence intervals for normality. These reference standards will then need to be investigated for an independent sample to fully validate the technique. Conclusions: Using a curve-fitting approach to fit ranked segment curves retains information relating to the topographic nature of neural loss. Such methodology appears to overcome some of the deficiencies of conventional ranked segment analysis, and subject to validation in larger scale studies, may potentially be of clinical utility for detecting and monitoring glaucomatous damage. © 2007 The College of Optometrists.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Background - The binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities. Results - We developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides. Conclusion - As a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

General Regression Neuro-Fuzzy Network, which combines the properties of conventional General Regression Neural Network and Adaptive Network-based Fuzzy Inference System is proposed in this work. This network relates to so-called “memory-based networks”, which is adjusted by one-pass learning algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The task of approximation-forecasting for a function, represented by empirical data was investigated. Certain class of the functions as forecasting tools: so called RFT-transformers, – was proposed. Least Square Method and superposition are the principal composing means for the function generating. Besides, the special classes of beam dynamics with delay were introduced and investigated to get classical results regarding gradients. These results were applied to optimize the RFT-transformers. The effectiveness of the forecast was demonstrated on the empirical data from the Forex market.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data fluctuation in multiple measurements of Laser Induced Breakdown Spectroscopy (LIBS) greatly affects the accuracy of quantitative analysis. A new LIBS quantitative analysis method based on the Robust Least Squares Support Vector Machine (RLS-SVM) regression model is proposed. The usual way to enhance the analysis accuracy is to improve the quality and consistency of the emission signal, such as by averaging the spectral signals or spectrum standardization over a number of laser shots. The proposed method focuses more on how to enhance the robustness of the quantitative analysis regression model. The proposed RLS-SVM regression model originates from the Weighted Least Squares Support Vector Machine (WLS-SVM) but has an improved segmented weighting function and residual error calculation according to the statistical distribution of measured spectral data. Through the improved segmented weighting function, the information on the spectral data in the normal distribution will be retained in the regression model while the information on the outliers will be restrained or removed. Copper elemental concentration analysis experiments of 16 certified standard brass samples were carried out. The average value of relative standard deviation obtained from the RLS-SVM model was 3.06% and the root mean square error was 1.537%. The experimental results showed that the proposed method achieved better prediction accuracy and better modeling robustness compared with the quantitative analysis methods based on Partial Least Squares (PLS) regression, standard Support Vector Machine (SVM) and WLS-SVM. It was also demonstrated that the improved weighting function had better comprehensive performance in model robustness and convergence speed, compared with the four known weighting functions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62J05, 62G35.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62M20, 62-07, 62J05, 62P20.