847 resultados para Penalized regression
Resumo:
Purpose: To determine whether curve-fitting analysis of the ranked segment distributions of topographic optic nerve head (ONH) parameters, derived using the Heidelberg Retina Tomograph (HRT), provide a more effective statistical descriptor to differentiate the normal from the glaucomatous ONH. Methods: The sample comprised of 22 normal control subjects (mean age 66.9 years; S.D. 7.8) and 22 glaucoma patients (mean age 72.1 years; S.D. 6.9) confirmed by reproducible visual field defects on the Humphrey Field Analyser. Three 10°-images of the ONH were obtained using the HRT. The mean topography image was determined and the HRT software was used to calculate the rim volume, rim area to disc area ratio, normalised rim area to disc area ratio and retinal nerve fibre cross-sectional area for each patient at 10°-sectoral intervals. The values were ranked in descending order, and each ranked-segment curve of ordered values was fitted using the least squares method. Results: There was no difference in disc area between the groups. The group mean cup-disc area ratio was significantly lower in the normal group (0.204 ± 0.16) compared with the glaucoma group (0.533 ± 0.083) (p < 0.001). The visual field indices, mean deviation and corrected pattern S.D., were significantly greater (p < 0.001) in the glaucoma group (-9.09 dB ± 3.3 and 7.91 ± 3.4, respectively) compared with the normal group (-0.15 dB ± 0.9 and 0.95 dB ± 0.8, respectively). Univariate linear regression provided the best overall fit to the ranked segment data. The equation parameters of the regression line manually applied to the normalised rim area-disc area and the rim area-disc area ratio data, correctly classified 100% of normal subjects and glaucoma patients. In this study sample, the regression analysis of ranked segment parameters method was more effective than conventional ranked segment analysis, in which glaucoma patients were misclassified in approximately 50% of cases. Further investigation in larger samples will enable the calculation of confidence intervals for normality. These reference standards will then need to be investigated for an independent sample to fully validate the technique. Conclusions: Using a curve-fitting approach to fit ranked segment curves retains information relating to the topographic nature of neural loss. Such methodology appears to overcome some of the deficiencies of conventional ranked segment analysis, and subject to validation in larger scale studies, may potentially be of clinical utility for detecting and monitoring glaucomatous damage. © 2007 The College of Optometrists.
Resumo:
Solving many scientific problems requires effective regression and/or classification models for large high-dimensional datasets. Experts from these problem domains (e.g. biologists, chemists, financial analysts) have insights into the domain which can be helpful in developing powerful models but they need a modelling framework that helps them to use these insights. Data visualisation is an effective technique for presenting data and requiring feedback from the experts. A single global regression model can rarely capture the full behavioural variability of a huge multi-dimensional dataset. Instead, local regression models, each focused on a separate area of input space, often work better since the behaviour of different areas may vary. Classical local models such as Mixture of Experts segment the input space automatically, which is not always effective and it also lacks involvement of the domain experts to guide a meaningful segmentation of the input space. In this paper we addresses this issue by allowing domain experts to interactively segment the input space using data visualisation. The segmentation output obtained is then further used to develop effective local regression models.
Resumo:
The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.
Resumo:
Background - The binding between peptide epitopes and major histocompatibility complex proteins (MHCs) is an important event in the cellular immune response. Accurate prediction of the binding between short peptides and the MHC molecules has long been a principal challenge for immunoinformatics. Recently, the modeling of MHC-peptide binding has come to emphasize quantitative predictions: instead of categorizing peptides as "binders" or "non-binders" or as "strong binders" and "weak binders", recent methods seek to make predictions about precise binding affinities. Results - We developed a quantitative support vector machine regression (SVR) approach, called SVRMHC, to model peptide-MHC binding affinities. As a non-linear method, SVRMHC was able to generate models that out-performed existing linear models, such as the "additive method". By adopting a new "11-factor encoding" scheme, SVRMHC takes into account similarities in the physicochemical properties of the amino acids constituting the input peptides. When applied to MHC-peptide binding data for three mouse class I MHC alleles, the SVRMHC models produced more accurate predictions than those produced previously. Furthermore, comparisons based on Receiver Operating Characteristic (ROC) analysis indicated that SVRMHC was able to out-perform several prominent methods in identifying strongly binding peptides. Conclusion - As a method with demonstrated performance in the quantitative modeling of MHC-peptide binding and in identifying strong binders, SVRMHC is a promising immunoinformatics tool with not inconsiderable future potential.
Resumo:
General Regression Neuro-Fuzzy Network, which combines the properties of conventional General Regression Neural Network and Adaptive Network-based Fuzzy Inference System is proposed in this work. This network relates to so-called “memory-based networks”, which is adjusted by one-pass learning algorithm.
Resumo:
Abstract A new LIBS quantitative analysis method based on analytical line adaptive selection and Relevance Vector Machine (RVM) regression model is proposed. First, a scheme of adaptively selecting analytical line is put forward in order to overcome the drawback of high dependency on a priori knowledge. The candidate analytical lines are automatically selected based on the built-in characteristics of spectral lines, such as spectral intensity, wavelength and width at half height. The analytical lines which will be used as input variables of regression model are determined adaptively according to the samples for both training and testing. Second, an LIBS quantitative analysis method based on RVM is presented. The intensities of analytical lines and the elemental concentrations of certified standard samples are used to train the RVM regression model. The predicted elemental concentration analysis results will be given with a form of confidence interval of probabilistic distribution, which is helpful for evaluating the uncertainness contained in the measured spectra. Chromium concentration analysis experiments of 23 certified standard high-alloy steel samples have been carried out. The multiple correlation coefficient of the prediction was up to 98.85%, and the average relative error of the prediction was 4.01%. The experiment results showed that the proposed LIBS quantitative analysis method achieved better prediction accuracy and better modeling robustness compared with the methods based on partial least squares regression, artificial neural network and standard support vector machine.
Resumo:
* Supported by the Army Research Office under grant DAAD-19-02-10059.
Resumo:
The task of approximation-forecasting for a function, represented by empirical data was investigated. Certain class of the functions as forecasting tools: so called RFT-transformers, – was proposed. Least Square Method and superposition are the principal composing means for the function generating. Besides, the special classes of beam dynamics with delay were introduced and investigated to get classical results regarding gradients. These results were applied to optimize the RFT-transformers. The effectiveness of the forecast was demonstrated on the empirical data from the Forex market.
Resumo:
Data fluctuation in multiple measurements of Laser Induced Breakdown Spectroscopy (LIBS) greatly affects the accuracy of quantitative analysis. A new LIBS quantitative analysis method based on the Robust Least Squares Support Vector Machine (RLS-SVM) regression model is proposed. The usual way to enhance the analysis accuracy is to improve the quality and consistency of the emission signal, such as by averaging the spectral signals or spectrum standardization over a number of laser shots. The proposed method focuses more on how to enhance the robustness of the quantitative analysis regression model. The proposed RLS-SVM regression model originates from the Weighted Least Squares Support Vector Machine (WLS-SVM) but has an improved segmented weighting function and residual error calculation according to the statistical distribution of measured spectral data. Through the improved segmented weighting function, the information on the spectral data in the normal distribution will be retained in the regression model while the information on the outliers will be restrained or removed. Copper elemental concentration analysis experiments of 16 certified standard brass samples were carried out. The average value of relative standard deviation obtained from the RLS-SVM model was 3.06% and the root mean square error was 1.537%. The experimental results showed that the proposed method achieved better prediction accuracy and better modeling robustness compared with the quantitative analysis methods based on Partial Least Squares (PLS) regression, standard Support Vector Machine (SVM) and WLS-SVM. It was also demonstrated that the improved weighting function had better comprehensive performance in model robustness and convergence speed, compared with the four known weighting functions.
Resumo:
2002 Mathematics Subject Classification: 62J05, 62G35.
Resumo:
2002 Mathematics Subject Classification: 62M20, 62-07, 62J05, 62P20.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62K15, 91B42, 62H99.
Resumo:
2000 Mathematics Subject Classification: 62J12, 62P10.
Resumo:
2000 Mathematics Subject Classification: 62F10, 62J05, 62P30
Resumo:
Analysis of risk measures associated with price series data movements and its predictions are of strategic importance in the financial markets as well as to policy makers in particular for short- and longterm planning for setting up economic growth targets. For example, oilprice risk-management focuses primarily on when and how an organization can best prevent the costly exposure to price risk. Value-at-Risk (VaR) is the commonly practised instrument to measure risk and is evaluated by analysing the negative/positive tail of the probability distributions of the returns (profit or loss). In modelling applications, least-squares estimation (LSE)-based linear regression models are often employed for modeling and analyzing correlated data. These linear models are optimal and perform relatively well under conditions such as errors following normal or approximately normal distributions, being free of large size outliers and satisfying the Gauss-Markov assumptions. However, often in practical situations, the LSE-based linear regression models fail to provide optimal results, for instance, in non-Gaussian situations especially when the errors follow distributions with fat tails and error terms possess a finite variance. This is the situation in case of risk analysis which involves analyzing tail distributions. Thus, applications of the LSE-based regression models may be questioned for appropriateness and may have limited applicability. We have carried out the risk analysis of Iranian crude oil price data based on the Lp-norm regression models and have noted that the LSE-based models do not always perform the best. We discuss results from the L1, L2 and L∞-norm based linear regression models. ACM Computing Classification System (1998): B.1.2, F.1.3, F.2.3, G.3, J.2.