33 resultados para LINEAR-REGRESSION MODELS

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the Bayesian framework, predictions for a regression problem are expressed in terms of a distribution of output values. The mode of this distribution corresponds to the most probable output, while the uncertainty associated with the predictions can conveniently be expressed in terms of error bars. In this paper we consider the evaluation of error bars in the context of the class of generalized linear regression models. We provide insights into the dependence of the error bars on the location of the data points and we derive an upper bound on the true error bars in terms of the contributions from individual data points which are themselves easily evaluated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linear models reach their limitations in applications with nonlinearities in the data. In this paper new empirical evidence is provided on the relative Euro inflation forecasting performance of linear and non-linear models. The well established and widely used univariate ARIMA and multivariate VAR models are used as linear forecasting models whereas neural networks (NN) are used as non-linear forecasting models. It is endeavoured to keep the level of subjectivity in the NN building process to a minimum in an attempt to exploit the full potentials of the NN. It is also investigated whether the historically poor performance of the theoretically superior measure of the monetary services flow, Divisia, relative to the traditional Simple Sum measure could be attributed to a certain extent to the evaluation of these indices within a linear framework. Results obtained suggest that non-linear models provide better within-sample and out-of-sample forecasts and linear models are simply a subset of them. The Divisia index also outperforms the Simple Sum index when evaluated in a non-linear framework. © 2005 Taylor & Francis Group Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The spatial patterns of discrete beta-amyloid (Abeta) deposits in brain tissue from patients with Alzheimer disease (AD) were studied using a statistical method based on linear regression, the results being compared with the more conventional variance/mean (V/M) method. Both methods suggested that Abeta deposits occurred in clusters (400 to <12,800 mu m in diameter) in all but 1 of the 42 tissues examined. In many tissues, a regular periodicity of the Abeta deposit clusters parallel to the tissue boundary was observed. In 23 of 42 (55%) tissues, the two methods revealed essentially the same spatial patterns of Abeta deposits; in 15 of 42 (36%), the regression method indicated the presence of clusters at a scale not revealed by the V/M method; and in 4 of 42 (9%), there was no agreement between the two methods. Perceived advantages of the regression method are that there is a greater probability of detecting clustering at multiple scales, the dimension of larger Abeta clusters can be estimated more accurately, and the spacing between the clusters may be estimated. However, both methods may be useful, with the regression method providing greater resolution and the V/M method providing greater simplicity and ease of interpretation. Estimates of the distance between regularly spaced Abeta clusters were in the range 2,200-11,800 mu m, depending on tissue and cluster size. The regular periodicity of Abeta deposit clusters in many tissues would be consistent with their development in relation to clusters of neurons that give rise to specific neuronal projections.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-linear relationships are common in microbiological research and often necessitate the use of the statistical techniques of non-linear regression or curve fitting. In some circumstances, the investigator may wish to fit an exponential model to the data, i.e., to test the hypothesis that a quantity Y either increases or decays exponentially with increasing X. This type of model is straight forward to fit as taking logarithms of the Y variable linearises the relationship which can then be treated by the methods of linear regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In some circumstances, there may be no scientific model of the relationship between X and Y that can be specified in advance and indeed the objective of the investigation may be to provide a ‘curve of best fit’ for predictive purposes. In such an example, the fitting of successive polynomials may be the best approach. There are various strategies to decide on the polynomial of best fit depending on the objectives of the investigation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. The techniques associated with regression, whether linear or non-linear, are some of the most useful statistical procedures that can be applied in clinical studies in optometry. 2. In some cases, there may be no scientific model of the relationship between X and Y that can be specified in advance and the objective may be to provide a ‘curve of best fit’ for predictive purposes. In such cases, the fitting of a general polynomial type curve may be the best approach. 3. An investigator may have a specific model in mind that relates Y to X and the data may provide a test of this hypothesis. Some of these curves can be reduced to a linear regression by transformation, e.g., the exponential and negative exponential decay curves. 4. In some circumstances, e.g., the asymptotic curve or logistic growth law, a more complex process of curve fitting involving non-linear estimation will be required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We investigate the dependence of Bayesian error bars on the distribution of data in input space. For generalized linear regression models we derive an upper bound on the error bars which shows that, in the neighbourhood of the data points, the error bars are substantially reduced from their prior values. For regions of high data density we also show that the contribution to the output variance due to the uncertainty in the weights can exhibit an approximate inverse proportionality to the probability density. Empirical results support these conclusions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background/Aim - People of south Asian origin have an excessive risk of morbidity and mortality from cardiovascular disease. We examined the effect of ethnicity on known risk factors and analysed the risk of cardiovascular events and mortality in UK south Asian and white Europeans patients with type 2 diabetes over a 2 year period. Methods - A total of 1486 south Asian (SA) and 492 white European (WE) subjects with type 2 diabetes were recruited from 25 general practices in Coventry and Birmingham, UK. Baseline data included clinical history, anthropometry and measurements of traditional risk factors – blood pressure, total cholesterol, HbA1c. Multiple linear regression models were used to examine ethnicity differences in individual risk factors. Ten-year cardiovascular risk was estimated using the Framingham and UKPDS equations. All subjects were followed up for 2 years. Cardiovascular events (CVD) and mortality between the two groups were compared. Findings - Significant differences were noted in risk profiles between both groups. After adjustment for clustering and confounding a significant ethnicity effect remained only for higher HbA1c (0.50 [0.22 to 0.77]; P?=?0.0004) and lower HDL (-0.09 [-0.17 to -0.01]; P?=?0.0266). Baseline CVD history was predictive of CVD events during follow-up for SA (P?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Multiple Pheromone Ant Clustering Algorithm (MPACA) models the collective behaviour of ants to find clusters in data and to assign objects to the most appropriate class. It is an ant colony optimisation approach that uses pheromones to mark paths linking objects that are similar and potentially members of the same cluster or class. Its novelty is in the way it uses separate pheromones for each descriptive attribute of the object rather than a single pheromone representing the whole object. Ants that encounter other ants frequently enough can combine the attribute values they are detecting, which enables the MPACA to learn influential variable interactions. This paper applies the model to real-world data from two domains. One is logistics, focusing on resource allocation rather than the more traditional vehicle-routing problem. The other is mental-health risk assessment. The task for the MPACA in each domain was to predict class membership where the classes for the logistics domain were the levels of demand on haulage company resources and the mental-health classes were levels of suicide risk. Results on these noisy real-world data were promising, demonstrating the ability of the MPACA to find patterns in the data with accuracy comparable to more traditional linear regression models. © 2013 Polish Information Processing Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents a forecasting technique for forward energy prices, one day ahead. This technique combines a wavelet transform and forecasting models such as multi- layer perceptron, linear regression or GARCH. These techniques are applied to real data from the UK gas markets to evaluate their performance. The results show that the forecasting accuracy is improved significantly by using the wavelet transform. The methodology can be also applied to forecasting market clearing prices and electricity/gas loads.