27 resultados para multiple linear regression models

em Aston University Research Archive


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the Bayesian framework, predictions for a regression problem are expressed in terms of a distribution of output values. The mode of this distribution corresponds to the most probable output, while the uncertainty associated with the predictions can conveniently be expressed in terms of error bars. In this paper we consider the evaluation of error bars in the context of the class of generalized linear regression models. We provide insights into the dependence of the error bars on the location of the data points and we derive an upper bound on the true error bars in terms of the contributions from individual data points which are themselves easily evaluated.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background/Aim - People of south Asian origin have an excessive risk of morbidity and mortality from cardiovascular disease. We examined the effect of ethnicity on known risk factors and analysed the risk of cardiovascular events and mortality in UK south Asian and white Europeans patients with type 2 diabetes over a 2 year period. Methods - A total of 1486 south Asian (SA) and 492 white European (WE) subjects with type 2 diabetes were recruited from 25 general practices in Coventry and Birmingham, UK. Baseline data included clinical history, anthropometry and measurements of traditional risk factors – blood pressure, total cholesterol, HbA1c. Multiple linear regression models were used to examine ethnicity differences in individual risk factors. Ten-year cardiovascular risk was estimated using the Framingham and UKPDS equations. All subjects were followed up for 2 years. Cardiovascular events (CVD) and mortality between the two groups were compared. Findings - Significant differences were noted in risk profiles between both groups. After adjustment for clustering and confounding a significant ethnicity effect remained only for higher HbA1c (0.50 [0.22 to 0.77]; P?=?0.0004) and lower HDL (-0.09 [-0.17 to -0.01]; P?=?0.0266). Baseline CVD history was predictive of CVD events during follow-up for SA (P?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main aim of this paper is to provide a tutorial on regression with Gaussian processes. We start from Bayesian linear regression, and show how by a change of viewpoint one can see this method as a Gaussian process predictor based on priors over functions, rather than on priors over parameters. This leads in to a more general discussion of Gaussian processes in section 4. Section 5 deals with further issues, including hierarchical modelling and the setting of the parameters that control the Gaussian process, the covariance functions for neural network models and the use of Gaussian processes in classification problems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The spatial patterns of discrete beta-amyloid (Abeta) deposits in brain tissue from patients with Alzheimer disease (AD) were studied using a statistical method based on linear regression, the results being compared with the more conventional variance/mean (V/M) method. Both methods suggested that Abeta deposits occurred in clusters (400 to <12,800 mu m in diameter) in all but 1 of the 42 tissues examined. In many tissues, a regular periodicity of the Abeta deposit clusters parallel to the tissue boundary was observed. In 23 of 42 (55%) tissues, the two methods revealed essentially the same spatial patterns of Abeta deposits; in 15 of 42 (36%), the regression method indicated the presence of clusters at a scale not revealed by the V/M method; and in 4 of 42 (9%), there was no agreement between the two methods. Perceived advantages of the regression method are that there is a greater probability of detecting clustering at multiple scales, the dimension of larger Abeta clusters can be estimated more accurately, and the spacing between the clusters may be estimated. However, both methods may be useful, with the regression method providing greater resolution and the V/M method providing greater simplicity and ease of interpretation. Estimates of the distance between regularly spaced Abeta clusters were in the range 2,200-11,800 mu m, depending on tissue and cluster size. The regular periodicity of Abeta deposit clusters in many tissues would be consistent with their development in relation to clusters of neurons that give rise to specific neuronal projections.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In previous statnotes, the application of correlation and regression methods to the analysis of two variables (X,Y) was described. These methods can be used to determine whether there is a linear relationship between the two variables, whether the relationship is positive or negative, to test the degree of significance of the linear relationship, and to obtain an equation relating Y to X. This Statnote extends the methods of linear correlation and regression to situations where there are two or more X variables, i.e., 'multiple linear regression’.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Multiple Pheromone Ant Clustering Algorithm (MPACA) models the collective behaviour of ants to find clusters in data and to assign objects to the most appropriate class. It is an ant colony optimisation approach that uses pheromones to mark paths linking objects that are similar and potentially members of the same cluster or class. Its novelty is in the way it uses separate pheromones for each descriptive attribute of the object rather than a single pheromone representing the whole object. Ants that encounter other ants frequently enough can combine the attribute values they are detecting, which enables the MPACA to learn influential variable interactions. This paper applies the model to real-world data from two domains. One is logistics, focusing on resource allocation rather than the more traditional vehicle-routing problem. The other is mental-health risk assessment. The task for the MPACA in each domain was to predict class membership where the classes for the logistics domain were the levels of demand on haulage company resources and the mental-health classes were levels of suicide risk. Results on these noisy real-world data were promising, demonstrating the ability of the MPACA to find patterns in the data with accuracy comparable to more traditional linear regression models. © 2013 Polish Information Processing Society.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Linear models reach their limitations in applications with nonlinearities in the data. In this paper new empirical evidence is provided on the relative Euro inflation forecasting performance of linear and non-linear models. The well established and widely used univariate ARIMA and multivariate VAR models are used as linear forecasting models whereas neural networks (NN) are used as non-linear forecasting models. It is endeavoured to keep the level of subjectivity in the NN building process to a minimum in an attempt to exploit the full potentials of the NN. It is also investigated whether the historically poor performance of the theoretically superior measure of the monetary services flow, Divisia, relative to the traditional Simple Sum measure could be attributed to a certain extent to the evaluation of these indices within a linear framework. Results obtained suggest that non-linear models provide better within-sample and out-of-sample forecasts and linear models are simply a subset of them. The Divisia index also outperforms the Simple Sum index when evaluated in a non-linear framework. © 2005 Taylor & Francis Group Ltd.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Non-linear relationships are common in microbiological research and often necessitate the use of the statistical techniques of non-linear regression or curve fitting. In some circumstances, the investigator may wish to fit an exponential model to the data, i.e., to test the hypothesis that a quantity Y either increases or decays exponentially with increasing X. This type of model is straight forward to fit as taking logarithms of the Y variable linearises the relationship which can then be treated by the methods of linear regression.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In some circumstances, there may be no scientific model of the relationship between X and Y that can be specified in advance and indeed the objective of the investigation may be to provide a ‘curve of best fit’ for predictive purposes. In such an example, the fitting of successive polynomials may be the best approach. There are various strategies to decide on the polynomial of best fit depending on the objectives of the investigation.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. The techniques associated with regression, whether linear or non-linear, are some of the most useful statistical procedures that can be applied in clinical studies in optometry. 2. In some cases, there may be no scientific model of the relationship between X and Y that can be specified in advance and the objective may be to provide a ‘curve of best fit’ for predictive purposes. In such cases, the fitting of a general polynomial type curve may be the best approach. 3. An investigator may have a specific model in mind that relates Y to X and the data may provide a test of this hypothesis. Some of these curves can be reduced to a linear regression by transformation, e.g., the exponential and negative exponential decay curves. 4. In some circumstances, e.g., the asymptotic curve or logistic growth law, a more complex process of curve fitting involving non-linear estimation will be required.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the global economy, innovation is one of the most important competitive assets for companies willing to compete in international markets. As competition moves from standardised products to customised ones, depending on each specific market needs, economies of scale are not anymore the only winning strategy. Innovation requires firms to establish processes to acquire and absorb new knowledge, leading to the recent theory of Open Innovation. Knowledge sharing and acquisition happens when firms are embedded in networks with other firms, university, institutions and many other economic actors. Several typologies of innovation and firm networks have been identified, with various geographical spans. One of the first being modelled was the Industrial Cluster (or in Italian Distretto Industriale) which was for long considered the benchmark for innovation and economic development. Other kind of networks have been modelled since the late 1970s; Regional Innovation Systems represent one of the latest and more diffuse model of innovation networks, specifically introduced to combine local networks and the global economy. This model was qualitatively exploited since its introduction, but, together with National Innovation Systems, is among the most inspiring for policy makers and is often cited by them, not always properly. The aim of this research is to setup an econometric model describing Regional Innovation Systems, becoming one the first attempts to test and enhance this theory with a quantitative approach. A dataset of 104 secondary and primary data from European regions was built in order to run a multiple linear regression, testing if Regional Innovation Systems are really correlated to regional innovation and regional innovation in cooperation with foreign partners. Furthermore, an exploratory multiple linear regression was performed to verify which variables, among those describing a Regional Innovation Systems, are the most significant for innovating, alone or with foreign partners. Furthermore, the effectiveness of present innovation policies has been tested based on the findings of the econometric model. The developed model confirmed the role of Regional Innovation Systems for creating innovation even in cooperation with international partners: this represents one of the firsts quantitative confirmation of a theory previously based on qualitative models only. Furthermore the results of this model confirmed a minor influence of National Innovation Systems: comparing the analysis of existing innovation policies, both at regional and national level, to our findings, emerged the need for potential a pivotal change in the direction currently followed by policy makers. Last, while confirming the role of the presence a learning environment in a region and the catalyst role of regional administration, this research offers a potential new perspective for the whole private sector in creating a Regional Innovation System.