Biblioteca Digital

931 resultados para multiple linear regression analysis

A Combinatorial Approach to the Variable Selection in Multiple Linear Regression: Analysis of Selwood et al Data set -A Case Study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.

Dimension Reduction of the Explanatory Variables in Multiple Linear Regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62J05, 62G35.

Data analysis methods in optometry Part 7: multiple linear regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.

Predicting class I major histocompatibility complex (MHC) binders using multivariate statistics:comparison of discriminant analysis and multiple linear regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.

Applying Least Absolute Shrinkage Selection Operator and Akaike Information Criterion Analysis to Find the Best Multiple Linear Regression Models between Climate Indices and Components of Cow’s Milk

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.

Multiple Linear Regression and Artificial Neural Networks to Predict Time and Efficiency of Soil Vapor Extraction

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The prediction of the time and the efficiency of the remediation of contaminated soils using soil vapor extraction remain a difficult challenge to the scientific community and consultants. This work reports the development of multiple linear regression and artificial neural network models to predict the remediation time and efficiency of soil vapor extractions performed in soils contaminated separately with benzene, toluene, ethylbenzene, xylene, trichloroethylene, and perchloroethylene. The results demonstrated that the artificial neural network approach presents better performances when compared with multiple linear regression models. The artificial neural network model allowed an accurate prediction of remediation time and efficiency based on only soil and pollutants characteristics, and consequently allowing a simple and quick previous evaluation of the process viability.

Methods of Estimation in Multiple Linear Regression: Application to Clinical Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nesse artigo, tem-se o interesse em avaliar diferentes estratégias de estimação de parâmetros para um modelo de regressão linear múltipla. Para a estimação dos parâmetros do modelo foram utilizados dados de um ensaio clínico em que o interesse foi verificar se o ensaio mecânico da propriedade de força máxima (EM-FM) está associada com a massa femoral, com o diâmetro femoral e com o grupo experimental de ratas ovariectomizadas da raça Rattus norvegicus albinus, variedade Wistar. Para a estimação dos parâmetros do modelo serão comparadas três metodologias: a metodologia clássica, baseada no método dos mínimos quadrados; a metodologia Bayesiana, baseada no teorema de Bayes; e o método Bootstrap, baseado em processos de reamostragem.

Data methods in optometry. Part 10: non-linear regression analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

1. The techniques associated with regression, whether linear or non-linear, are some of the most useful statistical procedures that can be applied in clinical studies in optometry. 2. In some cases, there may be no scientific model of the relationship between X and Y that can be specified in advance and the objective may be to provide a ‘curve of best fit’ for predictive purposes. In such cases, the fitting of a general polynomial type curve may be the best approach. 3. An investigator may have a specific model in mind that relates Y to X and the data may provide a test of this hypothesis. Some of these curves can be reduced to a linear regression by transformation, e.g., the exponential and negative exponential decay curves. 4. In some circumstances, e.g., the asymptotic curve or logistic growth law, a more complex process of curve fitting involving non-linear estimation will be required.

Analysis and estimative of schistosomiasis prevalence for the state of Minas Gerais, Brazil, using multiple regression with social and environmental spatial data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The aim of this work is to establish a relationship between schistosomiasis prevalence and social-environmental variables, in the state of Minas Gerais, Brazil, through multiple linear regression. The final regression model was established, after a variables selection phase, with a set of spatial variables which contains the summer minimum temperature, human development index, and vegetation type variables. Based on this model, a schistosomiasis risk map was built for Minas Gerais.

Multiple linear and principal component regressions for modelling ecotoxicity bioassay response

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The ecotoxicological response of the living organisms in an aquatic system depends on the physical, chemical and bacteriological variables, as well as the interactions between them. An important challenge to scientists is to understand the interaction and behaviour of factors involved in a multidimensional process such as the ecotoxicological response.With this aim, multiple linear regression (MLR) and principal component regression were applied to the ecotoxicity bioassay response of Chlorella vulgaris and Vibrio fischeri in water collected at seven sites of Leça river during five monitoring campaigns (February, May, June, August and September of 2006). The river water characterization included the analysis of 22 physicochemical and 3 microbiological parameters. The model that best fitted the data was MLR, which shows: (i) a negative correlation with dissolved organic carbon, zinc and manganese, and a positive one with turbidity and arsenic, regarding C. vulgaris toxic response; (ii) a negative correlation with conductivity and turbidity and a positive one with phosphorus, hardness, iron, mercury, arsenic and faecal coliforms, concerning V. fischeri toxic response. This integrated assessment may allow the evaluation of the effect of future pollution abatement measures over the water quality of Leça River.

A study of the effect of heat source location in a ventilated room using multiple regression analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple regression analysis is a statistical technique which allows to predict a dependent variable from m ore than one independent variable and also to determine influential independent variables. Using experimental data, in this study the multiple regression analysis is applied to predict the room mean velocity and determine the most influencing parameters on the velocity. More than 120 experiments for four different heat source locations were carried out in a test chamber with a high level wall mounted air supply terminal at air change rates 3-6 ach. The influence of the environmental parameters such as supply air momentum, room heat load, Archimedes number and local temperature ratio, were examined by two methods: a simple regression analysis incorporated into scatter matrix plots and multiple stepwise regression analysis. It is concluded that, when a heat source is located along the jet centre line, the supply momentum mainly influences the room mean velocity regardless of the plume strength. However, when the heat source is located outside the jet region, the local temperature ratio (the inverse of the local heat removal effectiveness) is a major influencing parameter.

AGE INFLUENCE ON THE HEART RATE BEHAVIOR ON THE REST-EXERCICIO TRANSITION: AN ANALYSIS BY DELTAS AND LINEAR REGRESSION

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Changes in heart rate during rest-exercise transition can be characterized by the application of mathematical calculations, such as deltas 0-10 and 0-30 seconds to infer on the parasympathetic nervous system and linear regression and delta applied to data range from 60 to 240 seconds to infer on the sympathetic nervous system. The objective of this study was to test the hypothesis that young and middle-aged subjects have different heart rate responses in exercise of moderate and intense intensity, with different mathematical calculations. Methods: Seven middle-aged men and ten young men apparently healthy were subject to constant load tests (intense and moderate) in cycle ergometer. The heart rate data were submitted to analysis of deltas (0-10, 0-30 and 60-240 seconds) and simple linear regression (60-240 seconds). The parameters obtained from simple linear regression analysis were: intercept and slope angle. We used the Shapiro-Wilk test to check the distribution of data and the "t" test for unpaired comparisons between groups. The level of statistical significance was 5%. Results: The value of the intercept and delta 0-10 seconds was lower in middle age in two loads tested and the inclination angle was lower in moderate exercise in middle age. Conclusion: The young subjects present greater magnitude of vagal withdrawal in the initial stage of the HR response during constant load exercise and higher speed of adjustment of sympathetic response in moderate exercise.

Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.

Prediction of fat-free mass using bioelectrical impedance analysis in young adults from five populations of African origin.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background/objectives:Bioelectrical impedance analysis (BIA) is used in population and clinical studies as a technique for estimating body composition. Because of significant under-representation in existing literature, we sought to develop and validate predictive equation(s) for BIA for studies in populations of African origin.Subjects/methods:Among five cohorts of the Modeling the Epidemiologic Transition Study, height, weight, waist circumference and body composition, using isotope dilution, were measured in 362 adults, ages 25-45 with mean body mass indexes ranging from 24 to 32. BIA measures of resistance and reactance were measured using tetrapolar placement of electrodes and the same model of analyzer across sites (BIA 101Q, RJL Systems). Multiple linear regression analysis was used to develop equations for predicting fat-free mass (FFM), as measured by isotope dilution; covariates included sex, age, waist, reactance and height(2)/resistance, along with dummy variables for each site. Developed equations were then tested in a validation sample; FFM predicted by previously published equations were tested in the total sample.Results:A site-combined equation and site-specific equations were developed. The mean differences between FFM (reference) and FFM predicted by the study-derived equations were between 0.4 and 0.6âeuro0/00kg (that is, 1% difference between the actual and predicted FFM), and the measured and predicted values were highly correlated. The site-combined equation performed slightly better than the site-specific equations and the previously published equations.Conclusions:Relatively small differences exist between BIA equations to estimate FFM, whether study-derived or published equations, although the site-combined equation performed slightly better than others. The study-derived equations provide an important tool for research in these understudied populations.

Analysis of the erosive effect of different dietary substances and medications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Excessive consumption of acidic drinks and foods contributes to tooth erosion. The aims of the present in vitro study were twofold: (1) to assess the erosive potential of different dietary substances and medications; (2) to determine the chemical properties with an impact on the erosive potential. We selected sixty agents: soft drinks, an energy drink, sports drinks, alcoholic drinks, juice, fruit, mineral water, yogurt, tea, coffee, salad dressing and medications. The erosive potential of the tested agents was quantified as the changes in surface hardness (ΔSH) of enamel specimens within the first 2 min (ΔSH2-0 = SH2 min - SHbaseline) and the second 2 min exposure (ΔSH4-2 = SH4 min - SH2 min). To characterise these agents, various chemical properties, e.g. pH, concentrations of Ca, Pi and F, titratable acidity to pH 7·0 and buffering capacity at the original pH value (β), as well as degree of saturation (pK - pI) with respect to hydroxyapatite (HAP) and fluorapatite (FAP), were determined. Erosive challenge caused a statistically significant reduction in SH for all agents except for coffee, some medications and alcoholic drinks, and non-flavoured mineral waters, teas and yogurts (P < 0·01). By multiple linear regression analysis, 52 % of the variation in ΔSH after 2 min and 61 % after 4 min immersion were explained by pH, β and concentrations of F and Ca (P < 0·05). pH was the variable with the highest impact in multiple regression and bivariate correlation analyses. Furthermore, a high bivariate correlation was also obtained between (pK - pI)HAP, (pK - pI)FAP and ΔSH.

«
1
2
3
4
5
6
7
8
...
62
63
»