828 resultados para Linear regression analysis
Resumo:
A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.
Resumo:
1. The techniques associated with regression, whether linear or non-linear, are some of the most useful statistical procedures that can be applied in clinical studies in optometry. 2. In some cases, there may be no scientific model of the relationship between X and Y that can be specified in advance and the objective may be to provide a ‘curve of best fit’ for predictive purposes. In such cases, the fitting of a general polynomial type curve may be the best approach. 3. An investigator may have a specific model in mind that relates Y to X and the data may provide a test of this hypothesis. Some of these curves can be reduced to a linear regression by transformation, e.g., the exponential and negative exponential decay curves. 4. In some circumstances, e.g., the asymptotic curve or logistic growth law, a more complex process of curve fitting involving non-linear estimation will be required.
Resumo:
Background: Changes in heart rate during rest-exercise transition can be characterized by the application of mathematical calculations, such as deltas 0-10 and 0-30 seconds to infer on the parasympathetic nervous system and linear regression and delta applied to data range from 60 to 240 seconds to infer on the sympathetic nervous system. The objective of this study was to test the hypothesis that young and middle-aged subjects have different heart rate responses in exercise of moderate and intense intensity, with different mathematical calculations. Methods: Seven middle-aged men and ten young men apparently healthy were subject to constant load tests (intense and moderate) in cycle ergometer. The heart rate data were submitted to analysis of deltas (0-10, 0-30 and 60-240 seconds) and simple linear regression (60-240 seconds). The parameters obtained from simple linear regression analysis were: intercept and slope angle. We used the Shapiro-Wilk test to check the distribution of data and the "t" test for unpaired comparisons between groups. The level of statistical significance was 5%. Results: The value of the intercept and delta 0-10 seconds was lower in middle age in two loads tested and the inclination angle was lower in moderate exercise in middle age. Conclusion: The young subjects present greater magnitude of vagal withdrawal in the initial stage of the HR response during constant load exercise and higher speed of adjustment of sympathetic response in moderate exercise.
Resumo:
2002 Mathematics Subject Classification: 62J05, 62G35.
Resumo:
Processor architects have a challenging task of evaluating a large design space consisting of several interacting parameters and optimizations. In order to assist architects in making crucial design decisions, we build linear regression models that relate Processor performance to micro-architecture parameters, using simulation based experiments. We obtain good approximate models using an iterative process in which Akaike's information criteria is used to extract a good linear model from a small set of simulations, and limited further simulation is guided by the model using D-optimal experimental designs. The iterative process is repeated until desired error bounds are achieved. We used this procedure to establish the relationship of the CPI performance response to 26 key micro-architectural parameters using a detailed cycle-by-cycle superscalar processor simulator The resulting models provide a significance ordering on all micro-architectural parameters and their interactions, and explain the performance variations of micro-architectural techniques.
Resumo:
(ABR) is of fundamental importance to the investiga- tion of the auditory system behavior, though its in- terpretation has a subjective nature because of the manual process employed in its study and the clinical experience required for its analysis. When analyzing the ABR, clinicians are often interested in the identi- fication of ABR signal components referred to as Jewett waves. In particular, the detection and study of the time when these waves occur (i.e., the wave la- tency) is a practical tool for the diagnosis of disorders affecting the auditory system. In this context, the aim of this research is to compare ABR manual/visual analysis provided by different examiners. Methods: The ABR data were collected from 10 normal-hearing subjects (5 men and 5 women, from 20 to 52 years). A total of 160 data samples were analyzed and a pair- wise comparison between four distinct examiners was executed. We carried out a statistical study aiming to identify significant differences between assessments provided by the examiners. For this, we used Linear Regression in conjunction with Bootstrap, as a me- thod for evaluating the relation between the responses given by the examiners. Results: The analysis sug- gests agreement among examiners however reveals differences between assessments of the variability of the waves. We quantified the magnitude of the ob- tained wave latency differences and 18% of the inves- tigated waves presented substantial differences (large and moderate) and of these 3.79% were considered not acceptable for the clinical practice. Conclusions: Our results characterize the variability of the manual analysis of ABR data and the necessity of establishing unified standards and protocols for the analysis of these data. These results may also contribute to the validation and development of automatic systems that are employed in the early diagnosis of hearing loss.
Resumo:
In this paper, we study the influence of the National Telecom Business Volume by the data in 2008 that have been published in China Statistical Yearbook of Statistics. We illustrate the procedure of modeling “National Telecom Business Volume” on the following eight variables, GDP, Consumption Levels, Retail Sales of Social Consumer Goods Total Renovation Investment, the Local Telephone Exchange Capacity, Mobile Telephone Exchange Capacity, Mobile Phone End Users, and the Local Telephone End Users. The testing of heteroscedasticity and multicollinearity for model evaluation is included. We also consider AIC and BIC criterion to select independent variables, and conclude the result of the factors which are the optimal regression model for the amount of telecommunications business and the relation between independent variables and dependent variable. Based on the final results, we propose several recommendations about how to improve telecommunication services and promote the economic development.
Resumo:
In this thesis, we consider Bayesian inference on the detection of variance change-point models with scale mixtures of normal (for short SMN) distributions. This class of distributions is symmetric and thick-tailed and includes as special cases: Gaussian, Student-t, contaminated normal, and slash distributions. The proposed models provide greater flexibility to analyze a lot of practical data, which often show heavy-tail and may not satisfy the normal assumption. As to the Bayesian analysis, we specify some prior distributions for the unknown parameters in the variance change-point models with the SMN distributions. Due to the complexity of the joint posterior distribution, we propose an efficient Gibbs-type with Metropolis- Hastings sampling algorithm for posterior Bayesian inference. Thereafter, following the idea of [1], we consider the problems of the single and multiple change-point detections. The performance of the proposed procedures is illustrated and analyzed by simulation studies. A real application to the closing price data of U.S. stock market has been analyzed for illustrative purposes.
Resumo:
Issued May 1980.
Spatial pattern analysis of beta-amyloid (A beta) deposits in Alzheimer disease by linear regression
Resumo:
The spatial patterns of discrete beta-amyloid (Abeta) deposits in brain tissue from patients with Alzheimer disease (AD) were studied using a statistical method based on linear regression, the results being compared with the more conventional variance/mean (V/M) method. Both methods suggested that Abeta deposits occurred in clusters (400 to <12,800 mu m in diameter) in all but 1 of the 42 tissues examined. In many tissues, a regular periodicity of the Abeta deposit clusters parallel to the tissue boundary was observed. In 23 of 42 (55%) tissues, the two methods revealed essentially the same spatial patterns of Abeta deposits; in 15 of 42 (36%), the regression method indicated the presence of clusters at a scale not revealed by the V/M method; and in 4 of 42 (9%), there was no agreement between the two methods. Perceived advantages of the regression method are that there is a greater probability of detecting clustering at multiple scales, the dimension of larger Abeta clusters can be estimated more accurately, and the spacing between the clusters may be estimated. However, both methods may be useful, with the regression method providing greater resolution and the V/M method providing greater simplicity and ease of interpretation. Estimates of the distance between regularly spaced Abeta clusters were in the range 2,200-11,800 mu m, depending on tissue and cluster size. The regular periodicity of Abeta deposit clusters in many tissues would be consistent with their development in relation to clusters of neurons that give rise to specific neuronal projections.
Resumo:
Multiple regression analysis is a complex statistical method with many potential uses. It has also become one of the most abused of all statistical procedures since anyone with a data base and suitable software can carry it out. An investigator should always have a clear hypothesis in mind before carrying out such a procedure and knowledge of the limitations of each aspect of the analysis. In addition, multiple regression is probably best used in an exploratory context, identifying variables that might profitably be examined by more detailed studies. Where there are many variables potentially influencing Y, they are likely to be intercorrelated and to account for relatively small amounts of the variance. Any analysis in which R squared is less than 50% should be suspect as probably not indicating the presence of significant variables. A further problem relates to sample size. It is often stated that the number of subjects or patients must be at least 5-10 times the number of variables included in the study.5 This advice should be taken only as a rough guide but it does indicate that the variables included should be selected with great care as inclusion of an obviously unimportant variable may have a significant impact on the sample size required.
Resumo:
The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.
Resumo:
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.