71 resultados para LINEAR-REGRESSION
em CentAUR: Central Archive University of Reading - UK
Resumo:
This paper derives some exact power properties of tests for spatial autocorrelation in the context of a linear regression model. In particular, we characterize the circumstances in which the power vanishes as the autocorrelation increases, thus extending the work of Krämer (2005). More generally, the analysis in the paper sheds new light on how the power of tests for spatial autocorrelation is affected by the matrix of regressors and by the spatial structure. We mainly focus on the problem of residual spatial autocorrelation, in which case it is appropriate to restrict attention to the class of invariant tests, but we also consider the case when the autocorrelation is due to the presence of a spatially lagged dependent variable among the regressors. A numerical study aimed at assessing the practical relevance of the theoretical results is included
Resumo:
(ABR) is of fundamental importance to the investiga- tion of the auditory system behavior, though its in- terpretation has a subjective nature because of the manual process employed in its study and the clinical experience required for its analysis. When analyzing the ABR, clinicians are often interested in the identi- fication of ABR signal components referred to as Jewett waves. In particular, the detection and study of the time when these waves occur (i.e., the wave la- tency) is a practical tool for the diagnosis of disorders affecting the auditory system. In this context, the aim of this research is to compare ABR manual/visual analysis provided by different examiners. Methods: The ABR data were collected from 10 normal-hearing subjects (5 men and 5 women, from 20 to 52 years). A total of 160 data samples were analyzed and a pair- wise comparison between four distinct examiners was executed. We carried out a statistical study aiming to identify significant differences between assessments provided by the examiners. For this, we used Linear Regression in conjunction with Bootstrap, as a me- thod for evaluating the relation between the responses given by the examiners. Results: The analysis sug- gests agreement among examiners however reveals differences between assessments of the variability of the waves. We quantified the magnitude of the ob- tained wave latency differences and 18% of the inves- tigated waves presented substantial differences (large and moderate) and of these 3.79% were considered not acceptable for the clinical practice. Conclusions: Our results characterize the variability of the manual analysis of ABR data and the necessity of establishing unified standards and protocols for the analysis of these data. These results may also contribute to the validation and development of automatic systems that are employed in the early diagnosis of hearing loss.
Resumo:
The current energy requirements system used in the United Kingdom for lactating dairy cows utilizes key parameters such as metabolizable energy intake (MEI) at maintenance (MEm), the efficiency of utilization of MEI for 1) maintenance, 2) milk production (k(l)), 3) growth (k(g)), and the efficiency of utilization of body stores for milk production (k(t)). Traditionally, these have been determined using linear regression methods to analyze energy balance data from calorimetry experiments. Many studies have highlighted a number of concerns over current energy feeding systems particularly in relation to these key parameters, and the linear models used for analyzing. Therefore, a database containing 652 dairy cow observations was assembled from calorimetry studies in the United Kingdom. Five functions for analyzing energy balance data were considered: straight line, two diminishing returns functions, (the Mitscherlich and the rectangular hyperbola), and two sigmoidal functions (the logistic and the Gompertz). Meta-analysis of the data was conducted to estimate k(g) and k(t). Values of 0.83 to 0.86 and 0.66 to 0.69 were obtained for k(g) and k(t) using all the functions (with standard errors of 0.028 and 0.027), respectively, which were considerably different from previous reports of 0.60 to 0.75 for k(g) and 0.82 to 0.84 for k(t). Using the estimated values of k(g) and k(t), the data were corrected to allow for body tissue changes. Based on the definition of k(l) as the derivative of the ratio of milk energy derived from MEI to MEI directed towards milk production, MEm and k(l) were determined. Meta-analysis of the pooled data showed that the average k(l) ranged from 0.50 to 0.58 and MEm ranged between 0.34 and 0.64 MJ/kg of BW0.75 per day. Although the constrained Mitscherlich fitted the data as good as the straight line, more observations at high energy intakes (above 2.4 MJ/kg of BW0.75 per day) are required to determine conclusively whether milk energy is related to MEI linearly or not.
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
Resumo:
This paper discusses the dangers inherent in allempting to simplify something as complex as development. It does this by exploring the Lynn and Vanhanen theory of deterministic development which asserts that varying levels of economic development seen between countries can be explained by differences in 'national intelligence' (national IQ). Assuming that intelligence is genetically determined, and as different races have been shown to have different IQ, then they argue that economic development (measured as GDP/capita) is largely a function of race and interventions to address imbalances can only have a limited impact. The paper presents the Lynne and Vanhanen case and critically discusses the data and analyses (linear regression) upon which it is based. It also extends the cause-effect basis of Lynne and Vanhanen's theory for economic development into human development by using the Human Development Index (HDI). It is argued that while there is nothing mathematically incorrect with their calculations, there are concerns over the data they employ. Even more fundamentally it is argued that statistically significant correlations between the various components of the HDI and national IQ can occur via a host of cause-effect pathways, and hence the genetic determinism theory is far from proven. The paper ends by discussing the dangers involved in the use of over-simplistic measures of development as a means of exploring cause-effect relationships. While the creators of development indices such as the HDI have good intentions, simplistic indices can encourage simplistic explanations of under-development. (c) 2005 Elsevier B.V. All rights reserved.
Resumo:
Real-time rainfall monitoring in Africa is of great practical importance for operational applications in hydrology and agriculture. Satellite data have been used in this context for many years because of the lack of surface observations. This paper describes an improved artificial neural network algorithm for operational applications. The algorithm combines numerical weather model information with the satellite data. Using this algorithm, daily rainfall estimates were derived for 4 yr of the Ethiopian and Zambian main rainy seasons and were compared with two other algorithms-a multiple linear regression making use of the same information as that of the neural network and a satellite-only method. All algorithms were validated against rain gauge data. Overall, the neural network performs best, but the extent to which it does so depends on the calibration/validation protocol. The advantages of the neural network are most evident when calibration data are numerous and close in space and time to the validation data. This result emphasizes the importance of a real-time calibration system.
Resumo:
This study investigates the response of wintertime North Atlantic Oscillation (NAO) to increasing concentrations of atmospheric carbon dioxide (CO2) as simulated by 18 global coupled general circulation models that participated in phase 2 of the Coupled Model Intercomparison Project (CMIP2). NAO has been assessed in control and transient 80-year simulations produced by each model under constant forcing, and 1% per year increasing concentrations of CO2, respectively. Although generally able to simulate the main features of NAO, the majority of models overestimate the observed mean wintertime NAO index of 8 hPa by 5-10 hPa. Furthermore, none of the models, in either the control or perturbed simulations, are able to reproduce decadal trends as strong as that seen in the observed NAO index from 1970-1995. Of the 15 models able to simulate the NAO pressure dipole, 13 predict a positive increase in NAO with increasing CO2 concentrations. The magnitude of the response is generally small and highly model-dependent, which leads to large uncertainty in multi-model estimates such as the median estimate of 0.0061 +/- 0.0036 hPa per %CO2. Although an increase of 0.61 hPa in NAO for a doubling in CO2 represents only a relatively small shift of 0.18 standard deviations in the probability distribution of winter mean NAO, this can cause large relative increases in the probabilities of extreme values of NAO associated with damaging impacts. Despite the large differences in NAO responses, the models robustly predict similar statistically significant changes in winter mean temperature (warmer over most of Europe) and precipitation (an increase over Northern Europe). Although these changes present a pattern similar to that expected due to an increase in the NAO index, linear regression is used to show that the response is much greater than can be attributed to small increases in NAO. NAO trends are not the key contributor to model-predicted climate change in wintertime mean temperature and precipitation over Europe and the Mediterranean region. However, the models' inability to capture the observed decadal variability in NAO might also signify a major deficiency in their ability to simulate the NAO-related responses to climate change.
Resumo:
Suprathermal electrons (E > 80 eV) carry heat flux away from the Sun. Processes controlling the heat flux are not well understood. To gain insight into these processes, we model heat flux as a linear dependence on two independent parameters: electron number flux and electron pitch angle anisotropy. Pitch angle anisotropy is further modeled as a linear dependence on two solar wind components: magnetic field strength and plasma density. These components show no correlation with number flux, reinforcing its independence from pitch angle anisotropy. Multiple linear regression applied to 2 years of Wind data shows good correspondence between modeled and observed heat flux and anisotropy. The results suggest that the interplay of solar wind parameters and electron number flux results in distinctive heat flux dropouts at heliospheric features like plasma sheets but that these parameters continuously modify heat flux. This is inconsistent with magnetic disconnection as the primary cause of heat flux dropouts. Analysis of fast and slow solar wind regimes separately shows that electron number flux and pitch angle anisotropy are equally correlated with heat flux in slow wind but that number flux is the dominant correlative in fast wind. Also, magnetic field strength correlates better with pitch angle anisotropy in slow wind than in fast wind. The energy dependence of the model fits suggests different scattering processes in fast and slow wind.
Resumo:
Estimating the magnitude of Agulhas leakage, the volume flux of water from the Indian to the Atlantic Ocean, is difficult because of the presence of other circulation systems in the Agulhas region. Indian Ocean water in the Atlantic Ocean is vigorously mixed and diluted in the Cape Basin. Eulerian integration methods, where the velocity field perpendicular to a section is integrated to yield a flux, have to be calibrated so that only the flux by Agulhas leakage is sampled. Two Eulerian methods for estimating the magnitude of Agulhas leakage are tested within a high-resolution two-way nested model with the goal to devise a mooring-based measurement strategy. At the GoodHope line, a section halfway through the Cape Basin, the integrated velocity perpendicular to that line is compared to the magnitude of Agulhas leakage as determined from the transport carried by numerical Lagrangian floats. In the first method, integration is limited to the flux of water warmer and more saline than specific threshold values. These threshold values are determined by maximizing the correlation with the float-determined time series. By using the threshold values, approximately half of the leakage can directly be measured. The total amount of Agulhas leakage can be estimated using a linear regression, within a 90% confidence band of 12 Sv. In the second method, a subregion of the GoodHope line is sought so that integration over that subregion yields an Eulerian flux as close to the float-determined leakage as possible. It appears that when integration is limited within the model to the upper 300 m of the water column within 900 km of the African coast the time series have the smallest root-mean-square difference. This method yields a root-mean-square error of only 5.2 Sv but the 90% confidence band of the estimate is 20 Sv. It is concluded that the optimum thermohaline threshold method leads to more accurate estimates even though the directly measured transport is a factor of two lower than the actual magnitude of Agulhas leakage in this model.
Resumo:
The relation between the Agulhas Current retroflection location and the magnitude of Agulhas leakage, the transport of water from the Indian to the Atlantic Ocean, is investigated in a high-resolution numerical ocean model. Sudden eastward retreats of the Agulhas Current retroflection loop are linearly related to the shedding of Agulhas rings, where larger retreats generate larger rings. Using numerical Lagrangian floats a 37 year time series of the magnitude of Agulhas leakage in the model is constructed. The time series exhibits large amounts of variability, both on weekly and annual time scales. A linear relation is found between the magnitude of Agulhas leakage and the location of the Agulhas Current retroflection, both binned to three month averages. In the relation, a more westward location of the Agulhas Current retroflection corresponds to an increased transport from the Indian Ocean to the Atlantic Ocean. When this relation is used in a linear regression and applied to almost 20 years of altimetry data, it yields a best estimate of the mean magnitude of Agulhas leakage of 13.2 Sv. The early retroflection of 2000, when Agulhas leakage was probably halved, can be identified using the regression.
Resumo:
The purpose of this study was to improve the prediction of the quantity and type of Volatile Fatty Acids (VFA) produced from fermented substrate in the rumen of lactating cows. A model was formulated that describes the conversion of substrate (soluble carbohydrates, starch, hemi-cellulose, cellulose, and protein) into VFA (acetate, propionate, butyrate, and other VFA). Inputs to the model were observed rates of true rumen digestion of substrates, whereas outputs were observed molar proportions of VFA in rumen fluid. A literature survey generated data of 182 diets (96 roughage and 86 concentrate diets). Coefficient values that define the conversion of a specific substrate into VFA were estimated meta-analytically by regression of the model against observed VFA molar proportions using non-linear regression techniques. Coefficient estimates significantly differed for acetate and propionate production in particular, between different types of substrate and between roughage and concentrate diets. Deviations of fitted from observed VFA molar proportions could be attributed to random error for 100%. In addition to regression against observed data, simulation studies were performed to investigate the potential of the estimation method. Fitted coefficient estimates from simulated data sets appeared accurate, as well as fitted rates of VFA production, although the model accounted for only a small fraction (maximally 45%) of the variation in VFA molar proportions. The simulation results showed that the latter result was merely a consequence of the statistical analysis chosen and should not be interpreted as an indication of inaccuracy of coefficient estimates. Deviations between fitted and observed values corresponded to those obtained in simulations. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
A study was designed to examine the relationships between protein, condensed tannin and cell wall carbohydrate content and composition and the nutritional quality of seven tropical legumes (Desmodium ovalifolium, Flemingia macrophylla, Leucaena leucocephala, L pallida, L macrophylla, Calliandra calothyrsus and Clitotia fairchildiana). Among the legume species studied, D ovalifolium showed the lowest concentration of nitrogen, while L leucocephala showed the highest. Fibre (NDF) content was lowest in C calothyrsus, L Leucocephala and L pallida and highest in L macrophylla, which had no measurable condensed tannins. The highest tannin concentration was found in C calothyrsus. Total non-structural polysaccharides (NSP) varied among legumes species (lowest in C calothyrsus and highest in D ovalifolium), and glucose and uronic acids were the most abundant carbohydrate constituents in all legumes. Total NSP losses were lowest in F macrophylla and highest in L leucocephala and L pallida. Gas accumulation and acetate and propionate levels were 50% less with F macrophylla and D ovalifolium as compared with L leucocephala. The highest levels of branched-chain fatty acids were observed with non-tanniniferous legumes, and negative concentrations were observed with some of the legumes with high tannin content (D ovalifolium and F macrophylla). Linear regression analysis showed that the presence of condensed tannins was more related to a reduction of the initial rate of gas production (0-48 h) than to the final amount of gas produced or the extent (144h) of dry matter degradation, which could be due to differences in tannin chemistry. Consequently, more attention should be given in the future to elucidating the impact of tannin structure on the nutritional quality of tropical forage legumes. (C) 2003 Society of Chemical Industry.