Biblioteca Digital

916 resultados para Weighted linear regression schemes

Regression analysis of compositional data when both the dependent variable and independent variable are components

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features

Association between Long-Term Exposure to Traffic-Related Air Pollution and Subclinical Atherosclerosis: The REGICOR Study

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Epidemiological evidence of the effects of long-term exposure to air pollu tion on the chronic processes of athero genesis is limited. Objective: We investigated the association of long-term exposure to traffic-related air pollu tion with subclinical atherosclerosis, measured by carotid intima media thickness (IMT) and ankle–brachial index (ABI). Methods: We performed a cross-sectional analysis using data collected during the reexamination (2007–2010) of 2,780 participants in the REGICOR (Registre Gironí del Cor: the Gerona Heart Register) study, a population-based prospective cohort in Girona, Spain. Long-term exposure across residences was calculated as the last 10 years’ time-weighted average of residential nitrogen dioxide (NO2) estimates (based on a local-scale land-use regression model), traffic intensity in the nearest street, and traffic intensity in a 100 m buffer. Associations with IMT and ABI were estimated using linear regression and multinomial logistic regression, respectively, controlling for sex, age, smoking status, education, marital status, and several other potential confounders or intermediates. Results: Exposure contrasts between the 5th and 95th percentiles for NO2 (25 μg/m), traffic intensity in the nearest street (15,000 vehicles/day), and traffic load within 100 m (7,200,000 vehicle-m/day) were associated with differences of 0.56% (95% CI: –1.5, 2.6%), 2.32% (95% CI: 0.48, 4.17%), and 1.91% (95% CI: –0.24, 4.06) percent difference in IMT, respectively. Exposures were positively associated with an ABI of > 1.3, but not an ABI of < 0.9. Stronger associations were observed among those with a high level of education and in men ≥ 60 years of age. Conclusions: Long-term traffic-related exposures were associated with subclinical markers of atherosclerosis. Prospective studies are needed to confirm associations and further examine differences among population subgroups.key words: ankle–brachial index, average daily traffic, cardiovascular disease, exposure assessment, exposure to tailpipe emissions, intima media thickness, land use regression model, Mediterranean diet, nitrogen dioxide

Novel Regression Methods For Spectral Data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Multiple Linear Regression (MLR) are some of the mathematical pre- liminaries that are discussed prior to explaining PLS and PCR models. Both PLS and PCR are applied to real spectral data and their di erences and similarities are discussed in this thesis. The challenge lies in establishing the optimum number of components to be included in either of the models but this has been overcome by using various diagnostic tools suggested in this thesis. Correspondence analysis (CA) and PLS were applied to ecological data. The idea of CA was to correlate the macrophytes species and lakes. The di erences between PLS model for ecological data and PLS for spectral data are noted and explained in this thesis. i

Thermoacclimatory variations in the activities of enzymes implicated in ion transport in the rainbow trout, salmo gairdneri

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Two groups of rainbow trout were acclimated to 20 , 100 , and 18 o C. Plasma sodium, potassium, and chloride levels were determined for both. One group was employed in the estimation of branchial and renal (Na+-K+)-stimulated, (HC0 3-)-stimulated, and CMg++)-dependent ATPase activities, while the other was used in the measurement of carbonic anhydrase activity in the blood, gill and kidney. Assays were conducted using two incubation temperature schemes. One provided for incubation of all preparations at a common temperature of 2S oC, a value equivalent to the upper incipient lethal level for this species. In the other procedure the preparations were incubated at the appropriate acclimation temperature of the sampled fish. Trout were able to maintain plasma sodium and chloride levels essentially constant over the temperature range employed. The different incubation temperature protocols produced different levels of activity, and, in some cases, contrary trends with respect to acclimation temperature. This information was discussed in relation to previous work on gill and kidney. The standing-gradient flow hypothesis was discussed with reference to the structure of the chloride cell, known thermallyinduced changes in ion uptake, and the enzyme activities obtained in this study. Modifications of the model of gill lon uptake suggested by Maetz (1971) were proposed; high and low temperature models resulting. In short, ion transport at the gill at low temperatures appears to involve sodium and chloride 2 uptake by heteroionic exchange mechanisms working in association w.lth ca.rbonlc anhydrase. G.l ll ( Na + -K + ) -ATPase and erythrocyte carbonic anhydrase seem to provide the supplemental uptake required at higher temperatures. It appears that the kidney is prominent in ion transport at low temperatures while the gill is more important at high temperatures. 3 Linear regression analyses involving weight, plasma ion levels, and enzyme activities indicated several trends, the most significant being the interrelationship observed between plasma sodium and chloride. This, and other data obtained in the study was considered in light of the theory that a link exists between plasma sodium and chloride regulatory mechanisms.

Simulation-Based Finite-Sample Normality Tests in Linear Regressions

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the literature on tests of normality, much concern has been expressed over the problems associated with residual-based procedures. Indeed, the specialized tables of critical points which are needed to perform the tests have been derived for the location-scale model; hence reliance on available significance points in the context of regression models may cause size distortions. We propose a general solution to the problem of controlling the size normality tests for the disturbances of standard linear regression, which is based on using the technique of Monte Carlo tests.

Finite-Sample Diagnostics for Multivariate Regressions with Applications to Linear Asset Pricing Models

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose several finite-sample specification tests for multivariate linear regressions (MLR) with applications to asset pricing models. We focus on departures from the assumption of i.i.d. errors assumption, at univariate and multivariate levels, with Gaussian and non-Gaussian (including Student t) errors. The univariate tests studied extend existing exact procedures by allowing for unspecified parameters in the error distributions (e.g., the degrees of freedom in the case of the Student t distribution). The multivariate tests are based on properly standardized multivariate residuals to ensure invariance to MLR coefficients and error covariances. We consider tests for serial correlation, tests for multivariate GARCH and sign-type tests against general dependencies and asymmetries. The procedures proposed provide exact versions of those applied in Shanken (1990) which consist in combining univariate specification tests. Specifically, we combine tests across equations using the MC test procedure to avoid Bonferroni-type bounds. Since non-Gaussian based tests are not pivotal, we apply the “maximized MC” (MMC) test method [Dufour (2002)], where the MC p-value for the tested hypothesis (which depends on nuisance parameters) is maximized (with respect to these nuisance parameters) to control the test’s significance level. The tests proposed are applied to an asset pricing model with observable risk-free rates, using monthly returns on New York Stock Exchange (NYSE) portfolios over five-year subperiods from 1926-1995. Our empirical results reveal the following. Whereas univariate exact tests indicate significant serial correlation, asymmetries and GARCH in some equations, such effects are much less prevalent once error cross-equation covariances are accounted for. In addition, significant departures from the i.i.d. hypothesis are less evident once we allow for non-Gaussian errors.

Regression analysis of compositional data when both the dependent variable and independent variable are components

Relevância:

100.00% 100.00%

Publicador:

Resumo:

It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features

Proyecto asesoría piloto pronósticos Méderi

Relevância:

100.00% 100.00%

Publicador:

Resumo:

El presente trabajo desarrollado en el Hospital Méderi es una asesoría sobre modelos de pronósticos la cual consiste en analizar una base de datos de mercancía almacenada en la bodega general, suministrada por la entidad, mediante cuatro tipos de pronósticos diferentes, Promedio Móvil Ponderado, Promedio Móvil simple, Regresión Lineal y Suavizamiento Exponencial. Teniendo en cuenta el resultado arrojado por cada uno de los pronósticos, se hace una recomendación al hospital diciendo cual pronóstico debería utilizar para predecir la demanda con mayor precisión.

Alternatives to linear analysis of energy balance data from lactating dairy cows

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The current energy requirements system used in the United Kingdom for lactating dairy cows utilizes key parameters such as metabolizable energy intake (MEI) at maintenance (MEm), the efficiency of utilization of MEI for 1) maintenance, 2) milk production (k(l)), 3) growth (k(g)), and the efficiency of utilization of body stores for milk production (k(t)). Traditionally, these have been determined using linear regression methods to analyze energy balance data from calorimetry experiments. Many studies have highlighted a number of concerns over current energy feeding systems particularly in relation to these key parameters, and the linear models used for analyzing. Therefore, a database containing 652 dairy cow observations was assembled from calorimetry studies in the United Kingdom. Five functions for analyzing energy balance data were considered: straight line, two diminishing returns functions, (the Mitscherlich and the rectangular hyperbola), and two sigmoidal functions (the logistic and the Gompertz). Meta-analysis of the data was conducted to estimate k(g) and k(t). Values of 0.83 to 0.86 and 0.66 to 0.69 were obtained for k(g) and k(t) using all the functions (with standard errors of 0.028 and 0.027), respectively, which were considerably different from previous reports of 0.60 to 0.75 for k(g) and 0.82 to 0.84 for k(t). Using the estimated values of k(g) and k(t), the data were corrected to allow for body tissue changes. Based on the definition of k(l) as the derivative of the ratio of milk energy derived from MEI to MEI directed towards milk production, MEm and k(l) were determined. Meta-analysis of the pooled data showed that the average k(l) ranged from 0.50 to 0.58 and MEm ranged between 0.34 and 0.64 MJ/kg of BW0.75 per day. Although the constrained Mitscherlich fitted the data as good as the straight line, more observations at high energy intakes (above 2.4 MJ/kg of BW0.75 per day) are required to determine conclusively whether milk energy is related to MEI linearly or not.

Probabilistic wind power forecasts with an inverse power curve transformation and censored regression

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.

Tests of sunspot number sequences: 3. Effects of regression procedures on the calibration of historic sunspot data

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q  Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.

Deviance residuals in generalised log-gamma regression models with censored observations

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this article, we compare three residuals based on the deviance component in generalised log-gamma regression models with censored observations. For different parameter settings, sample sizes and censoring percentages, various simulation studies are performed and the empirical distribution of each residual is displayed and compared with the standard normal distribution. For all cases studied, the empirical distributions of the proposed residuals are in general symmetric around zero, but only a martingale-type residual presented negligible kurtosis for the majority of the cases studied. These studies suggest that the residual analysis usually performed in normal linear regression models can be straightforwardly extended for the martingale-type residual in generalised log-gamma regression models with censored data. A lifetime data set is analysed under log-gamma regression models and a model checking based on the martingale-type residual is performed.

Indicadores de renda baseados em consumo de energia elétrica: abordagens domiciliar e regional na perspectiva da estatística espacial

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Com o objetivo de avaliar o uso do consumo de energia elétrica como indicador socioeconômico, esta pesquisa analisa informações em dois níveis de agregação geográfica. No primeiro, sob perspectiva territorial, investiga indicadores de Renda e Consumo de Energia Elétrica agregados por áreas de ponderação (conjunto de setores censitários) do município de São Paulo e utiliza os microdados do Censo Demográfico 2000 em conjunto com a base de domicílios da AES Eletropaulo. Aplica modelos de Spatial Auto-Regression (SAR), Geographically Weighted Regression (GWR), e um modelo inédito combinado (GWR+SAR), desenvolvido neste estudo. Diversas matrizes de vizinhança foram utilizadas na avaliação da influência espacial (com padrão Centro-Periferia) das variáveis em estudo. As variáveis mostraram forte auto-correlação espacial (I de Moran superior a 58% para o Consumo de Energia Elétrica e superior a 75% para a Renda Domiciliar). As relações entre Renda e Consumo de Energia Elétrica mostraram-se muito fortes (os coeficientes de explicação da Renda atingiram valores de 0,93 a 0,98). No segundo nível, domiciliar, utiliza dados coletados na Pesquisa Anual de Satisfação do Cliente Residencial, coordenada pela Associação Brasileira dos Distribuidores de Energia Elétrica (ABRADEE), para os anos de 2004, 2006, 2007, 2008 e 2009. Foram aplicados os modelos Weighted Linear Model (WLM), GWR e SAR para os dados das pesquisas com as entrevistas alocadas no centróide e na sede dos distritos. Para o ano de 2009, foram obtidas as localizações reais dos domicílios entrevistados. Adicionalmente, foram desenvolvidos 6 algoritmos de distribuição de pontos no interior dos polígonos dos distritos. Os resultados dos modelos baseados em centróides e sedes obtiveram um coeficiente de determinação R2 em torno de 0,45 para a técnica GWR, enquanto os modelos baseados no espalhamento de pontos no interior dos polígonos dos distritos reduziram essa explicação para cerca de 0,40. Esses resultados sugerem que os algoritmos de alocação de pontos em polígonos permitem a observação de uma associação mais realística entre os construtos analisados. O uso combinado dos achados demonstra que as informações de faturamento das distribuidoras de energia elétrica têm grande potencial para apoiar decisões estratégicas. Por serem atuais, disponíveis e de atualização mensal, os indicadores socioeconômicos baseados em consumo de energia elétrica podem ser de grande utilidade como subsídio a processos de classificação, concentração e previsão da renda domiciliar.

Equação de regressão linear múltipla para estimativa do erro experimental

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Este trabalho teve por objetivo estimar equações de regressão linear múltipla tendo, como variáveis explicativas, as demais características avaliadas em experimento de milho e, como variáveis principais, a diferença mínima significativa em percentagem da média (DMS%) e quadrado médio do erro (QMe), para peso de grãos. Com 610 experimentos conduzidos na Rede de Ensaios Nacionais de Competição de Cultivares de Milho, realizados entre 1986 e 1996 (522 experimentos) e em 1997 (88 experimentos), estimaram-se duas equações de regressão, com os 522 experimentos, validando estas pela análise de regressão simples entre os valores reais e os estimados pelas equações, com os 88 restantes, observando que, para a DMS% a equação não estimava o mesmo valor que a fórmula original e, para o QMe, a equação poderia ser utilizada na estimação. Com o teste de Lilliefors, verificou-se que os valores do QMe aderiam à distribuição normal padrão e foi construída uma tabela de classificação dos valores do QMe, baseada nos valores observados na análise da variância dos experimentos e nos estimados pela equação de regressão.

Mortalidade infantil e condições socioeconômicas nas microrregiões do Nordeste Brasileiro

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The study aims to answer the following question: what are the different profiles of infant mortality, according to demographic, socioeconomic, infrastructure and health care, for the micro-regions at the Northeast of Brazil? Thus, the main objective is to analyze the profiles or typologies associated mortality levels sociodemographic conditions of the micro-regions, in the year 2010. To this end, the databases of birth and death certificates of SIM and SINASC (DATASUS/MS), were taken from the 2010 population Census microdata and from SIDRA/IBGE. As a methodology, a weighted multiple linear regression model was used in the analysis in order to find the most significant variables in the explanation child mortality for the year 2010. Also a cluster analysis was performed, seeking evidence, initially, of homogeneous groups of micro-regions, from of the significant variables. The logit of the infant mortality rate was used as dependent variable, while variables such as demographic, socioeconomic, infrastructure and health care in the micro-regions were taken as the independent variables of the model. The Bayesian estimation technique was applied to the database of births and deaths, due to the inconvenient fact of underreporting and random fluctuations of small quantities in small areas. The techniques of Spatial Statistics were used to determine the spatial behavior of the distribution of rates from thematic maps. In conclusion, we used the method GoM (Grade of Membership), to find typologies of mortality, associated with the selected variables by micro-regions, in order to respond the main question of the study. The results points out to the formation of three profiles: Profile 1, high infant mortality and unfavorable social conditions; Profile 2, low infant mortality, with a median social conditions of life; and Profile 3, median and high infant mortality social conditions. With this classification, it was found that, out of 188 micro-regions, 20 (10%) fits the extreme profile 1, 59 (31.4%) was characterized in the extreme profile 2, 34 (18.1%) was characterized in the extreme profile 3 and only 9 (4.8%) was classified as amorphous profile. The other micro-regions framed up in the profiles mixed. Such profiles suggest the need for different interventions in terms of public policies aimed to reducing child mortality in the region

«
1
2
...
5
6
7
8
9
10
11
...
61
62
»