37 resultados para LINEAR-REGRESSION
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We consider the application of normal theory methods to the estimation and testing of a general type of multivariate regressionmodels with errors--in--variables, in the case where various data setsare merged into a single analysis and the observable variables deviatepossibly from normality. The various samples to be merged can differ on the set of observable variables available. We show that there is a convenient way to parameterize the model so that, despite the possiblenon--normality of the data, normal--theory methods yield correct inferencesfor the parameters of interest and for the goodness--of--fit test. Thetheory described encompasses both the functional and structural modelcases, and can be implemented using standard software for structuralequations models, such as LISREL, EQS, LISCOMP, among others. An illustration with Monte Carlo data is presented.
Resumo:
Peer-reviewed
Resumo:
We introduce several exact nonparametric tests for finite sample multivariatelinear regressions, and compare their powers. This fills an important gap inthe literature where the only known nonparametric tests are either asymptotic,or assume one covariate only.
Resumo:
Random coefficient regression models have been applied in differentfields and they constitute a unifying setup for many statisticalproblems. The nonparametric study of this model started with Beranand Hall (1992) and it has become a fruitful framework. In thispaper we propose and study statistics for testing a basic hypothesisconcerning this model: the constancy of coefficients. The asymptoticbehavior of the statistics is investigated and bootstrapapproximations are used in order to determine the critical values ofthe test statistics. A simulation study illustrates the performanceof the proposals.
Resumo:
We present an exact test for whether two random variables that have known bounds on their support are negatively correlated. The alternative hypothesis is that they are not negatively correlated. No assumptions are made on the underlying distributions. We show by example that the Spearman rank correlation test as the competing exact test of correlation in nonparametric settings rests on an additional assumption on the data generating process without which it is not valid as a test for correlation.We then show how to test for the significance of the slope in a linear regression analysis that invovles a single independent variable and where outcomes of the dependent variable belong to a known bounded set.
Resumo:
It is well known that regression analyses involving compositional data need special attention because the data are not of full rank. For a regression analysis where both the dependent and independent variable are components we propose a transformation of the components emphasizing their role as dependent and independent variables. A simple linear regression can be performed on the transformed components. The regression line can be depicted in a ternary diagram facilitating the interpretation of the analysis in terms of components. An exemple with time-budgets illustrates the method and the graphical features
Resumo:
Inductive learning aims at finding general rules that hold true in a database. Targeted learning seeks rules for the predictions of the value of a variable based on the values of others, as in the case of linear or non-parametric regression analysis. Non-targeted learning finds regularities without a specific prediction goal. We model the product of non-targeted learning as rules that state that a certain phenomenon never happens, or that certain conditions necessitate another. For all types of rules, there is a trade-off between the rule's accuracy and its simplicity. Thus rule selection can be viewed as a choice problem, among pairs of degree of accuracy and degree of complexity. However, one cannot in general tell what is the feasible set in the accuracy-complexity space. Formally, we show that finding out whether a point belongs to this set is computationally hard. In particular, in the context of linear regression, finding a small set of variables that obtain a certain value of R2 is computationally hard. Computational complexity may explain why a person is not always aware of rules that, if asked, she would find valid. This, in turn, may explain why one can change other people's minds (opinions, beliefs) without providing new information.
Resumo:
Social entrepreneurship has been a subject of growing interest by academics and governments, however little still being known about environmental factors that affect this phenomenon. The main objective of this study is to analyze how these factors affect social entrepreneurial activity, in the light of the institutional economic theory as the conceptual framework. Using linear regression analysis for a sample of 49 countries, is studied the impact of informal institutions (social needs, societal attitudes and education) and formal institutions (public spending, access to finance and governance effectiveness) on social entrepreneurial activity. The findings suggest that while societal attitudes increase the rates of social entrepreneurship, public spending has a negative relationship with this phenomenon. Finally, the empirical evidence found could be useful for the definition of government policies on promoting social entrepreneurship.
Resumo:
The preceding two editions of CoDaWork included talks on the possible considerationof densities as infinite compositions: Egozcue and D´ıaz-Barrero (2003) extended theEuclidean structure of the simplex to a Hilbert space structure of the set of densitieswithin a bounded interval, and van den Boogaart (2005) generalized this to the setof densities bounded by an arbitrary reference density. From the many variations ofthe Hilbert structures available, we work with three cases. For bounded variables, abasis derived from Legendre polynomials is used. For variables with a lower bound, westandardize them with respect to an exponential distribution and express their densitiesas coordinates in a basis derived from Laguerre polynomials. Finally, for unboundedvariables, a normal distribution is used as reference, and coordinates are obtained withrespect to a Hermite-polynomials-based basis.To get the coordinates, several approaches can be considered. A numerical accuracyproblem occurs if one estimates the coordinates directly by using discretized scalarproducts. Thus we propose to use a weighted linear regression approach, where all k-order polynomials are used as predictand variables and weights are proportional to thereference density. Finally, for the case of 2-order Hermite polinomials (normal reference)and 1-order Laguerre polinomials (exponential), one can also derive the coordinatesfrom their relationships to the classical mean and variance.Apart of these theoretical issues, this contribution focuses on the application of thistheory to two main problems in sedimentary geology: the comparison of several grainsize distributions, and the comparison among different rocks of the empirical distribution of a property measured on a batch of individual grains from the same rock orsediment, like their composition
Resumo:
Background: There is growing evidence that traffic-related air pollution reduces birth weight. Improving exposure assessment is a key issue to advance in this research area.Objective: We investigated the effect of prenatal exposure to traffic-related air pollution via geographic information system (GIS) models on birth weight in 570 newborns from the INMA (Environment and Childhood) Sabadell cohort.Methods: We estimated pregnancy and trimester-specific exposures to nitrogen dioxide and aromatic hydrocarbons [benzene, toluene, ethylbenzene, m/p-xylene, and o-xylene (BTEX)] by using temporally adjusted land-use regression (LUR) models. We built models for NO2 and BTEX using four and three 1-week measurement campaigns, respectively, at 57 locations. We assessed the relationship between prenatal air pollution exposure and birth weight with linear regression models. We performed sensitivity analyses considering time spent at home and time spent in nonresidential outdoor environments during pregnancy.Results: In the overall cohort, neither NO2 nor BTEX exposure was significantly associated with birth weight in any of the exposure periods. When considering only women who spent < 2 hr/day in nonresidential outdoor environments, the estimated reductions in birth weight associated with an interquartile range increase in BTEX exposure levels were 77 g [95% confidence interval (CI), 7–146 g] and 102 g (95% CI, 28–176 g) for exposures during the whole pregnancy and the second trimester, respectively. The effects of NO2 exposure were less clear in this subset.Conclusions: The association of BTEX with reduced birth weight underscores the negative role of vehicle exhaust pollutants in reproductive health. Time–activity patterns during pregnancy complement GIS-based models in exposure assessment.
Resumo:
The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.
Resumo:
The aim of this paper is twofold. First, we study the determinants of economic growth among a wide set of potential variables for the Spanish provinces (NUTS3). Among others, we include various types of private, public and human capital in the group of growth factors. Also,we analyse whether Spanish provinces have converged in economic terms in recent decades. Thesecond objective is to obtain cross-section and panel data parameter estimates that are robustto model speci¯cation. For this purpose, we use a Bayesian Model Averaging (BMA) approach.Bayesian methodology constructs parameter estimates as a weighted average of linear regression estimates for every possible combination of included variables. The weight of each regression estimate is given by the posterior probability of each model.
Resumo:
En este documento se ilustra de un modo práctico, el empleo de tres instrumentos que permiten al actuario definir grupos arancelarios y estimar premios de riesgo en el proceso que tasa la clase para el seguro de no vida. El primero es el análisis de segmentación (CHAID y XAID) usado en primer lugar en 1997 por UNESPA en su cartera común de coches. El segundo es un proceso de selección gradual con el modelo de regresión a base de distancia. Y el tercero es un proceso con el modelo conocido y generalizado de regresión linear, que representa la técnica más moderna en la bibliografía actuarial. De estos últimos, si combinamos funciones de eslabón diferentes y distribuciones de error, podemos obtener el aditivo clásico y modelos multiplicativos
Resumo:
In a recent paper A. S. Johal and D. J. Dunstan [Phys. Rev. B 73, 024106 (2006)] have applied multivariate linear regression analysis to the published data of the change in ultrasonic velocity with applied stress. The aim is to obtain the best estimates for the third-order elastic constants in cubic materials. From such an analysis they conclude that uniaxial stress data on metals turns out to be nearly useless by itself. The purpose of this comment is to point out that by a proper analysis of uniaxial stress data it is possible to obtain reliable values of third-order elastic constants in cubic metals and alloys. Cu-based shape memory alloys are used as an illustrative example.
Resumo:
En este documento se ilustra de un modo práctico, el empleo de tres instrumentos que permiten al actuario definir grupos arancelarios y estimar premios de riesgo en el proceso que tasa la clase para el seguro de no vida. El primero es el análisis de segmentación (CHAID y XAID) usado en primer lugar en 1997 por UNESPA en su cartera común de coches. El segundo es un proceso de selección gradual con el modelo de regresión a base de distancia. Y el tercero es un proceso con el modelo conocido y generalizado de regresión linear, que representa la técnica más moderna en la bibliografía actuarial. De estos últimos, si combinamos funciones de eslabón diferentes y distribuciones de error, podemos obtener el aditivo clásico y modelos multiplicativos