993 resultados para Instrument variable regression


Relevância:

30.00% 30.00%

Publicador:

Resumo:

An investigator may also wish to select a small subset of the X variables which give the best prediction of the Y variable. In this case, the question is how many variables should the regression equation include? One method would be to calculate the regression of Y on every subset of the X variables and choose the subset that gives the smallest mean square deviation from the regression. Most investigators, however, prefer to use a ‘stepwise multiple regression’ procedure. There are two forms of this analysis called the ‘step-up’ (or ‘forward’) method and the ‘step-down’ (or ‘backward’) method. This Statnote illustrates the use of stepwise multiple regression with reference to the scenario introduced in Statnote 24, viz., the influence of climatic variables on the growth of the crustose lichen Rhizocarpon geographicum (L.)DC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The aim of this research work was primarily to examine the relevance of patient parameters, ward structures, procedures and practices, in respect of the potential hazards of wound cross-infection and nasal colonisation with multiple resistant strains of Staphylococcus aureus, which it is thought might provide a useful indication of a patient's general susceptibility to wound infection. Information from a large cross-sectional survey involving 12,000 patients from some 41 hospitals and 375 wards was collected over a five-year period from 1967-72, and its validity checked before any subsequent analysis was carried out. Many environmental factors and procedures which had previously been thought (but never conclusively proved) to have an influence on wound infection or nasal colonisation rates, were assessed, and subsequently dismissed as not being significant, provided that the standard of the current range of practices and procedures is maintained and not allowed to deteriorate. Retrospective analysis revealed that the probability of wound infection was influenced by the patient's age, duration of pre-operative hospitalisation, sex, type of wound, presence and type of drain, number of patients in ward, and other special risk factors, whilst nasal colonisation was found to be influenced by the patient's age, total duration of hospitalisation, sex, antibiotics, proportion of occupied beds in the ward, average distance between bed centres and special risk factors. A multi-variate regression analysis technique was used to develop statistical models, consisting of variable patient and environmental factors which were found to have a significant influence on the risks pertaining to wound infection and nasal colonisation. A relationship between wound infection and nasal colonisation was then established and this led to the development of a more advanced model for predicting wound infections, taking advantage of the additional knowledge of the patient's state of nasal colonisation prior to operation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Direct quantile regression involves estimating a given quantile of a response variable as a function of input variables. We present a new framework for direct quantile regression where a Gaussian process model is learned, minimising the expected tilted loss function. The integration required in learning is not analytically tractable so to speed up the learning we employ the Expectation Propagation algorithm. We describe how this work relates to other quantile regression methods and apply the method on both synthetic and real data sets. The method is shown to be competitive with state of the art methods whilst allowing for the leverage of the full Gaussian process probabilistic framework.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2002 Mathematics Subject Classification: 62J05, 62G35.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 62J12, 62P10.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analysis of risk measures associated with price series data movements and its predictions are of strategic importance in the financial markets as well as to policy makers in particular for short- and longterm planning for setting up economic growth targets. For example, oilprice risk-management focuses primarily on when and how an organization can best prevent the costly exposure to price risk. Value-at-Risk (VaR) is the commonly practised instrument to measure risk and is evaluated by analysing the negative/positive tail of the probability distributions of the returns (profit or loss). In modelling applications, least-squares estimation (LSE)-based linear regression models are often employed for modeling and analyzing correlated data. These linear models are optimal and perform relatively well under conditions such as errors following normal or approximately normal distributions, being free of large size outliers and satisfying the Gauss-Markov assumptions. However, often in practical situations, the LSE-based linear regression models fail to provide optimal results, for instance, in non-Gaussian situations especially when the errors follow distributions with fat tails and error terms possess a finite variance. This is the situation in case of risk analysis which involves analyzing tail distributions. Thus, applications of the LSE-based regression models may be questioned for appropriateness and may have limited applicability. We have carried out the risk analysis of Iranian crude oil price data based on the Lp-norm regression models and have noted that the LSE-based models do not always perform the best. We discuss results from the L1, L2 and L∞-norm based linear regression models. ACM Computing Classification System (1998): B.1.2, F.1.3, F.2.3, G.3, J.2.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The solution of a TU cooperative game can be a distribution of the value of the grand coalition, i.e. it can be a distribution of the payo (utility) all the players together achieve. In a regression model, the evaluation of the explanatory variables can be a distribution of the overall t, i.e. the t of the model every regressor variable is involved. Furthermore, we can take regression models as TU cooperative games where the explanatory (regressor) variables are the players. In this paper we introduce the class of regression games, characterize it and apply the Shapley value to evaluating the explanatory variables in regression models. In order to support our approach we consider Young (1985)'s axiomatization of the Shapley value, and conclude that the Shapley value is a reasonable tool to evaluate the explanatory variables of regression models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper explains how Poisson regression can be used in studies in which the dependent variable describes the number of occurrences of some rare event such as suicide. After pointing out why ordinary linear regression is inappropriate for treating dependent variables of this sort, we go on to present the basic Poisson regression model and show how it fits in the broad class of generalized linear models. Then we turn to discussing a major problem of Poisson regression known as overdispersion and suggest possible solutions, including the correction of standard errors and negative binomial regression. The paper ends with a detailed empirical example, drawn from our own research on suicide.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Five models delineating the person-situation fit controversy were developed and tested. Hypotheses were tested to determine the linkages between vision congruence, empowerment, locus of control, job satisfaction, organizational commitment, and employee performance. Vision was defined as a mental image of a possible and desirable future state of the organization.^ Data were collected from 213 employees in a major flower import company. Participants were from various organizational levels and ethnic backgrounds. The data collection procedure consisted of three parts. First, a profile analysis instrument was used which was developed employing a Q-sort based technique, to measure the vision congruence between the CEO and each employee. Second, employees completed a survey instrument which included scales measuring empowerment, locus of control, job satisfaction, organizational commitment, and social desirability. Third, supervisor performance ratings were gathered from employee files. Data analysis consisted of using Kendall's tau to measure the correlation between CEO's and each employee's vision. Path analyses were conducted using the EQS structural equation program to test five theoretical models for goodness-of-fit. Regression analysis was employed to test whether locus of control acted as a moderator variable.^ The results showed that vision congruence is significantly related to job satisfaction and employee commitment, and perceived empowerment acts as an intervening variable affecting employee outcomes. The study also found that people with an internal locus of control were more likely to feel empowered than were those with external beliefs. Implications of these findings for both researchers and practitioners are discussed and suggestions for future research directions are provided. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Annual average daily traffic (AADT) is important information for many transportation planning, design, operation, and maintenance activities, as well as for the allocation of highway funds. Many studies have attempted AADT estimation using factor approach, regression analysis, time series, and artificial neural networks. However, these methods are unable to account for spatially variable influence of independent variables on the dependent variable even though it is well known that to many transportation problems, including AADT estimation, spatial context is important. ^ In this study, applications of geographically weighted regression (GWR) methods to estimating AADT were investigated. The GWR based methods considered the influence of correlations among the variables over space and the spatially non-stationarity of the variables. A GWR model allows different relationships between the dependent and independent variables to exist at different points in space. In other words, model parameters vary from location to location and the locally linear regression parameters at a point are affected more by observations near that point than observations further away. ^ The study area was Broward County, Florida. Broward County lies on the Atlantic coast between Palm Beach and Miami-Dade counties. In this study, a total of 67 variables were considered as potential AADT predictors, and six variables (lanes, speed, regional accessibility, direct access, density of roadway length, and density of seasonal household) were selected to develop the models. ^ To investigate the predictive powers of various AADT predictors over the space, the statistics including local r-square, local parameter estimates, and local errors were examined and mapped. The local variations in relationships among parameters were investigated, measured, and mapped to assess the usefulness of GWR methods. ^ The results indicated that the GWR models were able to better explain the variation in the data and to predict AADT with smaller errors than the ordinary linear regression models for the same dataset. Additionally, GWR was able to model the spatial non-stationarity in the data, i.e., the spatially varying relationship between AADT and predictors, which cannot be modeled in ordinary linear regression. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The purpose of this study was to develop an instrument to measure high school students’ perspectives on global awareness and attitudes toward social issues. The research questions that guided this study were: (a) Can acceptable validity and reliability estimates be established for an instrument developed to measure high schools students' global awareness? (b) Can acceptable validity and reliability estimates be established for an instrument developed to measure high schools students' attitudes towards global social issues? (c) What is the relationship between high school students’ GPA, race/ethnicity, gender, socio-economic status, parents’ education, getting the news, reading and listening habits, the number of classes taken in the social sciences, whether they speak a second language, and have experienced living in or visiting other countries, and their perception of global awareness and attitudes toward global social issues. ^ An ex post facto research design was used and the data were collected using a 4-part Likert-type survey. It was administered to 14 schools in the Miami-Dade County, Florida area to 704 students. A factor analysis with an orthogonal varimax rotation was vii used to select the factors that best represented the three constructs – global education, global citizenship, and global workforce. This was done to establish construct validity. Cronbach’s alpha was used to determine the reliability of the instrument. Descriptive statistics and a hierarchical multiple regression were used for the demographics to establish their relationship, if any, to the findings. ^ Key findings of the study were that reliable and valid estimates can be developed for the instrument. The multiple regression analysis for model 1 and 2 accounted for a variance of 3% and 5% for self-perceptions of global awareness (factor 1). The regression model also accounted for a 5% and 13% variance in the two models for attitudes toward global social issues (factor 2). The demographics that were statistically significant were: ethnicity, gender, SES, parents’ education, listening to music, getting the news, speaking a second language, GPA, classes taken in the social sciences, and visiting other countries. An important finding for the study was those attending public schools (as opposed to private schools) had more positive attitudes towards global social issues (factor 2) The statistics indicated that these students had taken history, economics, and social studies – a curriculum infused with global perspectives.^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Fitting statistical models is computationally challenging when the sample size or the dimension of the dataset is huge. An attractive approach for down-scaling the problem size is to first partition the dataset into subsets and then fit using distributed algorithms. The dataset can be partitioned either horizontally (in the sample space) or vertically (in the feature space), and the challenge arise in defining an algorithm with low communication, theoretical guarantees and excellent practical performance in general settings. For sample space partitioning, I propose a MEdian Selection Subset AGgregation Estimator ({\em message}) algorithm for solving these issues. The algorithm applies feature selection in parallel for each subset using regularized regression or Bayesian variable selection method, calculates the `median' feature inclusion index, estimates coefficients for the selected features in parallel for each subset, and then averages these estimates. The algorithm is simple, involves very minimal communication, scales efficiently in sample size, and has theoretical guarantees. I provide extensive experiments to show excellent performance in feature selection, estimation, prediction, and computation time relative to usual competitors.

While sample space partitioning is useful in handling datasets with large sample size, feature space partitioning is more effective when the data dimension is high. Existing methods for partitioning features, however, are either vulnerable to high correlations or inefficient in reducing the model dimension. In the thesis, I propose a new embarrassingly parallel framework named {\em DECO} for distributed variable selection and parameter estimation. In {\em DECO}, variables are first partitioned and allocated to m distributed workers. The decorrelated subset data within each worker are then fitted via any algorithm designed for high-dimensional problems. We show that by incorporating the decorrelation step, DECO can achieve consistent variable selection and parameter estimation on each subset with (almost) no assumptions. In addition, the convergence rate is nearly minimax optimal for both sparse and weakly sparse models and does NOT depend on the partition number m. Extensive numerical experiments are provided to illustrate the performance of the new framework.

For datasets with both large sample sizes and high dimensionality, I propose a new "divided-and-conquer" framework {\em DEME} (DECO-message) by leveraging both the {\em DECO} and the {\em message} algorithm. The new framework first partitions the dataset in the sample space into row cubes using {\em message} and then partition the feature space of the cubes using {\em DECO}. This procedure is equivalent to partitioning the original data matrix into multiple small blocks, each with a feasible size that can be stored and fitted in a computer in parallel. The results are then synthezied via the {\em DECO} and {\em message} algorithm in a reverse order to produce the final output. The whole framework is extremely scalable.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past few years, there has been a concern among economists and policy makers that increased openness to international trade affects some regions in a country more than others. Recent research has found that local labor markets more exposed to import competition through their initial employment composition experience worse outcomes in several dimensions such as, employment, wages, and poverty. Although there is evidence that regions within a country exhibit variation in the intensity with which they trade with each other and with other countries, trade linkages have been ignored in empirical analyses of the regional effects of trade, which focus on differences in employment composition. In this dissertation, I investigate how local labor markets' trade linkages shape the response of wages to international trade shocks. In the second chapter, I lay out a standard multi-sector general equilibrium model of trade, where domestic regions trade with each other and with the rest of the world. Using this benchmark, I decompose a region's wage change resulting from a national import cost shock into a direct effect on prices, holding other endogenous variables constant, and a series of general equilibrium effects. I argue the direct effect provides a natural measure of exposure to import competition within the model since it summarizes the effect of the shock on a region's wage as a function of initial conditions given by its trade linkages. I call my proposed measure linkage exposure while I refer to the measures used in previous studies as employment exposure. My theoretical analysis also shows that the assumptions previous studies make on trade linkages are not consistent with the standard trade model. In the third chapter, I calibrate the model to the Brazilian economy in 1991--at the beginning of a period of trade liberalization--to perform a series of experiments. In each of them, I reduce the Brazilian import cost by 1 percent in a single sector and I calculate how much of the cross-regional variation in counterfactual wage changes is explained by exposure measures. Over this set of experiments, employment exposure explains, for the median sector, 2 percent of the variation in counterfactual wage changes while linkage exposure explains 44 percent. In addition, I propose an estimation strategy that incorporates trade linkages in the analysis of the effects of trade on observed wages. In the model, changes in wages are completely determined by changes in market access, an endogenous variable that summarizes the real demand faced by a region. I show that a linkage measure of exposure is a valid instrument for changes in market access within Brazil. By using observed wage changes in Brazil between 1991-2000, my estimates imply that a region at the 25th percentile of the change in domestic market access induced by trade liberalization, experiences a 0.6 log points larger wage decline (or smaller wage increase) than a region at the 75th percentile. The estimates from a regression of wages changes on exposure imply that a region at the 25th percentile of exposure experiences a 3 log points larger wage decline (or smaller wage increase) than a region at the 75th percentile. I conclude that estimates based on exposure overstate the negative impact of trade liberalization on wages in Brazil. In the fourth chapter, I extend the standard model to allow for two types of workers according to their education levels: skilled and unskilled. I show that there is substantial variation across Brazilian regions in the skill premium. I use the exogenous variation provided by tariff changes to estimate the impact of market access on the skill premium. I find that decreased domestic market access resulting from trade liberalization resulted in a higher skill premium. I propose a mechanism to explain this result: that the manufacturing sector is relatively more intensive in unskilled labor and I show empirical evidence that supports this hypothesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Dissertação (mestrado)—Universidade de Brasília, Faculdade de Economia, Administração e Contabilidade, Programa de Pós-Graduação em Administração, 2016.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Tourist accommodation expenditure is a widely investigated topic as it represents a major contribution to the total tourist expenditure. The identification of the determinant factors is commonly based on supply-driven applications while little research has been made on important travel characteristics. This paper proposes a demand-driven analysis of tourist accommodation price by focusing on data generated from room bookings. The investigation focuses on modeling the relationship between key travel characteristics and the price paid to book the accommodation. To accommodate the distributional characteristics of the expenditure variable, the analysis is based on the estimation of a quantile regression model. The findings support the econometric approach used and enable the elaboration of relevant managerial implications.