824 resultados para Quantile regression
Resumo:
There are a great number of evidences showing that education is extremely important in many economic and social dimensions. In Brazil, education is a right guaranteed by the Federal Constitution; however, in the Brazilian legislation the right to the three stages of basic education: Kindergarten, Elementary and High School is better promoted and supported than the right to education at College level. According to educational census data (INEP, 2009), 78% of all enrolments in College education are in private schools, while the reverse is found in High School: 84% of all matriculations are in public schools, which shows a contradiction in the admission into the universities. The Brazilian scenario presents that public universities receive mostly students who performed better and were prepared in elementary and high school education in private schools, while private universities attend students who received their basic education in public schools, which are characterized as low quality. These facts have led researchers to raise the possible determinants of student performance on standardized tests, such as the Brazilian Vestibular exam, to guide the development of policies aimed at equal access to College education. Seeking inspiration in North American models of affirmative action policies, some Brazilian public universities have suggested rate policies to enable and facilitate the entry of "minorities" (blacks, pardos1, natives, people of low income and public school students) to free College education. At the Federal University of the state Rio Grande do Norte (UFRN), the first incentives for candidates from public schools emerged in 2006, being improved and widespread during the last 7 years. This study aimed to analyse and discuss the Argument of Inclution (AI) - the affirmative action policy that provides additional scoring for students from public schools. From an extensive database, the Ordinary Least Squares (OLS) technique was used as well as a Quantile Regression considering as control the variables of personal, socioeconomic and educational characteristics of the candidates from the Brazilian Vestibular exam 2010 of the Federal University of the state Rio Grande do Norte (UFRN). The results demonstrate the importance of this incentive system, besides the magnitude of other variables
Resumo:
There are a great number of evidences showing that education is extremely important in many economic and social dimensions. In Brazil, education is a right guaranteed by the Federal Constitution; however, in the Brazilian legislation the right to the three stages of basic education: Kindergarten, Elementary and High School is better promoted and supported than the right to education at College level. According to educational census data (INEP, 2009), 78% of all enrolments in College education are in private schools, while the reverse is found in High School: 84% of all matriculations are in public schools, which shows a contradiction in the admission into the universities. The Brazilian scenario presents that public universities receive mostly students who performed better and were prepared in elementary and high school education in private schools, while private universities attend students who received their basic education in public schools, which are characterized as low quality. These facts have led researchers to raise the possible determinants of student performance on standardized tests, such as the Brazilian Vestibular exam, to guide the development of policies aimed at equal access to College education. Seeking inspiration in North American models of affirmative action policies, some Brazilian public universities have suggested rate policies to enable and facilitate the entry of "minorities" (blacks, pardos1, natives, people of low income and public school students) to free College education. At the Federal University of the state Rio Grande do Norte (UFRN), the first incentives for candidates from public schools emerged in 2006, being improved and widespread during the last 7 years. This study aimed to analyse and discuss the Argument of Inclution (AI) - the affirmative action policy that provides additional scoring for students from public schools. From an extensive database, the Ordinary Least Squares (OLS) technique was used as well as a Quantile Regression considering as control the variables of personal, socioeconomic and educational characteristics of the candidates from the Brazilian Vestibular exam 2010 of the Federal University of the state Rio Grande do Norte (UFRN). The results demonstrate the importance of this incentive system, besides the magnitude of other variables
Resumo:
Dissertação (mestrado)—UnB/UFPB/UFRN, Programa MultiInstitucional e Inter-Regional de Pós-Graduação em Ciências Contábeis, 2016.
Resumo:
Esta investigación evalúa el desempeño de 73 fondos de inversión colectiva (FIC) colombianos enfocados en acciones de 2005 a 2015 -- Para cuantificar el valor generado por estos fondos en comparación con sus respectivos activos de referencia (“benchmarks”), se calcula el alfa de Jensen mediante dos metodologías de regresión: Mínimos Cuadrados Ordinarios (MCO) y Regresión por Cuantiles -- También se analiza si estos fondos muestran evidencia de “market timing” o no, utilizando dos modelos: efecto cuadrático y variable binaria interactiva -- De igual manera, nuestro estudio propone la creación de una empresa privada en Colombia que provea a los inversores de información precisa sobre las características y desempeño histórico de estos fondos de inversión colectiva, como lo hace Morningstar Inc. en Estados Unidos -- Esto permitiría a los inversores seleccionar los fondos con mejores perspectivas y, como es de esperarse, haría este mercado más eficiente y atractivo para nuevos inversores potenciales
Resumo:
¿What have we learnt from the 2006-2012 crisis, including events such as the subprime crisis, the bankruptcy of Lehman Brothers or the European sovereign debt crisis, among others? It is usually assumed that in firms that have a CDS quotation, this CDS is the key factor in establishing the credit premiumrisk for a new financial asset. Thus, the CDS is a key element for any investor in taking relative value opportunities across a firm’s capital structure. In the first chapter we study the most relevant aspects of the microstructure of the CDS market in terms of pricing, to have a clear idea of how this market works. We consider that such an analysis is a necessary point for establishing a solid base for the rest of the chapters in order to carry out the different empirical studies we perform. In its document “Basel III: A global regulatory framework for more resilient banks and banking systems”, Basel sets the requirement of a capital charge for credit valuation adjustment (CVA) risk in the trading book and its methodology for the computation for the capital requirement. This regulatory requirement has added extra pressure for in-depth knowledge of the CDS market and this motivates the analysis performed in this thesis. The problem arises in estimating of the credit risk premium for those counterparties without a directly quoted CDS in the market. How can we estimate the credit spread for an issuer without CDS? In addition to this, given the high volatility period in the credit market in the last few years and, in particular, after the default of Lehman Brothers on 15 September 2008, we observe the presence of big outliers in the distribution of credit spread in the different combinations of rating, industry and region. After an exhaustive analysis of the results from the different models studied, we have reached the following conclusions. It is clear that hierarchical regression models fit the data much better than those of non-hierarchical regression. Furthermore,we generally prefer the median model (50%-quantile regression) to the mean model (standard OLS regression) due to its robustness when assigning the price to a new credit asset without spread,minimizing the “inversion problem”. Finally, an additional fundamental reason to prefer the median model is the typical "right skewness" distribution of CDS spreads...
Resumo:
Méthodologie: Modèle de régression quantile de variable instrumentale pour données de Panel utilisant la fonction de production partielle
Resumo:
Dada la persistencia de las diferencias en ingresos laborales por regiones en Colombia, el presente artículo propone cuantificar la magnitud de este diferencial que es atribuida a la diferencia en estructuras de mercado laboral, entendiendo esta última como la diferencia en los retornos a las características de la fuerza laboral. Para ello se propone el uso de un método de descomposición del tipo Oaxaca- Blinder y se compara a Bogotá –la ciudad con mayores ingresos laborales- con otras ciudades principales. Los resultados obtenidos al conducir el ejercicio de descomposición muestran que las diferencias en estructura están a favor de Bogotá y que estas explican más de la mitad de la diferencia total, indicando que si se quieren reducir las disparidades de ingresos laborales entre ciudades no es suficiente con calificar la fuerza laboral y que es necesario indagar por las causas que hacen que los retornos a las características difieran entre ciudades.
Resumo:
The Normal Quantile Transform (NQT) has been used in many hydrological and meteorological applications in order to make the Cumulated Distribution Function (CDF) of the observed, simulated and forecast river discharge, water level or precipitation data Gaussian. It is also the heart of the meta-Gaussian model for assessing the total predictive uncertainty of the Hydrological Uncertainty Processor (HUP) developed by Krzysztofowicz. In the field of geo-statistics this transformation is better known as the Normal-Score Transform. In this paper some possible problems caused by small sample sizes when applying the NQT in flood forecasting systems will be discussed and a novel way to solve the problem will be outlined by combining extreme value analysis and non-parametric regression methods. The method will be illustrated by examples of hydrological stream-flow forecasts.
Resumo:
Forecasting wind power is an important part of a successful integration of wind power into the power grid. Forecasts with lead times longer than 6 h are generally made by using statistical methods to post-process forecasts from numerical weather prediction systems. Two major problems that complicate this approach are the non-linear relationship between wind speed and power production and the limited range of power production between zero and nominal power of the turbine. In practice, these problems are often tackled by using non-linear non-parametric regression models. However, such an approach ignores valuable and readily available information: the power curve of the turbine's manufacturer. Much of the non-linearity can be directly accounted for by transforming the observed power production into wind speed via the inverse power curve so that simpler linear regression models can be used. Furthermore, the fact that the transformed power production has a limited range can be taken care of by employing censored regression models. In this study, we evaluate quantile forecasts from a range of methods: (i) using parametric and non-parametric models, (ii) with and without the proposed inverse power curve transformation and (iii) with and without censoring. The results show that with our inverse (power-to-wind) transformation, simpler linear regression models with censoring perform equally or better than non-linear models with or without the frequently used wind-to-power transformation.
Resumo:
We use sunspot group observations from the Royal Greenwich Observatory (RGO) to investigate the effects of intercalibrating data from observers with different visual acuities. The tests are made by counting the number of groups RB above a variable cut-off threshold of observed total whole-spot area (uncorrected for foreshortening) to simulate what a lower acuity observer would have seen. The synthesised annual means of RB are then re-scaled to the full observed RGO group number RA using a variety of regression techniques. It is found that a very high correlation between RA and RB (rAB > 0.98) does not prevent large errors in the intercalibration (for example sunspot maximum values can be over 30 % too large even for such levels of rAB). In generating the backbone sunspot number (RBB), Svalgaard and Schatten (2015, this issue) force regression fits to pass through the scatter plot origin which generates unreliable fits (the residuals do not form a normal distribution) and causes sunspot cycle amplitudes to be exaggerated in the intercalibrated data. It is demonstrated that the use of Quantile-Quantile (“Q Q”) plots to test for a normal distribution is a useful indicator of erroneous and misleading regression fits. Ordinary least squares linear fits, not forced to pass through the origin, are sometimes reliable (although the optimum method used is shown to be different when matching peak and average sunspot group numbers). However, other fits are only reliable if non-linear regression is used. From these results it is entirely possible that the inflation of solar cycle amplitudes in the backbone group sunspot number as one goes back in time, relative to related solar-terrestrial parameters, is entirely caused by the use of inappropriate and non-robust regression techniques to calibrate the sunspot data.
Resumo:
The ability to measure gene expression on a genome-wide scale is one of the most promising accomplishments in molecular biology. Microarrays, the technology that first permitted this, were riddled with problems due to unwanted sources of variability. Many of these problems are now mitigated, after a decade’s worth of statistical methodology development. The recently developed RNA sequencing (RNA-seq) technology has generated much excitement in part due to claims of reduced variability in comparison to microarrays. However, we show RNA-seq data demonstrates unwanted and obscuring variability similar to what was first observed in microarrays. In particular, we find GC-content has a strong sample specific effect on gene expression measurements that, if left uncorrected, leads to false positives in downstream results. We also report on commonly observed data distortions that demonstrate the need for data normalization. Here we describe statistical methodology that improves precision by 42% without loss of accuracy. Our resulting conditional quantile normalization (CQN) algorithm combines robust generalized regression to remove systematic bias introduced by deterministic features such as GC-content, and quantile normalization to correct for global distortions.
Resumo:
The performance of the Hosmer-Lemeshow global goodness-of-fit statistic for logistic regression models was explored in a wide variety of conditions not previously fully investigated. Computer simulations, each consisting of 500 regression models, were run to assess the statistic in 23 different situations. The items which varied among the situations included the number of observations used in each regression, the number of covariates, the degree of dependence among the covariates, the combinations of continuous and discrete variables, and the generation of the values of the dependent variable for model fit or lack of fit.^ The study found that the $\rm\ C$g* statistic was adequate in tests of significance for most situations. However, when testing data which deviate from a logistic model, the statistic has low power to detect such deviation. Although grouping of the estimated probabilities into quantiles from 8 to 30 was studied, the deciles of risk approach was generally sufficient. Subdividing the estimated probabilities into more than 10 quantiles when there are many covariates in the model is not necessary, despite theoretical reasons which suggest otherwise. Because it does not follow a X$\sp2$ distribution, the statistic is not recommended for use in models containing only categorical variables with a limited number of covariate patterns.^ The statistic performed adequately when there were at least 10 observations per quantile. Large numbers of observations per quantile did not lead to incorrect conclusions that the model did not fit the data when it actually did. However, the statistic failed to detect lack of fit when it existed and should be supplemented with further tests for the influence of individual observations. Careful examination of the parameter estimates is also essential since the statistic did not perform as desired when there was moderate to severe collinearity among covariates.^ Two methods studied for handling tied values of the estimated probabilities made only a slight difference in conclusions about model fit. Neither method split observations with identical probabilities into different quantiles. Approaches which create equal size groups by separating ties should be avoided. ^
Resumo:
Prediction at ungauged sites is essential for water resources planning and management. Ungauged sites have no observations about the magnitude of floods, but some site and basin characteristics are known. Regression models relate physiographic and climatic basin characteristics to flood quantiles, which can be estimated from observed data at gauged sites. However, these models assume linear relationships between variables Prediction intervals are estimated by the variance of the residuals in the estimated model. Furthermore, the effect of the uncertainties in the explanatory variables on the dependent variable cannot be assessed. This paper presents a methodology to propagate the uncertainties that arise in the process of predicting flood quantiles at ungauged basins by a regression model. In addition, Bayesian networks were explored as a feasible tool for predicting flood quantiles at ungauged sites. Bayesian networks benefit from taking into account uncertainties thanks to their probabilistic nature. They are able to capture non-linear relationships between variables and they give a probability distribution of discharges as result. The methodology was applied to a case study in the Tagus basin in Spain.
Resumo:
Analysis of risk measures associated with price series data movements and its predictions are of strategic importance in the financial markets as well as to policy makers in particular for short- and longterm planning for setting up economic growth targets. For example, oilprice risk-management focuses primarily on when and how an organization can best prevent the costly exposure to price risk. Value-at-Risk (VaR) is the commonly practised instrument to measure risk and is evaluated by analysing the negative/positive tail of the probability distributions of the returns (profit or loss). In modelling applications, least-squares estimation (LSE)-based linear regression models are often employed for modeling and analyzing correlated data. These linear models are optimal and perform relatively well under conditions such as errors following normal or approximately normal distributions, being free of large size outliers and satisfying the Gauss-Markov assumptions. However, often in practical situations, the LSE-based linear regression models fail to provide optimal results, for instance, in non-Gaussian situations especially when the errors follow distributions with fat tails and error terms possess a finite variance. This is the situation in case of risk analysis which involves analyzing tail distributions. Thus, applications of the LSE-based regression models may be questioned for appropriateness and may have limited applicability. We have carried out the risk analysis of Iranian crude oil price data based on the Lp-norm regression models and have noted that the LSE-based models do not always perform the best. We discuss results from the L1, L2 and L∞-norm based linear regression models. ACM Computing Classification System (1998): B.1.2, F.1.3, F.2.3, G.3, J.2.