960 resultados para Hybrid semi-parametric modeling
Resumo:
L'Enquête rétrospective sur les travailleurs sélectionnés au Québec a permis d’analyser la relation formation-emploi des immigrantes — arrivées comme requérantes principales — et de jeter un regard sur le parcours en emploi de ces femmes, en comparaison avec leurs homologues masculins. Une attention particulière est mise sur l'effet de genre et de la région de provenance, ainsi que l'interaction entre ces deux variables. Des modèles semi-paramétriques de Cox mettent en exergue comment les caractéristiques individuelles, mais aussi les activités de formation dans la société d’accueil, affectent au fil du temps les risques relatifs d’obtenir un premier emploi correspondant à ses qualifications scolaires prémigratoires. Puis, des régressions linéaires font état des déterminants du salaire après deux ans sur le territoire. Les résultats montrent que l'accès à l'emploi qualifié n'est pas affecté différemment selon que l'immigrant soit un homme ou une femme. Des différences intragroupes apparaissent toutefois en fonction de la région de provenance, avec un net avantage pour les immigrants de l'Europe de l'Ouest et des États-Unis. L'accès au premier emploi (sans distinction pour les qualifications) et le salaire révèlent, quant à eux, des différences sur la base du genre, avec un désavantage pour les femmes. Chez ces dernières, l'insertion en emploi se fait de façon similaire entre les groupes régionaux, alors que les groupes d'hommes sont plus hétérogènes. D'ailleurs, certaines caractéristiques individuelles, comme la connaissance du français et la catégorie d'admission, affectent différemment les immigrants et les immigrantes dans l'accès au premier emploi.
Resumo:
Ouvrage réalisé sous la supervision du comité de jury composé des membres suivants: Dre Leila Ben Amor, Dre Diane Sauriol, Daniel Fiset, PhD. & Éric Lacourse PhD.
Resumo:
Cette thèse comporte trois articles dont un est publié et deux en préparation. Le sujet central de la thèse porte sur le traitement des valeurs aberrantes représentatives dans deux aspects importants des enquêtes que sont : l’estimation des petits domaines et l’imputation en présence de non-réponse partielle. En ce qui concerne les petits domaines, les estimateurs robustes dans le cadre des modèles au niveau des unités ont été étudiés. Sinha & Rao (2009) proposent une version robuste du meilleur prédicteur linéaire sans biais empirique pour la moyenne des petits domaines. Leur estimateur robuste est de type «plugin», et à la lumière des travaux de Chambers (1986), cet estimateur peut être biaisé dans certaines situations. Chambers et al. (2014) proposent un estimateur corrigé du biais. En outre, un estimateur de l’erreur quadratique moyenne a été associé à ces estimateurs ponctuels. Sinha & Rao (2009) proposent une procédure bootstrap paramétrique pour estimer l’erreur quadratique moyenne. Des méthodes analytiques sont proposées dans Chambers et al. (2014). Cependant, leur validité théorique n’a pas été établie et leurs performances empiriques ne sont pas pleinement satisfaisantes. Ici, nous examinons deux nouvelles approches pour obtenir une version robuste du meilleur prédicteur linéaire sans biais empirique : la première est fondée sur les travaux de Chambers (1986), et la deuxième est basée sur le concept de biais conditionnel comme mesure de l’influence d’une unité de la population. Ces deux classes d’estimateurs robustes des petits domaines incluent également un terme de correction pour le biais. Cependant, ils utilisent tous les deux l’information disponible dans tous les domaines contrairement à celui de Chambers et al. (2014) qui utilise uniquement l’information disponible dans le domaine d’intérêt. Dans certaines situations, un biais non négligeable est possible pour l’estimateur de Sinha & Rao (2009), alors que les estimateurs proposés exhibent un faible biais pour un choix approprié de la fonction d’influence et de la constante de robustesse. Les simulations Monte Carlo sont effectuées, et les comparaisons sont faites entre les estimateurs proposés et ceux de Sinha & Rao (2009) et de Chambers et al. (2014). Les résultats montrent que les estimateurs de Sinha & Rao (2009) et de Chambers et al. (2014) peuvent avoir un biais important, alors que les estimateurs proposés ont une meilleure performance en termes de biais et d’erreur quadratique moyenne. En outre, nous proposons une nouvelle procédure bootstrap pour l’estimation de l’erreur quadratique moyenne des estimateurs robustes des petits domaines. Contrairement aux procédures existantes, nous montrons formellement la validité asymptotique de la méthode bootstrap proposée. Par ailleurs, la méthode proposée est semi-paramétrique, c’est-à-dire, elle n’est pas assujettie à une hypothèse sur les distributions des erreurs ou des effets aléatoires. Ainsi, elle est particulièrement attrayante et plus largement applicable. Nous examinons les performances de notre procédure bootstrap avec les simulations Monte Carlo. Les résultats montrent que notre procédure performe bien et surtout performe mieux que tous les compétiteurs étudiés. Une application de la méthode proposée est illustrée en analysant les données réelles contenant des valeurs aberrantes de Battese, Harter & Fuller (1988). S’agissant de l’imputation en présence de non-réponse partielle, certaines formes d’imputation simple ont été étudiées. L’imputation par la régression déterministe entre les classes, qui inclut l’imputation par le ratio et l’imputation par la moyenne sont souvent utilisées dans les enquêtes. Ces méthodes d’imputation peuvent conduire à des estimateurs imputés biaisés si le modèle d’imputation ou le modèle de non-réponse n’est pas correctement spécifié. Des estimateurs doublement robustes ont été développés dans les années récentes. Ces estimateurs sont sans biais si l’un au moins des modèles d’imputation ou de non-réponse est bien spécifié. Cependant, en présence des valeurs aberrantes, les estimateurs imputés doublement robustes peuvent être très instables. En utilisant le concept de biais conditionnel, nous proposons une version robuste aux valeurs aberrantes de l’estimateur doublement robuste. Les résultats des études par simulations montrent que l’estimateur proposé performe bien pour un choix approprié de la constante de robustesse.
Resumo:
Multivariate lifetime data arise in various forms including recurrent event data when individuals are followed to observe the sequence of occurrences of a certain type of event; correlated lifetime when an individual is followed for the occurrence of two or more types of events, or when distinct individuals have dependent event times. In most studies there are covariates such as treatments, group indicators, individual characteristics, or environmental conditions, whose relationship to lifetime is of interest. This leads to a consideration of regression models.The well known Cox proportional hazards model and its variations, using the marginal hazard functions employed for the analysis of multivariate survival data in literature are not sufficient to explain the complete dependence structure of pair of lifetimes on the covariate vector. Motivated by this, in Chapter 2, we introduced a bivariate proportional hazards model using vector hazard function of Johnson and Kotz (1975), in which the covariates under study have different effect on two components of the vector hazard function. The proposed model is useful in real life situations to study the dependence structure of pair of lifetimes on the covariate vector . The well known partial likelihood approach is used for the estimation of parameter vectors. We then introduced a bivariate proportional hazards model for gap times of recurrent events in Chapter 3. The model incorporates both marginal and joint dependence of the distribution of gap times on the covariate vector . In many fields of application, mean residual life function is considered superior concept than the hazard function. Motivated by this, in Chapter 4, we considered a new semi-parametric model, bivariate proportional mean residual life time model, to assess the relationship between mean residual life and covariates for gap time of recurrent events. The counting process approach is used for the inference procedures of the gap time of recurrent events. In many survival studies, the distribution of lifetime may depend on the distribution of censoring time. In Chapter 5, we introduced a proportional hazards model for duration times and developed inference procedures under dependent (informative) censoring. In Chapter 6, we introduced a bivariate proportional hazards model for competing risks data under right censoring. The asymptotic properties of the estimators of the parameters of different models developed in previous chapters, were studied. The proposed models were applied to various real life situations.
Resumo:
The service quality of any sector has two major aspects namely technical and functional. Technical quality can be attained by maintaining technical specification as decided by the organization. Functional quality refers to the manner which service is delivered to customer which can be assessed by the customer feed backs. A field survey was conducted based on the management tool SERVQUAL, by designing 28 constructs under 7 dimensions of service quality. Stratified sampling techniques were used to get 336 valid responses and the gap scores of expectations and perceptions are analyzed using statistical techniques to identify the weakest dimension. To assess the technical aspects of availability six months live outage data of base transceiver were collected. The statistical and exploratory techniques were used to model the network performance. The failure patterns have been modeled in competing risk models and probability distribution of service outage and restorations were parameterized. Since the availability of network is a function of the reliability and maintainability of the network elements, any service provider who wishes to keep up their service level agreements on availability should be aware of the variability of these elements and its effects on interactions. The availability variations were studied by designing a discrete time event simulation model with probabilistic input parameters. The probabilistic distribution parameters arrived from live data analysis was used to design experiments to define the availability domain of the network under consideration. The availability domain can be used as a reference for planning and implementing maintenance activities. A new metric is proposed which incorporates a consistency index along with key service parameters that can be used to compare the performance of different service providers. The developed tool can be used for reliability analysis of mobile communication systems and assumes greater significance in the wake of mobile portability facility. It is also possible to have a relative measure of the effectiveness of different service providers.
Más allá de la infraestructura: el impacto de las bibliotecas públicas en la calidad de la educación
Resumo:
La literatura sobre la calidad de la educación ha prestado poca atención al papel que tienen las bibliotecas públicas dentro de los determinantes del desempeño educativo. Las bibliotecas públicas son activos externos al colegio y al hogar del estudiante, pero hacen parte del entorno social que les rodea. La puesta en marcha a finales de 2001 de tres bibliotecas de gran tamaño en Bogotá, conocidas como megabibliotecas, nos permite analizar el impacto de estas iniciativas sobre la calidad de la educación en los colegios aledaños. Dicho impacto se daría a través de mecanismos adicionales a la simple reducción de costos al acceso a la información: las bibliotecas renovaron el espacio público mediante la generación de espacios agradables y amigables hacia la educación, además ofrecen regularmente actividades lúdicas dirigidas a las habitantes del sector. Aprovechando la distancia del plantel educativo a la biblioteca como una aproximación al costo de acceso a la misma, utilizando para ello Diferencia en Diferencias junto a la descomposición Blinder Oaxaca. Encontramos que las mismas parecen no tener un impacto significativo sobre el desempeño académico general en los exámenes oficiales SABER 11 durante los años posteriores a su implementación. Se recomienda analizar programas específicos que aprovechen las bibliotecas para actividades escolares y otras posibles variables de impacto como actitudes hacia el estudio y aspiraciones a la educación superior.
Resumo:
This paper explores the changing survival patterns of cereal crop variety innovations in the UK since the introduction of plant breeders’ rights in the mid-1960s. Using non-parametric, semi-parametric and parametric approaches, we examine the determinants of the survival of wheat variety innovations, focusing on the impacts of changes to Plant Variety Protection (PVP) regime over the last four decades. We find that the period since the introduction of the PVP regime has been characterised by the accelerated development of new varieties and increased private sector participation in the breeding of cereal crop varieties. However, the increased flow of varieties has been accompanied by a sharp decline in the longevity of innovations. These trends may have contributed to a reduction in the returns appropriated by plant breeders from protected variety innovations and may explain the decline of conventional plant breeding in the UK. It may also explain the persistent demand from the seed industry for stronger protection. The strengthening of the PVP regime in conformity with the UPOV Convention of 1991, the introduction of EU-wide protection through the Community Plant Variety Office and the introduction of royalties on farm-saved seed have had a positive effect on the longevity of protected variety innovations, but have not been adequate to offset the long term decline in survival durations.
Resumo:
We address the problem of automatically identifying and restoring damaged and contaminated images. We suggest a novel approach based on a semi-parametric model. This has two components, a parametric component describing known physical characteristics and a more flexible non-parametric component. The latter avoids the need for a detailed model for the sensor, which is often costly to produce and lacking in robustness. We assess our approach using an analysis of electroencephalographic images contaminated by eye-blink artefacts and highly damaged photographs contaminated by non-uniform lighting. These experiments show that our approach provides an effective solution to problems of this type.
Resumo:
In this article, we introduce a semi-parametric Bayesian approach based on Dirichlet process priors for the discrete calibration problem in binomial regression models. An interesting topic is the dosimetry problem related to the dose-response model. A hierarchical formulation is provided so that a Markov chain Monte Carlo approach is developed. The methodology is applied to simulated and real data.
Resumo:
A number of recent works have introduced statistical methods for detecting genetic loci that affect phenotypic variability, which we refer to as variability-controlling quantitative trait loci (vQTL). These are genetic variants whose allelic state predicts how much phenotype values will vary about their expected means. Such loci are of great potential interest in both human and non-human genetic studies, one reason being that a detected vQTL could represent a previously undetected interaction with other genes or environmental factors. The simultaneous publication of these new methods in different journals has in many cases precluded opportunity for comparison. We survey some of these methods, the respective trade-offs they imply, and the connections between them. The methods fall into three main groups: classical non-parametric, fully parametric, and semi-parametric two-stage approximations. Choosing between alternatives involves balancing the need for robustness, flexibility, and speed. For each method, we identify important assumptions and limitations, including those of practical importance, such as their scope for including covariates and random effects. We show in simulations that both parametric methods and their semi-parametric approximations can give elevated false positive rates when they ignore mean-variance relationships intrinsic to the data generation process. We conclude that choice of method depends on the trait distribution, the need to include non-genetic covariates, and the population size and structure, coupled with a critical evaluation of how these fit with the assumptions of the statistical model.
Resumo:
This work investigates the impact of schooling Oil income distribution in statesjregions of Brazil. Using a semi-parametric model, discussed in DiNardo, Fortin & Lemieux (1996), we measure how much income diíferences between the Northeast and Southeast regions- the country's poorest and richest - and between the states of Ceará and São Paulo in those regions - can be explained by differences in schooling leveIs of the resident population. Using data from the National Household Survey (PNAD), we construct counterfactual densities by reweighting the distribution of the poorest region/state by the schooling profile of the richest. We conclude that: (i) more than 50% of the income di:fference is explained by the difference in schooling; (ii) the highest deciles of the income distribution gain more from an increase in schooling, closely approaching the wage distribution of the richest region/state; and (iii) an increase in schooling, holding the wage structure constant, aggravates the wage disparity in the poorest regions/ states.
Resumo:
Convex combinations of long memory estimates using the same data observed at different sampling rates can decrease the standard deviation of the estimates, at the cost of inducing a slight bias. The convex combination of such estimates requires a preliminary correction for the bias observed at lower sampling rates, reported by Souza and Smith (2002). Through Monte Carlo simulations, we investigate the bias and the standard deviation of the combined estimates, as well as the root mean squared error (RMSE), which takes both into account. While comparing the results of standard methods and their combined versions, the latter achieve lower RMSE, for the two semi-parametric estimators under study (by about 30% on average for ARFIMA(0,d,0) series).
Resumo:
Produtividade é frequentemente calculada pela aproximação da função de produção Cobb-Douglas. Tal estimativa, no entanto, pode sofrer de simultaneidade e viés de seleção dos insumos. Olley e Pakes (1996) introduziu um método semi-paramétrico que nos permite estimar os parâmetros da função de produção de forma consistente e, assim, obter medidas de produtividade confiável, controlando tais problemas de viés. Este estudo aplica este método em uma empresa do setor sucroalcooleiro e utiliza o comando opreg do Stata com a finalidade de estimar a função produção, descrevendo a intuição econômica por trás dos resultados.
Resumo:
This paper presents a methodology to estimate and identify different kinds of economic interaction, whenever these interactions can be established in the form of spatial dependence. First, we apply the semi-parametric approach of Chen and Conley (2001) to the estimation of reaction functions. Then, the methodology is applied to the analysis financial providers in Thailand. Based on a sample of financial institutions, we provide an economic framework to test if the actual spatial pattern is compatible with strategic competition (local interactions) or social planning (global interactions). Our estimates suggest that the provision of commercial banks and suppliers credit access is determined by spatial competition, while the Thai Bank of Agriculture and Agricultural Cooperatives is distributed as in a social planner problem.
Resumo:
Second-order polynomial models have been used extensively to approximate the relationship between a response variable and several continuous factors. However, sometimes polynomial models do not adequately describe the important features of the response surface. This article describes the use of fractional polynomial models. It is shown how the models can be fitted, an appropriate model selected, and inference conducted. Polynomial and fractional polynomial models are fitted to two published datasets, illustrating that sometimes the fractional polynomial can give as good a fit to the data and much more plausible behavior between the design points than the polynomial model. © 2005 American Statistical Association and the International Biometric Society.