902 resultados para two stage quantile regression
Resumo:
We investigated the association between diet and head and neck cancer (HNC) risk using data from the International Head and Neck Cancer Epidemiology (INHANCE) consortium. The INHANCE pooled data included 22 case-control studies with 14,520 cases and 22,737 controls. Center-specific quartiles among the controls were used for food groups, and frequencies per week were used for single food items. A dietary pattern score combining high fruit and vegetable intake and low red meat intake was created. Odds ratios (OR) and 95% confidence intervals (CI) for the dietary items on the risk of HNC were estimated with a two-stage random-effects logistic regression model. An inverse association was observed for higher-frequency intake of fruit (4th vs. 1st quartile OR = 0.52, 95% CI = 0.43-0.62, p (trend) < 0.01) and vegetables (OR = 0.66, 95% CI = 0.49-0.90, p (trend) = 0.01). Intake of red meat (OR = 1.40, 95% CI = 1.13-1.74, p p (trend) < 0.01) was positively associated with HNC risk. Higher dietary pattern scores, reflecting high fruit/vegetable and low red meat intake, were associated with reduced HNC risk (per score increment OR = 0.90, 95% CI = 0.84-0.97).
Resumo:
Notre consommation en eau souterraine, en particulier comme eau potable ou pour l'irrigation, a considérablement augmenté au cours des années. De nombreux problèmes font alors leur apparition, allant de la prospection de nouvelles ressources à la remédiation des aquifères pollués. Indépendamment du problème hydrogéologique considéré, le principal défi reste la caractérisation des propriétés du sous-sol. Une approche stochastique est alors nécessaire afin de représenter cette incertitude en considérant de multiples scénarios géologiques et en générant un grand nombre de réalisations géostatistiques. Nous rencontrons alors la principale limitation de ces approches qui est le coût de calcul dû à la simulation des processus d'écoulements complexes pour chacune de ces réalisations. Dans la première partie de la thèse, ce problème est investigué dans le contexte de propagation de l'incertitude, oú un ensemble de réalisations est identifié comme représentant les propriétés du sous-sol. Afin de propager cette incertitude à la quantité d'intérêt tout en limitant le coût de calcul, les méthodes actuelles font appel à des modèles d'écoulement approximés. Cela permet l'identification d'un sous-ensemble de réalisations représentant la variabilité de l'ensemble initial. Le modèle complexe d'écoulement est alors évalué uniquement pour ce sousensemble, et, sur la base de ces réponses complexes, l'inférence est faite. Notre objectif est d'améliorer la performance de cette approche en utilisant toute l'information à disposition. Pour cela, le sous-ensemble de réponses approximées et exactes est utilisé afin de construire un modèle d'erreur, qui sert ensuite à corriger le reste des réponses approximées et prédire la réponse du modèle complexe. Cette méthode permet de maximiser l'utilisation de l'information à disposition sans augmentation perceptible du temps de calcul. La propagation de l'incertitude est alors plus précise et plus robuste. La stratégie explorée dans le premier chapitre consiste à apprendre d'un sous-ensemble de réalisations la relation entre les modèles d'écoulement approximé et complexe. Dans la seconde partie de la thèse, cette méthodologie est formalisée mathématiquement en introduisant un modèle de régression entre les réponses fonctionnelles. Comme ce problème est mal posé, il est nécessaire d'en réduire la dimensionnalité. Dans cette optique, l'innovation du travail présenté provient de l'utilisation de l'analyse en composantes principales fonctionnelles (ACPF), qui non seulement effectue la réduction de dimensionnalités tout en maximisant l'information retenue, mais permet aussi de diagnostiquer la qualité du modèle d'erreur dans cet espace fonctionnel. La méthodologie proposée est appliquée à un problème de pollution par une phase liquide nonaqueuse et les résultats obtenus montrent que le modèle d'erreur permet une forte réduction du temps de calcul tout en estimant correctement l'incertitude. De plus, pour chaque réponse approximée, une prédiction de la réponse complexe est fournie par le modèle d'erreur. Le concept de modèle d'erreur fonctionnel est donc pertinent pour la propagation de l'incertitude, mais aussi pour les problèmes d'inférence bayésienne. Les méthodes de Monte Carlo par chaîne de Markov (MCMC) sont les algorithmes les plus communément utilisés afin de générer des réalisations géostatistiques en accord avec les observations. Cependant, ces méthodes souffrent d'un taux d'acceptation très bas pour les problèmes de grande dimensionnalité, résultant en un grand nombre de simulations d'écoulement gaspillées. Une approche en deux temps, le "MCMC en deux étapes", a été introduite afin d'éviter les simulations du modèle complexe inutiles par une évaluation préliminaire de la réalisation. Dans la troisième partie de la thèse, le modèle d'écoulement approximé couplé à un modèle d'erreur sert d'évaluation préliminaire pour le "MCMC en deux étapes". Nous démontrons une augmentation du taux d'acceptation par un facteur de 1.5 à 3 en comparaison avec une implémentation classique de MCMC. Une question reste sans réponse : comment choisir la taille de l'ensemble d'entrainement et comment identifier les réalisations permettant d'optimiser la construction du modèle d'erreur. Cela requiert une stratégie itérative afin que, à chaque nouvelle simulation d'écoulement, le modèle d'erreur soit amélioré en incorporant les nouvelles informations. Ceci est développé dans la quatrième partie de la thèse, oú cette méthodologie est appliquée à un problème d'intrusion saline dans un aquifère côtier. -- Our consumption of groundwater, in particular as drinking water and for irrigation, has considerably increased over the years and groundwater is becoming an increasingly scarce and endangered resource. Nofadays, we are facing many problems ranging from water prospection to sustainable management and remediation of polluted aquifers. Independently of the hydrogeological problem, the main challenge remains dealing with the incomplete knofledge of the underground properties. Stochastic approaches have been developed to represent this uncertainty by considering multiple geological scenarios and generating a large number of realizations. The main limitation of this approach is the computational cost associated with performing complex of simulations in each realization. In the first part of the thesis, we explore this issue in the context of uncertainty propagation, where an ensemble of geostatistical realizations is identified as representative of the subsurface uncertainty. To propagate this lack of knofledge to the quantity of interest (e.g., the concentration of pollutant in extracted water), it is necessary to evaluate the of response of each realization. Due to computational constraints, state-of-the-art methods make use of approximate of simulation, to identify a subset of realizations that represents the variability of the ensemble. The complex and computationally heavy of model is then run for this subset based on which inference is made. Our objective is to increase the performance of this approach by using all of the available information and not solely the subset of exact responses. Two error models are proposed to correct the approximate responses follofing a machine learning approach. For the subset identified by a classical approach (here the distance kernel method) both the approximate and the exact responses are knofn. This information is used to construct an error model and correct the ensemble of approximate responses to predict the "expected" responses of the exact model. The proposed methodology makes use of all the available information without perceptible additional computational costs and leads to an increase in accuracy and robustness of the uncertainty propagation. The strategy explored in the first chapter consists in learning from a subset of realizations the relationship between proxy and exact curves. In the second part of this thesis, the strategy is formalized in a rigorous mathematical framework by defining a regression model between functions. As this problem is ill-posed, it is necessary to reduce its dimensionality. The novelty of the work comes from the use of functional principal component analysis (FPCA), which not only performs the dimensionality reduction while maximizing the retained information, but also allofs a diagnostic of the quality of the error model in the functional space. The proposed methodology is applied to a pollution problem by a non-aqueous phase-liquid. The error model allofs a strong reduction of the computational cost while providing a good estimate of the uncertainty. The individual correction of the proxy response by the error model leads to an excellent prediction of the exact response, opening the door to many applications. The concept of functional error model is useful not only in the context of uncertainty propagation, but also, and maybe even more so, to perform Bayesian inference. Monte Carlo Markov Chain (MCMC) algorithms are the most common choice to ensure that the generated realizations are sampled in accordance with the observations. Hofever, this approach suffers from lof acceptance rate in high dimensional problems, resulting in a large number of wasted of simulations. This led to the introduction of two-stage MCMC, where the computational cost is decreased by avoiding unnecessary simulation of the exact of thanks to a preliminary evaluation of the proposal. In the third part of the thesis, a proxy is coupled to an error model to provide an approximate response for the two-stage MCMC set-up. We demonstrate an increase in acceptance rate by a factor three with respect to one-stage MCMC results. An open question remains: hof do we choose the size of the learning set and identify the realizations to optimize the construction of the error model. This requires devising an iterative strategy to construct the error model, such that, as new of simulations are performed, the error model is iteratively improved by incorporating the new information. This is discussed in the fourth part of the thesis, in which we apply this methodology to a problem of saline intrusion in a coastal aquifer.
Resumo:
The costs related to the treatment of infected total joint arthroplasties represent an ever groving burden to the society. Different patient-adapted therapeutic options like débridement and retention, 1- or 2-step exchange can be used. If a 2-step exchange is used we have to consider short (2-4 weeks) or long (>4-6 weeks) interval treatment. The Swiss DRG (Diagnose related Groups) determines the reimboursement the hopsital receives for the treatment of an infected total arthroplasty. The review assesses the cost-effectiveness of hospitalisation practices linked to surgical treatment in the two-stage exchange of a prosthetic-joint infection. The aim of this retrospectiv study is to compare the economical impact between a short (2 to 4 weeks) versus a long (6 weeks and above) interval during a two-satge procedure to determine the financial impact. Retrospectiv study of the patients with a two-stage procedure for a hip or knee prosthetic joint infection at CHUV hospital Lausanne (Switzerland) between 2012 and 2013. The review analyses the correlation between the interval length and the length of the hospital stay as well as with the costs and revenues per hospital stay. In average there is a loss of 40′000 Euro per hospitalisation for the treatment of prosthetic joint infection. Revenues never cover all the costs, even with a short interval procedure. This economical loss increases with the length of the hospital stay if a long-term intervall is choosen. The review explores potential for improvement in reimbourement practices and hospitalisation practices in the current Swiss healthcare setting. There should be alternative setups to decrease the burden of medical costs by a) increase the reimboursment for the treatment of infected total joints or by b) splitting the hospital stay with partners (rapid transfer after first operation from center hospital to level 2 hospital and retransfer for second operation to center) in order to increase revenues.
Resumo:
In this study it was evaluated the effects of hydraulic retention time (HRT) and Organic Loading Rate (OLR) on the performance of UASB (Upflow Anaerobic Sludge Blanket) reactors in two stages treating residual waters of swine farming. The system consisted of two UASB reactors in pilot scale, installed in series, with volumes of 908 and 188 L, for the first and second stages (R1 and R2), respectively. The HRT applied in the system of anaerobic treatment in two stages (R1 + R2) was of 19.3, 29.0 and 57.9 h. The OLR applied in the R1 ranged from 5.5 to 40.1 kg CODtotal (m³ d)-1. The average removal efficiencies of chemical oxygen demand (COD) and total suspended solids (TSS) ranged, respectively, from 66.3 to 88.2% and 62.5 to 89.3% in the R1, and from 85.5 to 95.5% and 76.4 to 96.1% in the system (R1 + R2). The volumetric production of methane in the system (R1 + R2) ranged from 0.295 to 0.721 m³CH4 (m³ reactor d)-1. It was found that the OLR applied were not limiting to obtain high efficiencies of CODtotal and TSS removal and methane production. The inclusion of the UASB reactor in the second stage contributed to increase the efficiencies of CODtotal and TSS removal, especially, when the treatment system was submitted to the lowest HRT and the highest OLR.
Resumo:
The performance of two upflow anaerobic sludge blanket (UASB) reactors was evaluated in pilot scale (908 and 188 L), installed in series (R1 and R2), fed with swine wastewater with TSS around 5 and 13 g L-1. The UASB reactors were submitted to HDT of 36 and 18 h with VOL of 5.5 to 34.4 g COD (L d)-1 in the R1 and HDT of 7.5 e 3.7 h with VOL from 5.1 to 45.2 g COD (L d)-1 in the R2. The average removal efficiencies of COD ranged from 55 to 85% in the R1 and from 43 to 57% in the R2, resulting in values from 82 to 93% in the UASB reactors in two stage. Methane concentrations in the biogas were 69 to 74% with specific production from 0.05 to 0.27 L CH4 (g removedCOD)-1 in the R1 and of 0.10 to 0.12 L CH4 (g removedCOD)-1 in the R2. The average removal efficiencies were 61 to 75% for totalP, 39 to 69% for KN, 82 to 93% for orgN and 20 to 94% for Fe, Zn, Cu and Mn. The amN concentration were not reduced indicating the need to post-treatment for effluent disposal into water bodies. There were reductions of total coliforms from 99.8123 to 99.9989% and of thermotolerant coliforms from 99.9725 to 99.9999%. The conditions imposed to the UASB reactors in two stage provided high conversions of removedCOD into methane (up to 77%) and reductions of organic an inorganic pollution loads from swine wastewater.
Resumo:
This Paper Studies Tests of Joint Hypotheses in Time Series Regression with a Unit Root in Which Weakly Dependent and Heterogeneously Distributed Innovations Are Allowed. We Consider Two Types of Regression: One with a Constant and Lagged Dependent Variable, and the Other with a Trend Added. the Statistics Studied Are the Regression \"F-Test\" Originally Analysed by Dickey and Fuller (1981) in a Less General Framework. the Limiting Distributions Are Found Using Functinal Central Limit Theory. New Test Statistics Are Proposed Which Require Only Already Tabulated Critical Values But Which Are Valid in a Quite General Framework (Including Finite Order Arma Models Generated by Gaussian Errors). This Study Extends the Results on Single Coefficients Derived in Phillips (1986A) and Phillips and Perron (1986).
Resumo:
In the accounting literature, interaction or moderating effects are usually assessed by means of OLS regression and summated rating scales are constructed to reduce measurement error bias. Structural equation models and two-stage least squares regression could be used to completely eliminate this bias, but large samples are needed. Partial Least Squares are appropriate for small samples but do not correct measurement error bias. In this article, disattenuated regression is discussed as a small sample alternative and is illustrated on data of Bisbe and Otley (in press) that examine the interaction effect of innovation and style of use of budgets on performance. Sizeable differences emerge between OLS and disattenuated regression
Resumo:
Several methods have been suggested to estimate non-linear models with interaction terms in the presence of measurement error. Structural equation models eliminate measurement error bias, but require large samples. Ordinary least squares regression on summated scales, regression on factor scores and partial least squares are appropriate for small samples but do not correct measurement error bias. Two stage least squares regression does correct measurement error bias but the results strongly depend on the instrumental variable choice. This article discusses the old disattenuated regression method as an alternative for correcting measurement error in small samples. The method is extended to the case of interaction terms and is illustrated on a model that examines the interaction effect of innovation and style of use of budgets on business performance. Alternative reliability estimates that can be used to disattenuate the estimates are discussed. A comparison is made with the alternative methods. Methods that do not correct for measurement error bias perform very similarly and considerably worse than disattenuated regression
Resumo:
La crisis que se desató en el mercado hipotecario en Estados Unidos en 2008 y que logró propagarse a lo largo de todo sistema financiero, dejó en evidencia el nivel de interconexión que actualmente existe entre las entidades del sector y sus relaciones con el sector productivo, dejando en evidencia la necesidad de identificar y caracterizar el riesgo sistémico inherente al sistema, para que de esta forma las entidades reguladoras busquen una estabilidad tanto individual, como del sistema en general. El presente documento muestra, a través de un modelo que combina el poder informativo de las redes y su adecuación a un modelo espacial auto regresivo (tipo panel), la importancia de incorporar al enfoque micro-prudencial (propuesto en Basilea II), una variable que capture el efecto de estar conectado con otras entidades, realizando así un análisis macro-prudencial (propuesto en Basilea III).
Resumo:
This paper analyzes the measure of systemic importance ∆CoV aR proposed by Adrian and Brunnermeier (2009, 2010) within the context of a similar class of risk measures used in the risk management literature. In addition, we develop a series of testing procedures, based on ∆CoV aR, to identify and rank the systemically important institutions. We stress the importance of statistical testing in interpreting the measure of systemic importance. An empirical application illustrates the testing procedures, using equity data for three European banks.
Resumo:
El artículo analiza los determinantes de la presencia de hijos no deseados en Colombia. Se utiliza la información de la Encuesta Nacional de Demografía y Salud (ENDS, 2005), específicamente para las mujeres de 40 años o más. Dadas las características especiales de la variable que se analiza, se utilizan modelos de conteo para verificar si determinadas características socioeconómicas como la educación o el estrato económico explican la presencia de hijos no deseados. Se encuentra que la educación de la mujer y el área de residencia son determinantes significativos de los nacimientos no planeados. Además, la relación negativa entre el número de hijos no deseados y la educación de la mujer arroja implicaciones clave en materia de política social.
Resumo:
Even though antenatal care is universally regarded as important, determinants of demand for antenatal care have not been widely studied. Evidence concerning which and how socioeconomic conditions influence whether a pregnant woman attends or not at least one antenatal consultation or how these factors affect the absences to antenatal consultations is very limited. In order to generate this evidence, a two-stage analysis was performed with data from the Demographic and Health Survey carried out by Profamilia in Colombia during 2005. The first stage was run as a logit model showing the marginal effects on the probability of attending the first visit and an ordinary least squares model was performed for the second stage. It was found that mothers living in the pacific region as well as young mothers seem to have a lower probability of attending the first visit but these factors are not related to the number of absences to antenatal consultation once the first visit has been achieved. The effect of health insurance was surprising because of the differing effects that the health insurers showed. Some familiar and personal conditions such as willingness to have the last children and number of previous children, demonstrated to be important in the determination of demand. The effect of mother’s educational attainment was proved as important whereas the father’s educational achievement was not. This paper provides some elements for policy making in order to increase the demand inducement of antenatal care, as well as stimulating research on demand for specific issues on health.
Resumo:
We propose and estimate a financial distress model that explicitly accounts for the interactions or spill-over effects between financial institutions, through the use of a spatial continuity matrix that is build from financial network data of inter bank transactions. Such setup of the financial distress model allows for the empirical validation of the importance of network externalities in determining financial distress, in addition to institution specific and macroeconomic covariates. The relevance of such specification is that it incorporates simultaneously micro-prudential factors (Basel 2) as well as macro-prudential and systemic factors (Basel 3) as determinants of financial distress. Results indicate network externalities are an important determinant of financial health of a financial institutions. The parameter that measures the effect of network externalities is both economically and statistical significant and its inclusion as a risk factor reduces the importance of the firm specific variables such as the size or degree of leverage of the financial institution. In addition we analyze the policy implications of the network factor model for capital requirements and deposit insurance pricing.
Resumo:
Background The persistence of rural-urban disparities in child nutrition outcomes in developing countries alongside rapid urbanisation and increasing incidence of child malnutrition in urban areas raises an important health policy question - whether fundamentally different nutrition policies and interventions are required in rural and urban areas. Addressing this question requires an enhanced understanding of the main drivers of rural-urban disparities in child nutrition outcomes especially for the vulnerable segments of the population. This study applies recently developed statistical methods to quantify the contribution of different socio-economic determinants to rural-urban differences in child nutrition outcomes in two South Asian countries – Bangladesh and Nepal. Methods Using DHS data sets for Bangladesh and Nepal, we apply quantile regression-based counterfactual decomposition methods to quantify the contribution of (1) the differences in levels of socio-economic determinants (covariate effects) and (2) the differences in the strength of association between socio-economic determinants and child nutrition outcomes (co-efficient effects) to the observed rural-urban disparities in child HAZ scores. The methodology employed in the study allows the covariate and coefficient effects to vary across entire distribution of child nutrition outcomes. This is particularly useful in providing specific insights into factors influencing rural-urban disparities at the lower tails of child HAZ score distributions. It also helps assess the importance of individual determinants and how they vary across the distribution of HAZ scores. Results There are no fundamental differences in the characteristics that determine child nutrition outcomes in urban and rural areas. Differences in the levels of a limited number of socio-economic characteristics – maternal education, spouse’s education and the wealth index (incorporating household asset ownership and access to drinking water and sanitation) contribute a major share of rural-urban disparities in the lowest quantiles of child nutrition outcomes. Differences in the strength of association between socio-economic characteristics and child nutrition outcomes account for less than a quarter of rural-urban disparities at the lower end of the HAZ score distribution. Conclusions Public health interventions aimed at overcoming rural-urban disparities in child nutrition outcomes need to focus principally on bridging gaps in socio-economic endowments of rural and urban households and improving the quality of rural infrastructure. Improving child nutrition outcomes in developing countries does not call for fundamentally different approaches to public health interventions in rural and urban areas.
Resumo:
A novel two-stage construction algorithm for linear-in-the-parameters classifier is proposed, aiming at noisy two-class classification problems. The purpose of the first stage is to produce a prefiltered signal that is used as the desired output for the second stage to construct a sparse linear-in-the-parameters classifier. For the first stage learning of generating the prefiltered signal, a two-level algorithm is introduced to maximise the model's generalisation capability, in which an elastic net model identification algorithm using singular value decomposition is employed at the lower level while the two regularisation parameters are selected by maximising the Bayesian evidence using a particle swarm optimization algorithm. Analysis is provided to demonstrate how “Occam's razor” is embodied in this approach. The second stage of sparse classifier construction is based on an orthogonal forward regression with the D-optimality algorithm. Extensive experimental results demonstrate that the proposed approach is effective and yields competitive results for noisy data sets.