897 resultados para random regression model
Resumo:
In the fixed design regression model, additional weights areconsidered for the Nadaraya--Watson and Gasser--M\"uller kernel estimators.We study their asymptotic behavior and the relationships between new andclassical estimators. For a simple family of weights, and considering theIMSE as global loss criterion, we show some possible theoretical advantages.An empirical study illustrates the performance of the weighted estimatorsin finite samples.
Resumo:
Most methods for small-area estimation are based on composite estimators derived from design- or model-based methods. A composite estimator is a linear combination of a direct and an indirect estimator with weights that usually depend on unknown parameters which need to be estimated. Although model-based small-area estimators are usually based on random-effects models, the assumption of fixed effects is at face value more appropriate.Model-based estimators are justified by the assumption of random (interchangeable) area effects; in practice, however, areas are not interchangeable. In the present paper we empirically assess the quality of several small-area estimators in the setting in which the area effects are treated as fixed. We consider two settings: one that draws samples from a theoretical population, and another that draws samples from an empirical population of a labor force register maintained by the National Institute of Social Security (NISS) of Catalonia. We distinguish two types of composite estimators: a) those that use weights that involve area specific estimates of bias and variance; and, b) those that use weights that involve a common variance and a common squared bias estimate for all the areas. We assess their precision and discuss alternatives to optimizing composite estimation in applications.
Resumo:
This paper shows how recently developed regression-based methods for thedecomposition of health inequality can be extended to incorporateindividual heterogeneity in the responses of health to the explanatoryvariables. We illustrate our method with an application to the CanadianNPHS of 1994. Our strategy for the estimation of heterogeneous responsesis based on the quantile regression model. The results suggest that thereis an important degree of heterogeneity in the association of health toexplanatory variables which, in turn, accounts for a substantial percentageof inequality in observed health. A particularly interesting finding isthat the marginal response of health to income is zero for healthyindividuals but positive and significant for unhealthy individuals. Theheterogeneity in the income response reduces both overall health inequalityand income related health inequality.
Resumo:
BACKGROUND: The prevalence of hyperuricemia has rarely been investigated in developing countries. The purpose of the present study was to investigate the prevalence of hyperuricemia and the association between uric acid levels and the various cardiovascular risk factors in a developing country with high average blood pressures (the Seychelles, Indian Ocean, population mainly of African origin). METHODS: This cross-sectional health examination survey was based on a population random sample from the Seychelles. It included 1011 subjects aged 25 to 64 years. Blood pressure (BP), body mass index (BMI), waist circumference, waist-to-hip ratio, total and HDL cholesterol, serum triglycerides and serum uric acid were measured. Data were analyzed using scatterplot smoothing techniques and gender-specific linear regression models. RESULTS: The prevalence of a serum uric acid level >420 micromol/L in men was 35.2% and the prevalence of a serum uric acid level >360 micromol/L was 8.7% in women. Serum uric acid was strongly related to serum triglycerides in men as well as in women (r = 0.73 in men and r = 0.59 in women, p < 0.001). Uric acid levels were also significantly associated but to a lesser degree with age, BMI, blood pressure, alcohol and the use of antihypertensive therapy. In a regression model, triglycerides, age, BMI, antihypertensive therapy and alcohol consumption accounted for about 50% (R2) of the serum uric acid variations in men as well as in women. CONCLUSIONS: This study shows that the prevalence of hyperuricemia can be high in a developing country such as the Seychelles. Besides alcohol consumption and the use of antihypertensive therapy, mainly diuretics, serum uric acid is markedly associated with parameters of the metabolic syndrome, in particular serum triglycerides. Considering the growing incidence of obesity and metabolic syndrome worldwide and the potential link between hyperuricemia and cardiovascular complications, more emphasis should be put on the evolving prevalence of hyperuricemia in developing countries.
Resumo:
We investigated the association between diet and head and neck cancer (HNC) risk using data from the International Head and Neck Cancer Epidemiology (INHANCE) consortium. The INHANCE pooled data included 22 case-control studies with 14,520 cases and 22,737 controls. Center-specific quartiles among the controls were used for food groups, and frequencies per week were used for single food items. A dietary pattern score combining high fruit and vegetable intake and low red meat intake was created. Odds ratios (OR) and 95% confidence intervals (CI) for the dietary items on the risk of HNC were estimated with a two-stage random-effects logistic regression model. An inverse association was observed for higher-frequency intake of fruit (4th vs. 1st quartile OR = 0.52, 95% CI = 0.43-0.62, p (trend) < 0.01) and vegetables (OR = 0.66, 95% CI = 0.49-0.90, p (trend) = 0.01). Intake of red meat (OR = 1.40, 95% CI = 1.13-1.74, p p (trend) < 0.01) was positively associated with HNC risk. Higher dietary pattern scores, reflecting high fruit/vegetable and low red meat intake, were associated with reduced HNC risk (per score increment OR = 0.90, 95% CI = 0.84-0.97).
Resumo:
We consider an infinite number of noninteracting lattice random walkers with the goal of determining statistical properties of the time, out of a total time T, that a single site has been occupied by n random walkers. Initially the random walkers are assumed uniformly distributed on the lattice except for the target site at the origin, which is unoccupied. The random-walk model is taken to be a continuous-time random walk and the pausing-time density at the target site is allowed to differ from the pausing-time density at other sites. We calculate the dependence of the mean time of occupancy by n random walkers as a function of n and the observation time T. We also find the variance for the cumulative time during which the site is unoccupied. The large-T behavior of the variance differs according as the random walk is transient or recurrent. It is shown that the variance is proportional to T at large T in three or more dimensions, it is proportional to T3/2 in one dimension and to TlnT in two dimensions.
Resumo:
The methylation status of the O(6)-methylguanine-DNA methyltransferase (MGMT) gene is an important predictive biomarker for benefit from alkylating agent therapy in glioblastoma. Recent studies in anaplastic glioma suggest a prognostic value for MGMT methylation. Investigation of pathogenetic and epigenetic features of this intriguingly distinct behavior requires accurate MGMT classification to assess high throughput molecular databases. Promoter methylation-mediated gene silencing is strongly dependent on the location of the methylated CpGs, complicating classification. Using the HumanMethylation450 (HM-450K) BeadChip interrogating 176 CpGs annotated for the MGMT gene, with 14 located in the promoter, two distinct regions in the CpG island of the promoter were identified with high importance for gene silencing and outcome prediction. A logistic regression model (MGMT-STP27) comprising probes cg1243587 and cg12981137 provided good classification properties and prognostic value (kappa = 0.85; log-rank p < 0.001) using a training-set of 63 glioblastomas from homogenously treated patients, for whom MGMT methylation was previously shown to be predictive for outcome based on classification by methylation-specific PCR. MGMT-STP27 was successfully validated in an independent cohort of chemo-radiotherapy-treated glioblastoma patients (n = 50; kappa = 0.88; outcome, log-rank p < 0.001). Lower prevalence of MGMT methylation among CpG island methylator phenotype (CIMP) positive tumors was found in glioblastomas from The Cancer Genome Atlas than in low grade and anaplastic glioma cohorts, while in CIMP-negative gliomas MGMT was classified as methylated in approximately 50 % regardless of tumor grade. The proposed MGMT-STP27 prediction model allows mining of datasets derived on the HM-450K or HM-27K BeadChip to explore effects of distinct epigenetic context of MGMT methylation suspected to modulate treatment resistance in different tumor types.
Resumo:
The objectives of this study were to develop a computerized method to screen for potentially avoidable hospital readmissions using routinely collected data and a prediction model to adjust rates for case mix. We studied hospital information system data of a random sample of 3,474 inpatients discharged alive in 1997 from a university hospital and medical records of those (1,115) readmitted within 1 year. The gold standard was set on the basis of the hospital data and medical records: all readmissions were classified as foreseen readmissions, unforeseen readmissions for a new affection, or unforeseen readmissions for a previously known affection. The latter category was submitted to a systematic medical record review to identify the main cause of readmission. Potentially avoidable readmissions were defined as a subgroup of unforeseen readmissions for a previously known affection occurring within an appropriate interval, set to maximize the chance of detecting avoidable readmissions. The computerized screening algorithm was strictly based on routine statistics: diagnosis and procedures coding and admission mode. The prediction was based on a Poisson regression model. There were 454 (13.1%) unforeseen readmissions for a previously known affection within 1 year. Fifty-nine readmissions (1.7%) were judged avoidable, most of them occurring within 1 month, which was the interval used to define potentially avoidable readmissions (n = 174, 5.0%). The intra-sample sensitivity and specificity of the screening algorithm both reached approximately 96%. Higher risk for potentially avoidable readmission was associated with previous hospitalizations, high comorbidity index, and long length of stay; lower risk was associated with surgery and delivery. The model offers satisfactory predictive performance and a good medical plausibility. The proposed measure could be used as an indicator of inpatient care outcome. However, the instrument should be validated using other sets of data from various hospitals.
Resumo:
BACKGROUND: Up to 5% of patients presenting to the emergency department (ED) four or more times within a 12 month period represent 21% of total ED visits. In this study we sought to characterize social and medical vulnerability factors of ED frequent users (FUs) and to explore if these factors hold simultaneously. METHODS: We performed a case-control study at Lausanne University Hospital, Switzerland. Patients over 18 years presenting to the ED at least once within the study period (April 2008 toMarch 2009) were included. FUs were defined as patients with four or more ED visits within the previous 12 months. Outcome data were extracted from medical records of the first ED attendance within the study period. Outcomes included basic demographics and social variables, ED admission diagnosis, somatic and psychiatric days hospitalized over 12 months, and having a primary care physician.We calculated the percentage of FUs and non-FUs having at least one social and one medical vulnerability factor. The four chosen social factors included: unemployed and/or dependence on government welfare, institutionalized and/or without fixed residence, either separated, divorced or widowed, and under guardianship. The fourmedical vulnerability factors were: ≥6 somatic days hospitalized, ≥1 psychiatric days hospitalized, ≥5 clinical departments used (all three factors measured over 12 months), and ED admission diagnosis of alcohol and/or drug abuse. Univariate and multivariate logistical regression analyses allowed comparison of two JGIM ABSTRACTS S391 random samples of 354 FUs and 354 non-FUs (statistical power 0.9, alpha 0.05 for all outcomes except gender, country of birth, and insurance type). RESULTS: FUs accounted for 7.7% of ED patients and 24.9% of ED visits. Univariate logistic regression showed that FUs were older (mean age 49.8 vs. 45.2 yrs, p=0.003),more often separated and/or divorced (17.5%vs. 13.9%, p=0.029) or widowed (13.8% vs. 8.8%, p=0.029), and either unemployed or dependent on government welfare (31.3% vs. 13.3%, p<0.001), compared to non-FUs. FUs cumulated more days hospitalized over 12 months (mean number of somatic days per patient 1.0 vs. 0.3, p<0.001; mean number of psychiatric days per patient 0.12 vs. 0.03, p<0.001). The two groups were similar regarding gender distribution (females 51.7% vs. 48.3%). The multivariate linear regression model was based on the six most significant factors identified by univariate analysis The model showed that FUs had more social problems, as they were more likely to be institutionalized or not have a fixed residence (OR 4.62; 95% CI, 1.65 to 12.93), and to be unemployed or dependent on government welfare (OR 2.03; 95% CI, 1.31 to 3.14) compared to non-FUs. FUs were more likely to need medical care, as indicated by involvement of≥5 clinical departments over 12 months (OR 6.2; 95%CI, 3.74 to 10.15), having an ED admission diagnosis of substance abuse (OR 3.23; 95% CI, 1.23 to 8.46) and having a primary care physician (OR 1.70;95%CI, 1.13 to 2.56); however, they were less likely to present with an admission diagnosis of injury (OR 0.64; 95% CI, 0.40 to 1.00) compared to non-FUs. FUs were more likely to combine at least one social with one medical vulnerability factor (38.4% vs. 12.1%, OR 7.74; 95% CI 5.03 to 11.93). CONCLUSIONS: FUs were more likely than non-FUs to have social and medical vulnerability factors and to have multiple factors in combination.
Resumo:
Some models have been developed using agrometeorological and remote sensing data to estimate agriculture production. However, it is expected that the use of SAR images can improve their performance. The main objective of this study was to estimate the sugarcane production using a multiple linear regression model which considers agronomic data and ALOS/PALSAR images obtained from 2007/08, 2008/09 and 2009/10 cropping seasons. The performance of models was evaluated by coefficient of determination, t-test, Willmott agreement index (d), random error and standard error. The model was able to explain 79%, 12% and 74% of the variation in the observed productions of the 2007/08, 2008/09 and 2009/10 cropping seasons, respectively. Performance of the model for the 2008/09 cropping season was poor because of the occurrence of a long period of drought in that season. When the three seasons were considered all together, the model explained 66% of the variation. Results showed that SAR-based yield prediction models can contribute and assist sugar mill technicians to improve such estimates.
Resumo:
Several Authors Have Discussed Recently the Limited Dependent Variable Regression Model with Serial Correlation Between Residuals. the Pseudo-Maximum Likelihood Estimators Obtained by Ignoring Serial Correlation Altogether, Have Been Shown to Be Consistent. We Present Alternative Pseudo-Maximum Likelihood Estimators Which Are Obtained by Ignoring Serial Correlation Only Selectively. Monte Carlo Experiments on a Model with First Order Serial Correlation Suggest That Our Alternative Estimators Have Substantially Lower Mean-Squared Errors in Medium Size and Small Samples, Especially When the Serial Correlation Coefficient Is High. the Same Experiments Also Suggest That the True Level of the Confidence Intervals Established with Our Estimators by Assuming Asymptotic Normality, Is Somewhat Lower Than the Intended Level. Although the Paper Focuses on Models with Only First Order Serial Correlation, the Generalization of the Proposed Approach to Serial Correlation of Higher Order Is Also Discussed Briefly.
Resumo:
L’objectif de cette étude était de déterminer l’impact d’une infection intra-mammaire (IIM) subclinique causée par staphylocoque coagulase-négative (SCN) ou Staphylococcus aureus diagnostiquée durant le premier mois de lactation chez les taures sur le comptage de cellules somatiques (CCS), la production laitière et le risque de réforme durant la lactation en cours. Des données bactériologiques provenant d’échantillons de lait composites de 2 273 taures Holstein parmi 50 troupeaux ont été interprétées selon les recommandations du National Mastitis Council. Parmi 1 691 taures rencontrant les critères de sélection, 90 (5%) étaient positives à S. aureus, 168 (10%) étaient positives à SCN et 153 (9%) étaient négatives (aucun agent pathogène isolé). Le CCS transformé en logarithme népérien (lnCCS) a été modélisé via une régression linéaire avec le troupeau comme effet aléatoire. Le lnCCS chez les groupes S. aureus et SCN était significativement plus élevé que dans le groupe témoin de 40 à 300 jours en lait (JEL) (P < 0.0001 pour tous les contrastes). La valeur journalière du lnSCC chez les groupes S. aureus et SCN était en moyenne 1.2 et 0.6 plus élevé que le groupe témoin respectivement. Un modèle similaire a été réalisé pour la production laitière avec l’âge au vêlage, le trait génétique lié aux parents pour la production laitière et le logarithme népérien du JEL de la pesée inclus. La production laitière n’était pas statistiquement différente entre les 3 groupes de culture de 40 à 300 JEL (P ≥ 0.12). Les modèles de survie de Cox ont révélé que le risque de réforme n’était pas statistiquement différent entre le groupe S. aureus ou SCN et le groupe témoin (P ≥ 0.16). La prévention des IIM causées par SCN et S. aureus en début de lactation demeure importante étant donné leur association avec le CCS durant la lactation en cours.
Resumo:
L'imputation est souvent utilisée dans les enquêtes pour traiter la non-réponse partielle. Il est bien connu que traiter les valeurs imputées comme des valeurs observées entraîne une sous-estimation importante de la variance des estimateurs ponctuels. Pour remédier à ce problème, plusieurs méthodes d'estimation de la variance ont été proposées dans la littérature, dont des méthodes adaptées de rééchantillonnage telles que le Bootstrap et le Jackknife. Nous définissons le concept de double-robustesse pour l'estimation ponctuelle et de variance sous l'approche par modèle de non-réponse et l'approche par modèle d'imputation. Nous mettons l'emphase sur l'estimation de la variance à l'aide du Jackknife qui est souvent utilisé dans la pratique. Nous étudions les propriétés de différents estimateurs de la variance à l'aide du Jackknife pour l'imputation par la régression déterministe ainsi qu'aléatoire. Nous nous penchons d'abord sur le cas de l'échantillon aléatoire simple. Les cas de l'échantillonnage stratifié et à probabilités inégales seront aussi étudiés. Une étude de simulation compare plusieurs méthodes d'estimation de variance à l'aide du Jackknife en terme de biais et de stabilité relative quand la fraction de sondage n'est pas négligeable. Finalement, nous établissons la normalité asymptotique des estimateurs imputés pour l'imputation par régression déterministe et aléatoire.
Resumo:
L’imputation simple est très souvent utilisée dans les enquêtes pour compenser pour la non-réponse partielle. Dans certaines situations, la variable nécessitant l’imputation prend des valeurs nulles un très grand nombre de fois. Ceci est très fréquent dans les enquêtes entreprises qui collectent les variables économiques. Dans ce mémoire, nous étudions les propriétés de deux méthodes d’imputation souvent utilisées en pratique et nous montrons qu’elles produisent des estimateurs imputés biaisés en général. Motivé par un modèle de mélange, nous proposons trois méthodes d’imputation et étudions leurs propriétés en termes de biais. Pour ces méthodes d’imputation, nous considérons un estimateur jackknife de la variance convergent vers la vraie variance, sous l’hypothèse que la fraction de sondage est négligeable. Finalement, nous effectuons une étude par simulation pour étudier la performance des estimateurs ponctuels et de variance en termes de biais et d’erreur quadratique moyenne.
Resumo:
La dialyse péritonéale (DP) est une thérapie d’épuration extra-rénale qui peut se réaliser à domicile par l’entremise d’une technologie. Elle exige, du patient certaines aptitudes, (motivation et compétence) et de l’équipe de soins, une organisation particulière pour arriver à une autonomie d’exécution de l’épuration. Dans un contexte de thérapie à domicile, comme celui de la dialyse péritonéale, le niveau d’autonomie des patients ainsi que les facteurs qui y sont associés n’ont pas été examinés auparavant. C’est l’objet de cette thèse. En se fondant sur la théorie de l’autodétermination et sur une revue de la littérature, un cadre conceptuel a été développé et fait l’hypothèse que trois types de facteurs essentiels pourraient influencer l’autonomie. Il s’agit de facteurs individuels, technologiques et organisationnels. Pour tester ces hypothèses, un devis mixte séquentiel, composé de deux volets, a été réalisé. Un premier volet qualitatif - opérationnalisé par des entrevues auprès de 12 patients et de 11 infirmières - a permis, d’une part, d’explorer et de mieux définir les dimensions de l’autonomie pertinente dans le cadre de la DP; d’autre part de bonifier le développement d’un questionnaire. Après validation, ce dernier a servi à la collecte de données lors du deuxième volet quantitatif et alors a permis d’obtenir des résultats auprès d’un échantillon probabiliste (n =98), tiré de la population des dialysés péritonéaux du Québec (N=700). L’objectif de ce deuxième volet était de mesurer le degré d’autonomie des patients, d’examiner les associations entre les facteurs technologiques, organisationnels ainsi qu’individuels et les différentes dimensions de l’autonomie. Des analyses univariées et multivariées ont été réalisées à cet effet. Les résultats obtenus montrent que quatre dimensions d’autonomie sont essentielles à atteindre en dialyse à domicile. Il s’agit de l’autonomie, sur le plan clinique, technique, fonctionnel (liberté journalière) et organisationnel (indépendance par rapport à l’institution de soins). Pour ces quatre types d’autonomie, les patients ont rapporté être hautement autonomes, un résultat qui se reflète dans les scores obtenus sur une échelle de 1 à 5 : l’autonomie clinique (4,1), l’autonomie technique (4,8), l’autonomie fonctionnelle (4,1) et l’autonomie organisationnelle (4,5). Chacun de ces types d’autonomie est associé à des degrés variables aux trois facteurs du modèle conceptuel : facteurs individuels (motivation et compétence), technologique (convivialité) et organisationnels (soutien clinique, technique et familial). Plus spécifiquement, la motivation serait associée à l’autonomie fonctionnelle. La convivialité serait associée à l’autonomie clinique, alors que la myopathie pourrait la compromettre. La convivialité de la technologie et la compétence du patient contribueraient à une meilleure autonomie organisationnelle. Quant à l’autonomie sur le plan technique, tous les patients ont rapporté être hautement autonomes en ce qui concerne la manipulation de la technologie. Ce résultat s’expliquerait par une formation adéquate mise à la disposition des patients en prédialyse, par le suivi continu et par la manipulation quotidienne pendant des années d’utilisation. Bien que dans cette thèse la technologie d’application soit la dialyse péritonéale, nous retenons que lorsqu’on transfère la maîtrise d’une technologie thérapeutique à domicile pour traiter une maladie chronique, il est primordial d’organiser ce transfert de telle façon que les trois facteurs techniques (convivialité), individuels (motivation, formation et compétence), et organisationnels (soutien de l’aidant) soient mis en place pour garantir une autonomie aux quatre niveaux, technique, clinique, fonctionnel et organisationnel.