87 resultados para nested multinomial logit
Batch effect confounding leads to strong bias in performance estimates obtained by cross-validation.
Resumo:
BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.
Resumo:
OBJECTIVES: HIV infection and exposure to certain antiretroviral drugs is associated with dyslipidemia and increased risk for coronary events. Whether this risk is mediated by highly atherogenic lipoproteins is unclear. We investigated the association of highly atherogenic small dense low-density lipoproteins (LDLs) and apolipoprotein B and coronary events in HIV-infected individuals receiving antiretroviral therapy. METHODS: We conducted a case-control study nested into the Swiss HIV Cohort Study to investigate the association of small dense LDL and apolipoprotein B and coronary events in 98 antiretroviral drug-treated patients with a first coronary event (19 fatal and 79 nonfatal coronary events with 53 definite and 15 possible myocardial infarctions, 11 angioplasties or bypasses) and 393 treated controls matched for age, gender, and smoking status. Lipids were measured by ultracentrifugation. RESULTS: In models including cholesterol, triglycerides, high-density lipoprotein cholesterol, blood pressure, central obesity, diabetes, and family history, there was an independent association between small dense LDL and coronary events [odds ratio (OR) for 1 mg/dL increase: 1.06, 95% confidence interval (CI): 1.00 to 1.11] and apolipoprotein B (OR for 10 mg/dL increase: 1.16, 95% CI: 1.02 to 1.32). When adding HIV and antiretroviral therapy-related variables, ORs were 1.04 (95% CI: 0.99 to 1.10) for small dense LDL and 1.13 (95% CI: 0.99 to 1.30) for apolipoprotein B. In both models, blood pressure and HIV viral load was independently associated with the odds for coronary events. CONCLUSIONS: HIV-infected patients receiving antiretroviral therapy with elevate small dense LDL and apolipoprotein B are at increased risk for coronary events as are patients without sustained HIV suppression.
Resumo:
Background:It has been suggested that the relative importance of oestrogen-metabolising pathways may affect the risk of oestrogen-dependent tumours including endometrial cancer. One hypothesis is that the 2-hydroxy pathway is protective, whereas the 16α-hydroxy pathway is harmful.Methods:We conducted a case-control study nested within three prospective cohorts to assess whether the circulating 2-hydroxyestrone : 16α-hydroxyestrone (2-OHE1 : 16α-OHE1) ratio is inversely associated with endometrial cancer risk in postmenopausal women. A total of 179 cases and 336 controls, matching cases on cohort, age and date of blood donation, were included. Levels of 2-OHE1 and 16α-OHE1 were measured using a monoclonal antibody-based enzyme assay.Results:Endometrial cancer risk increased with increasing levels of both metabolites, with odds ratios in the top tertiles of 2.4 (95% CI=1.3, 4.6; P(trend)=0.007) for 2-OHE1 and 1.9 (95% CI=1.1, 3.5; P(trend)=0.03) for 16α-OHE1 in analyses adjusting for endometrial cancer risk factors. These associations were attenuated and no longer statistically significant after further adjustment for oestrone or oestradiol levels. No significant association was observed for the 2-OHE1 : 16α-OHE1 ratio.Conclusion:Our results do not support the hypothesis that greater metabolism of oestrogen via the 2-OH pathway, relative to the 16α-OH pathway, protects against endometrial cancer.
Resumo:
Veterans of infection, Leishmania parasites have been plaguing mammals for centuries, causing a morbidity toll second only to that of malaria as the most devastating protozoan parasitic disease in the world. Cutaneous leishmaniasis (CL) is, by far, the most prevalent form of the disease, with symptoms ranging from a single self-healing lesion to chronic metastatic leishmaniasis (ML). In an increasingly immunocompromised population, complicated CL is becoming a more likely outcome, characterized by severely inflamed, destructive lesions that are often refractory to current treatment. This is perhaps because our ageing arsenal of variably effective antileishmanial drugs may be directly or indirectly immunomodulatory and may thus have variable effects in each type and stage of CL. Indeed, widely differing immune biases are created by the various species of Leishmania, and these immunological watersheds are further shifted by extrinsic disturbances in immune homeostasis. For example, we recently showed that a naturally occurring RNA virus (Leishmania RNA virus (LRV)) within some Leishmania parasites creates hyperinflammatory cross-talk, which can predispose to ML: a case of immunological misfire that may require a different approach to immunotherapy, whereby treatments are tailored to underlying immune biases. Understanding the intersecting immune pathways of leishmaniasis and its co-infections will enable us to identify new drug targets, and thereby design therapeutic strategies that work by untangling the immunological cross-wires of pathogenic cross-talk.
Resumo:
OBJECTIVE: Most studies on alcohol as a risk factor for injuries have been mechanism specific, and few have considered several mechanisms simultaneously or reported alcohol-attributable fractions (AAFs)-which was the aim of the current study. METHOD: Data from 3,592 injured and 3,489 noninjured patients collected between January 2003 and June 2004 in the surgical ward of the emergency department of the Lausanne University Hospital (Switzerland) were analyzed. Four injury mechanisms derived from the International Classification of Diseases, 10th Revision, were considered: transportation-related injuries, falls, exposure to forces and other events, and interpersonal violence. Multinomial logistic regression models were calculated to estimate the risk relationships of different levels of alcohol consumption, using noninjured patients as quasi-controls. The AAFs were then calculated. RESULTS: Risk relationships between injury and acute consumption were found across all mechanisms, commonly resulting in dose-response relationships. Marked differences between mechanisms were observed for relative risks and AAFs, which varied between 15.2% and 33.1% and between 10.1% and 35.9%, depending on the time window of consumption (either 6 hours or 24 hours before injury, respectively). Low and medium levels of alcohol consumption generally were associated with the most AAFs. CONCLUSIONS: This study underscores the implications of even low levels of alcohol consumption on the risk of sustaining injuries through any of the mechanisms considered. Substantial AAFs are reported for each mechanism, particularly for injuries resulting from interpersonal violence. Observation of a so-called preventive paradox phenomenon is discussed, and prevention or intervention measures are described.
Resumo:
Therapist competence is a key variable for psychotherapy research. Empirically, the relationship between competence and therapeutic outcome has shown contradictory results and needs to be clarified, especially with regard to possible variables influencing this relationship. A total of 78 outpatients were treated by 15 therapists in a very brief 4-session format, based on psychoanalytic theory. Data were analyzed by means of a nested design using hierarchical linear modeling. No direct link between therapist competence and outcome has been found, however, results corroborated the importance of alliance patterns as moderator in the relationship between therapist competence and outcome. Only in dyads with alliance change over the course of treatment was it clear that competence is positively related to outcome. These findings are discussed with regard to the importance for outcome of therapist competence and alliance construction processes.
Resumo:
With a life expectancy at the age of 65 of around 20 years, damaging health risk behaviours of young-old adults have become a target for preventive actions. Such risk factors necessitate an accurate understanding of the present and past socioeconomic conditions associated with health risk behaviours. The aim of our study is to assess the impact of certain life events as well as economic and environmental factors on health risk behaviours. We included 1309 participants of the Lausanne Cohort Lc65+ aged 65-70 years and employed logistic regression analyses, with individuals nested within areas. The results illustrate the influences of socioeconomic factors from childhood to young-old age. Life experiences in adulthood and economic resources in young-old age are both associated with unfavourable health behaviours. Neighbourhood is a modest determinant as well, particularly regarding alcohol consumption. Therefore, prevention against health risk behaviours should focus on population subgroups defined on the basis of their socioeconomic and living contexts.
Resumo:
We present a Bayesian approach for estimating the relative frequencies of multi-single nucleotide polymorphism (SNP) haplotypes in populations of the malaria parasite Plasmodium falciparum by using microarray SNP data from human blood samples. Each sample comes from a malaria patient and contains one or several parasite clones that may genetically differ. Samples containing multiple parasite clones with different genetic markers pose a special challenge. The situation is comparable with a polyploid organism. The data from each blood sample indicates whether the parasites in the blood carry a mutant or a wildtype allele at various selected genomic positions. If both mutant and wildtype alleles are detected at a given position in a multiply infected sample, the data indicates the presence of both alleles, but the ratio is unknown. Thus, the data only partially reveals which specific combinations of genetic markers (i.e. haplotypes across the examined SNPs) occur in distinct parasite clones. In addition, SNP data may contain errors at non-negligible rates. We use a multinomial mixture model with partially missing observations to represent this data and a Markov chain Monte Carlo method to estimate the haplotype frequencies in a population. Our approach addresses both challenges, multiple infections and data errors.
Resumo:
Abstract Traditionally, the common reserving methods used by the non-life actuaries are based on the assumption that future claims are going to behave in the same way as they did in the past. There are two main sources of variability in the processus of development of the claims: the variability of the speed with which the claims are settled and the variability between the severity of the claims from different accident years. High changes in these processes will generate distortions in the estimation of the claims reserves. The main objective of this thesis is to provide an indicator which firstly identifies and quantifies these two influences and secondly to determine which model is adequate for a specific situation. Two stochastic models were analysed and the predictive distributions of the future claims were obtained. The main advantage of the stochastic models is that they provide measures of variability of the reserves estimates. The first model (PDM) combines one conjugate family Dirichlet - Multinomial with the Poisson distribution. The second model (NBDM) improves the first one by combining two conjugate families Poisson -Gamma (for distribution of the ultimate amounts) and Dirichlet Multinomial (for distribution of the incremental claims payments). It was found that the second model allows to find the speed variability in the reporting process and development of the claims severity as function of two above mentioned distributions' parameters. These are the shape parameter of the Gamma distribution and the Dirichlet parameter. Depending on the relation between them we can decide on the adequacy of the claims reserve estimation method. The parameters have been estimated by the Methods of Moments and Maximum Likelihood. The results were tested using chosen simulation data and then using real data originating from the three lines of business: Property/Casualty, General Liability, and Accident Insurance. These data include different developments and specificities. The outcome of the thesis shows that when the Dirichlet parameter is greater than the shape parameter of the Gamma, resulting in a model with positive correlation between the past and future claims payments, suggests the Chain-Ladder method as appropriate for the claims reserve estimation. In terms of claims reserves, if the cumulated payments are high the positive correlation will imply high expectations for the future payments resulting in high claims reserves estimates. The negative correlation appears when the Dirichlet parameter is lower than the shape parameter of the Gamma, meaning low expected future payments for the same high observed cumulated payments. This corresponds to the situation when claims are reported rapidly and fewer claims remain expected subsequently. The extreme case appears in the situation when all claims are reported at the same time leading to expectations for the future payments of zero or equal to the aggregated amount of the ultimate paid claims. For this latter case, the Chain-Ladder is not recommended.
Resumo:
Active labor-market policies (ALMPs) have developed significantly over the past two decades across Organization for Economic Cooperation and Development (OECD) countries, with substantial cross-national differences in terms of both extent and overall orientation. The objective of this article is to account for cross-national variation in this policy field. It starts by reviewing existing scholarship concerning political, institutional, and ideational determinants of ALMPs. It then argues that ALMP is too broad a category to be used without further specification, and it develops a typology of four different types of ALMPs: incentive reinforcement, employment assistance, occupation, and human capital investment. These are discussed and examined through ALMP expenditure profiles in selected countries. The article uses this typology to analyze ALMP trajectories in six Western European countries and shows that the role of this instrument changes dramatically over time. It concludes that there is little regularity in the political determinants of ALMPs. In contrast, it finds strong institutional and ideational effects, nested in the interaction between the changing economic context and existing labor-market policies.
Resumo:
OBJECTIVE: To investigate HIV-related immunodeficiency as a risk factor for hepatocellular carcinoma (HCC) among persons infected with HIV, while controlling for the effect of frequent coinfection with hepatitis C and B viruses. DESIGN: A case-control study nested in the Swiss HIV Cohort Study. METHODS: Twenty-six HCC patients were identified in the Swiss HIV Cohort Study or through linkage with Swiss Cancer Registries, and were individually matched to 251 controls according to Swiss HIV Cohort Study centre, sex, HIV-transmission category, age and year at enrollment. Odds ratios and corresponding confidence intervals were estimated by conditional logistic regression. RESULTS: All HCC patients were positive for hepatitis B surface antigen or antibodies against hepatitis C virus. HCC patients included 14 injection drug users (three positive for hepatitis B surface antigen and 13 for antibodies against hepatitis C virus) and 12 men having sex with men/heterosexual/other (11 positive for hepatitis B surface antigen, three for antibodies against hepatitis C virus), revealing a strong relationship between HIV transmission route and hepatitis viral type. Latest CD4+ cell count [Odds ratio (OR) per 100 cells/mul decrease = 1.33, 95% confidence interval (CI) 1.06-1.68] and CD4+ cell count percentage (OR per 10% decrease = 1.65, 95% CI 1.01-2.71) were significantly associated with HCC. The effects of CD4+ cell count were concentrated among men having sex with men/heterosexual/other rather than injecting drug users. Highly active antiretroviral therapy use was not significantly associated with HCC risk (OR for ever versus never = 0.59, 95% confidence interval 0.18-1.91). CONCLUSION: Lower CD4+ cell counts increased the risk for HCC among persons infected with HIV, an effect that was particularly evident for hepatitis B virus-related HCC arising in non-injecting drug users.
Resumo:
Gastric cancer affects about one million people per year worldwide, being the second leading cause of cancer mortality. The study of its etiology remains therefore a global issue as it may allow the identification of major targets, besides eradication of Helicobacter pylori infection, for primary prevention. It has however received little attention, given its comparatively low incidence in most high-income countries. We introduce a consortium of epidemiological investigations named the 'Stomach cancer Pooling (StoP) Project'. Twenty-two studies agreed to participate, for a total of over 9000 cases and 23 000 controls. Twenty studies have already shared the original data set. Of the patients, 40% are from Asia, 43% from Europe, and 17% from North America; 34% are women and 66% men; the median age is 61 years; 56% are from population-based case-control studies, 41% from hospital-based ones, and 3% from nested case-control studies derived from cohort investigations. Biological samples are available from 12 studies. The aim of the StoP Project is to analyze the role of lifestyle and genetic determinants in the etiology of gastric cancer through pooled analyses of individual-level data. The uniquely large data set will allow us to define and quantify the main effects of each risk factor of interest, including a number of infrequent habits, and to adequately address associations in subgroups of the population, as well as interaction within and between environmental and genetic factors. Further, we will carry out separate analyses according to different histotypes and subsites of gastric cancer, to identify potential different risk patterns and etiological characteristics.
Resumo:
Question: When multiple observers record the same spatial units of alpine vegetation, how much variation is there in the records and what are the consequences of this variation for monitoring schemes to detect change? Location: One test summit in Switzerland (Alps) and one test summit in Scotland (Cairngorm Mountains). Method: Eight observers used the GLORIA protocols for species composition and visual cover estimates in percent on large summit sections (>100 m2) and species composition and frequency in nested quadrats (1 m2). Results: The multiple records from the same spatial unit for species composition and species cover showed considerable variation in the two countries. Estimates of pseudoturnover of composition and coefficients of variation of cover estimates for vascular plant species in 1m x 1m quadrats showed less variation than in previously published reports whereas our results in larger sections were broadly in line with previous reports. In Scotland, estimates for bryophytes and lichens were more variable than for vascular plants. Conclusions: Statistical power calculations indicated that, unless large numbers of plots were used, changes in cover or frequency were only likely to be detected for abundant species (exceeding 10% cover) or if relative changes were large (50% or more). Lower variation could be reached with the point methods and with larger numbers of small plots. However, as summits often strongly differ from each other, supplementary summits cannot be considered as a way of increasing statistical power without introducing a supplementary component of variance into the analysis and hence the power calculations.
Resumo:
Aims: To describe the drinking patterns and their baseline predictive factors during a 12-month period after an initial evaluation for alcohol treatment. Methods CONTROL is a single-center, prospective, observational study evaluating consecutive alcohol-dependent patients. Using a curve clustering methodology based on a polynomial regression mixture model, we identified three clusters of patients with dominant alcohol use patterns described as mostly abstainers, mostly moderate drinkers and mostly heavy drinkers. Multinomial logistic regression analysis was used to identify baseline factors (socio-demographic, alcohol dependence consequences and related factors) predictive of belonging to each drinking cluster. ResultsThe sample included 143 alcohol-dependent adults (63.6% males), mean age 44.6 ± 11.8 years. The clustering method identified 47 (32.9%) mostly abstainers, 56 (39.2%) mostly moderate drinkers and 40 (28.0%) mostly heavy drinkers. Multivariate analyses indicated that mild or severe depression at baseline predicted belonging to the mostly moderate drinkers cluster during follow-up (relative risk ratio (RRR) 2.42, CI [1.02-5.73, P = 0.045] P = 0.045), while living alone (RRR 2.78, CI [1.03-7.50], P = 0.044) and reporting more alcohol-related consequences (RRR 1.03, CI [1.01-1.05], P = 0.004) predicted belonging to the mostly heavy drinkers cluster during follow-up. Conclusion In this sample, the drinking patterns of alcohol-dependent patients were predicted by baseline factors, i.e. depression, living alone or alcohol-related consequences and findings that may inform clinicians about the likely drinking patterns of their alcohol-dependent patient over the year following the initial evaluation for alcohol treatment.
Resumo:
PURPOSE: The objective of this study was to investigate the effects of weather, rank, and home advantage on international football match results and scores in the Gulf Cooperation Council (GCC) region. METHODS: Football matches (n = 2008) in six GCC countries were analyzed. To determine the weather influence on the likelihood of favorable outcome and goal difference, generalized linear model with a logit link function and multiple regression analysis were performed. RESULTS: In the GCC region, home teams tend to have greater likelihood of a favorable outcome (P < 0.001) and higher goal difference (P < 0.001). Temperature difference was identified as a significant explanatory variable when used independently (P < 0.001) or after adjustment for home advantage and team ranking (P < 0.001). The likelihood of favorable outcome for GCC teams increases by 3% for every 1-unit increase in temperature difference. After inclusion of interaction with opposition, this advantage remains significant only when playing against non-GCC opponents. While home advantage increased the odds of favorable outcome (P < 0.001) and goal difference (P < 0.001) after inclusion of interaction term, the likelihood of favorable outcome for a GCC team decreased (P < 0.001) when playing against a stronger opponent. Finally, the temperature and wet bulb globe temperature approximation were found as better indicators of the effect of environmental conditions than absolute and relative humidity or heat index on match outcomes. CONCLUSIONS: In GCC region, higher temperature increased the likelihood of a favorable outcome when playing against non-GCC teams. However, international ranking should be considered because an opponent with a higher rank reduced, but did not eliminate, the likelihood of a favorable outcome.