869 resultados para Cox regression
Forward Stepwise Ridge Regression (FSRR) based variable selection for highly correlated input spaces
Resumo:
Virtual metrology (VM) aims to predict metrology values using sensor data from production equipment and physical metrology values of preceding samples. VM is a promising technology for the semiconductor manufacturing industry as it can reduce the frequency of in-line metrology operations and provide supportive information for other operations such as fault detection, predictive maintenance and run-to-run control. The prediction models for VM can be from a large variety of linear and nonlinear regression methods and the selection of a proper regression method for a specific VM problem is not straightforward, especially when the candidate predictor set is of high dimension, correlated and noisy. Using process data from a benchmark semiconductor manufacturing process, this paper evaluates the performance of four typical regression methods for VM: multiple linear regression (MLR), least absolute shrinkage and selection operator (LASSO), neural networks (NN) and Gaussian process regression (GPR). It is observed that GPR performs the best among the four methods and that, remarkably, the performance of linear regression approaches that of GPR as the subset of selected input variables is increased. The observed competitiveness of high-dimensional linear regression models, which does not hold true in general, is explained in the context of extreme learning machines and functional link neural networks.
Resumo:
Background and Purpose-The aim was to investigate prospectively the all-cause mortality risk up to and after coronary heart disease (CHD) and stroke events in European middle-aged men.
Methods-The study population comprised 10 424 men 50 to 59 years of age recruited between 1991 and 1994 in France (N=7855) and Northern Ireland (N=2747) within the Prospective Epidemiological Study of Myocardial Infarction. Incident CHD and stroke events and deaths from all causes were prospectively registered during the 10-year follow-up. In Cox's proportional hazards regression analysis, CHD and stroke events during follow-up were used as time-dependent covariates.
Results-A total of 769 CHD and 132 stroke events were adjudicated, and 569 deaths up to and 66 after CHD or stroke occurred during follow-up. After adjustment for study country and cardiovascular risk factors, the hazard ratios of all-cause mortality were 1.58 (95% confidence interval 1.18-2.12) after CHD and 3.13 (95% confidence interval 1.98-4.92) after stroke.
Conclusions-These findings support continuous efforts to promote both primary and secondary prevention of cardiovascular disease.
Resumo:
OBJECTIVE: To investigate the impact of smoking and smoking cessation on cardiovascular mortality, acute coronary events, and stroke events in people aged 60 and older, and to calculate and report risk advancement periods for cardiovascular mortality in addition to traditional epidemiological relative risk measures.
DESIGN: Individual participant meta-analysis using data from 25 cohorts participating in the CHANCES consortium. Data were harmonised, analysed separately employing Cox proportional hazard regression models, and combined by meta-analysis.
RESULTS: Overall, 503,905 participants aged 60 and older were included in this study, of whom 37,952 died from cardiovascular disease. Random effects meta-analysis of the association of smoking status with cardiovascular mortality yielded a summary hazard ratio of 2.07 (95% CI 1.82 to 2.36) for current smokers and 1.37 (1.25 to 1.49) for former smokers compared with never smokers. Corresponding summary estimates for risk advancement periods were 5.50 years (4.25 to 6.75) for current smokers and 2.16 years (1.38 to 2.39) for former smokers. The excess risk in smokers increased with cigarette consumption in a dose-response manner, and decreased continuously with time since smoking cessation in former smokers. Relative risk estimates for acute coronary events and for stroke events were somewhat lower than for cardiovascular mortality, but patterns were similar.
CONCLUSIONS: Our study corroborates and expands evidence from previous studies in showing that smoking is a strong independent risk factor of cardiovascular events and mortality even at older age, advancing cardiovascular mortality by more than five years, and demonstrating that smoking cessation in these age groups is still beneficial in reducing the excess risk.
Resumo:
A forward and backward least angle regression (LAR) algorithm is proposed to construct the nonlinear autoregressive model with exogenous inputs (NARX) that is widely used to describe a large class of nonlinear dynamic systems. The main objective of this paper is to improve model sparsity and generalization performance of the original forward LAR algorithm. This is achieved by introducing a replacement scheme using an additional backward LAR stage. The backward stage replaces insignificant model terms selected by forward LAR with more significant ones, leading to an improved model in terms of the model compactness and performance. A numerical example to construct four types of NARX models, namely polynomials, radial basis function (RBF) networks, neuro fuzzy and wavelet networks, is presented to illustrate the effectiveness of the proposed technique in comparison with some popular methods.
Resumo:
Wnt/β-catenin signaling has a central role in the development and progression of most colon cancers (CCs). Germline variants in Wnt/β-catenin pathway genes may result in altered gene function and/or activity, thereby causing inter-individual differences in relation to tumor recurrence capacity and chemoresistance. We investigated germline polymorphisms in a comprehensive panel of Wnt/β-catenin pathway genes to predict time to tumor recurrence (TTR) in patients with stage III and high-risk stage II CC. A total of 234 patients treated with 5-fluorouracil-based chemotherapy were included in this study. Whole-blood samples were analyzed for putative functional germline polymorphisms in SFRP3, SFRP4, DKK2, DKK3, Axin2, APC, TCF7L2, WNT5B, CXXC4, NOTCH2 and GLI1 genes by PCR-based restriction fragment-length polymorphism or direct DNA sequencing. Polymorphisms with statistical significance were validated in an independent study cohort. The minor allele of WNT5B rs2010851 T>G was significantly associated with a shorter TTR (10.7 vs 4.9 years; hazard ratio: 2.48; 95% CI, 0.96-6.38; P=0.04) in high-risk stage II CC patients. This result remained significant in multivariate Cox's regression analysis. This study shows that the WNT5B germline variant rs2010851 was significantly identified as a stage-dependent prognostic marker for CC patients after 5-fluorouracil-based adjuvant therapy.
Resumo:
In many applications, and especially those where batch processes are involved, a target scalar output of interest is often dependent on one or more time series of data. With the exponential growth in data logging in modern industries such time series are increasingly available for statistical modeling in soft sensing applications. In order to exploit time series data for predictive modelling, it is necessary to summarise the information they contain as a set of features to use as model regressors. Typically this is done in an unsupervised fashion using simple techniques such as computing statistical moments, principal components or wavelet decompositions, often leading to significant information loss and hence suboptimal predictive models. In this paper, a functional learning paradigm is exploited in a supervised fashion to derive continuous, smooth estimates of time series data (yielding aggregated local information), while simultaneously estimating a continuous shape function yielding optimal predictions. The proposed Supervised Aggregative Feature Extraction (SAFE) methodology can be extended to support nonlinear predictive models by embedding the functional learning framework in a Reproducing Kernel Hilbert Spaces setting. SAFE has a number of attractive features including closed form solution and the ability to explicitly incorporate first and second order derivative information. Using simulation studies and a practical semiconductor manufacturing case study we highlight the strengths of the new methodology with respect to standard unsupervised feature extraction approaches.
Resumo:
Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
Resumo:
Background: Around 10-15% of patients with locally advanced rectal cancer (LARC) undergo a pathologically complete response (TRG4) to neoadjuvant chemoradiotherapy; the rest of patients exhibit a spectrum of tumour regression (TRG1-3). Understanding therapy-related genomic alterations may help us to identify underlying biology or novel targets associated with response that could increase the efficacy of therapy in patients that do not benefit from the current standard of care.
Methods: 48 FFPE rectal cancer biopsies and matched resections were analysed using the WG-DASL HumanHT-12_v4 Beadchip array on the illumina iScan. Bioinformatic analysis was conducted in Partek genomics suite and R studio. Limma and glmnet packages were used to identify genes differentially expressed between tumour regression grades. Validation of microarray results will be carried out using IHC, RNAscope and RT-PCR.
Results: Immune response genes were observed from supervised analysis of the biopsies which may have predictive value. Differential gene expression from the resections as well as pre and post therapy analysis revealed induction of genes in a tumour regression dependent manner. Pathway mapping and Gene Ontology analysis of these genes suggested antigen processing and natural killer mediated cytotoxicity respectively. The natural killer-like gene signature was switched off in non-responders and on in the responders. IHC has confirmed the presence of Natural killer cells through CD56+ staining.
Conclusion: Identification of NK cell genes and CD56+ cells in patients responding to neoadjuvant chemoradiotherapy warrants further investigation into their association with tumour regression grade in LARC. NK cells are known to lyse malignant cells and determining whether their presence is a cause or consequence of response is crucial. Interrogation of the cytokines upregulated in our NK-like signature will help guide future in vitro models.
Resumo:
Histone deacetylases (HDACs) are enzymes involved in transcriptional repression. We aimed to examine the significance of HDAC1 and HDAC2 gene expression in the prediction of recurrence and survival in 156 patients with hepatocellular carcinoma (HCC) among a South East Asian population who underwent curative surgical resection in Singapore. We found that HDAC1 and HDAC2 were upregulated in the majority of HCC tissues. The presence of HDAC1 in tumor tissues was correlated with poor tumor differentiation. Notably, HDAC1 expression in adjacent non-tumor hepatic tissues was correlated with the presence of satellite nodules and multiple lesions, suggesting that HDAC1 upregulation within the field of HCC may contribute to tumor spread. Using competing risk regression analysis, we found that increased cancer-specific mortality was significantly associated with HDAC2 expression. Mortality was also increased with high HDAC1 expression. In the liver cancer cell lines, HEP3B, HEPG2, PLC5, and a colorectal cancer cell line, HCT116, the combined knockdown of HDAC1 and HDAC2 increased cell death and reduced cell proliferation as well as colony formation. In contrast, knockdown of either HDAC1 or HDAC2 alone had minimal effects on cell death and proliferation. Taken together, our study suggests that both HDAC1 and HDAC2 exert pro-survival effects in HCC cells, and the combination of isoform-specific HDAC inhibitors against both HDACs may be effective in targeting HCC to reduce mortality.
Resumo:
Dissertação de Mestrado, Gestão da Água e da Costa, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2010
Resumo:
Contexte: L’inactivation des androgènes est majoritairement régulée par des enzymes du métabolisme de la famille des UDP-glucuronosyltransferase (UGT). Ce procédé métabolique permet de contrôler la biodisponibilité des hormones stéroïdiennes systémiques et locales. Objectif : L’objectif était d’étudier la relation entre l’expression de l’enzyme UDP-glucuronosyltransferase 2B polypeptide 28 (UGT2B28), impliquée dans la biotransformation des hormones, avec les niveaux hormonaux circulants, et les caractéristiques clinico-pathologiques dans le cancer de la prostate (CaP). Conception et participants : Nous avons utilisé dans cette étude la technique d’immunohostochimie à grande échelle (tissue microarray) sur les tissus de 239 patients ayant un CaP localisé. L’étude des 51 patients additionnels ne possédant pas l’enzyme UGT2B28 dans leur génome, a été effectuée pour confirmer l’importance de cette enzyme sur les niveaux hormonaux circulants. Résultats : La surexpression de l’enzyme UGT2B28 a été associée à des niveaux d’antigène prostatique spécifique (APS) au diagnostic plus faibles, à un score de Gleason plus élevé, à des marges et statuts nodaux positifs, et fut associée de façon indépendante au risque de progression. La surexpression de l’enzyme fut également associée à des niveaux circulants de testostérone (T) et dihydrotestostérone (DHT) plus élevés. Les patients n’exprimant pas le gène UGT2B28 avaient des niveaux plus bas de T (19%), de DHT (17%), de métabolites glucuronidés (18-38%), et des niveaux plus élevés du précurseur surrénalien androsténédione (36%). Conclusion : L’enzyme UGT2B28 modifie les niveaux circulants de T et DHT, et sa surexpression est associée avec un CaP à plus haut grade. Notre étude a permis de découvrir un nouveau rôle d’UGT2B28, celui de régulateur de la stéroïdogenèse, et a souligné l’interconnexion entre les capacités de biotransformation hormonale des cellules cancéreuses, des niveaux hormonaux, des caractéristiques clinicopathologiques et du risque de progression.
Resumo:
Airborne concentrations of Poaceae pollen have been monitored in Poznań for more than ten years and the length of the dataset is now considered sufficient for statistical analysis. The objective of this paper is to produce long-range forecasts that predict certain characteristics of the grass pollen season (such as the start, peak and end dates of the grass pollen season) as well as short-term forecasts that predict daily variations in grass pollen counts for the next day or next few days throughout the main grass pollen season. The method of forecasting was regression analysis. Correlation analysis was used to examine the relationship between grass pollen counts and the factors that affect its production, release and dispersal. The models were constructed with data from 1994-2004 and tested on data from 2005 and 2006. The forecast models predicted the start of the grass pollen season to within 2 days and achieved 61% and 70% accuracy on a scale of 1-4 when forecasting variations in daily grass pollen counts in 2005 and 2006 respectively. This study has emphasised how important the weather during the few weeks or months preceding pollination is to grass pollen production, and draws attention to the importance of considering large-scale patterns of climate variability (indices of the North Atlantic Oscillation) when constructing forecast models for allergenic pollen.