113 resultados para ICF CLASSIFICATION
Resumo:
Classification methods with embedded feature selection capability are very appealing for the analysis of complex processes since they allow the analysis of root causes even when the number of input variables is high. In this work, we investigate the performance of three techniques for classification within a Monte Carlo strategy with the aim of root cause analysis. We consider the naive bayes classifier and the logistic regression model with two different implementations for controlling model complexity, namely, a LASSO-like implementation with a L1 norm regularization and a fully Bayesian implementation of the logistic model, the so called relevance vector machine. Several challenges can arise when estimating such models mainly linked to the characteristics of the data: a large number of input variables, high correlation among subsets of variables, the situation where the number of variables is higher than the number of available data points and the case of unbalanced datasets. Using an ecological and a semiconductor manufacturing dataset, we show advantages and drawbacks of each method, highlighting the superior performance in term of classification accuracy for the relevance vector machine with respect to the other classifiers. Moreover, we show how the combination of the proposed techniques and the Monte Carlo approach can be used to get more robust insights into the problem under analysis when faced with challenging modelling conditions.
Resumo:
The applicability of ultra-short-term wind power prediction (USTWPP) models is reviewed. The USTWPP method proposed extracts featrues from historical data of wind power time series (WPTS), and classifies every short WPTS into one of several different subsets well defined by stationary patterns. All the WPTS that cannot match any one of the stationary patterns are sorted into the subset of nonstationary pattern. Every above WPTS subset needs a USTWPP model specially optimized for it offline. For on-line application, the pattern of the last short WPTS is recognized, then the corresponding prediction model is called for USTWPP. The validity of the proposed method is verified by simulations.
Resumo:
Breast cancer remains a frequent cause of female cancer death despite the great strides in elucidation of biological subtypes and their reported clinical and prognostic significance. We have defined a general cohort of breast cancers in terms of putative actionable targets, involving growth and proliferative factors, the cell cycle, and apoptotic pathways, both as single biomarkers across a general cohort and within intrinsic molecular subtypes.
We identified 293 patients treated with adjuvant chemotherapy. Additional hormonal therapy and trastuzumab was administered depending on hormonal and HER2 status respectively. We performed immunohistochemistry for ER, PR, HER2, MM1, CK5/6, p53, TOP2A, EGFR, IGF1R, PTEN, p-mTOR and e-cadherin. The cohort was classified into luminal (62%) and non-luminal (38%) tumors as well as luminal A (27%), luminal B HER2 negative (22%) and positive (12%), HER2 enriched (14%) and triple negative (25%). Patients with luminal tumors and co-overexpression of TOP2A or IGF1R loss displayed worse overall survival (p=0.0251 and p=0.0008 respectively). Non-luminal tumors had much greater heterogeneous expression profiles with no individual markers of prognostic significance. Non-luminal tumors were characterised by EGFR and TOP2A overexpression, IGF1R, PTEN and p-mTOR negativity and extreme p53 expression.
Our results indicate that only a minority of intrinsic subtype tumors purely express single novel actionable targets. This lack of pure biomarker expression is particular prevalent in the triple negative subgroup and may allude to the mechanism of targeted therapy inaction and myriad disappointing trial results. Utilising a combinatorial biomarker approach may enhance studies of targeted therapies providing additional information during design and patient selection while also helping decipher negative trial results.
Resumo:
Mobile malware has been growing in scale and complexity as smartphone usage continues to rise. Android has surpassed other mobile platforms as the most popular whilst also witnessing a dramatic increase in malware targeting the platform. A worrying trend that is emerging is the increasing sophistication of Android malware to evade detection by traditional signature-based scanners. As such, Android app marketplaces remain at risk of hosting malicious apps that could evade detection before being downloaded by unsuspecting users. Hence, in this paper we present an effective approach to alleviate this problem based on Bayesian classification models obtained from static code analysis. The models are built from a collection of code and app characteristics that provide indicators of potential malicious activities. The models are evaluated with real malware samples in the wild and results of experiments are presented to demonstrate the effectiveness of the proposed approach.
Resumo:
The Magellanic Clouds are uniquely placed to study the stellar contribution to dust emission. Individual stars can be resolved in these systems even in the mid-infrared, and they are close enough to allow detection of infrared excess caused by dust. We have searched the Spitzer Space Telescope data archive for all Infrared Spectrograph (IRS) staring-mode observations of the Small Magellanic Cloud (SMC) and found that 209 Infrared Array Camera (IRAC) point sources within the footprint of the Surveying the Agents of Galaxy Evolution in the Small Magellanic Cloud (SAGE-SMC) Spitzer Legacy programme were targeted, within a total of 311 staring-mode observations. We classify these point sources using a decision tree method of object classification, based on infrared spectral features, continuum and spectral energy distribution shape, bolometric luminosity, cluster membership and variability information. We find 58 asymptotic giant branch (AGB) stars, 51 young stellar objects, 4 post-AGB objects, 22 red supergiants, 27 stars (of which 23 are dusty OB stars), 24 planetary nebulae (PNe), 10 Wolf-Rayet stars, 3 H II regions, 3 R Coronae Borealis stars, 1 Blue Supergiant and 6 other objects, including 2 foreground AGB stars. We use these classifications to evaluate the success of photometric classification methods reported in the literature.
Resumo:
Sediment particle size analysis (PSA) is routinely used to support benthic macrofaunal community distribution data in habitat mapping and Ecological Status (ES) assessment. No optimal PSA Method to explain variability in multivariate macrofaunal distribution has been identified nor have the effects of changing sampling strategy been examined. Here, we use benthic macrofaunal and PSA grabs from two embayments in the south of Ireland. Four frequently used PSA Methods and two common sampling strategies are applied. A combination of laser particle sizing and wet/dry sieving without peroxide pre-treatment to remove organics was identified as the optimal Method for explaining macrofaunal distributions. ES classifications and EUNIS sediment classification were robust to changes in PSA Method. Fauna and PSA samples returned from the same grab sample significantly decreased macrofaunal variance explained by PSA and caused ES to be classified as lower. Employing the optimal PSA Method and sampling strategy will improve benthic monitoring. © 2012 Elsevier Ltd.
Molecular classification of non-invasive breast lesions for personalised therapy and chemoprevention
Resumo:
Breast cancer screening has led to a dramatic increase in the detection of pre-invasive breast lesions. While mastectomy is almost guaranteed to treat the disease, more conservative approaches could be as effective if patients can be stratified based on risk of co-existing or recurrent invasive disease.Here we use a range of biomarkers to interrogate and classify purely non-invasive lesions (PNL) and those with co-existing invasive breast cancer (CEIN). Apart from Ductal Carcinoma In Situ (DCIS), relative homogeneity is observed. DCIS contained a greater spread of molecular subtypes. Interestingly, high expression of p-mTOR was observed in all PNL with lower expression in DCIS and invasive carcinoma while the opposite expression pattern was observed for TOP2A.Comparing PNL with CEIN, we have identified p53 and Ki67 as predictors of CEIN with a combined PPV and NPV of 90.48% and 43.3% respectively. Furthermore, HER2 expression showed the best concordance between DCIS and its invasive counterpart.We propose that these biomarkers can be used to improve the management of patients with pre-invasive breast lesions following further validation and clinical trials. p53 and Ki67 could be used to stratify patients into low and high-risk groups for co-existing disease. Knowledge of expression of more actionable targets such as HER2 or TOP2A can be used to design chemoprevention or neo-adjuvant strategies. Increased knowledge of the molecular profile of pre-invasive lesions can only serve to enhance our understanding of the disease and, in the era of personalised medicine, bring us closer to improving breast cancer care.