9 resultados para Big data analytics
em Université de Lausanne, Switzerland
Resumo:
Given the rapid increase of species with a sequenced genome, the need to identify orthologous genes between them has emerged as a central bioinformatics task. Many different methods exist for orthology detection, which makes it difficult to decide which one to choose for a particular application. Here, we review the latest developments and issues in the orthology field, and summarize the most recent results reported at the third 'Quest for Orthologs' meeting. We focus on community efforts such as the adoption of reference proteomes, standard file formats and benchmarking. Progress in these areas is good, and they are already beneficial to both orthology consumers and providers. However, a major current issue is that the massive increase in complete proteomes poses computational challenges to many of the ortholog database providers, as most orthology inference algorithms scale at least quadratically with the number of proteomes. The Quest for Orthologs consortium is an open community with a number of working groups that join efforts to enhance various aspects of orthology analysis, such as defining standard formats and datasets, documenting community resources and benchmarking. AVAILABILITY AND IMPLEMENTATION: All such materials are available at http://questfororthologs.org. CONTACT: erik.sonnhammer@scilifelab.se or c.dessimoz@ucl.ac.uk.
Resumo:
The emergence of powerful new technologies, the existence of large quantities of data, and increasing demands for the extraction of added value from these technologies and data have created a number of significant challenges for those charged with both corporate and information technology management. The possibilities are great, the expectations high, and the risks significant. Organisations seeking to employ cloud technologies and exploit the value of the data to which they have access, be this in the form of "Big Data" available from different external sources or data held within the organisation, in structured or unstructured formats, need to understand the risks involved in such activities. Data owners have responsibilities towards the subjects of the data and must also, frequently, demonstrate that they are in compliance with current standards, laws and regulations. This thesis sets out to explore the nature of the technologies that organisations might utilise, identify the most pertinent constraints and risks, and propose a framework for the management of data from discovery to external hosting that will allow the most significant risks to be managed through the definition, implementation, and performance of appropriate internal control activities.
Resumo:
Background: The purpose of the work reported here is to test reliable molecular profiles using routinely processed formalin-fixed paraffin-embedded (FFPE) tissues from participants of the clinical trial BIG 1-98 with a median follow-up of 60 months. Methods: RNA from fresh frozen (FF) and FFPE tumor samples of 82 patients were used for quality control, and independent FFPE tissues of 342 postmenopausal participants of BIG 1-98 with ER-positive cancer were analyzed by measuring prospectively selected genes and computing scores representing the functions of the estrogen receptor (eight genes, ER_8), the progesterone receptor (five genes, PGR_5), Her2 (two genes, HER2_2), and proliferation (ten genes, PRO_10) by quantitative reverse transcription PCR (qRT-PCR) on TaqMan Low Density Arrays. Molecular scores were computed for each category and ER_8, PGR_5, HER2_2, and PRO_10 scores were combined into a RISK_25 score. Results: Pearson correlation coefficients between FF- and FFPE-derived scores were at least 0.94 and high concordance was observed between molecular scores and immunohistochemical data. The HER2_2, PGR_ 5, PRO_10 and RISK_25 scores were significant predictors of disease free-survival (DFS) in univariate Cox proportional hazard regression. PRO_10 and RISK_25 scores predicted DFS in patients with histological grade II breast cancer and in lymph node positive disease. The PRO_10 and PGR_ 5 scores were independent predictors of DFS in multivariate Cox regression models incorporating clinical risk indicators; PRO_10 outperformed Ki-67 labeling index in multivariate Cox proportional hazard analyses. Conclusions: Scores representing the endocrine responsiveness and proliferation status of breast cancers were developed from gene expression analyses based on RNA derived from FFPE tissues. The validation of the molecular scores with tumor samples of participants of the BIG 1-98 trial demonstrates that such scores can serve as independent prognostic factors to estimate disease free survival (DFS) in postmenopausal patients with estrogen receptor positive breast cancer.
Resumo:
Letrozole, an aromatase inhibitor, is ineffective in the presence of ovarian estrogen production. Two subpopulations of apparently postmenopausal women might derive reduced benefit from letrozole due to residual or returning ovarian activity: younger women (who have the potential for residual subclinical ovarian estrogen production), and those with chemotherapy-induced menopause who may experience return of ovarian function. In these situations tamoxifen may be preferable to an aromatase inhibitor. Among 4,922 patients allocated to the monotherapy arms (5 years of letrozole or tamoxifen) in the BIG 1-98 trial we identified two relevant subpopulations: patients with potential residual ovarian function, defined as having natural menopause, treated without adjuvant or neoadjuvant chemotherapy and age ≤ 55 years (n = 641); and those with chemotherapy-induced menopause (n = 105). Neither of the subpopulations examined showed treatment effects differing from the trial population as a whole (interaction P values are 0.23 and 0.62, respectively). Indeed, both among the 641 patients aged ≤ 55 years with natural menopause and no chemotherapy (HR 0.77 [0.51, 1.16]) and among the 105 patients with chemotherapy-induced menopause (HR 0.51 [0.19, 1.39]), the disease-free survival (DFS) point estimate favoring letrozole was marginally more beneficial than in the trial as a whole (HR 0.84 [0.74, 0.95]). Contrary to our initial concern, DFS results for young postmenopausal patients who did not receive chemotherapy and patients with chemotherapy-induced menopause parallel the letrozole benefit seen in the BIG 1-98 population as a whole. These data support the use of letrozole even in such patients.
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
BACKGROUND: Postmenopausal women with hormone receptor-positive early breast cancer have persistent, long-term risk of breast-cancer recurrence and death. Therefore, trials assessing endocrine therapies for this patient population need extended follow-up. We present an update of efficacy outcomes in the Breast International Group (BIG) 1-98 study at 8·1 years median follow-up. METHODS: BIG 1-98 is a randomised, phase 3, double-blind trial of postmenopausal women with hormone receptor-positive early breast cancer that compares 5 years of tamoxifen or letrozole monotherapy, or sequential treatment with 2 years of one of these drugs followed by 3 years of the other. Randomisation was done with permuted blocks, and stratified according to the two-arm or four-arm randomisation option, participating institution, and chemotherapy use. Patients, investigators, data managers, and medical reviewers were masked. The primary efficacy endpoint was disease-free survival (events were invasive breast cancer relapse, second primaries [contralateral breast and non-breast], or death without previous cancer event). Secondary endpoints were overall survival, distant recurrence-free interval (DRFI), and breast cancer-free interval (BCFI). The monotherapy comparison included patients randomly assigned to tamoxifen or letrozole for 5 years. In 2005, after a significant disease-free survival benefit was reported for letrozole as compared with tamoxifen, a protocol amendment facilitated the crossover to letrozole of patients who were still receiving tamoxifen alone; Cox models and Kaplan-Meier estimates with inverse probability of censoring weighting (IPCW) are used to account for selective crossover to letrozole of patients (n=619) in the tamoxifen arm. Comparison of sequential treatments to letrozole monotherapy included patients enrolled and randomly assigned to letrozole for 5 years, letrozole for 2 years followed by tamoxifen for 3 years, or tamoxifen for 2 years followed by letrozole for 3 years. Treatment has ended for all patients and detailed safety results for adverse events that occurred during the 5 years of treatment have been reported elsewhere. Follow-up is continuing for those enrolled in the four-arm option. BIG 1-98 is registered at clinicaltrials.govNCT00004205. FINDINGS: 8010 patients were included in the trial, with a median follow-up of 8·1 years (range 0-12·4). 2459 were randomly assigned to monotherapy with tamoxifen for 5 years and 2463 to monotherapy with letrozole for 5 years. In the four-arm option of the trial, 1546 were randomly assigned to letrozole for 5 years, 1548 to tamoxifen for 5 years, 1540 to letrozole for 2 years followed by tamoxifen for 3 years, and 1548 to tamoxifen for 2 years followed by letrozole for 3 years. At a median follow-up of 8·7 years from randomisation (range 0-12·4), letrozole monotherapy was significantly better than tamoxifen, whether by IPCW or intention-to-treat analysis (IPCW disease-free survival HR 0·82 [95% CI 0·74-0·92], overall survival HR 0·79 [0·69-0·90], DRFI HR 0·79 [0·68-0·92], BCFI HR 0·80 [0·70-0·92]; intention-to-treat disease-free survival HR 0·86 [0·78-0·96], overall survival HR 0·87 [0·77-0·999], DRFI HR 0·86 [0·74-0·998], BCFI HR 0·86 [0·76-0·98]). At a median follow-up of 8·0 years from randomisation (range 0-11·2) for the comparison of the sequential groups with letrozole monotherapy, there were no statistically significant differences in any of the four endpoints for either sequence. 8-year intention-to-treat estimates (each with SE ≤1·1%) for letrozole monotherapy, letrozole followed by tamoxifen, and tamoxifen followed by letrozole were 78·6%, 77·8%, 77·3% for disease-free survival; 87·5%, 87·7%, 85·9% for overall survival; 89·9%, 88·7%, 88·1% for DRFI; and 86·1%, 85·3%, 84·3% for BCFI. INTERPRETATION: For postmenopausal women with endocrine-responsive early breast cancer, a reduction in breast cancer recurrence and mortality is obtained by letrozole monotherapy when compared with tamoxifen montherapy. Sequential treatments involving tamoxifen and letrozole do not improve outcome compared with letrozole monotherapy, but might be useful strategies when considering an individual patient's risk of recurrence and treatment tolerability. FUNDING: Novartis, United States National Cancer Institute, International Breast Cancer Study Group.
Resumo:
BACKGROUND: Pathological complete response (pCR) following chemotherapy is strongly associated with both breast cancer subtype and long-term survival. Within a phase III neoadjuvant chemotherapy trial, we sought to determine whether the prognostic implications of pCR, TP53 status and treatment arm (taxane versus non-taxane) differed between intrinsic subtypes. PATIENTS AND METHODS: Patients were randomized to receive either six cycles of anthracycline-based chemotherapy or three cycles of docetaxel then three cycles of eprirubicin/docetaxel (T-ET). pCR was defined as no evidence of residual invasive cancer (or very few scattered tumour cells) in primary tumour and lymph nodes. We used a simplified intrinsic subtypes classification, as suggested by the 2011 St Gallen consensus. Interactions between pCR, TP53 status, treatment arm and intrinsic subtype on event-free survival (EFS), distant metastasis-free survival (DMFS) and overall survival (OS) were studied using a landmark and a two-step approach multivariate analyses. RESULTS: Sufficient data for pCR analyses were available in 1212 (65%) of 1856 patients randomized. pCR occurred in 222 of 1212 (18%) patients: 37 of 496 (7.5%) luminal A, 22 of 147 (15%) luminal B/HER2 negative, 51 of 230 (22%) luminal B/HER2 positive, 43 of 118 (36%) HER2 positive/non-luminal, 69 of 221(31%) triple negative (TN). The prognostic effect of pCR on EFS did not differ between subtypes and was an independent predictor for better EFS [hazard ratio (HR) = 0.40, P < 0.001 in favour of pCR], DMFS (HR = 0.32, P < 0.001) and OS (HR = 0.32, P < 0.001). Chemotherapy arm was an independent predictor only for EFS (HR = 0.73, P = 0.004 in favour of T-ET). The interaction between TP53, intrinsic subtypes and survival outcomes only approached statistical significance for EFS (P = 0.1). CONCLUSIONS: pCR is an independent predictor of favourable clinical outcomes in all molecular subtypes in a two-step multivariate analysis. CLINICALTRIALSGOV: EORTC 10994/BIG 1-00 Trial registration number NCT00017095.
Resumo:
BACKGROUND: Aromatase inhibitors provide superior disease control when compared with tamoxifen as adjuvant therapy for postmenopausal women with endocrine-responsive early breast cancer. PURPOSE: To present the design, history, and analytic challenges of the Breast International Group (BIG) 1-98 trial: an international, multicenter, randomized, double-blind, phase-III study comparing the aromatase inhibitor letrozole with tamoxifen in this clinical setting. METHODS: From 1998-2003, BIG 1-98 enrolled 8028 women to receive monotherapy with either tamoxifen or letrozole for 5 years, or sequential therapy of 2 years of one agent followed by 3 years of the other. Randomization to one of four treatment groups permitted two complementary analyses to be conducted several years apart. The first, reported in 2005, provided a head-to-head comparison of letrozole versus tamoxifen. Statistical power was increased by an enriched design, which included patients who were assigned sequential treatments until the time of the treatment switch. The second, reported in late 2008, used a conditional landmark approach to test the hypothesis that switching endocrine agents at approximately 2 years from randomization for patients who are disease-free is superior to continuing with the original agent. RESULTS: The 2005 analysis showed the superiority of letrozole compared with tamoxifen. The patients who were assigned tamoxifen alone were unblinded and offered the opportunity to switch to letrozole. Results from other trials increased the clinical relevance about whether or not to start treatment with letrozole or tamoxifen, and analysis plans were expanded to evaluate sequential versus single-agent strategies from randomization. LIMITATIONS: Due to the unblinding of patients assigned tamoxifen alone, analysis of updated data will require ascertainment of the influence of selective crossover from tamoxifen to letrozole. CONCLUSIONS: BIG 1-98 is an example of an enriched design, involving complementary analyses addressing different questions several years apart, and subject to evolving analytic plans influenced by new data that emerge over time.