4 resultados para Test data generation
em Archivo Digital para la Docencia y la Investigación - Repositorio Institucional de la Universidad del País Vasco
Resumo:
Revised: 2006-11
Resumo:
Published as an article in: Studies in Nonlinear Dynamics & Econometrics, 2004, vol. 8, issue 3, article 6.
Resumo:
Background: Recently, with the access of low toxicity biological and targeted therapies, evidence of the existence of a long-term survival subpopulation of cancer patients is appearing. We have studied an unselected population with advanced lung cancer to look for evidence of multimodality in survival distribution, and estimate the proportion of long-term survivors. Methods: We used survival data of 4944 patients with non-small-cell lung cancer (NSCLC) stages IIIb-IV at diagnostic, registered in the National Cancer Registry of Cuba (NCRC) between January 1998 and December 2006. We fitted one-component survival model and two-component mixture models to identify short-and long-term survivors. Bayesian information criterion was used for model selection. Results: For all of the selected parametric distributions the two components model presented the best fit. The population with short-term survival (almost 4 months median survival) represented 64% of patients. The population of long-term survival included 35% of patients, and showed a median survival around 12 months. None of the patients of short-term survival was still alive at month 24, while 10% of the patients of long-term survival died afterwards. Conclusions: There is a subgroup showing long-term evolution among patients with advanced lung cancer. As survival rates continue to improve with the new generation of therapies, prognostic models considering short-and long-term survival subpopulations should be considered in clinical research.
Resumo:
In the problem of one-class classification (OCC) one of the classes, the target class, has to be distinguished from all other possible objects, considered as nontargets. In many biomedical problems this situation arises, for example, in diagnosis, image based tumor recognition or analysis of electrocardiogram data. In this paper an approach to OCC based on a typicality test is experimentally compared with reference state-of-the-art OCC techniques-Gaussian, mixture of Gaussians, naive Parzen, Parzen, and support vector data description-using biomedical data sets. We evaluate the ability of the procedures using twelve experimental data sets with not necessarily continuous data. As there are few benchmark data sets for one-class classification, all data sets considered in the evaluation have multiple classes. Each class in turn is considered as the target class and the units in the other classes are considered as new units to be classified. The results of the comparison show the good performance of the typicality approach, which is available for high dimensional data; it is worth mentioning that it can be used for any kind of data (continuous, discrete, or nominal), whereas state-of-the-art approaches application is not straightforward when nominal variables are present.