2 resultados para Clinical Classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.

Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.

Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.

Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hemizygous deletion of 17p (del(17p)) has been identified as a variable associated with poor prognosis in myeloma, although its impact in the context of thalidomide therapy is not well described. The clinical outcome of 85 myeloma patients with del(17p) treated in a clinical trial incorporating both conventional and thalidomide-based induction therapies was examined. The clinical impact of deletion, low expression, and mutation of TP53 was also determined. Patients with del(17p) did not have inferior response rates compared to patients without del(17p), but, despite this, del(17p) was associated with impaired overall survival (OS) (median OS 26.6 vs. 48.5 months, P <0.001). Within the del(17p) group, thalidomide induction therapy was associated with improved response rates compared to conventional therapy, but there was no impact on OS. Thalidomide maintenance was associated with impaired OS, although our analysis suggests that this effect may have been due to confounding variables. A minimally deleted region on 17p13.1 involving 17 genes was identified, of which only TP53 and SAT2 were underexpressed. TP53 was mutated in <1% in patients without del(17p) and in 27% of patients with del(17p). The higher TP53 mutation rate in samples with del(17p) suggests a role for TP53 in these clinical outcomes. In conclusion, del(17p) defined a patient group associated with short survival in myeloma, and although thalidomide induction therapy was associated with improved response rates, it did not impact OS, suggesting that alternative therapeutic strategies are required for this group. (C) 2011 Wiley-Liss, Inc.