5 resultados para k-nearest neighbours estimation


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.

Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.

Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.

Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A modified UNIFAC–VISCO group contribution method was developed for the correlation and prediction of viscosity of ionic liquids as a function of temperature at 0.1 MPa. In this original approach, cations and anions were regarded as peculiar molecular groups. The significance of this approach comes from the ability to calculate the viscosity of mixtures of ionic liquids as well as pure ionic liquids. Binary interaction parameters for selected cations and anions were determined by fitting the experimental viscosity data available in literature for selected ionic liquids. The temperature dependence on the viscosity of the cations and anions were fitted to a Vogel–Fulcher–Tamman behavior. Binary interaction parameters and VFT type fitting parameters were then used to determine the viscosity of pure and mixtures of ionic liquids with different combinations of cations and anions to ensure the validity of the prediction method. Consequently, the viscosities of binary ionic liquid mixtures were then calculated by using this prediction method. In this work, the viscosity data of pure ionic liquids and of binary mixtures of ionic liquids are successfully calculated from 293.15 K to 363.15 K at 0.1 MPa. All calculated viscosity data showed excellent agreement with experimental data with a relative absolute average deviation lower than 1.7%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Li-ion batteries have been widely used in electric vehicles, and battery internal state estimation plays an important role in the battery management system. However, it is technically challenging, in particular, for the estimation of the battery internal temperature and state-ofcharge (SOC), which are two key state variables affecting the battery performance. In this paper, a novel method is proposed for realtime simultaneous estimation of these two internal states, thus leading to a significantly improved battery model for realtime SOC estimation. To achieve this, a simplified battery thermoelectric model is firstly built, which couples a thermal submodel and an electrical submodel. The interactions between the battery thermal and electrical behaviours are captured, thus offering a comprehensive description of the battery thermal and electrical behaviour. To achieve more accurate internal state estimations, the model is trained by the simulation error minimization method, and model parameters are optimized by a hybrid optimization method combining a meta-heuristic algorithm and the least square approach. Further, timevarying model parameters under different heat dissipation conditions are considered, and a joint extended Kalman filter is used to simultaneously estimate both the battery internal states and time-varying model parameters in realtime. Experimental results based on the testing data of LiFePO4 batteries confirm the efficacy of the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The viscosity of ionic liquids (ILs) has been modeled as a function of temperature and at atmospheric pressure using a new method based on the UNIFAC–VISCO method. This model extends the calculations previously reported by our group (see Zhao et al. J. Chem. Eng. Data 2016, 61, 2160–2169) which used 154 experimental viscosity data points of 25 ionic liquids for regression of a set of binary interaction parameters and ion Vogel–Fulcher–Tammann (VFT) parameters. Discrepancies in the experimental data of the same IL affect the quality of the correlation and thus the development of the predictive method. In this work, mathematical gnostics was used to analyze the experimental data from different sources and recommend one set of reliable data for each IL. These recommended data (totally 819 data points) for 70 ILs were correlated using this model to obtain an extended set of binary interaction parameters and ion VFT parameters, with a regression accuracy of 1.4%. In addition, 966 experimental viscosity data points for 11 binary mixtures of ILs were collected from literature to establish this model. All the binary data consist of 128 training data points used for the optimization of binary interaction parameters and 838 test data points used for the comparison of the pure evaluated values. The relative average absolute deviation (RAAD) for training and test is 2.9% and 3.9%, respectively.