941 resultados para Automatic classification
Resumo:
This paper presents a Computer Aided Diagnosis (CAD) system that automatically classifies microcalcifications detected on digital mammograms into one of the five types proposed by Michele Le Gal, a classification scheme that allows radiologists to determine whether a breast tumor is malignant or not without the need for surgeries. The developed system uses a combination of wavelets and Artificial Neural Networks (ANN) and is executed on an Altera DE2-115 Development Kit, a kit containing a Field-Programmable Gate Array (FPGA) that allows the system to be smaller, cheaper and more energy efficient. Results have shown that the system was able to correctly classify 96.67% of test samples, which can be used as a second opinion by radiologists in breast cancer early diagnosis. (C) 2013 The Authors. Published by Elsevier B.V.
Resumo:
In this paper we presente a classification system that uses a combination of texture features from stromal regions: Haralick features and Local Binary Patterns (LBP) in wavelet domain. The system has five steps for classification of the tissues. First, the stromal regions were detected and extracted using segmentation techniques based on thresholding and RGB colour space. Second, the Wavelet decomposition was applied in the extracted regions to obtain the Wavelet coefficients. Third, the Haralick and LBP features were extracted from the coefficients. Fourth, relevant features were selected using the ANOVA statistical method. The classication (fifth step) was performed with Radial Basis Function (RBF) networks. The system was tested in 105 prostate images, which were divided into three groups of 35 images: normal, hyperplastic and cancerous. The system performance was evaluated using the area under the ROC curve and resulted in 0.98 for normal versus cancer, 0.95 for hyperplasia versus cancer and 0.96 for normal versus hyperplasia. Our results suggest that texture features can be used as discriminators for stromal tissues prostate images. Furthermore, the system was effective to classify prostate images, specially the hyperplastic class which is the most difficult type in diagnosis and prognosis.
Resumo:
This paper presents a neural network based technique for the classification of segments of road images into cracks and normal images. The density and histogram features are extracted. The features are passed to a neural network for the classification of images into images with and without cracks. Once images are classified into cracks and non-cracks, they are passed to another neural network for the classification of a crack type after segmentation. Some experiments were conducted and promising results were obtained. The selected results and a comparative analysis are included in this paper.
Resumo:
Report published in the Proceedings of the National Conference on "Education and Research in the Information Society", Plovdiv, May, 2014
Resumo:
Objective Death certificates provide an invaluable source for cancer mortality statistics; however, this value can only be realised if accurate, quantitative data can be extracted from certificates – an aim hampered by both the volume and variable nature of certificates written in natural language. This paper proposes an automatic classification system for identifying cancer related causes of death from death certificates. Methods Detailed features, including terms, n-grams and SNOMED CT concepts were extracted from a collection of 447,336 death certificates. These features were used to train Support Vector Machine classifiers (one classifier for each cancer type). The classifiers were deployed in a cascaded architecture: the first level identified the presence of cancer (i.e., binary cancer/nocancer) and the second level identified the type of cancer (according to the ICD-10 classification system). A held-out test set was used to evaluate the effectiveness of the classifiers according to precision, recall and F-measure. In addition, detailed feature analysis was performed to reveal the characteristics of a successful cancer classification model. Results The system was highly effective at identifying cancer as the underlying cause of death (F-measure 0.94). The system was also effective at determining the type of cancer for common cancers (F-measure 0.7). Rare cancers, for which there was little training data, were difficult to classify accurately (F-measure 0.12). Factors influencing performance were the amount of training data and certain ambiguous cancers (e.g., those in the stomach region). The feature analysis revealed a combination of features were important for cancer type classification, with SNOMED CT concept and oncology specific morphology features proving the most valuable. Conclusion The system proposed in this study provides automatic identification and characterisation of cancers from large collections of free-text death certificates. This allows organisations such as Cancer Registries to monitor and report on cancer mortality in a timely and accurate manner. In addition, the methods and findings are generally applicable beyond cancer classification and to other sources of medical text besides death certificates.
Resumo:
Objective To evaluate the effects of Optical Character Recognition (OCR) on the automatic cancer classification of pathology reports. Method Scanned images of pathology reports were converted to electronic free-text using a commercial OCR system. A state-of-the-art cancer classification system, the Medical Text Extraction (MEDTEX) system, was used to automatically classify the OCR reports. Classifications produced by MEDTEX on the OCR versions of the reports were compared with the classification from a human amended version of the OCR reports. Results The employed OCR system was found to recognise scanned pathology reports with up to 99.12% character accuracy and up to 98.95% word accuracy. Errors in the OCR processing were found to minimally impact on the automatic classification of scanned pathology reports into notifiable groups. However, the impact of OCR errors is not negligible when considering the extraction of cancer notification items, such as primary site, histological type, etc. Conclusions The automatic cancer classification system used in this work, MEDTEX, has proven to be robust to errors produced by the acquisition of freetext pathology reports from scanned images through OCR software. However, issues emerge when considering the extraction of cancer notification items.
Resumo:
Objective: To develop a system for the automatic classification of pathology reports for Cancer Registry notifications. Method: A two pass approach is proposed to classify whether pathology reports are cancer notifiable or not. The first pass queries pathology HL7 messages for known report types that are received by the Queensland Cancer Registry (QCR), while the second pass aims to analyse the free text reports and identify those that are cancer notifiable. Cancer Registry business rules, natural language processing and symbolic reasoning using the SNOMED CT ontology were adopted in the system. Results: The system was developed on a corpus of 500 histology and cytology reports (with 47% notifiable reports) and evaluated on an independent set of 479 reports (with 52% notifiable reports). Results show that the system can reliably classify cancer notifiable reports with a sensitivity, specificity, and positive predicted value (PPV) of 0.99, 0.95, and 0.95, respectively for the development set, and 0.98, 0.96, and 0.96 for the evaluation set. High sensitivity can be achieved at a slight expense in specificity and PPV. Conclusion: The system demonstrates how medical free-text processing enables the classification of cancer notifiable pathology reports with high reliability for potential use by Cancer Registries and pathology laboratories.
Resumo:
Human expert analyses are commonly used in bioacoustic studies and can potentially limit the reproducibility of these results. In this paper, a machine learning method is presented to statistically classify avian vocalizations. Automated approaches were applied to isolate bird songs from long field recordings, assess song similarities, and classify songs into distinct variants. Because no positive controls were available to assess the true classification of variants, multiple replicates of automatic classification of song variants were analyzed to investigate clustering uncertainty. The automatic classifications were more similar to the expert classifications than expected by chance. Application of these methods demonstrated the presence of discrete song variants in an island population of the New Zealand hihi (Notiomystis cincta). The geographic patterns of song variation were then revealed by integrating over classification replicates. Because this automated approach considers variation in song variant classification, it reduces potential human bias and facilitates the reproducibility of the results.
Resumo:
In competitive combat sporting environments like boxing, the statistics on a boxer's performance, including the amount and type of punches thrown, provide a valuable source of data and feedback which is routinely used for coaching and performance improvement purposes. This paper presents a robust framework for the automatic classification of a boxer's punches. Overhead depth imagery is employed to alleviate challenges associated with occlusions, and robust body-part tracking is developed for the noisy time-of-flight sensors. Punch recognition is addressed through both a multi-class SVM and Random Forest classifiers. A coarse-to-fine hierarchical SVM classifier is presented based on prior knowledge of boxing punches. This framework has been applied to shadow boxing image sequences taken at the Australian Institute of Sport with 8 elite boxers. Results demonstrate the effectiveness of the proposed approach, with the hierarchical SVM classifier yielding a 96% accuracy, signifying its suitability for analysing athletes punches in boxing bouts.
Resumo:
The work presented here is part of a larger study to identify novel technologies and biomarkers for early Alzheimer disease (AD) detection and it focuses on evaluating the suitability of a new approach for early AD diagnosis by non-invasive methods. The purpose is to examine in a pilot study the potential of applying intelligent algorithms to speech features obtained from suspected patients in order to contribute to the improvement of diagnosis of AD and its degree of severity. In this sense, Artificial Neural Networks (ANN) have been used for the automatic classification of the two classes (AD and control subjects). Two human issues have been analyzed for feature selection: Spontaneous Speech and Emotional Response. Not only linear features but also non-linear ones, such as Fractal Dimension, have been explored. The approach is non invasive, low cost and without any side effects. Obtained experimental results were very satisfactory and promising for early diagnosis and classification of AD patients.
Resumo:
Automatic molecular classification of cancer based on DNA microarray has many advantages over conventional classification based on morphological appearance of the tumor. Using artificial neural networks is a general approach for automatic classification. In this paper, Direction-Basis-Function neuron and Priority-Ordered algorithm are applied to neural networks. And the leukemia gene expression dataset is used as an example to testify the classifier. The result of our method is compared to that of SVM. It shows that our method makes a better performance than SVM.
Resumo:
The electroencephalogram (EEG) is an important noninvasive tool used in the neonatal intensive care unit (NICU) for the neurologic evaluation of the sick newborn infant. It provides an excellent assessment of at-risk newborns and formulates a prognosis for long-term neurologic outcome.The automated analysis of neonatal EEG data in the NICU can provide valuable information to the clinician facilitating medical intervention. The aim of this thesis is to develop a system for automatic classification of neonatal EEG which can be mainly divided into two parts: (1) classification of neonatal EEG seizure from nonseizure, and (2) classifying neonatal background EEG into several grades based on the severity of the injury using atomic decomposition. Atomic decomposition techniques use redundant time-frequency dictionaries for sparse signal representations or approximations. The first novel contribution of this thesis is the development of a novel time-frequency dictionary coherent with the neonatal EEG seizure states. This dictionary was able to track the time-varying nature of the EEG signal. It was shown that by using atomic decomposition and the proposed novel dictionary, the neonatal EEG transition from nonseizure to seizure states could be detected efficiently. The second novel contribution of this thesis is the development of a neonatal seizure detection algorithm using several time-frequency features from the proposed novel dictionary. It was shown that the time-frequency features obtained from the atoms in the novel dictionary improved the seizure detection accuracy when compared to that obtained from the raw EEG signal. With the assistance of a supervised multiclass SVM classifier and several timefrequency features, several methods to automatically grade EEG were explored. In summary, the novel techniques proposed in this thesis contribute to the application of advanced signal processing techniques for automatic assessment of neonatal EEG recordings.
Resumo:
Aquesta tesi està emmarcada dins la detecció precoç de masses, un dels símptomes més clars del càncer de mama, en imatges mamogràfiques. Primerament, s'ha fet un anàlisi extensiu dels diferents mètodes de la literatura, concloent que aquests mètodes són dependents de diferent paràmetres: el tamany i la forma de la massa i la densitat de la mama. Així, l'objectiu de la tesi és analitzar, dissenyar i implementar un mètode de detecció robust i independent d'aquests tres paràmetres. Per a tal fi, s'ha construït un patró deformable de la massa a partir de l'anàlisi de masses reals i, a continuació, aquest model és buscat en les imatges seguint un esquema probabilístic, obtenint una sèrie de regions sospitoses. Fent servir l'anàlisi 2DPCA, s'ha construït un algorisme capaç de discernir aquestes regions són realment una massa o no. La densitat de la mama és un paràmetre que s'introdueix de forma natural dins l'algorisme.
Resumo:
Hospitals attached to the Spanish Ministry of Health are currently using the International Classification of Diseases 9 Clinical Modification (ICD9-CM) to classify health discharge records. Nowadays, this work is manually done by experts. This paper tackles the automatic classification of real Discharge Records in Spanish following the ICD9-CM standard. The challenge is that the Discharge Records are written in spontaneous language. We explore several machine learning techniques to deal with the classification problem. Random Forest resulted in the most competitive one, achieving an F-measure of 0.876.
Resumo:
This paper deals with the classification of news items in ePaper, a prototype system of a future personalized newspaper service on a mobile reading device. The ePaper system aggregates news items from various news providers and delivers to each subscribed user (reader) a personalized electronic newspaper, utilizing content-based and collaborative filtering methods. The ePaper can also provide users "standard" (i.e., not personalized) editions of selected newspapers, as well as browsing capabilities in the repository of news items. This paper concentrates on the automatic classification of incoming news using hierarchical news ontology. Based on this classification on one hand, and on the users' profiles on the other hand, the personalization engine of the system is able to provide a personalized paper to each user onto her mobile reading device.