2 resultados para Scatter plot
em Cochin University of Science
Resumo:
Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis
Resumo:
Efficient optic disc segmentation is an important task in automated retinal screening. For the same reason optic disc detection is fundamental for medical references and is important for the retinal image analysis application. The most difficult problem of optic disc extraction is to locate the region of interest. Moreover it is a time consuming task. This paper tries to overcome this barrier by presenting an automated method for optic disc boundary extraction using Fuzzy C Means combined with thresholding. The discs determined by the new method agree relatively well with those determined by the experts. The present method has been validated on a data set of 110 colour fundus images from DRION database, and has obtained promising results. The performance of the system is evaluated using the difference in horizontal and vertical diameters of the obtained disc boundary and that of the ground truth obtained from two expert ophthalmologists. For the 25 test images selected from the 110 colour fundus images, the Pearson correlation of the ground truth diameters with the detected diameters by the new method are 0.946 and 0.958 and, 0.94 and 0.974 respectively. From the scatter plot, it is shown that the ground truth and detected diameters have a high positive correlation. This computerized analysis of optic disc is very useful for the diagnosis of retinal diseases