5 resultados para grid, clustering, statistical, clustering
em Cochin University of Science
Resumo:
Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis
Resumo:
An Overview of known spatial clustering algorithms The space of interest can be the two-dimensional abstraction of the surface of the earth or a man-made space like the layout of a VLSI design, a volume containing a model of the human brain, or another 3d-space representing the arrangement of chains of protein molecules. The data consists of geometric information and can be either discrete or continuous. The explicit location and extension of spatial objects define implicit relations of spatial neighborhood (such as topological, distance and direction relations) which are used by spatial data mining algorithms. Therefore, spatial data mining algorithms are required for spatial characterization and spatial trend analysis. Spatial data mining or knowledge discovery in spatial databases differs from regular data mining in analogous with the differences between non-spatial data and spatial data. The attributes of a spatial object stored in a database may be affected by the attributes of the spatial neighbors of that object. In addition, spatial location, and implicit information about the location of an object, may be exactly the information that can be extracted through spatial data mining
Resumo:
In this paper, moving flock patterns are mined from spatio- temporal datasets by incorporating a clustering algorithm. A flock is defined as the set of data that move together for a certain continuous amount of time. Finding out moving flock patterns using clustering algorithms is a potential method to find out frequent patterns of movement in large trajectory datasets. In this approach, SPatial clusteRing algoRithm thrOugh sWarm intelligence (SPARROW) is the clustering algorithm used. The advantage of using SPARROW algorithm is that it can effectively discover clusters of widely varying sizes and shapes from large databases. Variations of the proposed method are addressed and also the experimental results show that the problem of scalability and duplicate pattern formation is addressed. This method also reduces the number of patterns produced
Resumo:
A spectral angle based feature extraction method, Spectral Clustering Independent Component Analysis (SC-ICA), is proposed in this work to improve the brain tissue classification from Magnetic Resonance Images (MRI). SC-ICA provides equal priority to global and local features; thereby it tries to resolve the inefficiency of conventional approaches in abnormal tissue extraction. First, input multispectral MRI is divided into different clusters by a spectral distance based clustering. Then, Independent Component Analysis (ICA) is applied on the clustered data, in conjunction with Support Vector Machines (SVM) for brain tissue analysis. Normal and abnormal datasets, consisting of real and synthetic T1-weighted, T2-weighted and proton density/fluid-attenuated inversion recovery images, were used to evaluate the performance of the new method. Comparative analysis with ICA based SVM and other conventional classifiers established the stability and efficiency of SC-ICA based classification, especially in reproduction of small abnormalities. Clinical abnormal case analysis demonstrated it through the highest Tanimoto Index/accuracy values, 0.75/98.8%, observed against ICA based SVM results, 0.17/96.1%, for reproduced lesions. Experimental results recommend the proposed method as a promising approach in clinical and pathological studies of brain diseases
Resumo:
Cerebral glioma is the most prevalent primary brain tumor, which are classified broadly into low and high grades according to the degree of malignancy. High grade gliomas are highly malignant which possess a poor prognosis, and the patients survive less than eighteen months after diagnosis. Low grade gliomas are slow growing, least malignant and has better response to therapy. To date, histological grading is used as the standard technique for diagnosis, treatment planning and survival prediction. The main objective of this thesis is to propose novel methods for automatic extraction of low and high grade glioma and other brain tissues, grade detection techniques for glioma using conventional magnetic resonance imaging (MRI) modalities and 3D modelling of glioma from segmented tumor slices in order to assess the growth rate of tumors. Two new methods are developed for extracting tumor regions, of which the second method, named as Adaptive Gray level Algebraic set Segmentation Algorithm (AGASA) can also extract white matter and grey matter from T1 FLAIR an T2 weighted images. The methods were validated with manual Ground truth images, which showed promising results. The developed methods were compared with widely used Fuzzy c-means clustering technique and the robustness of the algorithm with respect to noise is also checked for different noise levels. Image texture can provide significant information on the (ab)normality of tissue, and this thesis expands this idea to tumour texture grading and detection. Based on the thresholds of discriminant first order and gray level cooccurrence matrix based second order statistical features three feature sets were formulated and a decision system was developed for grade detection of glioma from conventional T2 weighted MRI modality.The quantitative performance analysis using ROC curve showed 99.03% accuracy for distinguishing between advanced (aggressive) and early stage (non-aggressive) malignant glioma. The developed brain texture analysis techniques can improve the physician’s ability to detect and analyse pathologies leading to a more reliable diagnosis and treatment of disease. The segmented tumors were also used for volumetric modelling of tumors which can provide an idea of the growth rate of tumor; this can be used for assessing response to therapy and patient prognosis.