923 resultados para classification aided by clustering
Resumo:
This paper reports on a sensor array able to distinguish tastes and used to classify red wines. The array comprises sensing units made from Langmuir-Blodgett (LB) films of conducting polymers and lipids and layer-by-layer (LBL) films from chitosan deposited onto gold interdigitated electrodes. Using impedance spectroscopy as the principle of detection, we show that distinct clusters can be identified in principal component analysis (PCA) plots for six types of red wine. Distinction can be made with regard to vintage, vineyard and brands of the red wine. Furthermore, if the data are treated with artificial neural networks (ANNs), this artificial tongue can identify wine samples stored under different conditions. This is illustrated by considering 900 wine samples, obtained with 30 measurements for each of the five bottles of the six wines, which could be recognised with 100% accuracy using the algorithms Standard Backpropagation and Backpropagation momentum in the ANNs. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
This study explores, in 3 steps, how the 3 main library classification systems, the Library of Congress Classification, the Dewey Decimal Classification, and the Universal Decimal Classification, cover human knowledge. First, we mapped the knowledge covered by the 3 systems. We used the 10 Pillars of Knowledge: Map of Human Knowledge, which comprises 10 pillars, as an evaluative model. We mapped all the subject-based classes and subclasses that are part of the first 2 levels of the 3 hierarchical structures. Then, we zoomed into each of the 10 pillars and analyzed how the three systems cover the 10 knowledge domains. Finally, we focused on the 3 library systems. Based on the way each one of them covers the 10 knowledge domains, it is evident that they failed to adequately and systematically present contemporary human knowledge. They are unsystematic and biased, and, at the top 2 levels of the hierarchical structures, they are incomplete.
Resumo:
Spatial analysis and fuzzy classification techniques were used to estimate the spatial distributions of heavy metals in soil. The work was applied to soils in a coastal region that is characterized by intense urban occupation and large numbers of different industries. Concentrations of heavy metals were determined using geostatistical techniques and classes of risk were defined using fuzzy classification. The resulting prediction mappings identify the locations of high concentrations of Pb, Zn, Ni, and Cu in topsoils of the study area. The maps show that areas of high pollution of Ni and Cu are located at the northeast, where there is a predominance of industrial and agricultural activities; Pb and Zn also occur in high concentrations in the northeast, but the maps also show significant concentrations of Pb and Zn in other areas, mainly in the central and southeastern parts, where there are urban leisure activities and trade centers. Maps were also prepared showing levels of pollution risk. These maps show that (1) Cu presents a large pollution risk in the north-northwest, midwest, and southeast sectors, (2) Pb represents a moderate risk in most areas, (3) Zn generally exhibits low risk, and (4) Ni represents either low risk or no risk in the studied area. This study shows that combining geostatistics with fuzzy theory can provide results that offer insight into risk assessment for environmental pollution.
Resumo:
This article presents a quantitative and objective approach to cat ganglion cell characterization and classification. The combination of several biologically relevant features such as diameter, eccentricity, fractal dimension, influence histogram, influence area, convex hull area, and convex hull diameter are derived from geometrical transforms and then processed by three different clustering methods (Ward's hierarchical scheme, K-means and genetic algorithm), whose results are then combined by a voting strategy. These experiments indicate the superiority of some features and also suggest some possible biological implications.
Resumo:
Complementa la información contenida en el documento E/CEPAL/G.1207
Resumo:
Vinte e sete amostras de mel, produzidas em dez cidades do Estado do Pará (Região Amazônica, norte do Brasil) por três espécies diferentes de abelhas (Apis mellifera, Melipona fasciculata e Melipona flavoneata), foram analisadas em seus teores de elementos minerais (Al, As, Ba, Be, Bi, Ca, Cd, Co, Cr, Cu, Fe, K, Li, Mg, Mn, Na, Ni, Sr e Zn) e alguns parâmetros fisicoquímicos (cor, umidade, densidade, pH, sólidos insolúveis e solúveis totais, cinzas, condutividade elétrica, índice de formol, acidez livre, hidroximetilfurfural, açúcares redutores e totais e sacarose). Os teores minerais foram determinados via espectrometria de emissão atômica por plasma acoplado indutivamente (ICP OES) e as análises dos parâmetros físico-químicos seguiram metodologias oficiais. Os resultados das análises físico-químicas apresentaram-se de acordo com a legislação nacional e internacional, bem como com outros trabalhos similares ao redor do mundo. A análise estatística multivariada (análise por agrupamento hierárquico (HCA) e por componentes principais (PCA)) foi aplicada aos resultados dos teores metálicos e aos parâmetros físico-químicos, sendo possível a separação das amostras de mel conforme a espécie produtora.
Resumo:
This paper presents a Computer Aided Diagnosis (CAD) system that automatically classifies microcalcifications detected on digital mammograms into one of the five types proposed by Michele Le Gal, a classification scheme that allows radiologists to determine whether a breast tumor is malignant or not without the need for surgeries. The developed system uses a combination of wavelets and Artificial Neural Networks (ANN) and is executed on an Altera DE2-115 Development Kit, a kit containing a Field-Programmable Gate Array (FPGA) that allows the system to be smaller, cheaper and more energy efficient. Results have shown that the system was able to correctly classify 96.67% of test samples, which can be used as a second opinion by radiologists in breast cancer early diagnosis. (C) 2013 The Authors. Published by Elsevier B.V.
Resumo:
In Computer-Aided Diagnosis-based schemes in mammography analysis each module is interconnected, which directly affects the system operation as a whole. The identification of mammograms with and without masses is highly needed to reduce the false positive rates regarding the automatic selection of regions of interest for further image segmentation. This study aims to evaluate the performance of three techniques in classifying regions of interest as containing masses or without masses (without clinical findings), as well as the main contribution of this work is to introduce the Optimum-Path Forest (OPF) classifier in this context, which has never been done so far. Thus, we have compared OPF against with two sorts of neural networks in a private dataset composed by 120 images: Radial Basis Function and Multilayer Perceptron (MLP). Texture features have been used for such purpose, and the experiments have demonstrated that MLP networks have been slightly better than OPF, but the former is much faster, which can be a suitable tool for real-time recognition systems.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
This work proposes a method for data clustering based on complex networks theory. A data set is represented as a network by considering different metrics to establish the connection between each pair of objects. The clusters are obtained by taking into account five community detection algorithms. The network-based clustering approach is applied in two real-world databases and two sets of artificially generated data. The obtained results suggest that the exponential of the Minkowski distance is the most suitable metric to quantify the similarities between pairs of objects. In addition, the community identification method based on the greedy optimization provides the best cluster solution. We compare the network-based clustering approach with some traditional clustering algorithms and verify that it provides the lowest classification error rate. (C) 2012 Elsevier B.V. All rights reserved.
Resumo:
Multicentric carpotarsal osteolysis (MCTO) is a rare skeletal dysplasia characterized by aggressive osteolysis, particularly affecting the carpal and tarsal bones, and is frequently associated with progressive renal failure. Using exome capture and next-generation sequencing in five unrelated simplex cases of MCTO, we identified previously unreported missense mutations clustering within a 51 base pair region of the single exon of MAFB, validated by Sanger sequencing. A further six unrelated simplex cases with MCTO were also heterozygous for previously unreported mutations within this same region, as were affected members of two families with autosomal-dominant MCTO. MAFB encodes a transcription factor that negatively regulates RANKL-induced osteoclastogenesis and is essential for normal renal development. Identification of this gene paves the way for development of novel therapeutic approaches for this crippling disease and provides insight into normal bone and kidney development.
Resumo:
The attributes describing a data set may often be arranged in meaningful subsets, each of which corresponds to a different aspect of the data. An unsupervised algorithm (SCAD) that simultaneously performs fuzzy clustering and aspects weighting was proposed in the literature. However, SCAD may fail and halt given certain conditions. To fix this problem, its steps are modified and then reordered to reduce the number of parameters required to be set by the user. In this paper we prove that each step of the resulting algorithm, named ASCAD, globally minimizes its cost-function with respect to the argument being optimized. The asymptotic analysis of ASCAD leads to a time complexity which is the same as that of fuzzy c-means. A hard version of the algorithm and a novel validity criterion that considers aspect weights in order to estimate the number of clusters are also described. The proposed method is assessed over several artificial and real data sets.