828 resultados para classification aided by clustering
Resumo:
Chronic kidney disease (CKD) has become a serious public health problem because of its associated morbidity, premature mortality and attendant healthcare costs. The rising number of persons with CKD is linked with ageing population structure and an increased prevalence of diabetes, hypertension and obesity. There is an inherited risk associated with developing CKD as evidenced by familial clustering and differing prevalence rates across ethnic groups. Earlier studies to determine the inherited risk factors for CKD rarely identified genetic variants that were robustly replicated. However, improvements in genotyping technologies and analytical methods are now helping to identify promising genetic loci aided by international collaboration and multi-consortia efforts. More recently, epigenetic modifications have been proposed to play a role in both the inherited susceptibility to CKD and, importantly, to explain how the environment dynamically interacts with the genome to alter an individual's disease risk. Genome-wide, epigenome-wide and whole transcriptome studies have been performed and optimal approaches for integrative analysis are being developed. This review summarises recent research and the current status of genetic and epigenetic risk factors influencing CKD using population-based information.
Resumo:
Aims: We investigated the physical properties and dynamical evolution of near-Earth asteroid (NEA) (190491) 2000 FJ10 in order to assess the suitability of this accessible NEA as a space mission target. Methods: Photometry and colour determination were carried out with the 1.54 m Kuiper Telescope (Mt Bigelow, USA) and the 10 m Southern African Large Telescope (SALT; Sutherland, South Africa) during the object's recent favourable apparition in 2011-12. During the earlier 2008 apparition, a spectrum of the object in the 6000-9000 Angstrom region was obtained with the 4.2 m William Herschel Telescope (WHT; Canary Islands, Spain). Interpretation of the observational results was aided by numerical simulations of 1000 dynamical clones of 2000 FJ10 up to 106 yr in the past and in the future. Results: The asteroid's spectrum and colours determined by our observations suggest a taxonomic classification within the S-complex although other classifications (V, D, E, M, P) cannot be ruled out. On this evidence, it is unlikely to be a primitive, relatively unaltered remnant from the early history of the solar system and thus a low priority target for robotic sample return. Our photometry placed a lower bound of 2 h to the asteroid's rotation period. Its absolute magnitude was estimated to be 21.54 ± 0.1 which, for a typical S-complex albedo, translates into a diameter of 130 ± 20 m. Our dynamical simulations show that it has likely been an Amor for the past 105 yr. Although currently not Earth-crossing, it will likely become so during the period 50-100 kyr in the future. It may have arrived from the inner or central main belt >1 Myr ago as a former member of a low-inclination S-class asteroid family. Its relatively slow rotation and large size make it a suitable destination for a human mission. We show that ballistic Earth-190491-Earth transfer trajectories with ΔV <2 km s-1 at the asteroid exist between 2052 and 2061. Based on observations made with the Southern African Large Telescope (SALT).
Resumo:
Ferrofluids belonging to the series NixFe1 xFe2O4 were synthesised by two different procedures—one by standard co-precipitation techniques, the other by co-precipitation for synthesis of particles and dispersion aided by high-energy ball milling with a view to understand the effect of strain and size anisotropy on the magneto-optical properties of ferrofluids. The birefringence measurements were carried out using a standard ellipsometer. The birefringence signal obtained for chemically synthesised samples was satisfactorily fitted to the standard second Langevin function. The ball-milled ferrofluids showed a deviation and their birefringence was enhanced by an order. This large enhancement in the birefringence value cannot be attributed to the increase in grain size of the samples, considering that the grain sizes of sample synthesised by both modes are comparable; instead, it can be attributed to the lattice strain-induced shape anisotropy(oblation) arising from the high-energy ball-milling process. Thus magnetic-optical (MO) signals can be tuned by ball-milling process, which can find potential applications.
Resumo:
Superparamagnetic nanocomposites based on Y-Fe2O3 and sulphonated polystyrene were synthesised by ion-exchange process and the structural characterisation has been carried out using X-ray diffraction technique. Doping of cobalt in to the Y-Fe2O3 lattice was effected in situ and the doping was varied in the atomic percentage range 1–10. The optical absorption studies show a band gap of 2.84 eV, which is blue shifted by 0.64 eV when compared to the reported values for the bulk samples (2.2 eV). This is explained on the basis of weak quantum confinement. Further size reduction can result in a strong confinement, which can yield transparent magnetic nanocomposites because of further blue shifting. The band gap gets red shifted further with the addition of cobalt in the lattice and this red shift increases with the increase in doping. The observed red shift can be attributed to the strain in the lattice caused by the anisotropy induced by the addition of cobalt. Thus, tuning of bandgap and blue shifting is aided by weak exciton confinement and further red shifting of the bandgap is assisted by cobalt doping.
Resumo:
In recent years there is an apparent shift in research from content based image retrieval (CBIR) to automatic image annotation in order to bridge the gap between low level features and high level semantics of images. Automatic Image Annotation (AIA) techniques facilitate extraction of high level semantic concepts from images by machine learning techniques. Many AIA techniques use feature analysis as the first step to identify the objects in the image. However, the high dimensional image features make the performance of the system worse. This paper describes and evaluates an automatic image annotation framework which uses SURF descriptors to select right number of features and right features for annotation. The proposed framework uses a hybrid approach in which k-means clustering is used in the training phase and fuzzy K-NN classification in the annotation phase. The performance of the system is evaluated using standard metrics.
Resumo:
Ferrofluids belonging to the series NixFe1 xFe2O4 were synthesised by two different procedures—one by standard co-precipitation techniques, the other by co-precipitation for synthesis of particles and dispersion aided by high-energy ball milling with a view to understand the effect of strain and size anisotropy on the magneto-optical properties of ferrofluids. The birefringence measurements were carried out using a standard ellipsometer. The birefringence signal obtained for chemically synthesised samples was satisfactorily fitted to the standard second Langevin function. The ball-milled ferrofluids showed a deviation and their birefringence was enhanced by an order. This large enhancement in the birefringence value cannot be attributed to the increase in grain size of the samples, considering that the grain sizes of sample synthesised by both modes are comparable; instead, it can be attributed to the lattice strain-induced shape anisotropy(oblation) arising from the high-energy ball-milling process. Thus magnetic-optical (MO) signals can be tuned by ball-milling process, which can find potential applications
Resumo:
Knowledge discovery in databases is the non-trivial process of identifying valid, novel potentially useful and ultimately understandable patterns from data. The term Data mining refers to the process which does the exploratory analysis on the data and builds some model on the data. To infer patterns from data, data mining involves different approaches like association rule mining, classification techniques or clustering techniques. Among the many data mining techniques, clustering plays a major role, since it helps to group the related data for assessing properties and drawing conclusions. Most of the clustering algorithms act on a dataset with uniform format, since the similarity or dissimilarity between the data points is a significant factor in finding out the clusters. If a dataset consists of mixed attributes, i.e. a combination of numerical and categorical variables, a preferred approach is to convert different formats into a uniform format. The research study explores the various techniques to convert the mixed data sets to a numerical equivalent, so as to make it equipped for applying the statistical and similar algorithms. The results of clustering mixed category data after conversion to numeric data type have been demonstrated using a crime data set. The thesis also proposes an extension to the well known algorithm for handling mixed data types, to deal with data sets having only categorical data. The proposed conversion has been validated on a data set corresponding to breast cancer. Moreover, another issue with the clustering process is the visualization of output. Different geometric techniques like scatter plot, or projection plots are available, but none of the techniques display the result projecting the whole database but rather demonstrate attribute-pair wise analysis
Resumo:
Short video on laser classification produced by the National Physical Laboratory
Resumo:
L'increment de bases de dades que cada vegada contenen imatges més difícils i amb un nombre més elevat de categories, està forçant el desenvolupament de tècniques de representació d'imatges que siguin discriminatives quan es vol treballar amb múltiples classes i d'algorismes que siguin eficients en l'aprenentatge i classificació. Aquesta tesi explora el problema de classificar les imatges segons l'objecte que contenen quan es disposa d'un gran nombre de categories. Primerament s'investiga com un sistema híbrid format per un model generatiu i un model discriminatiu pot beneficiar la tasca de classificació d'imatges on el nivell d'anotació humà sigui mínim. Per aquesta tasca introduïm un nou vocabulari utilitzant una representació densa de descriptors color-SIFT, i desprès s'investiga com els diferents paràmetres afecten la classificació final. Tot seguit es proposa un mètode par tal d'incorporar informació espacial amb el sistema híbrid, mostrant que la informació de context es de gran ajuda per la classificació d'imatges. Desprès introduïm un nou descriptor de forma que representa la imatge segons la seva forma local i la seva forma espacial, tot junt amb un kernel que incorpora aquesta informació espacial en forma piramidal. La forma es representada per un vector compacte obtenint un descriptor molt adequat per ésser utilitzat amb algorismes d'aprenentatge amb kernels. Els experiments realitzats postren que aquesta informació de forma te uns resultats semblants (i a vegades millors) als descriptors basats en aparença. També s'investiga com diferents característiques es poden combinar per ésser utilitzades en la classificació d'imatges i es mostra com el descriptor de forma proposat juntament amb un descriptor d'aparença millora substancialment la classificació. Finalment es descriu un algoritme que detecta les regions d'interès automàticament durant l'entrenament i la classificació. Això proporciona un mètode per inhibir el fons de la imatge i afegeix invariança a la posició dels objectes dins les imatges. S'ensenya que la forma i l'aparença sobre aquesta regió d'interès i utilitzant els classificadors random forests millora la classificació i el temps computacional. Es comparen els postres resultats amb resultats de la literatura utilitzant les mateixes bases de dades que els autors Aixa com els mateixos protocols d'aprenentatge i classificació. Es veu com totes les innovacions introduïdes incrementen la classificació final de les imatges.
Resumo:
This contribution proposes a powerful technique for two-class imbalanced classification problems by combining the synthetic minority over-sampling technique (SMOTE) and the particle swarm optimisation (PSO) aided radial basis function (RBF) classifier. In order to enhance the significance of the small and specific region belonging to the positive class in the decision region, the SMOTE is applied to generate synthetic instances for the positive class to balance the training data set. Based on the over-sampled training data, the RBF classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier's structure and the parameters of RBF kernels are determined using a PSO algorithm based on the criterion of minimising the leave-one-out misclassification rate. The experimental results obtained on a simulated imbalanced data set and three real imbalanced data sets are presented to demonstrate the effectiveness of our proposed algorithm.
Resumo:
Airborne lidar provides accurate height information of objects on the earth and has been recognized as a reliable and accurate surveying tool in many applications. In particular, lidar data offer vital and significant features for urban land-cover classification, which is an important task in urban land-use studies. In this article, we present an effective approach in which lidar data fused with its co-registered images (i.e. aerial colour images containing red, green and blue (RGB) bands and near-infrared (NIR) images) and other derived features are used effectively for accurate urban land-cover classification. The proposed approach begins with an initial classification performed by the Dempster–Shafer theory of evidence with a specifically designed basic probability assignment function. It outputs two results, i.e. the initial classification and pseudo-training samples, which are selected automatically according to the combined probability masses. Second, a support vector machine (SVM)-based probability estimator is adopted to compute the class conditional probability (CCP) for each pixel from the pseudo-training samples. Finally, a Markov random field (MRF) model is established to combine spatial contextual information into the classification. In this stage, the initial classification result and the CCP are exploited. An efficient belief propagation (EBP) algorithm is developed to search for the global minimum-energy solution for the maximum a posteriori (MAP)-MRF framework in which three techniques are developed to speed up the standard belief propagation (BP) algorithm. Lidar and its co-registered data acquired by Toposys Falcon II are used in performance tests. The experimental results prove that fusing the height data and optical images is particularly suited for urban land-cover classification. There is no training sample needed in the proposed approach, and the computational cost is relatively low. An average classification accuracy of 93.63% is achieved.
Resumo:
Studying joint noise is an important parameter for diagnosing temporomandibular dysfunction. In this study, eight groups (n=9) were formed according to joint dysfunction classification, provided by employing vibration analysis equipment. Parameters for analyzing joint noise were: total vibration energy, peak amplitude, and peak frequency. Mouth opening range was also analyzed. Statistical analysis results for each parameter were significant at 1 %. Each analyzed group presented different noise characteristics. This allowed for inclusion of the groups within a determined value category. The patient group with normal condyle/disk relationship always presented the lowest values. The type of joint noise was characterized by analyzing total integral noise, peak amplitude, peak frequency, and mouth opening. Analyzing joint noise using electrovibratography suggests the type of joint dysfunction and may help to establish a diagnosis, as well as a treatment plan.
Resumo:
Data mining is a relatively new field of research that its objective is to acquire knowledge from large amounts of data. In medical and health care areas, due to regulations and due to the availability of computers, a large amount of data is becoming available [27]. On the one hand, practitioners are expected to use all this data in their work but, at the same time, such a large amount of data cannot be processed by humans in a short time to make diagnosis, prognosis and treatment schedules. A major objective of this thesis is to evaluate data mining tools in medical and health care applications to develop a tool that can help make rather accurate decisions. In this thesis, the goal is finding a pattern among patients who got pneumonia by clustering of lab data values which have been recorded every day. By this pattern we can generalize it to the patients who did not have been diagnosed by this disease whose lab values shows the same trend as pneumonia patients does. There are 10 tables which have been extracted from a big data base of a hospital in Jena for my work .In ICU (intensive care unit), COPRA system which is a patient management system has been used. All the tables and data stored in German Language database.
Resumo:
This article highlights the potential benefits that the Kohonen method has for the classification of rivers with similar characteristics by determining regional ecological flows using the ELOHA (Ecological Limits of Hydrologic Alteration) methodology. Currently, there are many methodologies for the classification of rivers, however none of them include the characteristics found in Kohonen method such as (i) providing the number of groups that actually underlie the information presented, (ii) used to make variable importance analysis, (iii) which in any case can display two-dimensional classification process, and (iv) that regardless of the parameters used in the model the clustering structure remains. In order to evaluate the potential benefits of the Kohonen method, 174 flow stations distributed along the great river basin “Magdalena-Cauca” (Colombia) were analyzed. 73 variables were obtained for the classification process in each case. Six trials were done using different combinations of variables and the results were validated against reference classification obtained by Ingfocol in 2010, whose results were also framed using ELOHA guidelines. In the process of validation it was found that two of the tested models reproduced a level higher than 80% of the reference classification with the first trial, meaning that more than 80% of the flow stations analyzed in both models formed invariant groups of streams.
Resumo:
In this paper artificial neural network (ANN) based on supervised and unsupervised algorithms were investigated for use in the study of rheological parameters of solid pharmaceutical excipients, in order to develop computational tools for manufacturing solid dosage forms. Among four supervised neural networks investigated, the best learning performance was achieved by a feedfoward multilayer perceptron whose architectures was composed by eight neurons in the input layer, sixteen neurons in the hidden layer and one neuron in the output layer. Learning and predictive performance relative to repose angle was poor while to Carr index and Hausner ratio (CI and HR, respectively) showed very good fitting capacity and learning, therefore HR and CI were considered suitable descriptors for the next stage of development of supervised ANNs. Clustering capacity was evaluated for five unsupervised strategies. Network based on purely unsupervised competitive strategies, classic "Winner-Take-All", "Frequency-Sensitive Competitive Learning" and "Rival-Penalize Competitive Learning" (WTA, FSCL and RPCL, respectively) were able to perform clustering from database, however this classification was very poor, showing severe classification errors by grouping data with conflicting properties into the same cluster or even the same neuron. On the other hand it could not be established what was the criteria adopted by the neural network for those clustering. Self-Organizing Maps (SOM) and Neural Gas (NG) networks showed better clustering capacity. Both have recognized the two major groupings of data corresponding to lactose (LAC) and cellulose (CEL). However, SOM showed some errors in classify data from minority excipients, magnesium stearate (EMG) , talc (TLC) and attapulgite (ATP). NG network in turn performed a very consistent classification of data and solve the misclassification of SOM, being the most appropriate network for classifying data of the study. The use of NG network in pharmaceutical technology was still unpublished. NG therefore has great potential for use in the development of software for use in automated classification systems of pharmaceutical powders and as a new tool for mining and clustering data in drug development