10 resultados para means clustering
em Universidad Politécnica de Madrid
Resumo:
Salamanca has been considered among the most polluted cities in Mexico. The vehicular park, the industry and the emissions produced by agriculture, as well as orography and climatic characteristics have propitiated the increment in pollutant concentration of Particulate Matter less than 10 μg/m3 in diameter (PM10). In this work, a Multilayer Perceptron Neural Network has been used to make the prediction of an hour ahead of pollutant concentration. A database used to train the Neural Network corresponds to historical time series of meteorological variables (wind speed, wind direction, temperature and relative humidity) and air pollutant concentrations of PM10. Before the prediction, Fuzzy c-Means clustering algorithm have been implemented in order to find relationship among pollutant and meteorological variables. These relationship help us to get additional information that will be used for predicting. Our experiments with the proposed system show the importance of this set of meteorological variables on the prediction of PM10 pollutant concentrations and the neural network efficiency. The performance estimation is determined using the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The results shown that the information obtained in the clustering step allows a prediction of an hour ahead, with data from past 2 hours
Resumo:
In this work we propose an image acquisition and processing methodology (framework) developed for performance in-field grapes and leaves detection and quantification, based on a six step methodology: 1) image segmentation through Fuzzy C-Means with Gustafson Kessel (FCM-GK) clustering; 2) obtaining of FCM-GK outputs (centroids) for acting as seeding for K-Means clustering; 3) Identification of the clusters generated by K-Means using a Support Vector Machine (SVM) classifier. 4) Performance of morphological operations over the grapes and leaves clusters in order to fill holes and to eliminate small pixels clusters; 5)Creation of a mosaic image by Scale-Invariant Feature Transform (SIFT) in order to avoid overlapping between images; 6) Calculation of the areas of leaves and grapes and finding of the centroids in the grape bunches. Image data are collected using a colour camera fixed to a mobile platform. This platform was developed to give a stabilized surface to guarantee that the images were acquired parallel to de vineyard rows. In this way, the platform avoids the distortion of the images that lead to poor estimation of the areas. Our preliminary results are promissory, although they still have shown that it is necessary to implement a camera stabilization system to avoid undesired camera movements, and also a parallel processing procedure in order to speed up the mosaicking process.
Resumo:
Salamanca, situated in center of Mexico is among the cities which suffer most from the air pollution in Mexico. The vehicular park and the industry, as well as orography and climatic characteristics have propitiated the increment in pollutant concentration of Sulphur Dioxide (SO2). In this work, a Multilayer Perceptron Neural Network has been used to make the prediction of an hour ahead of pollutant concentration. A database used to train the Neural Network corresponds to historical time series of meteorological variables and air pollutant concentrations of SO2. Before the prediction, Fuzzy c-Means and K-means clustering algorithms have been implemented in order to find relationship among pollutant and meteorological variables. Our experiments with the proposed system show the importance of this set of meteorological variables on the prediction of SO2 pollutant concentrations and the neural network efficiency. The performance estimation is determined using the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The results showed that the information obtained in the clustering step allows a prediction of an hour ahead, with data from past 2 hours.
Resumo:
This work presents a method to detect Microcalcifications in Regions of Interest from digitized mammograms. The method is based mainly on the combination of Image Processing, Pattern Recognition and Artificial Intelligence. The Top-Hat transform is a technique based on mathematical morphology operations that, in this work is used to perform contrast enhancement of microcalcifications in the region of interest. In order to find more or less homogeneous regions in the image, we apply a novel image sub-segmentation technique based on Possibilistic Fuzzy c-Means clustering algorithm. From the original region of interest we extract two window-based features, Mean and Deviation Standard, which will be used in a classifier based on a Artificial Neural Network in order to identify microcalcifications. Our results show that the proposed method is a good alternative in the stage of microcalcifications detection, because this stage is an important part of the early Breast Cancer detection
Resumo:
In the last few years there has been a heightened interest in data treatment and analysis with the aim of discovering hidden knowledge and eliciting relationships and patterns within this data. Data mining techniques (also known as Knowledge Discovery in Databases) have been applied over a wide range of fields such as marketing, investment, fraud detection, manufacturing, telecommunications and health. In this study, well-known data mining techniques such as artificial neural networks (ANN), genetic programming (GP), forward selection linear regression (LR) and k-means clustering techniques, are proposed to the health and sports community in order to aid with resistance training prescription. Appropriate resistance training prescription is effective for developing fitness, health and for enhancing general quality of life. Resistance exercise intensity is commonly prescribed as a percent of the one repetition maximum. 1RM, dynamic muscular strength, one repetition maximum or one execution maximum, is operationally defined as the heaviest load that can be moved over a specific range of motion, one time and with correct performance. The safety of the 1RM assessment has been questioned as such an enormous effort may lead to muscular injury. Prediction equations could help to tackle the problem of predicting the 1RM from submaximal loads, in order to avoid or at least, reduce the associated risks. We built different models from data on 30 men who performed up to 5 sets to exhaustion at different percentages of the 1RM in the bench press action, until reaching their actual 1RM. Also, a comparison of different existing prediction equations is carried out. The LR model seems to outperform the ANN and GP models for the 1RM prediction in the range between 1 and 10 repetitions. At 75% of the 1RM some subjects (n = 5) could perform 13 repetitions with proper technique in the bench press action, whilst other subjects (n = 20) performed statistically significant (p < 0:05) more repetitions at 70% than at 75% of their actual 1RM in the bench press action. Rate of perceived exertion (RPE) seems not to be a good predictor for 1RM when all the sets are performed until exhaustion, as no significant differences (p < 0:05) were found in the RPE at 75%, 80% and 90% of the 1RM. Also, years of experience and weekly hours of strength training are better correlated to 1RM (p < 0:05) than body weight. O'Connor et al. 1RM prediction equation seems to arise from the data gathered and seems to be the most accurate 1RM prediction equation from those proposed in literature and used in this study. Epley's 1RM prediction equation is reproduced by means of data simulation from 1RM literature equations. Finally, future lines of research are proposed related to the problem of the 1RM prediction by means of genetic algorithms, neural networks and clustering techniques. RESUMEN En los últimos años ha habido un creciente interés en el tratamiento y análisis de datos con el propósito de descubrir relaciones, patrones y conocimiento oculto en los mismos. Las técnicas de data mining (también llamadas de \Descubrimiento de conocimiento en bases de datos\) se han aplicado consistentemente a lo gran de un gran espectro de áreas como el marketing, inversiones, detección de fraude, producción industrial, telecomunicaciones y salud. En este estudio, técnicas bien conocidas de data mining como las redes neuronales artificiales (ANN), programación genética (GP), regresión lineal con selección hacia adelante (LR) y la técnica de clustering k-means, se proponen a la comunidad del deporte y la salud con el objetivo de ayudar con la prescripción del entrenamiento de fuerza. Una apropiada prescripción de entrenamiento de fuerza es efectiva no solo para mejorar el estado de forma general, sino para mejorar la salud e incrementar la calidad de vida. La intensidad en un ejercicio de fuerza se prescribe generalmente como un porcentaje de la repetición máxima. 1RM, fuerza muscular dinámica, una repetición máxima o una ejecución máxima, se define operacionalmente como la carga máxima que puede ser movida en un rango de movimiento específico, una vez y con una técnica correcta. La seguridad de las pruebas de 1RM ha sido cuestionada debido a que el gran esfuerzo requerido para llevarlas a cabo puede derivar en serias lesiones musculares. Las ecuaciones predictivas pueden ayudar a atajar el problema de la predicción de la 1RM con cargas sub-máximas y son empleadas con el propósito de eliminar o al menos, reducir los riesgos asociados. En este estudio, se construyeron distintos modelos a partir de los datos recogidos de 30 hombres que realizaron hasta 5 series al fallo en el ejercicio press de banca a distintos porcentajes de la 1RM, hasta llegar a su 1RM real. También se muestra una comparación de algunas de las distintas ecuaciones de predicción propuestas con anterioridad. El modelo LR parece superar a los modelos ANN y GP para la predicción de la 1RM entre 1 y 10 repeticiones. Al 75% de la 1RM algunos sujetos (n = 5) pudieron realizar 13 repeticiones con una técnica apropiada en el ejercicio press de banca, mientras que otros (n = 20) realizaron significativamente (p < 0:05) más repeticiones al 70% que al 75% de su 1RM en el press de banca. El ínndice de esfuerzo percibido (RPE) parece no ser un buen predictor del 1RM cuando todas las series se realizan al fallo, puesto que no existen diferencias signifiativas (p < 0:05) en el RPE al 75%, 80% y el 90% de la 1RM. Además, los años de experiencia y las horas semanales dedicadas al entrenamiento de fuerza están más correlacionadas con la 1RM (p < 0:05) que el peso corporal. La ecuación de O'Connor et al. parece surgir de los datos recogidos y parece ser la ecuación de predicción de 1RM más precisa de aquellas propuestas en la literatura y empleadas en este estudio. La ecuación de predicción de la 1RM de Epley es reproducida mediante simulación de datos a partir de algunas ecuaciones de predicción de la 1RM propuestas con anterioridad. Finalmente, se proponen futuras líneas de investigación relacionadas con el problema de la predicción de la 1RM mediante algoritmos genéticos, redes neuronales y técnicas de clustering.
Resumo:
On-line partial discharge (PD) measurements have become a common technique for assessing the insulation condition of installed high voltage (HV) insulated cables. When on-line tests are performed in noisy environments, or when more than one source of pulse-shaped signals are present in a cable system, it is difficult to perform accurate diagnoses. In these cases, an adequate selection of the non-conventional measuring technique and the implementation of effective signal processing tools are essential for a correct evaluation of the insulation degradation. Once a specific noise rejection filter is applied, many signals can be identified as potential PD pulses, therefore, a classification tool to discriminate the PD sources involved is required. This paper proposes an efficient method for the classification of PD signals and pulse-type noise interferences measured in power cables with HFCT sensors. By using a signal feature generation algorithm, representative parameters associated to the waveform of each pulse acquired are calculated so that they can be separated in different clusters. The efficiency of the clustering technique proposed is demonstrated through an example with three different PD sources and several pulse-shaped interferences measured simultaneously in a cable system with a high frequency current transformer (HFCT).
Resumo:
Cereals microstructure is one of the primary quality attributes of cereals. Cereals rehydration and milk diffusion depends on such microstructure and thus, the crispiness and the texture, which will make it more palatable for the final consumer. Magnetic Resonance Imaging (MRI) is a very powerful topographic tool since acquisition parameter leads to a wide possibility for identifying textures, structures and liquids mobility. It is suited for non-invasive imaging of water and fats. Rehydration and diffusion cereals processes were measured by MRI at different times and using two different kinds of milk, varying their fat level. Several images were obtained. A combination of textural analysis (based on the analysis of histograms) and segmentation methods (in order to understand the rehydration level of each variety of cereals) were performed. According to the rehydration level, no advisable clustering behavior was found. Nevertheless, some differences were noticeable between the coating, the type of milk and the variety of cereals
Resumo:
A new method for detecting microcalcifications in regions of interest (ROIs) extracted from digitized mammograms is proposed. The top-hat transform is a technique based on mathematical morphology operations and, in this paper, is used to perform contrast enhancement of the mi-crocalcifications. To improve microcalcification detection, a novel image sub-segmentation approach based on the possibilistic fuzzy c-means algorithm is used. From the original ROIs, window-based features, such as the mean and standard deviation, were extracted; these features were used as an input vector in a classifier. The classifier is based on an artificial neural network to identify patterns belonging to microcalcifications and healthy tissue. Our results show that the proposed method is a good alternative for automatically detecting microcalcifications, because this stage is an important part of early breast cancer detection
Resumo:
Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.
Resumo:
Recent advances in non-destructive imaging techniques, such as X-ray computed tomography (CT), make it possible to analyse pore space features from the direct visualisation from soil structures. A quantitative characterisation of the three-dimensional solid-pore architecture is important to understand soil mechanics, as they relate to the control of biological, chemical, and physical processes across scales. This analysis technique therefore offers an opportunity to better interpret soil strata, as new and relevant information can be obtained. In this work, we propose an approach to automatically identify the pore structure of a set of 200-2D images that represent slices of an original 3D CT image of a soil sample, which can be accomplished through non-linear enhancement of the pixel grey levels and an image segmentation based on a PFCM (Possibilistic Fuzzy C-Means) algorithm. Once the solids and pore spaces have been identified, the set of 200-2D images is then used to reconstruct an approximation of the soil sample by projecting only the pore spaces. This reconstruction shows the structure of the soil and its pores, which become more bounded, less bounded, or unbounded with changes in depth. If the soil sample image quality is sufficiently favourable in terms of contrast, noise and sharpness, the pore identification is less complicated, and the PFCM clustering algorithm can be used without additional processing; otherwise, images require pre-processing before using this algorithm. Promising results were obtained with four soil samples, the first of which was used to show the algorithm validity and the additional three were used to demonstrate the robustness of our proposal. The methodology we present here can better detect the solid soil and pore spaces on CT images, enabling the generation of better 2D?3D representations of pore structures from segmented 2D images.