928 resultados para K-Means Cluster


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The objective of this work was to typify, through physicochemical parameters, honey from Campos do Jordão’s microrregion, and verify how samples are grouped in accordance with the climatic production seasonality (summer and winter). It were assessed 30 samples of honey from beekeepers located in the cities of Monteiro Lobato, Campos do Jordão, Santo Antonio do Pinhal e São Bento do Sapucaí-SP, regarding both periods of honey production (November to February; July to September, during 2007 and 2008; n = 30). Samples were submitted to physicochemical analysis of total acidity, pH, humidity, water activity, density, aminoacids, ashes, color and electrical conductivity, identifying physicochemical standards of honey samples from both periods of production. Next, we carried out a cluster analysis of data using k-means algorithm, which grouped the samples into two classes (summer and winter). Thus, there was a supervised training of an Artificial Neural Network (ANN) using backpropagation algorithm. According to the analysis, the knowledge gained through the ANN classified the samples with 80% accuracy. It was observed that the ANNs have proved an effective tool to group samples of honey of the region of Campos do Jordao according to their physicochemical characteristics, depending on the different production periods.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Atmosphärische Aerosolpartikel wirken in vielerlei Hinsicht auf die Menschen und die Umwelt ein. Eine genaue Charakterisierung der Partikel hilft deren Wirken zu verstehen und dessen Folgen einzuschätzen. Partikel können hinsichtlich ihrer Größe, ihrer Form und ihrer chemischen Zusammensetzung charakterisiert werden. Mit der Laserablationsmassenspektrometrie ist es möglich die Größe und die chemische Zusammensetzung einzelner Aerosolpartikel zu bestimmen. Im Rahmen dieser Arbeit wurde das SPLAT (Single Particle Laser Ablation Time-of-flight mass spectrometer) zur besseren Analyse insbesondere von atmosphärischen Aerosolpartikeln weiterentwickelt. Der Aerosoleinlass wurde dahingehend optimiert, einen möglichst weiten Partikelgrößenbereich (80 nm - 3 µm) in das SPLAT zu transferieren und zu einem feinen Strahl zu bündeln. Eine neue Beschreibung für die Beziehung der Partikelgröße zu ihrer Geschwindigkeit im Vakuum wurde gefunden. Die Justage des Einlasses wurde mithilfe von Schrittmotoren automatisiert. Die optische Detektion der Partikel wurde so verbessert, dass Partikel mit einer Größe < 100 nm erfasst werden können. Aufbauend auf der optischen Detektion und der automatischen Verkippung des Einlasses wurde eine neue Methode zur Charakterisierung des Partikelstrahls entwickelt. Die Steuerelektronik des SPLAT wurde verbessert, so dass die maximale Analysefrequenz nur durch den Ablationslaser begrenzt wird, der höchsten mit etwa 10 Hz ablatieren kann. Durch eine Optimierung des Vakuumsystems wurde der Ionenverlust im Massenspektrometer um den Faktor 4 verringert.rnrnNeben den hardwareseitigen Weiterentwicklungen des SPLAT bestand ein Großteil dieser Arbeit in der Konzipierung und Implementierung einer Softwarelösung zur Analyse der mit dem SPLAT gewonnenen Rohdaten. CRISP (Concise Retrieval of Information from Single Particles) ist ein auf IGOR PRO (Wavemetrics, USA) aufbauendes Softwarepaket, das die effiziente Auswertung der Einzelpartikel Rohdaten erlaubt. CRISP enthält einen neu entwickelten Algorithmus zur automatischen Massenkalibration jedes einzelnen Massenspektrums, inklusive der Unterdrückung von Rauschen und von Problemen mit Signalen die ein intensives Tailing aufweisen. CRISP stellt Methoden zur automatischen Klassifizierung der Partikel zur Verfügung. Implementiert sind k-means, fuzzy-c-means und eine Form der hierarchischen Einteilung auf Basis eines minimal aufspannenden Baumes. CRISP bietet die Möglichkeit die Daten vorzubehandeln, damit die automatische Einteilung der Partikel schneller abläuft und die Ergebnisse eine höhere Qualität aufweisen. Daneben kann CRISP auf einfache Art und Weise Partikel anhand vorgebener Kriterien sortieren. Die CRISP zugrundeliegende Daten- und Infrastruktur wurde in Hinblick auf Wartung und Erweiterbarkeit erstellt. rnrnIm Rahmen der Arbeit wurde das SPLAT in mehreren Kampagnen erfolgreich eingesetzt und die Fähigkeiten von CRISP konnten anhand der gewonnen Datensätze gezeigt werden.rnrnDas SPLAT ist nun in der Lage effizient im Feldeinsatz zur Charakterisierung des atmosphärischen Aerosols betrieben zu werden, während CRISP eine schnelle und gezielte Auswertung der Daten ermöglicht.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A non-hierarchical K-means algorithm is used to cluster 47 years (1960–2006) of 10-day HYSPLIT backward trajectories to the Pico Mountain (PM) observatory on a seasonal basis. The resulting cluster centers identify the major transport pathways and collectively comprise a long-term climatology of transport to the observatory. The transport climatology improves our ability to interpret the observations made there and our understanding of pollution source regions to the station and the central North Atlantic region. I determine which pathways dominate transport to the observatory and examine the impacts of these transport patterns on the O3, NOy, NOx, and CO measurements made there during 2001–2006. Transport from the U.S., Canada, and the Atlantic most frequently reaches the station, but Europe, east Africa, and the Pacific can also contribute significantly depending on the season. Transport from Canada was correlated with the North Atlantic Oscillation (NAO) in spring and winter, and transport from the Pacific was uncorrelated with the NAO. The highest CO and O3 are observed during spring. Summer is also characterized by high CO and O3 and the highest NOy and NOx of any season. Previous studies at the station attributed the summer time high CO and O3 to transport of boreal wildfire emissions (for 2002–2004), and boreal fires continued to affect the station during 2005 and 2006. The particle dispersion model FLEXPART was used to calculate anthropogenic and biomass-burning CO tracer values at the station in an attempt to identify the regions responsible for the high CO and O3 observations during spring and biomass-burning impacts in summer.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The primary goal of this project is to demonstrate the practical use of data mining algorithms to cluster a solved steady-state computational fluids simulation (CFD) flow domain into a simplified lumped-parameter network. A commercial-quality code, “cfdMine” was created using a volume-weighted k-means clustering that that can accomplish the clustering of a 20 million cell CFD domain on a single CPU in several hours or less. Additionally agglomeration and k-means Mahalanobis were added as optional post-processing steps to further enhance the separation of the clusters. The resultant nodal network is considered a reduced-order model and can be solved transiently at a very minimal computational cost. The reduced order network is then instantiated in the commercial thermal solver MuSES to perform transient conjugate heat transfer using convection predicted using a lumped network (based on steady-state CFD). When inserting the lumped nodal network into a MuSES model, the potential for developing a “localized heat transfer coefficient” is shown to be an improvement over existing techniques. Also, it was found that the use of the clustering created a new flow visualization technique. Finally, fixing clusters near equipment newly demonstrates a capability to track temperatures near specific objects (such as equipment in vehicles).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Salamanca, situated in center of Mexico is among the cities which suffer most from the air pollution in Mexico. The vehicular park and the industry, as well as orography and climatic characteristics have propitiated the increment in pollutant concentration of Sulphur Dioxide (SO2). In this work, a Multilayer Perceptron Neural Network has been used to make the prediction of an hour ahead of pollutant concentration. A database used to train the Neural Network corresponds to historical time series of meteorological variables and air pollutant concentrations of SO2. Before the prediction, Fuzzy c-Means and K-means clustering algorithms have been implemented in order to find relationship among pollutant and meteorological variables. Our experiments with the proposed system show the importance of this set of meteorological variables on the prediction of SO2 pollutant concentrations and the neural network efficiency. The performance estimation is determined using the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE). The results showed that the information obtained in the clustering step allows a prediction of an hour ahead, with data from past 2 hours.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

We use quantitative X-ray diffraction to determine the mineralogy of late Quaternary marine sediments from the West and East Greenland shelves offshore from early Tertiary basalt outcrops. Despite the similar basalt outcrop area (60 000-70 000 km**2), there are significant differences between East and West Greenland sediments in the fraction of minerals (e.g. pyroxene) sourced from the basalt outcrops. We demonstrate the differences in the mineralogy between East and West Greenland marine sediments on three scales: (1) modern day, (2) late Quaternary inputs and (3) detailed down-core variations in 10 cores from the two margins. On the East Greenland Shelf (EGS), late Quaternary samples have an average quartz weight per cent of 6.2 ± 2.3 versus 12.8 ± 3.9 from the West Greenland Shelf (WGS), and 12.02 ± 4.8 versus 1.9 ± 2.3 wt% for pyroxene. K-means clustering indicated only 9% of the samples did not fit a simple EGS vs. WGS dichotomy. Sediments from the EGS and WGS are also isotopically distinct, with the EGS having higher eNd (-18 to 4) than those from the WGS (eNd = -25 to -35). We attribute the striking dichotomy in sediment composition to fundamentally different long-term Quaternary styles of glaciation on the two basalt outcrops.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The aim of the present study was to trace the mortality profile of the elderly in Brazil using two neighboring age groups: 60 to 69 years (young-old) and 80 years or more (oldest-old). To do this, we sought to characterize the trend and distinctions of different mortality profiles, as well as the quality of the data and associations with socioeconomic and sanitary conditions in the micro-regions of Brazil. Data was collected from the Mortality Information System (SIM) and the Brazilian Institute of Geography and Statistics (IBGE). Based on these data, the coefficients of mortality were calculated for the chapters of the International Disease Classification (ICD-10). A polynomial regression model was used to ascertain the trend of the main chapters. Non-hierarchical cluster analysis (K-Means) was used to obtain the profiles for different Brazilian micro-regions. Factorial analysis of the contextual variables was used to obtain the socio-economic and sanitary deprivation indices (IPSS). The trend of the CMId and of the ratio of its values in the two age groups confirmed a decrease in most of the indicators, particularly for badly-defined causes among the oldest-old. Among the young-old, the following profiles emerged: the Development Profile; the Modernity Profile; the Epidemiological Paradox Profile and the Ignorance Profile. Among the oldest-old, the latter three profiles were confirmed, in addition to the Low Mortality Rates Profile. When comparing the mean IPSS values in global terms, all of the groups were different in both of the age groups. The Ignorance Profile was compared with the other profiles using orthogonal contrasts. This profile differed from all of the others in isolation and in clusters. However, the mean IPSS was similar for the Low Mortality Rates Profile among the oldest-old. Furthermore, associations were found between the data quality indicators, the CMId for badly-defined causes, the general coefficient of mortality for each age group (CGMId) and the IPSS of the micro-regions. The worst rates were recorded in areas with the greatest socioeconomic and sanitary deprivation. The findings of the present study show that, despite the decrease in the mortality coefficients, there are notable differences in the profiles related to contextual conditions, including regional differences in data quality. These differences increase the vulnerability of the age groups studied and the health iniquities that are already present.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Understanding the impact of atmospheric black carbon (BC) containing particles on human health and radiative forcing requires knowledge of the mixing state of BC, including the characteristics of the materials with which it is internally mixed. In this study, we demonstrate for the first time the capabilities of the Aerodyne Soot-Particle Aerosol Mass Spectrometer equipped with a light scattering module (LS-SP-AMS) to examine the mixing state of refractory BC (rBC) and other aerosol components in an urban environment (downtown Toronto). K-means clustering analysis was used to classify single particle mass spectra into chemically distinct groups. One resultant cluster is dominated by rBC mass spectral signals (C+1 to C+5) while the organic signals fall into a few major clusters, identified as hydrocarbon-like organic aerosol (HOA), oxygenated organic aerosol (OOA), and cooking emission organic aerosol (COA). A nearly external mixing is observed with small BC particles only thinly coated by HOA ( 28% by mass on average), while over 90% of the HOA-rich particles did not contain detectable amounts of rBC. Most of the particles classified into other inorganic and organic clusters were not significantly associated with BC. The single particle results also suggest that HOA and COA emitted from anthropogenic sources were likely major contributors to organic-rich particles with low to mid-range aerodynamic diameter (dva). The similar temporal profiles and mass spectral features of the organic clusters and the factors from a positive matrix factorization (PMF) analysis of the ensemble aerosol dataset validate the conventional interpretation of the PMF results.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Continental margin sediments of SE South America originate from various terrestrial sources, each conveying specific magnetic and element signatures. Here, we aim to identify the sources and transport characteristics of shelf and slope sediments deposited between East Brazil and Patagonia (20°-48°S) using enviromagnetic, major element, and grain-size data. A set of five source-indicative parameters (i.e., chi-fd%, ARM/IRM, S0.3T, SIRM/Fe and Fe/K) of 25 surface samples (16-1805 m water depth) was analyzed by fuzzy c-means clustering and non-linear mapping to depict and unmix sediment-province characteristics. This multivariate approach yields three regionally coherent sediment provinces with petrologically and climatically distinct source regions. The southernmost province is entirely restricted to the slope off the Argentinean Pampas and has been identified as relict Andean-sourced sands with coarse unaltered magnetite. The direct transport to the slope was enabled by Rio Colorado and Rio Negro meltwaters during glacial and deglacial phases of low sea level. The adjacent shelf province consists of coastal loessoidal sands (highest hematite and goethite proportions) delivered from the Argentinean Pampas by wave erosion and westerly winds. The northernmost province includes the Plata mudbelt and Rio Grande Cone. It contains tropically weathered clayey silts from the La Plata Drainage Basin with pronounced proportions of fine magnetite, which were distributed up to ~24° S by the Brazilian Coastal Current and admixed to coarser relict sediments of Pampean loessoidal origin. Grain-size analyses of all samples showed that sediment fractionation during transport and deposition had little impact on magnetic and element source characteristics. This study corroborates the high potential of the chosen approach to access sediment origin in regions with contrasting sediment sources, complex transport dynamics, and large grain-size variability.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

En este trabajo se propone un nuevo sistema híbrido para el análisis de sentimientos en clase múltiple basado en el uso del diccionario General Inquirer (GI) y un enfoque jerárquico del clasificador Logistic Model Tree (LMT). Este nuevo sistema se compone de tres capas, la capa bipolar (BL) que consta de un LMT (LMT-1) para la clasificación de la polaridad de sentimientos, mientras que la segunda capa es la capa de la Intensidad (IL) y comprende dos LMTs (LMT-2 y LMT3) para detectar por separado tres intensidades de sentimientos positivos y tres intensidades de sentimientos negativos. Sólo en la fase de construcción, la capa de Agrupación (GL) se utiliza para agrupar las instancias positivas y negativas mediante el empleo de 2 k-means, respectivamente. En la fase de Pre-procesamiento, los textos son segmentados por palabras que son etiquetadas, reducidas a sus raíces y sometidas finalmente al diccionario GI con el objetivo de contar y etiquetar sólo los verbos, los sustantivos, los adjetivos y los adverbios con 24 marcadores que se utilizan luego para calcular los vectores de características. En la fase de Clasificación de Sentimientos, los vectores de características se introducen primero al LMT-1, a continuación, se agrupan en GL según la etiqueta de clase, después se etiquetan estos grupos de forma manual, y finalmente las instancias positivas son introducidas a LMT-2 y las instancias negativas a LMT-3. Los tres árboles están entrenados y evaluados usando las bases de datos Movie Review y SenTube con validación cruzada estratificada de 10-pliegues. LMT-1 produce un árbol de 48 hojas y 95 de tamaño, con 90,88% de exactitud, mientras que tanto LMT-2 y LMT-3 proporcionan dos árboles de una hoja y uno de tamaño, con 99,28% y 99,37% de exactitud,respectivamente. Los experimentos muestran que la metodología de clasificación jerárquica propuesta da un mejor rendimiento en comparación con otros enfoques prevalecientes.