40 resultados para CLUSTER ANALYSIS

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A first step in interpreting the wide variation in trace gas concentrations measured over time at a given site is to classify the data according to the prevailing weather conditions. In order to classify measurements made during two intensive field campaigns at Mace Head, on the west coast of Ireland, an objective method of assigning data to different weather types has been developed. Air-mass back trajectories calculated using winds from ECMWF analyses, arriving at the site in 1995–1997, were allocated to clusters based on a statistical analysis of the latitude, longitude and pressure of the trajectory at 12 h intervals over 5 days. The robustness of the analysis was assessed by using an ensemble of back trajectories calculated for four points around Mace Head. Separate analyses were made for each of the 3 years, and for four 3-month periods. The use of these clusters in classifying ground-based ozone measurements at Mace Head is described, including the need to exclude data which have been influenced by local perturbations to the regional flow pattern, for example, by sea breezes. Even with a limited data set, based on 2 months of intensive field measurements in 1996 and 1997, there are statistically significant differences in ozone concentrations in air from the different clusters. The limitations of this type of analysis for classification and interpretation of ground-based chemistry measurements are discussed.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The overall operation and internal complexity of a particular production machinery can be depicted in terms of clusters of multidimensional points which describe the process states, the value in each point dimension representing a measured variable from the machinery. The paper describes a new cluster analysis technique for use with manufacturing processes, to illustrate how machine behaviour can be categorised and how regions of good and poor machine behaviour can be identified. The cluster algorithm presented is the novel mean-tracking algorithm, capable of locating N-dimensional clusters in a large data space in which a considerable amount of noise is present. Implementation of the algorithm on a real-world high-speed machinery application is described, with clusters being formed from machinery data to indicate machinery error regions and error-free regions. This analysis is seen to provide a promising step ahead in the field of multivariable control of manufacturing systems.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper deals with the selection of centres for radial basis function (RBF) networks. A novel mean-tracking clustering algorithm is described as a way in which centers can be chosen based on a batch of collected data. A direct comparison is made between the mean-tracking algorithm and k-means clustering and it is shown how mean-tracking clustering is significantly better in terms of achieving an RBF network which performs accurate function modelling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper describes the novel use of cluster analysis in the field of industrial process control. The severe multivariable process problems encountered in manufacturing have often led to machine shutdowns, where the need for corrective actions arises in order to resume operation. Production faults which are caused by processes running in less efficient regions may be prevented or diagnosed using a reasoning based on cluster analysis. Indeed the intemal complexity of a production machinery may be depicted in clusters of multidimensional data points which characterise the manufacturing process. The application of a Mean-Tracking cluster algorithm (developed in Reading) to field data acquired from a high-speed machinery will be discussed. The objective of such an application is to illustrate how machine behaviour can be studied, in particular how regions of erroneous and stable running behaviour can be identified.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Boreal winter wind storm situations over Central Europe are investigated by means of an objective cluster analysis. Surface data from the NCEP-Reanalysis and ECHAM4/OPYC3-climate change GHG simulation (IS92a) are considered. To achieve an optimum separation of clusters of extreme storm conditions, 55 clusters of weather patterns are differentiated. To reduce the computational effort, a PCA is initially performed, leading to a data reduction of about 98 %. The clustering itself was computed on 3-day periods constructed with the first six PCs using "k-means" clustering algorithm. The applied method enables an evaluation of the time evolution of the synoptic developments. The climate change signal is constructed by a projection of the GCM simulation on the EOFs attained from the NCEP-Reanalysis. Consequently, the same clusters are obtained and frequency distributions can be compared. For Central Europe, four primary storm clusters are identified. These clusters feature almost 72 % of the historical extreme storms events and add only to 5 % of the total relative frequency. Moreover, they show a statistically significant signature in the associated wind fields over Europe. An increased frequency of Central European storm clusters is detected with enhanced GHG conditions, associated with an enhancement of the pressure gradient over Central Europe. Consequently, more intense wind events over Central Europe are expected. The presented algorithm will be highly valuable for the analysis of huge data amounts as is required for e.g. multi-model ensemble analysis, particularly because of the enormous data reduction.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A realistic representation of the North Atlantic tropical cyclone tracks is crucial as it allows, for example, explaining potential changes in US landfalling systems. Here we present a tentative study, which examines the ability of recent climate models to represent North Atlantic tropical cyclone tracks. Tracks from two types of climate models are evaluated: explicit tracks are obtained from tropical cyclones simulated in regional or global climate models with moderate to high horizontal resolution (1° to 0.25°), and downscaled tracks are obtained using a downscaling technique with large-scale environmental fields from a subset of these models. For both configurations, tracks are objectively separated into four groups using a cluster technique, leading to a zonal and a meridional separation of the tracks. The meridional separation largely captures the separation between deep tropical and sub-tropical, hybrid or baroclinic cyclones, while the zonal separation segregates Gulf of Mexico and Cape Verde storms. The properties of the tracks’ seasonality, intensity and power dissipation index in each cluster are documented for both configurations. Our results show that except for the seasonality, the downscaled tracks better capture the observed characteristics of the clusters. We also use three different idealized scenarios to examine the possible future changes of tropical cyclone tracks under 1) warming sea surface temperature, 2) increasing carbon dioxide, and 3) a combination of the two. The response to each scenario is highly variable depending on the simulation considered. Finally, we examine the role of each cluster in these future changes and find no preponderant contribution of any single cluster over the others.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. New method: We propose a complete pipeline for the cluster analysis of ERP data. To increase the signalto-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA)to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). Results: After validating the pipeline on simulated data, we tested it on data from two experiments – a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Flow in geophysical fluids is commonly summarized by coherent streams, for example conveyor belt flows in extratropical cyclones or jet streaks in the upper troposphere. Typically, parcel trajectories are calculated from the flow field and subjective thresholds are used to distinguish coherent streams of interest. This methodology contribution develops a more objective approach to distinguish coherent airstreams within extratropical cyclones. Agglomerative clustering is applied to trajectories along with a method to identify the optimal number of cluster classes. The methodology is applied to trajectories associated with the low-level jets of a well-studied extratropical cyclone. For computational efficiency, a constraint that trajectories must pass through these jet regions is applied prior to clustering; the partitioning into different airstreams is then performed by the agglomerative clustering. It is demonstrated that the methodology can identify the salient flow structures of cyclones: the warm and cold conveyor belts. A test focusing on the airstreams terminating at the tip of the bent-back front further demonstrates the success of the method in that it can distinguish fine-scale flow structure such as descending sting jet airstreams.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Data from various stations having different measurement record periods between 1988 and 2007 are analyzed to investigate the surface ozone concentration, long-term trends, and seasonal changes in and around Ireland. Time series statistical analysis is performed on the monthly mean data using seasonal and trend decomposition procedures and the Box-Jenkins approach (autoregressive integrated moving average). In general, ozone concentrations in the Irish region are found to have a negative trend at all sites except at the coastal sites of Mace Head and Valentia. Data from the most polluted Dublin city site have shown a very strong negative trend of −0.33 ppb/yr with a 95% confidence limit of 0.17 ppb/yr (i.e., −0.33 ± 0.17) for the period 2002−2007, and for the site near the city of Cork, the trend is found to be −0.20 ± 0.11 ppb/yr over the same period. The negative trend for other sites is more pronounced when the data span is considered from around the year 2000 to 2007. Rural sites of Wexford and Monaghan have also shown a very strong negative trend of −0.99 ± 0.13 and −0.58 ± 0.12, respectively, for the period 2000−2007. Mace Head, a site that is representative of ozone changes in the air advected from the Atlantic to Europe in the marine planetary boundary layer, has shown a positive trend of about +0.16 ± 0.04 ppb per annum over the entire period 1988−2007, but this positive trend has reduced during recent years (e.g., in the period 2001−2007). Cluster analysis for back trajectories are performed for the stations having a long record of data, Mace Head and Lough Navar. For Mace Head, the northern and western clean air sectors have shown a similar positive trend (+0.17 ± 0.02 ppb/yr for the northern sector and +0.18 ± 0.02 ppb/yr for the western sector) for the whole period, but partial analysis for the clean western sector at Mace Head shows different trends during different time periods with a decrease in the positive trend since 1988 indicating a deceleration in the ozone trend for Atlantic air masses entering Europe.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The aim of this study was to determine whether geographical differences impact the composition of bacterial communities present in the airways of cystic fibrosis (CF) patients attending CF centers in the United States or United Kingdom. Thirty-eight patients were matched on the basis of clinical parameters into 19 pairs comprised of one U.S. and one United Kingdom patient. Analysis was performed to determine what, if any, bacterial correlates could be identified. Two culture-independent strategies were used: terminal restriction fragment length polymorphism (T-RFLP) profiling and 16S rRNA clone sequencing. Overall, 73 different terminal restriction fragment lengths were detected, ranging from 2 to 10 for U.S. and 2 to 15 for United Kingdom patients. The statistical analysis of T-RFLP data indicated that patient pairing was successful and revealed substantial transatlantic similarities in the bacterial communities. A small number of bands was present in the vast majority of patients in both locations, indicating that these are species common to the CF lung. Clone sequence analysis also revealed that a number of species not traditionally associated with the CF lung were present in both sample groups. The species number per sample was similar, but differences in species presence were observed between sample groups. Cluster analysis revealed geographical differences in bacterial presence and relative species abundance. Overall, the U.S. samples showed tighter clustering with each other compared to that of United Kingdom samples, which may reflect the lower diversity detected in the U.S. sample group. The impact of cross-infection and biogeography is considered, and the implications for treating CF lung infections also are discussed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Accurate monitoring of degradation levels in soils is essential in order to understand and achieve complete degradation of petroleum hydrocarbons in contaminated soils. We aimed to develop the use of multivariate methods for the monitoring of biodegradation of diesel in soils and to determine if diesel contaminated soils could be remediated to a chemical composition similar to that of an uncontaminated soil. An incubation experiment was set up with three contrasting soil types. Each soil was exposed to diesel at varying stages of degradation and then analysed for key hydrocarbons throughout 161 days of incubation. Hydrocarbon distributions were analysed by Principal Coordinate Analysis and similar samples grouped by cluster analysis. Variation and differences between samples were determined using permutational multivariate analysis of variance. It was found that all soils followed trajectories approaching the chemical composition of the unpolluted soil. Some contaminated soils were no longer significantly different to that of uncontaminated soil after 161 days of incubation. The use of cluster analysis allows the assignment of a percentage chemical similarity of a diesel contaminated soil to an uncontaminated soil sample. This will aid in the monitoring of hydrocarbon contaminated sites and the establishment of potential endpoints for successful remediation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The atmospheric composition of the central North Atlantic region has been sampled using the FAAM BAe146 instrumented aircraft during the Intercontinental Transport of Ozone and Precursors (ITOP) campaign, part of the wider International Consortium for Atmospheric Research on Transport and Transformation (ICARTT). This paper presents an overview of the ITOP campaign. Between late July and early August 2004, twelve flights comprising 72 hours of measurement were made in a region from approximately 20 to 40°W and 33 to 47°N centered on Faial Island, Azores, ranging in altitude from 50 to 9000 m. The vertical profiles of O3 and CO are consistent with previous observations made in this region during 1997 and our knowledge of the seasonal cycles within the region. A cluster analysis technique is used to partition the data set into air mass types with distinct chemical signatures. Six clusters provide a suitable balance between cluster generality and specificity. The clusters are labeled as biomass burning, low level outflow, upper level outflow, moist lower troposphere, marine and upper troposphere. During this summer, boreal forest fire emissions from Alaska and northern Canada were found to provide a major perturbation of tropospheric composition in CO, PAN, organic compounds and aerosol. Anthropogenic influenced air from the continental boundary layer of the USA was clearly observed running above the marine boundary layer right across the mid-Atlantic, retaining high pollution levels in VOCs and sulfate aerosol. Upper level outflow events were found to have far lower sulfate aerosol, resulting from washout on ascent, but much higher PAN associated with the colder temperatures. Lagrangian links with flights of other aircraft over the USA and Europe show that such signatures are maintained many days downwind of emission regions. Some other features of the data set are highlighted, including the strong perturbations to many VOCs and OVOCs in this remote region.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Invasive plant species have been shown to alter the microbial community composition of the soils they invade and it is suggested that this below-ground perturbation of potential pathogens, decomposers or symbionts may feedback positively to allow invasive success. Whether these perturbations are mediated through specific components of root exudation are not understood. We focussed on 8-hydroxyquinoline, a putative allelochemical of Centaurea diffusa (diffuse knapweed) and used an artificial root system to differentiate the effects of 8-hydroxyquinoline against a background of total rhizodeposition as mimicked through supply of a synthetic exudate solution. In soil proximal (0-10 cm) to the artificial root, synthetic exudates had a highly significant (P < 0.001) influence on dehydrogenase, fluorescein diacetate hydrolysis and urease activity. in addition, 8-hydroxyquinoline was significant (p = 0.003) as a main effect on dehydrogenase activity and interacted with synthetic exudates to affect urease activity (p = 0.09). Hierarchical cluster analysis of 16S rDNA-based DGGE band patterns also identified a primary affect of synthetic exudates and a secondary affect of 8-hydroxyquinoline on bacterial community structure. Thus, we show that the artificial rhizosphere produced by the synthetic exudates was the predominant effect, but, that the influence of the 8-hydroxyquinoline signal on the activity and structure of soil microbial communities could also be detected. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy.