856 resultados para Data Driven Clustering
Resumo:
The purpose of this lecture is to review recent development in data analysis, initialization and data assimilation. The development of 3-dimensional multivariate schemes has been very timely because of its suitability to handle the many different types of observations during FGGE. Great progress has taken place in the initialization of global models by the aid of non-linear normal mode technique. However, in spite of great progress, several fundamental problems are still unsatisfactorily solved. Of particular importance is the question of the initialization of the divergent wind fields in the Tropics and to find proper ways to initialize weather systems driven by non-adiabatic processes. The unsatisfactory ways in which such processes are being initialized are leading to excessively long spin-up times.
Resumo:
During a period of heliospheric disturbance in 2007-9 associated with a co-rotating interaction region (CIR), a characteristic periodic variation becomes apparent in neutron monitor data. This variation is phase locked to periodic heliospheric current sheet crossings. Phase-locked electrical variations are also seen in the terrestrial lower atmosphere in the southern UK, including an increase in the vertical conduction current density of fair weather atmospheric electricity during increases in the neutron monitor count rate and energetic proton count rates measured by spacecraft. At the same time as the conduction current increases, changes in the cloud microphysical properties lead to an increase in the detected height of the cloud base at Lerwick Observatory, Shetland, with associated changes in surface meteorological quantities. As electrification is expected at the base of layer clouds, which can influence droplet properties, these observations of phase-locked thermodynamic, cloud, atmospheric electricity and solar sector changes are not inconsistent with a heliospheric disturbance driving lower troposphere changes.
Resumo:
A global aerosol transport model (Oslo CTM2) with main aerosol components included is compared to five satellite retrievals of aerosol optical depth (AOD) and one data set of the satellite-derived radiative effect of aerosols. The model is driven with meteorological data for the period November 1996 to June 1997 which is the time period investigated in this study. The modelled AOD is within the range of the AOD from the various satellite retrievals over oceanic regions. The direct radiative effect of the aerosols as well as the atmospheric absorption by aerosols are in both cases found to be of the order of 20 Wm−2 in certain regions in both the satellite-derived and the modelled estimates as a mean over the period studied. Satellite and model data exhibit similar patterns of aerosol optical depth, radiative effect of aerosols, and atmospheric absorption of the aerosols. Recently published results show that global aerosol models have a tendency to underestimate the magnitude of the clear-sky direct radiative effect of aerosols over ocean compared to satellite-derived estimates. However, this is only to a small extent the case with the Oslo CTM2. The global mean direct radiative effect of aerosols over ocean is modelled with the Oslo CTM2 to be –5.5 Wm−2 and the atmospheric aerosol absorption 1.5 Wm−2.
Resumo:
Exascale systems are the next frontier in high-performance computing and are expected to deliver a performance of the order of 10^18 operations per second using massive multicore processors. Very large- and extreme-scale parallel systems pose critical algorithmic challenges, especially related to concurrency, locality and the need to avoid global communication patterns. This work investigates a novel protocol for dynamic group communication that can be used to remove the global communication requirement and to reduce the communication cost in parallel formulations of iterative data mining algorithms. The protocol is used to provide a communication-efficient parallel formulation of the k-means algorithm for cluster analysis. The approach is based on a collective communication operation for dynamic groups of processes and exploits non-uniform data distributions. Non-uniform data distributions can be either found in real-world distributed applications or induced by means of multidimensional binary search trees. The analysis of the proposed dynamic group communication protocol has shown that it does not introduce significant communication overhead. The parallel clustering algorithm has also been extended to accommodate an approximation error, which allows a further reduction of the communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing elements.
Resumo:
Under particular large-scale atmospheric conditions, several windstorms may affect Europe within a short time period. The occurrence of such cyclone families leads to large socioeconomic impacts and cumulative losses. The serial clustering of windstorms is analyzed for the North Atlantic/western Europe. Clustering is quantified as the dispersion (ratio variance/mean) of cyclone passages over a certain area. Dispersion statistics are derived for three reanalysis data sets and a 20-run European Centre Hamburg Version 5 /Max Planck Institute Version–Ocean Model Version 1 global climate model (ECHAM5/MPI-OM1 GCM) ensemble. The dependence of the seriality on cyclone intensity is analyzed. Confirming previous studies, serial clustering is identified in reanalysis data sets primarily on both flanks and downstream regions of the North Atlantic storm track. This pattern is a robust feature in the reanalysis data sets. For the whole area, extreme cyclones cluster more than nonextreme cyclones. The ECHAM5/MPI-OM1 GCM is generally able to reproduce the spatial patterns of clustering under recent climate conditions, but some biases are identified. Under future climate conditions (A1B scenario), the GCM ensemble indicates that serial clustering may decrease over the North Atlantic storm track area and parts of western Europe. This decrease is associated with an extension of the polar jet toward Europe, which implies a tendency to a more regular occurrence of cyclones over parts of the North Atlantic Basin poleward of 50°N and western Europe. An increase of clustering of cyclones is projected south of Newfoundland. The detected shifts imply a change in the risk of occurrence of cumulative events over Europe under future climate conditions.
Resumo:
We present an efficient method of combining wide angle neutron scattering data with detailed atomistic models, allowing us to perform a quantitative and qualitative mapping of the organisation of the chain conformation in both glass and liquid phases. The structural refinement method presented in this work is based on the exploitation of the intrachain features of the diffraction pattern and its intimate linkage with atomistic models by the use of internal coordinates for bond lengths, valence angles and torsion rotations. Atomic connectivity is defined through these coordinates that are in turn assigned by pre-defined probability distributions, thus allowing for the models in question to be built stochastically. Incremental variation of these coordinates allows for the construction of models that minimise the differences between the observed and calculated structure factors. We present a series of neutron scattering data of 1,2 polybutadiene at the region 120-400K. Analysis of the experimental data yield bond lengths for C-C and C=C of 1.54Å and 1.35Å respectively. Valence angles of the backbone were found to be at 112° and the torsion distributions are characterised by five rotational states, a three-fold trans-skew± for the backbone and gauche± for the vinyl group. Rotational states of the vinyl group were found to be equally populated, indicating a largely atactic chan. The two backbone torsion angles exhibit different behaviour with respect to temperature of their trans population, with one of them adopting an almost all trans sequence. Consequently the resulting configuration leads to a rather persistent chain, something indicated by the value of the characteristic ratio extrapolated from the model. We compare our results with theoretical predictions, computer simulations, RIS models and previously reported experimental results.
Resumo:
The global vegetation response to climate and atmospheric CO2 changes between the last glacial maximum and recent times is examined using an equilibrium vegetation model (BIOME4), driven by output from 17 climate simulations from the Palaeoclimate Modelling Intercomparison Project. Features common to all of the simulations include expansion of treeless vegetation in high northern latitudes; southward displacement and fragmentation of boreal and temperate forests; and expansion of drought-tolerant biomes in the tropics. These features are broadly consistent with pollen-based reconstructions of vegetation distribution at the last glacial maximum. Glacial vegetation in high latitudes reflects cold and dry conditions due to the low CO2 concentration and the presence of large continental ice sheets. The extent of drought-tolerant vegetation in tropical and subtropical latitudes reflects a generally drier low-latitude climate. Comparisons of the observations with BIOME4 simulations, with and without consideration of the direct physiological effect of CO2 concentration on C3 photosynthesis, suggest an important additional role of low CO2 concentration in restricting the extent of forests, especially in the tropics. Global forest cover was overestimated by all models when climate change alone was used to drive BIOME4, and estimated more accurately when physiological effects of CO2 concentration were included. This result suggests that both CO2 effects and climate effects were important in determining glacial-interglacial changes in vegetation. More realistic simulations of glacial vegetation and climate will need to take into account the feedback effects of these structural and physiological changes on the climate.
Resumo:
Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.
Resumo:
Urbanization is one of the major forms of habitat alteration occurring at the present time. Although this is typically deleterious to biodiversity, some species flourish within these human-modified landscapes, potentially leading to negative and/or positive interactions between people and wildlife. Hence, up-to-date assessment of urban wildlife populations is important for developing appropriate management strategies. Surveying urban wildlife is limited by land partition and private ownership, rendering many common survey techniques difficult. Garnering public involvement is one solution, but this method is constrained by the inherent biases of non-standardised survey effort associated with voluntary participation. We used a television-led media approach to solicit national participation in an online sightings survey to investigate changes in the distribution of urban foxes in Great Britain and to explore relationships between urban features and fox occurrence and sightings density. Our results show that media-based approaches can generate a large national database on the current distribution of a recognisable species. Fox distribution in England and Wales has changed markedly within the last 25 years, with sightings submitted from 91% of urban areas previously predicted to support few or no foxes. Data were highly skewed with 90% of urban areas having <30 fox sightings per 1000 people km-2. The extent of total urban area was the only variable with a significant impact on both fox occurrence and sightings density in urban areas; longitude and percentage of public green urban space were respectively, significantly positively and negatively associated with sightings density only. Latitude, and distance to nearest neighbouring conurbation had no impact on either occurrence or sightings density. Given the limitations associated with this method, further investigations are needed to determine the association between sightings density and actual fox density, and variability of fox density within and between urban areas in Britain.
Resumo:
Background. Current models of concomitant, intermittent strabismus, heterophoria, convergence and accommodation anomalies are either theoretically complex or incomplete. We propose an alternative and more practical way to conceptualize clinical patterns. Methods. In each of three hypothetical scenarios (normal; high AC/A and low CA/C ratios; low AC/A and high CA/C ratios) there can be a disparity-biased or blur-biased “style”, despite identical ratios. We calculated a disparity bias index (DBI) to reflect these biases. We suggest how clinical patterns fit these scenarios and provide early objective data from small illustrative clinical groups. Results. Normal adults and children showed disparity bias (adult DBI 0.43 (95%CI 0.50-0.36), child DBI 0.20 (95%CI 0.31-0.07) (p=0.001). Accommodative esotropes showed less disparity-bias (DBI 0.03). In the high AC/A and low CA/C scenario, early presbyopes had mean DBI of 0.17 (95%CI 0.28-0.06), compared to DBI of -0.31 in convergence excess esotropes. In the low AC/A and high CA/C scenario near exotropes had mean DBI of 0.27, while we predict that non-strabismic, non-amblyopic hyperopes with good vision without spectacles will show lower DBIs. Disparity bias ranged between 1.25 and -1.67. Conclusions. Establishing disparity or blur bias, together with knowing whether convergence to target demand exceeds accommodation or vice versa explains clinical patterns more effectively than AC/A and CA/C ratios alone. Excessive bias or inflexibility in near-cue use increases risk of clinical problems. We suggest clinicians look carefully at details of accommodation and convergence changes induced by lenses, dissociation and prisms and use these to plan treatment in relation to the model.
Resumo:
Cognitive experiments involving motor execution (ME) and motor imagery (MI) have been intensively studied using functional magnetic resonance imaging (fMRI). However, the functional networks of a multitask paradigm which include ME and MI were not widely explored. In this article, we aimed to investigate the functional networks involved in MI and ME using a method combining the hierarchical clustering analysis (HCA) and the independent component analysis (ICA). Ten right-handed subjects were recruited to participate a multitask experiment with conditions such as visual cue, MI, ME and rest. The results showed that four activation clusters were found including parts of the visual network, ME network, the MI network and parts of the resting state network. Furthermore, the integration among these functional networks was also revealed. The findings further demonstrated that the combined HCA with ICA approach was an effective method to analyze the fMRI data of multitasks.
Resumo:
During the last decades, several windstorm series hit Europe leading to large aggregated losses. Such storm series are examples of serial clustering of extreme cyclones, presenting a considerable risk for the insurance industry. Clustering of events and return periods of storm series for Germany are quantified based on potential losses using empirical models. Two reanalysis data sets and observations from German weather stations are considered for 30 winters. Histograms of events exceeding selected return levels (1-, 2- and 5-year) are derived. Return periods of historical storm series are estimated based on the Poisson and the negative binomial distributions. Over 4000 years of general circulation model (GCM) simulations forced with current climate conditions are analysed to provide a better assessment of historical return periods. Estimations differ between distributions, for example 40 to 65 years for the 1990 series. For such less frequent series, estimates obtained with the Poisson distribution clearly deviate from empirical data. The negative binomial distribution provides better estimates, even though a sensitivity to return level and data set is identified. The consideration of GCM data permits a strong reduction of uncertainties. The present results support the importance of considering explicitly clustering of losses for an adequate risk assessment for economical applications.
Resumo:
The terrestrial magnetopause suffered considerable sudden changes in its location on 9–10 September 1978. These magnetopause motions were accompanied by disturbances of the geomagnetic field on the ground. We present a study of the magnetopause motions and the ground magnetic signatures using, for the latter, 10 s averaged data from 14 high latitude ground magnetometer stations. Observations in the solar wind (from IMP 8) are employed and the motions of the magnetopause are monitored directly by the spacecraft ISEE 1 and 2. With these coordinated observations we are able to show that it is the sudden changes in the solar wind dynamic pressure that are responsible for the disturbances seen on the ground. At some ground stations we see evidence of a “ringing” of the magnetospheric cavity, while at others only the initial impulse is evident. We note that at some stations field perturbations closely match the hypothesized ground signatures of flux transfer events. In accordance with more recent work in the area (e.g. Potemra et al., 1989, J. geophys. Res., in press), we argue that causes other than impulsive reeonnection may produce the twin ionospheric flow vortex originally proposed as a flux transfer even signature.
Resumo:
ERA-Interim/Land is a global land surface reanalysis data set covering the period 1979–2010. It describes the evolution of soil moisture, soil temperature and snowpack. ERA-Interim/Land is the result of a single 32-year simulation with the latest ECMWF (European Centre for Medium-Range Weather Forecasts) land surface model driven by meteorological forcing from the ERA-Interim atmospheric reanalysis and precipitation adjustments based on monthly GPCP v2.1 (Global Precipitation Climatology Project). The horizontal resolution is about 80 km and the time frequency is 3-hourly. ERA-Interim/Land includes a number of parameterization improvements in the land surface scheme with respect to the original ERA-Interim data set, which makes it more suitable for climate studies involving land water resources. The quality of ERA-Interim/Land is assessed by comparing with ground-based and remote sensing observations. In particular, estimates of soil moisture, snow depth, surface albedo, turbulent latent and sensible fluxes, and river discharges are verified against a large number of site measurements. ERA-Interim/Land provides a global integrated and coherent estimate of soil moisture and snow water equivalent, which can also be used for the initialization of numerical weather prediction and climate models.
Resumo:
Some recent winters in Western Europe have been characterized by the occurrence of multiple extratropical cyclones following a similar path. The occurrence of such cyclone clusters leads to large socio-economic impacts due to damaging winds, storm surges, and floods. Recent studies have statistically characterized the clustering of extratropical cyclones over the North Atlantic and Europe and hypothesized potential physical mechanisms responsible for their formation. Here we analyze 4 months characterized by multiple cyclones over Western Europe (February 1990, January 1993, December 1999, and January 2007). The evolution of the eddy driven jet stream, Rossby wave-breaking, and upstream/downstream cyclone development are investigated to infer the role of the large-scale flow and to determine if clustered cyclones are related to each other. Results suggest that optimal conditions for the occurrence of cyclone clusters are provided by a recurrent extension of an intensified eddy driven jet toward Western Europe lasting at least 1 week. Multiple Rossby wave-breaking occurrences on both the poleward and equatorward flanks of the jet contribute to the development of these anomalous large-scale conditions. The analysis of the daily weather charts reveals that upstream cyclone development (secondary cyclogenesis, where new cyclones are generated on the trailing fronts of mature cyclones) is strongly related to cyclone clustering, with multiple cyclones developing on a single jet streak. The present analysis permits a deeper understanding of the physical reasons leading to the occurrence of cyclone families over the North Atlantic, enabling a better estimation of the associated cumulative risk over Europe.