79 resultados para Data distribution

em CentAUR: Central Archive University of Reading - UK


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Relevância:

60.00% 60.00%

Publicador:

Resumo:

IPLV overall coefficient, presented by Air-Conditioning and Refrigeration Institute (ARI) of America, shows running/operation status of air-conditioning system host only. For overall operation coefficient, logical solution has not been developed, to reflect the whole air-conditioning system under part load. In this research undertaking, the running time proportions of air-conditioning systems under part load have been obtained through analysis on energy consumption data during practical operation in all public buildings in Chongqing. This was achieved by using analysis methods, based on the statistical energy consumption data distribution of public buildings month-by-month. Comparing with the weight number of IPLV, part load operation coefficient of air-conditioning system, based on this research, does not only show the status of system refrigerating host, but also reflects and calculate energy efficiency of the whole air-conditioning system. The coefficient results from the processing and analyzing of practical running data, shows the practical running status of area and building type (actual and objective) – not clear. The method is different from model analysis which gets IPLV weight number, in the sense that this method of coefficient results in both four equal proportions and also part load operation coefficient of air-conditioning system under any load rate as necessary.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Global communication requirements and load imbalance of some parallel data mining algorithms are the major obstacles to exploit the computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication cost in iterative parallel data mining algorithms. In particular, the analysis focuses on one of the most influential and popular data mining methods, the k-means algorithm for cluster analysis. The straightforward parallel formulation of the k-means algorithm requires a global reduction operation at each iteration step, which hinders its scalability. This work studies a different parallel formulation of the algorithm where the requirement of global communication can be relaxed while still providing the exact solution of the centralised k-means algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real world distributed applications or can be induced by means of multi-dimensional binary search trees. The approach can also be extended to accommodate an approximation error which allows a further reduction of the communication costs.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

There has been a clear lack of common data exchange semantics for inter-organisational workflow management systems where the research has mainly focused on technical issues rather than language constructs. This paper presents the neutral data exchanges semantics required for the workflow integration within the AXAEDIS framework and presents the mechanism for object discovery from the object repository where little or no knowledge about the object is available. The paper also presents workflow independent integration architecture with the AXAEDIS Framework.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The absorption spectra of phytoplankton in the visible domain hold implicit information on the phytoplankton community structure. Here we use this information to retrieve quantitative information on phytoplankton size structure by developing a novel method to compute the exponent of an assumed power-law for their particle-size spectrum. This quantity, in combination with total chlorophyll-a concentration, can be used to estimate the fractional concentration of chlorophyll in any arbitrarily-defined size class of phytoplankton. We further define and derive expressions for two distinct measures of cell size of mixed populations, namely, the average spherical diameter of a bio-optically equivalent homogeneous population of cells of equal size, and the average equivalent spherical diameter of a population of cells that follow a power-law particle-size distribution. The method relies on measurements of two quantities of a phytoplankton sample: the concentration of chlorophyll-a, which is an operational index of phytoplankton biomass, and the total absorption coefficient of phytoplankton in the red peak of visible spectrum at 676 nm. A sensitivity analysis confirms that the relative errors in the estimates of the exponent of particle size spectra are reasonably low. The exponents of phytoplankton size spectra, estimated for a large set of in situ data from a variety of oceanic environments (~ 2400 samples), are within a reasonable range; and the estimated fractions of chlorophyll in pico-, nano- and micro-phytoplankton are generally consistent with those obtained by an independent, indirect method based on diagnostic pigments determined using high-performance liquid chromatography. The estimates of cell size for in situ samples dominated by different phytoplankton types (diatoms, prymnesiophytes, Prochlorococcus, other cyanobacteria and green algae) yield nominal sizes consistent with the taxonomic classification. To estimate the same quantities from satellite-derived ocean-colour data, we combine our method with algorithms for obtaining inherent optical properties from remote sensing. The spatial distribution of the size-spectrum exponent and the chlorophyll fractions of pico-, nano- and micro-phytoplankton estimated from satellite remote sensing are in agreement with the current understanding of the biogeography of phytoplankton functional types in the global oceans. This study contributes to our understanding of the distribution and time evolution of phytoplankton size structure in the global oceans.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The global vegetation response to climate and atmospheric CO2 changes between the last glacial maximum and recent times is examined using an equilibrium vegetation model (BIOME4), driven by output from 17 climate simulations from the Palaeoclimate Modelling Intercomparison Project. Features common to all of the simulations include expansion of treeless vegetation in high northern latitudes; southward displacement and fragmentation of boreal and temperate forests; and expansion of drought-tolerant biomes in the tropics. These features are broadly consistent with pollen-based reconstructions of vegetation distribution at the last glacial maximum. Glacial vegetation in high latitudes reflects cold and dry conditions due to the low CO2 concentration and the presence of large continental ice sheets. The extent of drought-tolerant vegetation in tropical and subtropical latitudes reflects a generally drier low-latitude climate. Comparisons of the observations with BIOME4 simulations, with and without consideration of the direct physiological effect of CO2 concentration on C3 photosynthesis, suggest an important additional role of low CO2 concentration in restricting the extent of forests, especially in the tropics. Global forest cover was overestimated by all models when climate change alone was used to drive BIOME4, and estimated more accurately when physiological effects of CO2 concentration were included. This result suggests that both CO2 effects and climate effects were important in determining glacial-interglacial changes in vegetation. More realistic simulations of glacial vegetation and climate will need to take into account the feedback effects of these structural and physiological changes on the climate.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The MATLAB model is contained within the compressed folders (versions are available as .zip and .tgz). This model uses MERRA reanalysis data (>34 years available) to estimate the hourly aggregated wind power generation for a predefined (fixed) distribution of wind farms. A ready made example is included for the wind farm distribution of Great Britain, April 2014 ("CF.dat"). This consists of an hourly time series of GB-total capacity factor spanning the period 1980-2013 inclusive. Given the global nature of reanalysis data, the model can be applied to any specified distribution of wind farms in any region of the world. Users are, however, strongly advised to bear in mind the limitations of reanalysis data when using this model/data. This is discussed in our paper: Cannon, Brayshaw, Methven, Coker, Lenaghan. "Using reanalysis data to quantify extreme wind power generation statistics: a 33 year case study in Great Britain". Submitted to Renewable Energy in March, 2014. Additional information about the model is contained in the model code itself, in the accompanying ReadMe file, and on our website: http://www.met.reading.ac.uk/~energymet/data/Cannon2014/

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper provides for the first time an objective short-term (8 yr) climatology of African convective weather systems based on satellite imagery. Eight years of infrared International Satellite Cloud Climatology Project-European Space Agency's Meteorological Satellite (ISCCP-Meteosat) satellite imagery has been analyzed using objective feature identification, tracking, and statistical techniques for the July, August, and September periods and the region of Africa and the adjacent Atlantic ocean. This allows various diagnostics to be computed and used to study the distribution of mesoscale and synoptic-scale convective weather systems from mesoscale cloud clusters and squall lines to tropical cyclones. An 8-yr seasonal climatology (1983-90) and the seasonal cycle of this convective activity are presented and discussed. Also discussed is the dependence of organized convection for this region, on the orography, convective, and potential instability and vertical wind shear using European Centre for Medium-Range Weather Forecasts reanalysis data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Satellite-based rainfall monitoring is widely used for climatological studies because of its full global coverage but it is also of great importance for operational purposes especially in areas such as Africa where there is a lack of ground-based rainfall data. Satellite rainfall estimates have enormous potential benefits as input to hydrological and agricultural models because of their real time availability, low cost and full spatial coverage. One issue that needs to be addressed is the uncertainty on these estimates. This is particularly important in assessing the likely errors on the output from non-linear models (rainfall-runoff or crop yield) which make use of the rainfall estimates, aggregated over an area, as input. Correct assessment of the uncertainty on the rainfall is non-trivial as it must take account of • the difference in spatial support of the satellite information and independent data used for calibration • uncertainties on the independent calibration data • the non-Gaussian distribution of rainfall amount • the spatial intermittency of rainfall • the spatial correlation of the rainfall field This paper describes a method for estimating the uncertainty on satellite-based rainfall values taking account of these factors. The method involves firstly a stochastic calibration which completely describes the probability of rainfall occurrence and the pdf of rainfall amount for a given satellite value, and secondly the generation of ensemble of rainfall fields based on the stochastic calibration but with the correct spatial correlation structure within each ensemble member. This is achieved by the use of geostatistical sequential simulation. The ensemble generated in this way may be used to estimate uncertainty at larger spatial scales. A case study of daily rainfall monitoring in the Gambia, west Africa for the purpose of crop yield forecasting is presented to illustrate the method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Since the advent of the internet in every day life in the 1990s, the barriers to producing, distributing and consuming multimedia data such as videos, music, ebooks, etc. have steadily been lowered for most computer users so that almost everyone with internet access can join the online communities who both produce, consume and of course also share media artefacts. Along with this trend, the violation of personal data privacy and copyright has increased with illegal file sharing being rampant across many online communities particularly for certain music genres and amongst the younger age groups. This has had a devastating effect on the traditional media distribution market; in most cases leaving the distribution companies and the content owner with huge financial losses. To prove that a copyright violation has occurred one can deploy fingerprinting mechanisms to uniquely identify the property. However this is currently based on only uni-modal approaches. In this paper we describe some of the design challenges and architectural approaches to multi-modal fingerprinting currently being examined for evaluation studies within a PhD research programme on optimisation of multi-modal fingerprinting architectures. Accordingly we outline the available modalities that are being integrated through this research programme which aims to establish the optimal architecture for multi-modal media security protection over the internet as the online distribution environment for both legal and illegal distribution of media products.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The continuous operation of insect-monitoring radars in the UK has permitted, for the first time, the characterization of various phenomena associated with high-altitude migration of large insects over this part of northern Europe. Previous studies have taken a case-study approach, concentrating on a small number of nights of particular interest. Here, combining data from two radars, and from an extensive suction- and light-trapping network, we have undertaken a more systematic, longer-term study of diel flight periodicity and vertical distribution of macro-insects in the atmosphere. Firstly, we identify general features of insect abundance and stratification, occurring during the 24-hour cycle, which emerge from four years’ aggregated radar data for the summer months in southern Britain. These features include mass emigrations at dusk and to a lesser extent at dawn, and daytime concentrations associated with thermal convection. We then focus our attention on the well-defined layers of large nocturnal migrants that form in the early evening, usually at heights of 200–500 m above ground. We present evidence from both radar and trap data that these nocturnal layers are composed mainly of noctuid moths, with species such as Noctua pronuba, Autographa gamma, Agrotis exclamationis, A. segetum, Xestia c-nigrum and Phlogophora meticulosa predominating.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Understanding links between the El Nino-Southern Oscillation (ENSO) and snow would be useful for seasonal forecasting, but also for understanding natural variability and interpreting climate change predictions. Here, a 545-year run of the general circulation model HadCM3, with prescribed external forcings and fixed greenhouse gas concentrations, is used to explore the impact of ENSO on snow water equivalent (SWE) anomalies. In North America, positive ENSO events reduce the mean SWE and skew the distribution towards lower values, and vice versa during negative ENSO events. This is associated with a dipole SWE anomaly structure, with anomalies of opposite sign centered in western Canada and the central United States. In Eurasia, warm episodes lead to a more positively skewed distribution and the mean SWE is raised. Again, the opposite effect is seen during cold episodes. In Eurasia the largest anomalies are concentrated in the Himalayas. These correlations with February SWE distribution are seen to exist from the previous June-July-August (JJA) ENSO index onwards, and are weakly detected in 50-year subsections of the control run, but only a shifted North American response can be detected in the anaylsis of 40 years of ERA40 reanalysis data. The ENSO signal in SWE from the long run could still contribute to regional predictions although it would be a weak indicator only