88 resultados para Software Package Data Exchange (SPDX)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a model-data fusion (MDF) inter-comparison project (REFLEX), which compared various algorithms for estimating carbon (C) model parameters consistent with both measured carbon fluxes and states and a simple C model. Participants were provided with the model and with both synthetic net ecosystem exchange (NEE) of CO2 and leaf area index (LAI) data, generated from the model with added noise, and observed NEE and LAI data from two eddy covariance sites. Participants endeavoured to estimate model parameters and states consistent with the model for all cases over the two years for which data were provided, and generate predictions for one additional year without observations. Nine participants contributed results using Metropolis algorithms, Kalman filters and a genetic algorithm. For the synthetic data case, parameter estimates compared well with the true values. The results of the analyses indicated that parameters linked directly to gross primary production (GPP) and ecosystem respiration, such as those related to foliage allocation and turnover, or temperature sensitivity of heterotrophic respiration, were best constrained and characterised. Poorly estimated parameters were those related to the allocation to and turnover of fine root/wood pools. Estimates of confidence intervals varied among algorithms, but several algorithms successfully located the true values of annual fluxes from synthetic experiments within relatively narrow 90% confidence intervals, achieving >80% success rate and mean NEE confidence intervals <110 gC m−2 year−1 for the synthetic case. Annual C flux estimates generated by participants generally agreed with gap-filling approaches using half-hourly data. The estimation of ecosystem respiration and GPP through MDF agreed well with outputs from partitioning studies using half-hourly data. Confidence limits on annual NEE increased by an average of 88% in the prediction year compared to the previous year, when data were available. Confidence intervals on annual NEE increased by 30% when observed data were used instead of synthetic data, reflecting and quantifying the addition of model error. Finally, our analyses indicated that incorporating additional constraints, using data on C pools (wood, soil and fine roots) would help to reduce uncertainties for model parameters poorly served by eddy covariance data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Climate modeling is a complex process, requiring accurate and complete metadata in order to identify, assess and use climate data stored in digital repositories. The preservation of such data is increasingly important given the development of ever-increasingly complex models to predict the effects of global climate change. The EU METAFOR project has developed a Common Information Model (CIM) to describe climate data and the models and modelling environments that produce this data. There is a wide degree of variability between different climate models and modelling groups. To accommodate this, the CIM has been designed to be highly generic and flexible, with extensibility built in. METAFOR describes the climate modelling process simply as "an activity undertaken using software on computers to produce data." This process has been described as separate UML packages (and, ultimately, XML schemas). This fairly generic structure canbe paired with more specific "controlled vocabularies" in order to restrict the range of valid CIM instances. The CIM will aid digital preservation of climate models as it will provide an accepted standard structure for the model metadata. Tools to write and manage CIM instances, and to allow convenient and powerful searches of CIM databases,. Are also under development. Community buy-in of the CIM has been achieved through a continual process of consultation with the climate modelling community, and through the METAFOR team’s development of a questionnaire that will be used to collect the metadata for the Intergovernmental Panel on Climate Change’s (IPCC) Coupled Model Intercomparison Project Phase 5 (CMIP5) model runs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed and collaborative data stream mining in a mobile computing environment is referred to as Pocket Data Mining PDM. Large amounts of available data streams to which smart phones can subscribe to or sense, coupled with the increasing computational power of handheld devices motivates the development of PDM as a decision making system. This emerging area of study has shown to be feasible in an earlier study using technological enablers of mobile software agents and stream mining techniques [1]. A typical PDM process would start by having mobile agents roam the network to discover relevant data streams and resources. Then other (mobile) agents encapsulating stream mining techniques visit the relevant nodes in the network in order to build evolving data mining models. Finally, a third type of mobile agents roam the network consulting the mining agents for a final collaborative decision, when required by one or more users. In this paper, we propose the use of distributed Hoeffding trees and Naive Bayes classifers in the PDM framework over vertically partitioned data streams. Mobile policing, health monitoring and stock market analysis are among the possible applications of PDM. An extensive experimental study is reported showing the effectiveness of the collaborative data mining with the two classifers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The discourse surrounding the virtual has moved away from the utopian thinking accompanying the rise of the Internet in the 1990s. The Cyber-gurus of the last decades promised a technotopia removed from materiality and the confines of the flesh and the built environment, a liberation from old institutions and power structures. But since then, the virtual has grown into a distinct yet related sphere of cultural and political production that both parallels and occasionally flows over into the old world of material objects. The strict dichotomy of matter and digital purity has been replaced more recently with a more complex model where both the world of stuff and the world of knowledge support, resist and at the same time contain each other. Online social networks amplify and extend existing ones; other cultural interfaces like youtube have not replaced the communal experience of watching moving images in a semi-public space (the cinema) or the semi-private space (the family living room). Rather the experience of viewing is very much about sharing and communicating, offering interpretations and comments. Many of the web’s strongest entities (Amazon, eBay, Gumtree etc.) sit exactly at this juncture of applying tools taken from the knowledge management industry to organize the chaos of the material world along (post-)Fordist rationality. Since the early 1990s there have been many artistic and curatorial attempts to use the Internet as a platform of producing and exhibiting art, but a lot of these were reluctant to let go of the fantasy of digital freedom. Storage Room collapses the binary opposition of real and virtual space by using online data storage as a conduit for IRL art production. The artworks here will not be available for viewing online in a 'screen' environment but only as part of a downloadable package with the intention that the exhibition could be displayed (in a physical space) by any interested party and realised as ambitiously or minimally as the downloader wishes, based on their means. The artists will therefore also supply a set of instructions for the physical installation of the work alongside the digital files. In response to this curatorial initiative, File Transfer Protocol invites seven UK based artists to produce digital art for a physical environment, addressing the intersection between the virtual and the material. The files range from sound, video, digital prints and net art, blueprints for an action to take place, something to be made, a conceptual text piece, etc. About the works and artists: Polly Fibre is the pseudonym of London-based artist Christine Ellison. Ellison creates live music using domestic devices such as sewing machines, irons and slide projectors. Her costumes and stage sets propose a physical manifestation of the virtual space that is created inside software like Photoshop. For this exhibition, Polly Fibre invites the audience to create a musical composition using a pair of amplified scissors and a turntable. http://www.pollyfibre.com John Russell, a founding member of 1990s art group Bank, is an artist, curator and writer who explores in his work the contemporary political conditions of the work of art. In his digital print, Russell collages together visual representations of abstract philosophical ideas and transforms them into a post apocalyptic landscape that is complex and banal at the same time. www.john-russell.org The work of Bristol based artist Jem Nobel opens up a dialogue between the contemporary and the legacy of 20th century conceptual art around questions of collectivism and participation, authorship and individualism. His print SPACE concretizes the representation of the most common piece of Unicode: the vacant space between words. In this way, the gap itself turns from invisible cipher to sign. www.jemnoble.com Annabel Frearson is rewriting Mary Shelley's Frankenstein using all and only the words from the original text. Frankenstein 2, or the Monster of Main Stream, is read in parts by different performers, embodying the psychotic character of the protagonist, a mongrel hybrid of used language. www.annabelfrearson.com Darren Banks uses fragments of effect laden Holywood films to create an impossible space. The fictitious parts don't add up to a convincing material reality, leaving the viewer with a failed amalgamation of simulations of sophisticated technologies. www.darrenbanks.co.uk FIELDCLUB is collaboration between artist Paul Chaney and researcher Kenna Hernly. Chaney and Hernly developed together a project that critically examines various proposals for the management of sustainable ecological systems. Their FIELDMACHINE invites the public to design an ideal agricultural field. By playing with different types of crops that are found in the south west of England, it is possible for the user, for example, to create a balanced, but protein poor, diet or to simply decide to 'get rid' of half the population. The meeting point of the Platonic field and it physical consequences, generates a geometric abstraction that investigates the relationship between modernist utopianism and contemporary actuality. www.fieldclub.co.uk Pil and Galia Kollectiv, who have also curated the exhibition are London-based artists and run the xero, kline & coma gallery. Here they present a dialogue between two computers. The conversation opens with a simple text book problem in business studies. But gradually the language, mimicking the application of game theory in the business sector, becomes more abstract. The two interlocutors become adversaries trapped forever in a competition without winners. www.kollectiv.co.uk

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware and software technology enable us to collect, store and distribute large quantities of data on a very large scale. Automatically discovering and extracting hidden knowledge in the form of patterns from these large data volumes is known as data mining. Data mining technology is not only a part of business intelligence, but is also used in many other application areas such as research, marketing and financial analytics. For example medical scientists can use patterns extracted from historic patient data in order to determine if a new patient is likely to respond positively to a particular treatment or not; marketing analysts can use extracted patterns from customer data for future advertisement campaigns; finance experts have an interest in patterns that forecast the development of certain stock market shares for investment recommendations. However, extracting knowledge in the form of patterns from massive data volumes imposes a number of computational challenges in terms of processing time, memory, bandwidth and power consumption. These challenges have led to the development of parallel and distributed data analysis approaches and the utilisation of Grid and Cloud computing. This chapter gives an overview of parallel and distributed computing approaches and how they can be used to scale up data mining to large datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe ncWMS, an implementation of the Open Geospatial Consortium’s Web Map Service (WMS) specification for multidimensional gridded environmental data. ncWMS can read data in a large number of common scientific data formats – notably the NetCDF format with the Climate and Forecast conventions – then efficiently generate map imagery in thousands of different coordinate reference systems. It is designed to require minimal configuration from the system administrator and, when used in conjunction with a suitable client tool, provides end users with an interactive means for visualizing data without the need to download large files or interpret complex metadata. It is also used as a “bridging” tool providing interoperability between the environmental science community and users of geographic information systems. ncWMS implements a number of extensions to the WMS standard in order to fulfil some common scientific requirements, including the ability to generate plots representing timeseries and vertical sections. We discuss these extensions and their impact upon present and future interoperability. We discuss the conceptual mapping between the WMS data model and the data models used by gridded data formats, highlighting areas in which the mapping is incomplete or ambiguous. We discuss the architecture of the system and particular technical innovations of note, including the algorithms used for fast data reading and image generation. ncWMS has been widely adopted within the environmental data community and we discuss some of the ways in which the software is integrated within data infrastructures and portals.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The tropical tropopause is considered to be the main region of upward transport of tropospheric air carrying water vapor and other tracers to the tropical stratosphere. The lower tropical stratosphere is also the region where the quasi-biennial oscillation (QBO) in the zonal wind is observed. The QBO is positioned in the region where the upward transport of tropospheric tracers to the overworld takes place. Hence the QBO can in principle modulate these transports by its secondary meridional circulation. This modulation is investigated in this study by an analysis of general circulation model (GCM) experiments with an assimilated QBO. The experiments show, first, that the temperature signal of the QBO modifies the specific humidity in the air transported upward and, second, that the secondary meridional circulation modulates the velocity of the upward transport. Thus during the eastward phase of the QBO the upward moving air is moister and the upward velocity is less than during the westward phase of the QBO. It was further found that the QBO period is too short to allow an equilibration of the moisture in the QBO region. This causes a QBO signal of the moisture which is considerably smaller than what could be obtained in the limiting case of indefinitely long QBO phases. This also allows a high sensitivity of the mean moisture over a QBO cycle to the El Niño-Southern Oscillation (ENSO) phenomena or major tropical volcanic eruptions. The interplay of sporadic volcanic eruptions, ENSO, and QBO can produce low-frequency variability in the water vapor content of the tropical stratosphere, which renders the isolation of the QBO signal in observational data of water vapor in the equatorial lower stratosphere difficult.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

JASMIN is a super-data-cluster designed to provide a high-performance high-volume data analysis environment for the UK environmental science community. Thus far JASMIN has been used primarily by the atmospheric science and earth observation communities, both to support their direct scientific workflow, and the curation of data products in the STFC Centre for Environmental Data Archival (CEDA). Initial JASMIN configuration and first experiences are reported here. Useful improvements in scientific workflow are presented. It is clear from the explosive growth in stored data and use that there was a pent up demand for a suitable big-data analysis environment. This demand is not yet satisfied, in part because JASMIN does not yet have enough compute, the storage is fully allocated, and not all software needs are met. Plans to address these constraints are introduced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We review the decision by the European Commission in the case of the UK Agricultural Registration Exchange. We propose a theoretical model, offering a basis for some of the intuitive arguments used by the Commission on the anti-competitive role of information exchange in the case of price and non price collusion. Market transparency on non price data is shown to be a collusion facilitating device which may achieve stability in otherwise unstable cartels.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Cloud computing is usually regarded as being energy efficient and thus emitting less greenhouse gases (GHG) than traditional forms of computing. When the energy consumption of Microsoft’s cloud computing Office 365 (O365) and traditional Office 2010 (O2010) software suites were tested and modeled, some cloud services were found to consume more energy than the traditional form. The developed model in this research took into consideration the energy consumption at the three main stages of data transmission; data center, network, and end user device. Comparable products from each suite were selected and activities were defined for each product to represent a different computing type. Microsoft provided highly confidential data for the data center stage, while the networking and user device stages were measured directly. A new measurement and software apportionment approach was defined and utilized allowing the power consumption of cloud services to be directly measured for the user device stage. Results indicated that cloud computing is more energy efficient for Excel and Outlook which consumed less energy and emitted less GHG than the standalone counterpart. The power consumption of the cloud based Outlook (8%) and Excel (17%) was lower than their traditional counterparts. However, the power consumption of the cloud version of Word was 17% higher than its traditional equivalent. A third mixed access method was also measured for Word which emitted 5% more GHG than the traditional version. It is evident that cloud computing may not provide a unified way forward to reduce energy consumption and GHG. Direct conversion from the standalone package into the cloud provision platform can now consider energy and GHG emissions at the software development and cloud service design stage using the methods described in this research.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We consider the forecasting performance of two SETAR exchange rate models proposed by Kräger and Kugler [J. Int. Money Fin. 12 (1993) 195]. Assuming that the models are good approximations to the data generating process, we show that whether the non-linearities inherent in the data can be exploited to forecast better than a random walk depends on both how forecast accuracy is assessed and on the ‘state of nature’. Evaluation based on traditional measures, such as (root) mean squared forecast errors, may mask the superiority of the non-linear models. Generalized impulse response functions are also calculated as a means of portraying the asymmetric response to shocks implied by such models.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper combines and generalizes a number of recent time series models of daily exchange rate series by using a SETAR model which also allows the variance equation of a GARCH specification for the error terms to be drawn from more than one regime. An application of the model to the French Franc/Deutschmark exchange rate demonstrates that out-of-sample forecasts for the exchange rate volatility are also improved when the restriction that the data it is drawn from a single regime is removed. This result highlights the importance of considering both types of regime shift (i.e. thresholds in variance as well as in mean) when analysing financial time series.