933 resultados para Software Package Data Exchange (SPDX)


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Many recent papers have documented periodicities in returns, return volatility, bid–ask spreads and trading volume, in both equity and foreign exchange markets. We propose and employ a new test for detecting subtle periodicities in time series data based on a signal coherence function. The technique is applied to a set of seven half-hourly exchange rate series. Overall, we find the signal coherence to be maximal at the 8-h and 12-h frequencies. Retaining only the most coherent frequencies for each series, we implement a trading rule that is based on these observed periodicities. Our results demonstrate in all cases except one that, in gross terms, the rules can generate returns that are considerably greater than those of a buy-and-hold strategy, although they cannot retain their profitability net of transactions costs. We conjecture that this methodology could constitute an important tool for financial market researchers which will enable them to detect, quantify and rank the various periodic components in financial data better.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In recent years, a sharp divergence of London Stock Exchange equity prices from dividends has been noted. In this paper, we examine whether this divergence can be explained by reference to the existence of a speculative bubble. Three different empirical methodologies are used: variance bounds tests, bubble specification tests, and cointegration tests based on both ex post and ex ante data. We find that, stock prices diverged significantly from their fundamental values during the late 1990's, and that this divergence has all the characteristics of a bubble.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper deals with the integration of radial basis function (RBF) networks into the industrial software control package Connoisseur. The paper shows the improved modelling capabilities offered by RBF networks within the Connoisseur environment compared to linear modelling techniques such as recursive least squares. The paper also goes on to mention the way this improved modelling capability, obtained through the RBF networks will be utilised within Connoisseur.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Metabolic stable isotope labeling is increasingly employed for accurate protein (and metabolite) quantitation using mass spectrometry (MS). It provides sample-specific isotopologues that can be used to facilitate comparative analysis of two or more samples. Stable Isotope Labeling by Amino acids in Cell culture (SILAC) has been used for almost a decade in proteomic research and analytical software solutions have been established that provide an easy and integrated workflow for elucidating sample abundance ratios for most MS data formats. While SILAC is a discrete labeling method using specific amino acids, global metabolic stable isotope labeling using isotopes such as (15)N labels the entire element content of the sample, i.e. for (15)N the entire peptide backbone in addition to all nitrogen-containing side chains. Although global metabolic labeling can deliver advantages with regard to isotope incorporation and costs, the requirements for data analysis are more demanding because, for instance for polypeptides, the mass difference introduced by the label depends on the amino acid composition. Consequently, there has been less progress on the automation of the data processing and mining steps for this type of protein quantitation. Here, we present a new integrated software solution for the quantitative analysis of protein expression in differential samples and show the benefits of high-resolution MS data in quantitative proteomic analyses.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper builds upon previous research on currency bands, and provides a model for the Colombian peso. Stochastic differential equations are combined with information related to the Colombian currency band to estimate competing models of the behaviour of the Colombian peso within the limits of the currency band. The resulting moments of the density function for the simulated returns describe adequately most of the characteristics of the sample returns data. The factor included to account for the intra-marginal intervention performed to drive the rate towards the Central Parity accounts only for 6.5% of the daily change, which supports the argument that intervention, if performed by the Central Bank, it is not directed to push the currency towards the limits. Moreover, the credibility of the Colombian Central Bank, Banco de la República’s ability to defend the band seems low.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Much consideration is rightly given to the design of metadata models to describe data. At the other end of the data-delivery spectrum much thought has also been given to the design of geospatial delivery interfaces such as the Open Geospatial Consortium standards, Web Coverage Service (WCS), Web Map Server and Web Feature Service (WFS). Our recent experience with the Climate Science Modelling Language shows that an implementation gap exists where many challenges remain unsolved. To bridge this gap requires transposing information and data from one world view of geospatial climate data to another. Some of the issues include: the loss of information in mapping to a common information model, the need to create ‘views’ onto file-based storage, and the need to map onto an appropriate delivery interface (as with the choice between WFS and WCS for feature types with coverage-valued properties). Here we summarise the approaches we have taken in facing up to these problems.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new electronic software distribution (ESD) life cycle analysis (LCA)methodology and model structure were constructed to calculate energy consumption and greenhouse gas (GHG) emissions. In order to counteract the use of high level, top-down modeling efforts, and to increase result accuracy, a focus upon device details and data routes was taken. In order to compare ESD to a relevant physical distribution alternative,physical model boundaries and variables were described. The methodology was compiled from the analysis and operational data of a major online store which provides ESD and physical distribution options. The ESD method included the calculation of power consumption of data center server and networking devices. An in-depth method to calculate server efficiency and utilization was also included to account for virtualization and server efficiency features. Internet transfer power consumption was analyzed taking into account the number of data hops and networking devices used. The power consumed by online browsing and downloading was also factored into the model. The embedded CO2e of server and networking devices was proportioned to each ESD process. Three U.K.-based ESD scenarios were analyzed using the model which revealed potential CO2e savings of 83% when ESD was used over physical distribution. Results also highlighted the importance of server efficiency and utilization methods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Despite the increasing use of groupware technologies in education, there is little evidence of their impact, especially within an enquiry-based learning (EBL) context. In this paper, we examine the use of a commercial standard Group Intelligence software called GroupSystems®ThinkTank. To date, ThinkTank has been adopted mainly in the USA and supports teams in generating ideas, categorising, prioritising, voting and multi-criteria decision-making and automatically generates a report at the end of each session. The software was used by students carrying out an EBL project, set by employers, for a full academic year. The criteria for assessing the impact of ThinkTank on student learning were those of creativity, participation, productivity, engagement and understanding. Data was collected throughout the year using a combination of interviews and questionnaires, and written feedback from employers. The overall findings show an increase in levels of productivity and creativity, evidence of a deeper understanding of their work but some variation in attitudes towards participation in the early stages of the project.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently major processor manufacturers have announced a dramatic shift in their paradigm to increase computing power over the coming years. Instead of focusing on faster clock speeds and more powerful single core CPUs, the trend clearly goes towards multi core systems. This will also result in a paradigm shift for the development of algorithms for computationally expensive tasks, such as data mining applications. Obviously, work on parallel algorithms is not new per se but concentrated efforts in the many application domains are still missing. Multi-core systems, but also clusters of workstations and even large-scale distributed computing infrastructures provide new opportunities and pose new challenges for the design of parallel and distributed algorithms. Since data mining and machine learning systems rely on high performance computing systems, research on the corresponding algorithms must be on the forefront of parallel algorithm research in order to keep pushing data mining and machine learning applications to be more powerful and, especially for the former, interactive. To bring together researchers and practitioners working in this exciting field, a workshop on parallel data mining was organized as part of PKDD/ECML 2006 (Berlin, Germany). The six contributions selected for the program describe various aspects of data mining and machine learning approaches featuring low to high degrees of parallelism: The first contribution focuses the classic problem of distributed association rule mining and focuses on communication efficiency to improve the state of the art. After this a parallelization technique for speeding up decision tree construction by means of thread-level parallelism for shared memory systems is presented. The next paper discusses the design of a parallel approach for dis- tributed memory systems of the frequent subgraphs mining problem. This approach is based on a hierarchical communication topology to solve issues related to multi-domain computational envi- ronments. The forth paper describes the combined use and the customization of software packages to facilitate a top down parallelism in the tuning of Support Vector Machines (SVM) and the next contribution presents an interesting idea concerning parallel training of Conditional Random Fields (CRFs) and motivates their use in labeling sequential data. The last contribution finally focuses on very efficient feature selection. It describes a parallel algorithm for feature selection from random subsets. Selecting the papers included in this volume would not have been possible without the help of an international Program Committee that has provided detailed reviews for each paper. We would like to also thank Matthew Otey who helped with publicity for the workshop.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We describe a model-data fusion (MDF) inter-comparison project (REFLEX), which compared various algorithms for estimating carbon (C) model parameters consistent with both measured carbon fluxes and states and a simple C model. Participants were provided with the model and with both synthetic net ecosystem exchange (NEE) of CO2 and leaf area index (LAI) data, generated from the model with added noise, and observed NEE and LAI data from two eddy covariance sites. Participants endeavoured to estimate model parameters and states consistent with the model for all cases over the two years for which data were provided, and generate predictions for one additional year without observations. Nine participants contributed results using Metropolis algorithms, Kalman filters and a genetic algorithm. For the synthetic data case, parameter estimates compared well with the true values. The results of the analyses indicated that parameters linked directly to gross primary production (GPP) and ecosystem respiration, such as those related to foliage allocation and turnover, or temperature sensitivity of heterotrophic respiration, were best constrained and characterised. Poorly estimated parameters were those related to the allocation to and turnover of fine root/wood pools. Estimates of confidence intervals varied among algorithms, but several algorithms successfully located the true values of annual fluxes from synthetic experiments within relatively narrow 90% confidence intervals, achieving >80% success rate and mean NEE confidence intervals <110 gC m−2 year−1 for the synthetic case. Annual C flux estimates generated by participants generally agreed with gap-filling approaches using half-hourly data. The estimation of ecosystem respiration and GPP through MDF agreed well with outputs from partitioning studies using half-hourly data. Confidence limits on annual NEE increased by an average of 88% in the prediction year compared to the previous year, when data were available. Confidence intervals on annual NEE increased by 30% when observed data were used instead of synthetic data, reflecting and quantifying the addition of model error. Finally, our analyses indicated that incorporating additional constraints, using data on C pools (wood, soil and fine roots) would help to reduce uncertainties for model parameters poorly served by eddy covariance data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Climate modeling is a complex process, requiring accurate and complete metadata in order to identify, assess and use climate data stored in digital repositories. The preservation of such data is increasingly important given the development of ever-increasingly complex models to predict the effects of global climate change. The EU METAFOR project has developed a Common Information Model (CIM) to describe climate data and the models and modelling environments that produce this data. There is a wide degree of variability between different climate models and modelling groups. To accommodate this, the CIM has been designed to be highly generic and flexible, with extensibility built in. METAFOR describes the climate modelling process simply as "an activity undertaken using software on computers to produce data." This process has been described as separate UML packages (and, ultimately, XML schemas). This fairly generic structure canbe paired with more specific "controlled vocabularies" in order to restrict the range of valid CIM instances. The CIM will aid digital preservation of climate models as it will provide an accepted standard structure for the model metadata. Tools to write and manage CIM instances, and to allow convenient and powerful searches of CIM databases,. Are also under development. Community buy-in of the CIM has been achieved through a continual process of consultation with the climate modelling community, and through the METAFOR team’s development of a questionnaire that will be used to collect the metadata for the Intergovernmental Panel on Climate Change’s (IPCC) Coupled Model Intercomparison Project Phase 5 (CMIP5) model runs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Pocket Data Mining (PDM) is our new term describing collaborative mining of streaming data in mobile and distributed computing environments. With sheer amounts of data streams are now available for subscription on our smart mobile phones, the potential of using this data for decision making using data stream mining techniques has now been achievable owing to the increasing power of these handheld devices. Wireless communication among these devices using Bluetooth and WiFi technologies has opened the door wide for collaborative mining among the mobile devices within the same range that are running data mining techniques targeting the same application. This paper proposes a new architecture that we have prototyped for realizing the significant applications in this area. We have proposed using mobile software agents in this application for several reasons. Most importantly the autonomic intelligent behaviour of the agent technology has been the driving force for using it in this application. Other efficiency reasons are discussed in details in this paper. Experimental results showing the feasibility of the proposed architecture are presented and discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Distributed and collaborative data stream mining in a mobile computing environment is referred to as Pocket Data Mining PDM. Large amounts of available data streams to which smart phones can subscribe to or sense, coupled with the increasing computational power of handheld devices motivates the development of PDM as a decision making system. This emerging area of study has shown to be feasible in an earlier study using technological enablers of mobile software agents and stream mining techniques [1]. A typical PDM process would start by having mobile agents roam the network to discover relevant data streams and resources. Then other (mobile) agents encapsulating stream mining techniques visit the relevant nodes in the network in order to build evolving data mining models. Finally, a third type of mobile agents roam the network consulting the mining agents for a final collaborative decision, when required by one or more users. In this paper, we propose the use of distributed Hoeffding trees and Naive Bayes classifers in the PDM framework over vertically partitioned data streams. Mobile policing, health monitoring and stock market analysis are among the possible applications of PDM. An extensive experimental study is reported showing the effectiveness of the collaborative data mining with the two classifers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Advances in hardware and software in the past decade allow to capture, record and process fast data streams at a large scale. The research area of data stream mining has emerged as a consequence from these advances in order to cope with the real time analysis of potentially large and changing data streams. Examples of data streams include Google searches, credit card transactions, telemetric data and data of continuous chemical production processes. In some cases the data can be processed in batches by traditional data mining approaches. However, in some applications it is required to analyse the data in real time as soon as it is being captured. Such cases are for example if the data stream is infinite, fast changing, or simply too large in size to be stored. One of the most important data mining techniques on data streams is classification. This involves training the classifier on the data stream in real time and adapting it to concept drifts. Most data stream classifiers are based on decision trees. However, it is well known in the data mining community that there is no single optimal algorithm. An algorithm may work well on one or several datasets but badly on others. This paper introduces eRules, a new rule based adaptive classifier for data streams, based on an evolving set of Rules. eRules induces a set of rules that is constantly evaluated and adapted to changes in the data stream by adding new and removing old rules. It is different from the more popular decision tree based classifiers as it tends to leave data instances rather unclassified than forcing a classification that could be wrong. The ongoing development of eRules aims to improve its accuracy further through dynamic parameter setting which will also address the problem of changing feature domain values.