937 resultados para Algoritmic pairs trading, statistical arbitrage, Kalman filter, mean reversion.
Resumo:
Foreign exchange trading has emerged in recent times as a significant activity in many countries. As with most forms of trading, the activity is influenced by many random parameters so that the creation of a system that effectively emulates the trading process is very helpful. In this paper, we try to create such a system with a genetic algorithm engine to emulate trader behaviour on the foreign exchange market and to find the most profitable trading strategy.
Resumo:
A formalism for modelling the dynamics of Genetic Algorithms (GAs) using methods from statistical mechanics, originally due to Prugel-Bennett and Shapiro, is reviewed, generalized and improved upon. This formalism can be used to predict the averaged trajectory of macroscopic statistics describing the GA's population. These macroscopics are chosen to average well between runs, so that fluctuations from mean behaviour can often be neglected. Where necessary, non-trivial terms are determined by assuming maximum entropy with constraints on known macroscopics. Problems of realistic size are described in compact form and finite population effects are included, often proving to be of fundamental importance. The macroscopics used here are cumulants of an appropriate quantity within the population and the mean correlation (Hamming distance) within the population. Including the correlation as an explicit macroscopic provides a significant improvement over the original formulation. The formalism is applied to a number of simple optimization problems in order to determine its predictive power and to gain insight into GA dynamics. Problems which are most amenable to analysis come from the class where alleles within the genotype contribute additively to the phenotype. This class can be treated with some generality, including problems with inhomogeneous contributions from each site, non-linear or noisy fitness measures, simple diploid representations and temporally varying fitness. The results can also be applied to a simple learning problem, generalization in a binary perceptron, and a limit is identified for which the optimal training batch size can be determined for this problem. The theory is compared to averaged results from a real GA in each case, showing excellent agreement if the maximum entropy principle holds. Some situations where this approximation brakes down are identified. In order to fully test the formalism, an attempt is made on the strong sc np-hard problem of storing random patterns in a binary perceptron. Here, the relationship between the genotype and phenotype (training error) is strongly non-linear. Mutation is modelled under the assumption that perceptron configurations are typical of perceptrons with a given training error. Unfortunately, this assumption does not provide a good approximation in general. It is conjectured that perceptron configurations would have to be constrained by other statistics in order to accurately model mutation for this problem. Issues arising from this study are discussed in conclusion and some possible areas of further research are outlined.
Resumo:
A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling.
Resumo:
We discuss the Application of TAP mean field methods known from Statistical Mechanics of disordered systems to Bayesian classification with Gaussian processes. In contrast to previous applications, no knowledge about the distribution of inputs is needed. Simulation results for the Sonar data set are given.
Resumo:
We derive a mean field algorithm for binary classification with Gaussian processes which is based on the TAP approach originally proposed in Statistical Physics of disordered systems. The theory also yields an approximate leave-one-out estimator for the generalization error which is computed with no extra computational cost. We show that from the TAP approach, it is possible to derive both a simpler 'naive' mean field theory and support vector machines (SVM) as limiting cases. For both mean field algorithms and support vectors machines, simulation results for three small benchmark data sets are presented. They show 1. that one may get state of the art performance by using the leave-one-out estimator for model selection and 2. the built-in leave-one-out estimators are extremely precise when compared to the exact leave-one-out estimate. The latter result is a taken as a strong support for the internal consistency of the mean field approach.
Resumo:
The performance of "typical set (pairs) decoding" for ensembles of Gallager's linear code is investigated using statistical physics. In this decoding method, errors occur, either when the information transmission is corrupted by atypical noise, or when multiple typical sequences satisfy the parity check equation as provided by the received corrupted codeword. We show that the average error rate for the second type of error over a given code ensemble can be accurately evaluated using the replica method, including the sensitivity to message length. Our approach generally improves the existing analysis known in the information theory community, which was recently reintroduced in IEEE Trans. Inf. Theory 45, 399 (1999), and is believed to be the most accurate to date. © 2002 The American Physical Society.
Resumo:
This paper studies the behaviour of returns for a sample of cross-listed stocks, listed on both the Paris Bourse and SEAQ-International in London. The aim of the paper is to discover which market adjusts to fundamental news more quickly, the home market of Paris or SEAQ-International. We find that prices in London adjust to changes in their fundamental value more slowly than Paris prices, despite the ability to quickly arbitrage between the two markets. We suggest that this finding may reflect the type of trading, which takes place in the two markets and differences associated with the reporting of large trades. We also estimate the amount of noise present in the two markets and show that the Paris market is more noisy than London. © 2003 Published by Elsevier B.V.
Resumo:
Previously, it has been shown that the profits from a simple market timing trading rule applied to a portfolio of shares can be affected by the inter-relationships between the returns of the component securities. In this short letter, the results from applying a more sophisticated 'filter' rule to the same data are reported. Unlike the simple trading rule, the filter rule does produce some evidence of economic profits.
Resumo:
The techniques and insights from two distinct areas of financial economic modelling are combined to provide evidence of the influence of firm size on the volatility of stock portfolio returns. Portfolio returns are characterized by positive serial correlation induced by the varying levels of non-synchronous trading among the component stocks. This serial correlation is greatest for portfolios of small firms. The conditional volatility of stock returns has been shown to be well represented by the GARCH family of statistical processes. Using a GARCH model of the variance of capitalization-based portfolio returns, conditioned on the autocorrelation structure in the conditional mean, striking differences related to firm size are uncovered.
Resumo:
The suitability of a new plastic supporting medium for biofiltration was tested over a three year period. Tests were carried out on the stability, surface properties, mechanical strength, and dimensions of the medium. There was no evidence to suggest that the medium was deficient in any of these respects. The specific surface (320m2m-3) and the voidage (94%) of the new medium are unlike any other used in bio-filtration and a pilot plant containing two filters was built to observe its effects on ecology and performance. Performance was estimated by chemical analysis and ecology studied by film examination and fauna counts. A system of removable sampling baskets was designed to enable samples to be obtained from two intermediate depths of filter. One of the major operating problems of percolating filters is excessive accumulation of film. The amount of film is influenced by hydraulic and organic load and each filter was run at a different loading. One was operated at 1.2m3m-3day-1 (DOD load 0.24kgm-3day-1) judged at the time to be the lowest filtration rate to offer advantages over conventional media. The other filter was operated at more than twice this loading (2.4m3m-3day-lBOD load 0.55kgm-3day-1) giving a roughly 2.5x and 6x the conventional loadings recommended for a Royal Commission effluent. The amount of film in each filter was normally low (0.05-3kgm(3 as volatile solids) and did not affect efficiency. The evidence collected during the study indicated that the ecology of the filters was normal when compared with the data obtained from the literature relating to filters with mineral media. There were indications that full ecological stability was yet to be reached and this was affecting the efficiency of the filters. The lower rate filter produced an average 87% BOD removal giving a consistent Royal Commission effluent during the summer months. The higher rate filter produced a mean 83% BOD removal but at no stage a consistent Royal Commission effluent. From the data on ecology and performance the filters resembled conventional filters rather than high rate filters.
Resumo:
The dynamics of the non-equilibrium Ising model with parallel updates is investigated using a generalized mean field approximation that incorporates multiple two-site correlations at any two time steps, which can be obtained recursively. The proposed method shows significant improvement in predicting local system properties compared to other mean field approximation techniques, particularly in systems with symmetric interactions. Results are also evaluated against those obtained from Monte Carlo simulations. The method is also employed to obtain parameter values for the kinetic inverse Ising modeling problem, where couplings and local field values of a fully connected spin system are inferred from data. © 2014 IOP Publishing Ltd and SISSA Medialab srl.
Resumo:
We present a data based statistical study on the effects of seasonal variations in the growth rates of the gastro-intestinal (GI) parasitic infection in livestock. The alluded growth rate is estimated through the variation in the number of eggs per gram (EPG) of faeces in animals. In accordance with earlier studies, our analysis too shows that rainfall is the dominant variable in determining EPG infection rates compared to other macro-parameters like temperature and humidity. Our statistical analysis clearly indicates an oscillatory dependence of EPG levels on rainfall fluctuations. Monsoon recorded the highest infection with a comparative increase of at least 2.5 times compared to the next most infected period (summer). A least square fit of the EPG versus rainfall data indicates an approach towards a super diffusive (i. e. root mean square displacement growing faster than the square root of the elapsed time as obtained for simple diffusion) infection growth pattern regime for low rainfall regimes (technically defined as zeroth level dependence) that gets remarkably augmented for large rainfall zones. Our analysis further indicates that for low fluctuations in temperature (true on the bulk data), EPG level saturates beyond a critical value of the rainfall, a threshold that is expected to indicate the onset of the nonlinear regime. The probability density functions (PDFs) of the EPG data show oscillatory behavior in the large rainfall regime (greater than 500 mm), the frequency of oscillation, once again, being determined by the ambient wetness (rainfall, and humidity). Data recorded over three pilot projects spanning three measures of rainfall and humidity bear testimony to the universality of this statistical argument. © 2013 Chattopadhyay and Bandyopadhyay.
Resumo:
Purpose: To determine the effect of coloured light filter overlays on reading rates for people with age-related macular degeneration (AMD). Method: Using a prospective clinical trial design, we examined the null hypothesis that coloured light filter overlays do not improve reading rates in AMD when compared to a clear filter. Reading rates for 12 subjects with non-exudative AMD, associated with a relative scotoma and central fixation (mean age 81 years, SD 5.07 years) were determined using the Rate of Reading Test® (printed, nonsense, lower case sans serif, stationary text) with 10 different, coloured light filter overlays (Intuitive Overlays®; figures in brackets are percentage transmission values); rose (78%), pink (78%), purple (67%), aqua (81%), blue (74%), lime-green (86%), mint-green (85%), yellow (93%), orange (83%) and grey (71%). A clear overlay (Roscolene # 00) (360 cdm-2) with 100% transmittance was used as a control. Results: ANOVA indicated that there was no statistically significant difference in reading rates with the coloured light filter overlays compared to the clear filter. Furthermore, chi-squared analysis indicated that the rose, purple and blue filters had a significantly poorer overall ranking in terms of reading rates compared to the other coloured and clear light filters. Conclusion: Coloured light filter overlays are unlikely to provide a clinically significant improvement in reading rates for people with non-exudative AMD associated with a relative scotoma and central fixation. Copyright © Acta Ophthalmol Scand 2004.
Resumo:
Natural language understanding is to specify a computational model that maps sentences to their semantic mean representation. In this paper, we propose a novel framework to train the statistical models without using expensive fully annotated data. In particular, the input of our framework is a set of sentences labeled with abstract semantic annotations. These annotations encode the underlying embedded semantic structural relations without explicit word/semantic tag alignment. The proposed framework can automatically induce derivation rules that map sentences to their semantic meaning representations. The learning framework is applied on two statistical models, the conditional random fields (CRFs) and the hidden Markov support vector machines (HM-SVMs). Our experimental results on the DARPA communicator data show that both CRFs and HM-SVMs outperform the baseline approach, previously proposed hidden vector state (HVS) model which is also trained on abstract semantic annotations. In addition, the proposed framework shows superior performance than two other baseline approaches, a hybrid framework combining HVS and HM-SVMs and discriminative training of HVS, with a relative error reduction rate of about 25% and 15% being achieved in F-measure.
Resumo:
False friends are pairs of words in two languages that are perceived as similar but have different meanings. We present an improved algorithm for acquiring false friends from sentence-level aligned parallel corpus based on statistical observations of words occurrences and co-occurrences in the parallel sentences. The results are compared with an entirely semantic measure for cross-lingual similarity between words based on using the Web as a corpus through analyzing the words’ local contexts extracted from the text snippets returned by searching in Google. The statistical and semantic measures are further combined into an improved algorithm for identification of false friends that achieves almost twice better results than previously known algorithms. The evaluation is performed for identifying cognates between Bulgarian and Russian but the proposed methods could be adopted for other language pairs for which parallel corpora and bilingual glossaries are available.