986 resultados para Small-error approximation
Resumo:
As low carbon technologies become more pervasive, distribution network operators are looking to support the expected changes in the demands on the low voltage networks through the smarter control of storage devices. Accurate forecasts of demand at the single household-level, or of small aggregations of households, can improve the peak demand reduction brought about through such devices by helping to plan the appropriate charging and discharging cycles. However, before such methods can be developed, validation measures are required which can assess the accuracy and usefulness of forecasts of volatile and noisy household-level demand. In this paper we introduce a new forecast verification error measure that reduces the so called “double penalty” effect, incurred by forecasts whose features are displaced in space or time, compared to traditional point-wise metrics, such as Mean Absolute Error and p-norms in general. The measure that we propose is based on finding a restricted permutation of the original forecast that minimises the point wise error, according to a given metric. We illustrate the advantages of our error measure using half-hourly domestic household electrical energy usage data recorded by smart meters and discuss the effect of the permutation restriction.
Resumo:
Background: Expression microarrays are increasingly used to obtain large scale transcriptomic information on a wide range of biological samples. Nevertheless, there is still much debate on the best ways to process data, to design experiments and analyse the output. Furthermore, many of the more sophisticated mathematical approaches to data analysis in the literature remain inaccessible to much of the biological research community. In this study we examine ways of extracting and analysing a large data set obtained using the Agilent long oligonucleotide transcriptomics platform, applied to a set of human macrophage and dendritic cell samples. Results: We describe and validate a series of data extraction, transformation and normalisation steps which are implemented via a new R function. Analysis of replicate normalised reference data demonstrate that intrarray variability is small (only around 2 of the mean log signal), while interarray variability from replicate array measurements has a standard deviation (SD) of around 0.5 log(2) units (6 of mean). The common practise of working with ratios of Cy5/Cy3 signal offers little further improvement in terms of reducing error. Comparison to expression data obtained using Arabidopsis samples demonstrates that the large number of genes in each sample showing a low level of transcription reflect the real complexity of the cellular transcriptome. Multidimensional scaling is used to show that the processed data identifies an underlying structure which reflect some of the key biological variables which define the data set. This structure is robust, allowing reliable comparison of samples collected over a number of years and collected by a variety of operators. Conclusions: This study outlines a robust and easily implemented pipeline for extracting, transforming normalising and visualising transcriptomic array data from Agilent expression platform. The analysis is used to obtain quantitative estimates of the SD arising from experimental (non biological) intra- and interarray variability, and for a lower threshold for determining whether an individual gene is expressed. The study provides a reliable basis for further more extensive studies of the systems biology of eukaryotic cells.
Resumo:
We study the approximation of harmonic functions by means of harmonic polynomials in two-dimensional, bounded, star-shaped domains. Assuming that the functions possess analytic extensions to a delta-neighbourhood of the domain, we prove exponential convergence of the approximation error with respect to the degree of the approximating harmonic polynomial. All the constants appearing in the bounds are explicit and depend only on the shape-regularity of the domain and on delta. We apply the obtained estimates to show exponential convergence with rate O(exp(−b square root N)), N being the number of degrees of freedom and b>0, of a hp-dGFEM discretisation of the Laplace equation based on piecewise harmonic polynomials. This result is an improvement over the classical rate O(exp(−b cubic root N )), and is due to the use of harmonic polynomial spaces, as opposed to complete polynomial spaces.
Resumo:
For certain observing types, such as those that are remotely sensed, the observation errors are correlated and these correlations are state- and time-dependent. In this work, we develop a method for diagnosing and incorporating spatially correlated and time-dependent observation error in an ensemble data assimilation system. The method combines an ensemble transform Kalman filter with a method that uses statistical averages of background and analysis innovations to provide an estimate of the observation error covariance matrix. To evaluate the performance of the method, we perform identical twin experiments using the Lorenz ’96 and Kuramoto-Sivashinsky models. Using our approach, a good approximation to the true observation error covariance can be recovered in cases where the initial estimate of the error covariance is incorrect. Spatial observation error covariances where the length scale of the true covariance changes slowly in time can also be captured. We find that using the estimated correlated observation error in the assimilation improves the analysis.
Resumo:
The techno-economic performance of a small wind turbine is very sensitive to the available wind resource. However, due to financial and practical constraints installers rely on low resolution wind speed databases to assess a potential site. This study investigates whether the two site assessment tools currently used in the UK, NOABL or the Energy Saving Trust wind speed estimator, are accurate enough to estimate the techno-economic performance of a small wind turbine. Both the tools tend to overestimate the wind speed, with a mean error of 23% and 18% for the NOABL and Energy Saving Trust tool respectively. A techno-economic assessment of 33 small wind turbines at each site has shown that these errors can have a significant impact on the estimated load factor of an installation. Consequently, site/turbine combinations which are not economically viable can be predicted to be viable. Furthermore, both models tend to underestimate the wind resource at relatively high wind speed sites, this can lead to missed opportunities as economically viable turbine/site combinations are predicted to be non-viable. These results show that a better understanding of the local wind resource is a required to make small wind turbines a viable technology in the UK.
Resumo:
The Monte Carlo Independent Column Approximation (McICA) is a flexible method for representing subgrid-scale cloud inhomogeneity in radiative transfer schemes. It does, however, introduce conditional random errors but these have been shown to have little effect on climate simulations, where spatial and temporal scales of interest are large enough for effects of noise to be averaged out. This article considers the effect of McICA noise on a numerical weather prediction (NWP) model, where the time and spatial scales of interest are much closer to those at which the errors manifest themselves; this, as we show, means that noise is more significant. We suggest methods for efficiently reducing the magnitude of McICA noise and test these methods in a global NWP version of the UK Met Office Unified Model (MetUM). The resultant errors are put into context by comparison with errors due to the widely used assumption of maximum-random-overlap of plane-parallel homogeneous cloud. For a simple implementation of the McICA scheme, forecasts of near-surface temperature are found to be worse than those obtained using the plane-parallel, maximum-random-overlap representation of clouds. However, by applying the methods suggested in this article, we can reduce noise enough to give forecasts of near-surface temperature that are an improvement on the plane-parallel maximum-random-overlap forecasts. We conclude that the McICA scheme can be used to improve the representation of clouds in NWP models, with the provision that the associated noise is sufficiently small.
Resumo:
We present and analyse a space–time discontinuous Galerkin method for wave propagation problems. The special feature of the scheme is that it is a Trefftz method, namely that trial and test functions are solution of the partial differential equation to be discretised in each element of the (space–time) mesh. The method considered is a modification of the discontinuous Galerkin schemes of Kretzschmar et al. (2014) and of Monk & Richter (2005). For Maxwell’s equations in one space dimension, we prove stability of the method, quasi-optimality, best approximation estimates for polynomial Trefftz spaces and (fully explicit) error bounds with high order in the meshwidth and in the polynomial degree. The analysis framework also applies to scalar wave problems and Maxwell’s equations in higher space dimensions. Some numerical experiments demonstrate the theoretical results proved and the faster convergence compared to the non-Trefftz version of the scheme.
Resumo:
We estimate the conditions for detectability of two planets in a 2/1 mean-motion resonance from radial velocity data, as a function of their masses, number of observations and the signal-to-noise ratio. Even for a data set of the order of 100 observations and standard deviations of the order of a few meters per second, we find that Jovian-size resonant planets are difficult to detect if the masses of the planets differ by a factor larger than similar to 4. This is consistent with the present population of real exosystems in the 2/1 commensurability, most of which have resonant pairs with similar minimum masses, and could indicate that many other resonant systems exist, but are currently beyond the detectability limit. Furthermore, we analyze the error distribution in masses and orbital elements of orbital fits from synthetic data sets for resonant planets in the 2/1 commensurability. For various mass ratios and number of data points we find that the eccentricity of the outer planet is systematically overestimated, although the inner planet`s eccentricity suffers a much smaller effect. If the initial conditions correspond to small-amplitude oscillations around stable apsidal corotation resonances, the amplitudes estimated from the orbital fits are biased toward larger amplitudes, in accordance to results found in real resonant extrasolar systems.
Resumo:
Esta tese é composta de três artigos que analisam a estrutura a termo das taxas de juros usando diferentes bases de dados e modelos. O capítulo 1 propõe um modelo paramétrico de taxas de juros que permite a segmentação e choques locais na estrutura a termo. Adotando dados do tesouro americano, duas versões desse modelo segmentado são implementadas. Baseado em uma sequência de 142 experimentos de previsão, os modelos propostos são comparados à benchmarks e concluí-se que eles performam melhor nos resultados das previsões fora da amostra, especialmente para as maturidades curtas e para o horizonte de previsão de 12 meses. O capítulo 2 acrescenta restrições de não arbitragem ao estimar um modelo polinomial gaussiano dinâmico de estrutura a termo para o mercado de taxas de juros brasileiro. Esse artigo propõe uma importante aproximação para a série temporal dos fatores de risco da estrutura a termo, que permite a extração do prêmio de risco das taxas de juros sem a necessidade de otimização de um modelo dinâmico completo. Essa metodologia tem a vantagem de ser facilmente implementada e obtém uma boa aproximação para o prêmio de risco da estrutura a termo, que pode ser usada em diferentes aplicações. O capítulo 3 modela a dinâmica conjunta das taxas nominais e reais usando um modelo afim de não arbitagem com variáveis macroeconômicas para a estrutura a termo, afim de decompor a diferença entre as taxas nominais e reais em prêmio de risco de inflação e expectativa de inflação no mercado americano. Uma versão sem variáveis macroeconômicas e uma versão com essas variáveis são implementadas e os prêmios de risco de inflação obtidos são pequenos e estáveis no período analisado, porém possuem diferenças na comparação dos dois modelos analisados.
Resumo:
Ties among event times are often recorded in survival studies. For example, in a two week laboratory study where event times are measured in days, ties are very likely to occur. The proportional hazards model might be used in this setting using an approximated partial likelihood function. This approximation works well when the number of ties is small. on the other hand, discrete regression models are suggested when the data are heavily tied. However, in many situations it is not clear which approach should be used in practice. In this work, empirical guidelines based on Monte Carlo simulations are provided. These recommendations are based on a measure of the amount of tied data present and the mean square error. An example illustrates the proposed criterion.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
A comparative study of aggregation error bounds for the generalized transportation problem is presented. A priori and a posteriori error bounds were derived and a computational study was performed to (a) test the correlation between the a priori, the a posteriori, and the actual error and (b) quantify the difference of the error bounds from the actual error. Based on the results we conclude that calculating the a priori error bound can be considered as a useful strategy to select the appropriate aggregation level. The a posteriori error bound provides a good quantitative measure of the actual error.
Resumo:
Includes bibliography
Resumo:
Semi-supervised learning is applied to classification problems where only a small portion of the data items is labeled. In these cases, the reliability of the labels is a crucial factor, because mislabeled items may propagate wrong labels to a large portion or even the entire data set. This paper aims to address this problem by presenting a graph-based (network-based) semi-supervised learning method, specifically designed to handle data sets with mislabeled samples. The method uses teams of walking particles, with competitive and cooperative behavior, for label propagation in the network constructed from the input data set. The proposed model is nature-inspired and it incorporates some features to make it robust to a considerable amount of mislabeled data items. Computer simulations show the performance of the method in the presence of different percentage of mislabeled data, in networks of different sizes and average node degree. Importantly, these simulations reveals the existence of the critical points of the mislabeled subset size, below which the network is free of wrong label contamination, but above which the mislabeled samples start to propagate their labels to the rest of the network. Moreover, numerical comparisons have been made among the proposed method and other representative graph-based semi-supervised learning methods using both artificial and real-world data sets. Interestingly, the proposed method has increasing better performance than the others as the percentage of mislabeled samples is getting larger. © 2012 IEEE.
Resumo:
In this work we analyze the convergence of solutions of the Poisson equation with Neumann boundary conditions in a two-dimensional thin domain with highly oscillatory behavior. We consider the case where the height of the domain, amplitude and period of the oscillations are all of the same order, and given by a small parameter e > 0. Using an appropriate corrector approach, we show strong convergence and give error estimates when we replace the original solutions by the first-order expansion through the Multiple-Scale Method.