891 resultados para Log-normal distribution
Resumo:
It is extremely important to ensure that people with disabilities can access information and cultural works on an equal basis with others. Access is fundamentally important to enable people with disabilities to fully participate in economic, social, and political life. This is both a pressing moral imperative and a legal requirement in international law. Australia should take clear steps to affirmatively redress the fundamental inequalities of access that people with disabilities face. This requires a fundamental shift in the way that we think about copyright and disability rights: the mechanisms for enabling access should not be a limited exception to normal distribution, but should instead be strong positive rights that are able to be routinely and practically exercised.
Resumo:
Aim Determining how ecological processes vary across space is a major focus in ecology. Current methods that investigate such effects remain constrained by important limiting assumptions. Here we provide an extension to geographically weighted regression in which local regression and spatial weighting are used in combination. This method can be used to investigate non-stationarity and spatial-scale effects using any regression technique that can accommodate uneven weighting of observations, including machine learning. Innovation We extend the use of spatial weights to generalized linear models and boosted regression trees by using simulated data for which the results are known, and compare these local approaches with existing alternatives such as geographically weighted regression (GWR). The spatial weighting procedure (1) explained up to 80% deviance in simulated species richness, (2) optimized the normal distribution of model residuals when applied to generalized linear models versus GWR, and (3) detected nonlinear relationships and interactions between response variables and their predictors when applied to boosted regression trees. Predictor ranking changed with spatial scale, highlighting the scales at which different species–environment relationships need to be considered. Main conclusions GWR is useful for investigating spatially varying species–environment relationships. However, the use of local weights implemented in alternative modelling techniques can help detect nonlinear relationships and high-order interactions that were previously unassessed. Therefore, this method not only informs us how location and scale influence our perception of patterns and processes, it also offers a way to deal with different ecological interpretations that can emerge as different areas of spatial influence are considered during model fitting.
Resumo:
Reconstructing 3D motion data is highly under-constrained due to several common sources of data loss during measurement, such as projection, occlusion, or miscorrespondence. We present a statistical model of 3D motion data, based on the Kronecker structure of the spatiotemporal covariance of natural motion, as a prior on 3D motion. This prior is expressed as a matrix normal distribution, composed of separable and compact row and column covariances. We relate the marginals of the distribution to the shape, trajectory, and shape-trajectory models of prior art. When the marginal shape distribution is not available from training data, we show how placing a hierarchical prior over shapes results in a convex MAP solution in terms of the trace-norm. The matrix normal distribution, fit to a single sequence, outperforms state-of-the-art methods at reconstructing 3D motion data in the presence of significant data loss, while providing covariance estimates of the imputed points.
Resumo:
The widespread and increasing resistance of internal parasites to anthelmintic control is a serious problem for the Australian sheep and wool industry. As part of control programmes, laboratories use the Faecal Egg Count Reduction Test (FECRT) to determine resistance to anthelmintics. It is important to have confidence in the measure of resistance, not only for the producer planning a drenching programme but also for companies investigating the efficacy of their products. The determination of resistance and corresponding confidence limits as given in anthelmintic efficacy guidelines of the Standing Committee on Agriculture (SCA) is based on a number of assumptions. This study evaluated the appropriateness of these assumptions for typical data and compared the effectiveness of the standard FECRT procedure with the effectiveness of alternative procedures. Several sets of historical experimental data from sheep and goats were analysed to determine that a negative binomial distribution was a more appropriate distribution to describe pre-treatment helminth egg counts in faeces than a normal distribution. Simulated egg counts for control animals were generated stochastically from negative binomial distributions and those for treated animals from negative binomial and binomial distributions. Three methods for determining resistance when percent reduction is based on arithmetic means were applied. The first was that advocated in the SCA guidelines, the second similar to the first but basing the variance estimates on negative binomial distributions, and the third using Wadley’s method with the distribution of the response variate assumed negative binomial and a logit link transformation. These were also compared with a fourth method recommended by the International Co-operation on Harmonisation of Technical Requirements for Registration of Veterinary Medicinal Products (VICH) programme, in which percent reduction is based on the geometric means. A wide selection of parameters was investigated and for each set 1000 simulations run. Percent reduction and confidence limits were then calculated for the methods, together with the number of times in each set of 1000 simulations the theoretical percent reduction fell within the estimated confidence limits and the number of times resistance would have been said to occur. These simulations provide the basis for setting conditions under which the methods could be recommended. The authors show that given the distribution of helminth egg counts found in Queensland flocks, the method based on arithmetic not geometric means should be used and suggest that resistance be redefined as occurring when the upper level of percent reduction is less than 95%. At least ten animals per group are required in most circumstances, though even 20 may be insufficient where effectiveness of the product is close to the cut off point for defining resistance.
Resumo:
We have derived a versatile gene-based test for genome-wide association studies (GWAS). Our approach, called VEGAS (versatile gene-based association study), is applicable to all GWAS designs, including family-based GWAS, meta-analyses of GWAS on the basis of summary data, and DNA-pooling-based GWAS, where existing approaches based on permutation are not possible, as well as singleton data, where they are. The test incorporates information from a full set of markers (or a defined subset) within a gene and accounts for linkage disequilibrium between markers by using simulations from the multivariate normal distribution. We show that for an association study using singletons, our approach produces results equivalent to those obtained via permutation in a fraction of the computation time. We demonstrate proof-of-principle by using the gene-based test to replicate several genes known to be associated on the basis of results from a family-based GWAS for height in 11,536 individuals and a DNA-pooling-based GWAS for melanoma in approximately 1300 cases and controls. Our method has the potential to identify novel associated genes; provide a basis for selecting SNPs for replication; and be directly used in network (pathway) approaches that require per-gene association test statistics. We have implemented the approach in both an easy-to-use web interface, which only requires the uploading of markers with their association p-values, and a separate downloadable application.
Resumo:
Maize is one of the most important crops in the world. The products generated from this crop are largely used in the starch industry, the animal and human nutrition sector, and biomass energy production and refineries. For these reasons, there is much interest in figuring the potential grain yield of maize genotypes in relation to the environment in which they will be grown, as the productivity directly affects agribusiness or farm profitability. Questions like these can be investigated with ecophysiological crop models, which can be organized according to different philosophies and structures. The main objective of this work is to conceptualize a stochastic model for predicting maize grain yield and productivity under different conditions of water supply while considering the uncertainties of daily climate data. Therefore, one focus is to explain the model construction in detail, and the other is to present some results in light of the philosophy adopted. A deterministic model was built as the basis for the stochastic model. The former performed well in terms of the curve shape of the above-ground dry matter over time as well as the grain yield under full and moderate water deficit conditions. Through the use of a triangular distribution for the harvest index and a bivariate normal distribution of the averaged daily solar radiation and air temperature, the stochastic model satisfactorily simulated grain productivity, i.e., it was found that 10,604 kg ha(-1) is the most likely grain productivity, very similar to the productivity simulated by the deterministic model and for the real conditions based on a field experiment. © 2012 American Society of Agricultural and Biological Engineers.
Resumo:
Having the ability to work with complex models can be highly beneficial, but the computational cost of doing so is often large. Complex models often have intractable likelihoods, so methods that directly use the likelihood function are infeasible. In these situations, the benefits of working with likelihood-free methods become apparent. Likelihood-free methods, such as parametric Bayesian indirect likelihood that uses the likelihood of an alternative parametric auxiliary model, have been explored throughout the literature as a good alternative when the model of interest is complex. One of these methods is called the synthetic likelihood (SL), which assumes a multivariate normal approximation to the likelihood of a summary statistic of interest. This paper explores the accuracy and computational efficiency of the Bayesian version of the synthetic likelihood (BSL) approach in comparison to a competitor known as approximate Bayesian computation (ABC) and its sensitivity to its tuning parameters and assumptions. We relate BSL to pseudo-marginal methods and propose to use an alternative SL that uses an unbiased estimator of the exact working normal likelihood when the summary statistic has a multivariate normal distribution. Several applications of varying complexity are considered to illustrate the findings of this paper.
Resumo:
This thesis introduced two novel reputation models to generate accurate item reputation scores using ratings data and the statistics of the dataset. It also presented an innovative method that incorporates reputation awareness in recommender systems by employing voting system methods to produce more accurate top-N item recommendations. Additionally, this thesis introduced a personalisation method for generating reputation scores based on users' interests, where a single item can have different reputation scores for different users. The personalised reputation scores are then used in the proposed reputation-aware recommender systems to enhance the recommendation quality.
Resumo:
Water quality data are often collected at different sites over time to improve water quality management. Water quality data usually exhibit the following characteristics: non-normal distribution, presence of outliers, missing values, values below detection limits (censored), and serial dependence. It is essential to apply appropriate statistical methodology when analyzing water quality data to draw valid conclusions and hence provide useful advice in water management. In this chapter, we will provide and demonstrate various statistical tools for analyzing such water quality data, and will also introduce how to use a statistical software R to analyze water quality data by various statistical methods. A dataset collected from the Susquehanna River Basin will be used to demonstrate various statistical methods provided in this chapter. The dataset can be downloaded from website http://www.srbc.net/programs/CBP/nutrientprogram.htm.
Resumo:
This thesis studies quantile residuals and uses different methodologies to develop test statistics that are applicable in evaluating linear and nonlinear time series models based on continuous distributions. Models based on mixtures of distributions are of special interest because it turns out that for those models traditional residuals, often referred to as Pearson's residuals, are not appropriate. As such models have become more and more popular in practice, especially with financial time series data there is a need for reliable diagnostic tools that can be used to evaluate them. The aim of the thesis is to show how such diagnostic tools can be obtained and used in model evaluation. The quantile residuals considered here are defined in such a way that, when the model is correctly specified and its parameters are consistently estimated, they are approximately independent with standard normal distribution. All the tests derived in the thesis are pure significance type tests and are theoretically sound in that they properly take the uncertainty caused by parameter estimation into account. -- In Chapter 2 a general framework based on the likelihood function and smooth functions of univariate quantile residuals is derived that can be used to obtain misspecification tests for various purposes. Three easy-to-use tests aimed at detecting non-normality, autocorrelation, and conditional heteroscedasticity in quantile residuals are formulated. It also turns out that these tests can be interpreted as Lagrange Multiplier or score tests so that they are asymptotically optimal against local alternatives. Chapter 3 extends the concept of quantile residuals to multivariate models. The framework of Chapter 2 is generalized and tests aimed at detecting non-normality, serial correlation, and conditional heteroscedasticity in multivariate quantile residuals are derived based on it. Score test interpretations are obtained for the serial correlation and conditional heteroscedasticity tests and in a rather restricted special case for the normality test. In Chapter 4 the tests are constructed using the empirical distribution function of quantile residuals. So-called Khmaladze s martingale transformation is applied in order to eliminate the uncertainty caused by parameter estimation. Various test statistics are considered so that critical bounds for histogram type plots as well as Quantile-Quantile and Probability-Probability type plots of quantile residuals are obtained. Chapters 2, 3, and 4 contain simulations and empirical examples which illustrate the finite sample size and power properties of the derived tests and also how the tests and related graphical tools based on residuals are applied in practice.
Resumo:
An atmospheric radio noise burst represents the radiation received from one complete lightning flash at the frequency to which a receiver is tuned and within the receiver bandwidth. At tropical latitudes, the principal source of interference in the frequency range from 0.1 to 10 MHz is the burst form of atmospheric radio noise. The structure of a burst shows several approximately rectangular pulses of random amplitude, duration and frequency of recurrence. The influence of the noise on data communication can only be examined when the value of the number of pulses crossing a certain amplitude threshold per unit time of the noise burst is known. A pulse rate counter designed for this purpose has been used at Bangalore (12°58′N, 77°35′E) to investigate the pulse characteristics of noise bursts at 3 MHz with a receiver bandwidth of 3.3 kHz/6d B. The results show that the number of pulses lying in the amplitude range between peak and quasi-peak values of the noise bursts and the burst duration corresponding to these pulses follow log normal distributions. The pulse rates deduced therefrom show certain correlation between the number of pulses and the duration of the noise burst. The results are discussed with a view to furnish necessary information for data communication.
Resumo:
One of the most fundamental and widely accepted ideas in finance is that investors are compensated through higher returns for taking on non-diversifiable risk. Hence the quantification, modeling and prediction of risk have been, and still are one of the most prolific research areas in financial economics. It was recognized early on that there are predictable patterns in the variance of speculative prices. Later research has shown that there may also be systematic variation in the skewness and kurtosis of financial returns. Lacking in the literature so far, is an out-of-sample forecast evaluation of the potential benefits of these new more complicated models with time-varying higher moments. Such an evaluation is the topic of this dissertation. Essay 1 investigates the forecast performance of the GARCH (1,1) model when estimated with 9 different error distributions on Standard and Poor’s 500 Index Future returns. By utilizing the theory of realized variance to construct an appropriate ex post measure of variance from intra-day data it is shown that allowing for a leptokurtic error distribution leads to significant improvements in variance forecasts compared to using the normal distribution. This result holds for daily, weekly as well as monthly forecast horizons. It is also found that allowing for skewness and time variation in the higher moments of the distribution does not further improve forecasts. In Essay 2, by using 20 years of daily Standard and Poor 500 index returns, it is found that density forecasts are much improved by allowing for constant excess kurtosis but not improved by allowing for skewness. By allowing the kurtosis and skewness to be time varying the density forecasts are not further improved but on the contrary made slightly worse. In Essay 3 a new model incorporating conditional variance, skewness and kurtosis based on the Normal Inverse Gaussian (NIG) distribution is proposed. The new model and two previously used NIG models are evaluated by their Value at Risk (VaR) forecasts on a long series of daily Standard and Poor’s 500 returns. The results show that only the new model produces satisfactory VaR forecasts for both 1% and 5% VaR Taken together the results of the thesis show that kurtosis appears not to exhibit predictable time variation, whereas there is found some predictability in the skewness. However, the dynamic properties of the skewness are not completely captured by any of the models.
Resumo:
Financial time series tend to behave in a manner that is not directly drawn from a normal distribution. Asymmetries and nonlinearities are usually seen and these characteristics need to be taken into account. To make forecasts and predictions of future return and risk is rather complicated. The existing models for predicting risk are of help to a certain degree, but the complexity in financial time series data makes it difficult. The introduction of nonlinearities and asymmetries for the purpose of better models and forecasts regarding both mean and variance is supported by the essays in this dissertation. Linear and nonlinear models are consequently introduced in this dissertation. The advantages of nonlinear models are that they can take into account asymmetries. Asymmetric patterns usually mean that large negative returns appear more often than positive returns of the same magnitude. This goes hand in hand with the fact that negative returns are associated with higher risk than in the case where positive returns of the same magnitude are observed. The reason why these models are of high importance lies in the ability to make the best possible estimations and predictions of future returns and for predicting risk.
Resumo:
Probabilistic analysis of cracking moment from 22 simply supported reinforced concrete beams is performed. When the basic variables follow the distribution considered in this study, the cracking moment of a beam is found to follow a normal distribution. An expression is derived, for characteristic cracking moment, which will be useful in examining reinforced concrete beams for a limit state of cracking.
Resumo:
Anthesis was studied at the canopy level in 10 Norway spruce stands from 9 localities in Finland from 1963 to 1974. Distributions of pollen catches were compared to the normal Gaussian distribution. The basis for the timing studies was the 50 per cent point of the anthesis-fitted normal distribution. Development up to this point was given in calendar days, in degree days (>5 °C) and in period units. The count of each parameter began on March 19 (included). Male flowering in Norway spruce stands was found to have more annual variation in quantity than in Scots pine stands studied earlier. Anthesis in spruce in northern Finland occurred at a later date than in the south. The heat sums needed for anthesis varied latitudinally less in spruce than in pine. The variation of pollen catches in spruce increased towards north-west as in the case of Scots pine. In the unprocessed data, calendar days were found to be the most accurate forecast of anthesis in Norway spruce both for a single year and for the majority of cases of stand averages over several years. Locally, the period unit could be a more accurate parameter for the stand average. However, on a calendar day basis, when annual deviations between expected and measured heat sums were converted to days, period units were narrowly superior to days. The geographical correlations respect to timing of flowering, calculated against distances measured along simulated post-glacial migration routes, were stronger than purely latitudinal correlations. Effects of the reinvasion of Norway spruce into Finland are thus still visible in spruce populations just as they were in Scots pine populations. The proportion of the average annual heat sum needed for spruce anthesis grew rapidly north of a latitude of ca. 63° and the heat sum needed for anthesis decreased only slighty towards the timberline. In light of flowering phenology, it seems probable that the northwesterly third of Finnish Norway spruce populations are incompletely adapted to the prevailing cold climate. A moderate warming of the climate would therefore be beneficial for Norway spruce. This accords roughly with the adaptive situation in Scots pine.