998 resultados para 010401 Applied Statistics


Relevância:

80.00% 80.00%

Publicador:

Resumo:

A generic method for the estimation of parameters for Stochastic Ordinary Differential Equations (SODEs) is introduced and developed. This algorithm, called the GePERs method, utilises a genetic optimisation algorithm to minimise a stochastic objective function based on the Kolmogorov-Smirnov statistic. Numerical simulations are utilised to form the KS statistic. Further, the examination of some of the factors that improve the precision of the estimates is conducted. This method is used to estimate parameters of diffusion equations and jump-diffusion equations. It is also applied to the problem of model selection for the Queensland electricity market. (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The paper presents a framework for small area population estimation that enables users to select a method that is fit for the purpose. The adjustments to input data that are needed before use are outlined, with emphasis on developing consistent time series of inputs. We show how geographical harmonization of small areas, which is crucial to comparisons over time, can be achieved. For two study regions, the East of England and Yorkshire and the Humber, the differences in output and consequences of adopting different methods are illustrated. The paper concludes with a discussion of how data, on stream since 1998, might be included in future small area estimates.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Progress in bean breeding programs requires the exploitation of genetic variation that is present among races or through introgression across gene pools of Phaseolus vulgaris L. Of the two major common bean gene pools, the Andean gene pool seems to have a narrow genetic base, with about 10% of the accessions in the CIAT core collection presenting evidence of introgression. The objective of this study was to quantify the degree of spontaneous introgression in a sample of common bean landraces from the Andean gene pool. The effects of introgression on morphological, economic and nutritional attributes were also investigated. Homogeneity analysis was performed on molecular marker data from 426 Andean-type accessions from the primary centres of origin of the CIAT common bean core collection and two check varieties. Quantitative attribute diversity for 15 traits was studied based on the groups found from the cluster analysis of marker prevalence indices computed for each accession. The two-group summary consisted of one group of 58 accessions (14%) with low prevalence indices and another group of 370 accessions (86%) with high prevalence indices. The smaller group occupied the outlying area of points displayed from homogeneity analysis, yet their geographic origin was widely distributed over the Andean region. This group was regarded as introgressed, since its accessions displayed traits that are associated with the Middle American gene pool: high resistance to Andean disease isolates but low resistance to Middle American disease isolates, low seed weight and high scores for all nutrient elements. Genotypes generated by spontaneous introgression can be helpful for breeders to overcome the difficulties in transferring traits between gene pools.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In broader catchment scale investigations, there is a need to understand and ultimately exploit the spatial variation of agricultural crops for an improved economic return. In many instances, this spatial variation is temporally unstable and may be different for various crop attributes and crop species. In the Australian sugar industry, the opportunity arose to evaluate the performance of 231 farms in the Tully Mill area in far north Queensland using production information on cane yield (t/ha) and CCS ( a fresh weight measure of sucrose content in the cane) accumulated over a 12-year period. Such an arrangement of data can be expressed as a 3-way array where a farm x attribute x year matrix can be evaluated and interactions considered. Two multivariate techniques, the 3-way mixture method of clustering and the 3-mode principal component analysis, were employed to identify meaningful relationships between farms that performed similarly for both cane yield and CCS. In this context, farm has a spatial component and the aim of this analysis was to determine if systematic patterns in farm performance expressed by cane yield and CCS persisted over time. There was no spatial relationship between cane yield and CCS. However, the analysis revealed that the relationship between farms was remarkably stable from one year to the next for both attributes and there was some spatial aggregation of farm performance in parts of the mill area. This finding is important, since temporally consistent spatial variation may be exploited to improve regional production. Alternatively, the putative causes of the spatial variation may be explored to enhance the understanding of sugarcane production in the wet tropics of Australia.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

When studying genotype X environment interaction in multi-environment trials, plant breeders and geneticists often consider one of the effects, environments or genotypes, to be fixed and the other to be random. However, there are two main formulations for variance component estimation for the mixed model situation, referred to as the unconstrained-parameters (UP) and constrained-parameters (CP) formulations. These formulations give different estimates of genetic correlation and heritability as well as different tests of significance for the random effects factor. The definition of main effects and interactions and the consequences of such definitions should be clearly understood, and the selected formulation should be consistent for both fixed and random effects. A discussion of the practical outcomes of using the two formulations in the analysis of balanced data from multi-environment trials is presented. It is recommended that the CP formulation be used because of the meaning of its parameters and the corresponding variance components. When managed (fixed) environments are considered, users will have more confidence in prediction for them but will not be overconfident in prediction in the target (random) environments. Genetic gain (predicted response to selection in the target environments from the managed environments) is independent of formulation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Previous research has reported both agreements and serious anomalies in relationships between production attributes of sugarcane varieties in variety trials (VTs) and commercial production (CP). This paper examines VT and CP data for tonnes of cane per hectare (TCH) and sugar content (CCS). Data, analysed by REML, included 107 VTs and 54 CP mill years for 9 varieties from the mill districts of Mulgrave, Babinda, and Tully for harvest years 1982-99. Important consistencies included high TCH of Q152, high CCS of Q117 and Q120, and low CCS of H56-752. Significant anomalies existed with respect to TCH for Q113, Q117, Q120, Q122, Q138, and H56-752 and to CCS for Q113 and Q124. Investigation of these anomalies was assisted by access to independent REML analyses of CP data for 65692 individual Tully cane blocks from 1988 to 1999 and by the knowledge of persons familiar with the preferential uses of varieties by farmers. Minor anomalies were due to limited year or mill area data. Q124 TCH was deemed to be decreased and its CCS increased by severe disease in Babinda CP in the extremely wet 1998 and 1999 seasons. Other serious anomalies have credible but unsubstantiated explanations. The most convincing, for Q113, Q117, Q138, and H56-752, are that these varieties were deployed unevenly with regard to late season harvesting, predominant use or avoidance on high fertility soils, or use confined to low fertility sandy soils, respectively. Uneven deployment results in confounding of these effects in the varietal CP statistics at mill area level. It is concluded that VTs cannot be enhanced to anticipate or evaluate most effects of uneven deployment. They give adequate predictions of relative CP performance for varieties deployed evenly across confounding influences. Routine analyses of individual block CP data would be useful and enhanced by addition of relevant information to the block records.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An investigation was conducted to evaluate the impact of experimental designs and spatial analyses (single-trial models) of the response to selection for grain yield in the northern grains region of Australia (Queensland and northern New South Wales). Two sets of multi-environment experiments were considered. One set, based on 33 trials conducted from 1994 to 1996, was used to represent the testing system of the wheat breeding program and is referred to as the multi-environment trial (MET). The second set, based on 47 trials conducted from 1986 to 1993, sampled a more diverse set of years and management regimes and was used to represent the target population of environments (TPE). There were 18 genotypes in common between the MET and TPE sets of trials. From indirect selection theory, the phenotypic correlation coefficient between the MET and TPE single-trial adjusted genotype means [r(p(MT))] was used to determine the effect of the single-trial model on the expected indirect response to selection for grain yield in the TPE based on selection in the MET. Five single-trial models were considered: randomised complete block (RCB), incomplete block (IB), spatial analysis (SS), spatial analysis with a measurement error (SSM) and a combination of spatial analysis and experimental design information to identify the preferred (PF) model. Bootstrap-resampling methodology was used to construct multiple MET data sets, ranging in size from 2 to 20 environments per MET sample. The size and environmental composition of the MET and the single-trial model influenced the r(p(MT)). On average, the PF model resulted in a higher r(p(MT)) than the IB, SS and SSM models, which were in turn superior to the RCB model for MET sizes based on fewer than ten environments. For METs based on ten or more environments, the r(p(MT)) was similar for all single-trial models.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local FDR (false discovery rate) is provided for each gene. An attractive feature of the mixture model approach is that it provides a framework for the estimation of the prior probability that a gene is not differentially expressed, and this probability can subsequently be used in forming a decision rule. The rule can also be formed to take the false negative rate into account. We apply this approach to a well-known publicly available data set on breast cancer, and discuss our findings with reference to other approaches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An important and common problem in microarray experiments is the detection of genes that are differentially expressed in a given number of classes. As this problem concerns the selection of significant genes from a large pool of candidate genes, it needs to be carried out within the framework of multiple hypothesis testing. In this paper, we focus on the use of mixture models to handle the multiplicity issue. With this approach, a measure of the local false discovery rate is provided for each gene, and it can be implemented so that the implied global false discovery rate is bounded as with the Benjamini-Hochberg methodology based on tail areas. The latter procedure is too conservative, unless it is modified according to the prior probability that a gene is not differentially expressed. An attractive feature of the mixture model approach is that it provides a framework for the estimation of this probability and its subsequent use in forming a decision rule. The rule can also be formed to take the false negative rate into account.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Normal mixture models are often used to cluster continuous data. However, conventional approaches for fitting these models will have problems in producing nonsingular estimates of the component-covariance matrices when the dimension of the observations is large relative to the number of observations. In this case, methods such as principal components analysis (PCA) and the mixture of factor analyzers model can be adopted to avoid these estimation problems. We examine these approaches applied to the Cabernet wine data set of Ashenfelter (1999), considering the clustering of both the wines and the judges, and comparing our results with another analysis. The mixture of factor analyzers model proves particularly effective in clustering the wines, accurately classifying many of the wines by location.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Previously the process of finding critical sets in Latin squares has been inside cumbersome by the complexity and number of Latin trades that, must be constructed. In this paper we develop a theory of Latin trades that yields more transparent constructions. We use these Latin trades to find a new class of critical sets for Latin squares which are a product of the Latin square of order 2 with a. back circulant Latin square of odd order.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background Regression to the mean (RTM) is a statistical phenomenon that can make natural variation in repeated data look like real change. It happens when unusually large or small measurements tend to be followed by measurements that are closer to the mean. Methods We give some examples of the phenomenon, and discuss methods to overcome it at the design and analysis stages of a study. Results The effect of RTM in a sample becomes more noticeable with increasing measurement error and when follow-up measurements are only examined on a sub-sample selected using a baseline value. Conclusions RTM is a ubiquitous phenomenon in repeated data and should always be considered as a possible cause of an observed change. Its effect can be alleviated through better study design and use of suitable statistical methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The bispectrum and third-order moment can be viewed as equivalent tools for testing for the presence of nonlinearity in stationary time series. This is because the bispectrum is the Fourier transform of the third-order moment. An advantage of the bispectrum is that its estimator comprises terms that are asymptotically independent at distinct bifrequencies under the null hypothesis of linearity. An advantage of the third-order moment is that its values in any subset of joint lags can be used in the test, whereas when using the bispectrum the entire (or truncated) third-order moment is required to construct the Fourier transform. In this paper, we propose a test for nonlinearity based upon the estimated third-order moment. We use the phase scrambling bootstrap method to give a nonparametric estimate of the variance of our test statistic under the null hypothesis. Using a simulation study, we demonstrate that the test obtains its target significance level, with large power, when compared to an existing standard parametric test that uses the bispectrum. Further we show how the proposed test can be used to identify the source of nonlinearity due to interactions at specific frequencies. We also investigate implications for heuristic diagnosis of nonstationarity.