Biblioteca Digital

998 resultados para Statistical packages

Automatic indexing : an approach using an index term corpus and combining linguistic and statistical methods

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Statistical Optimization of Iron Electrodes for Alkaline Storage Batteries

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A pressed-plate Fe electrode for alkalines storage batteries, designed using a statistical method (fractional factorial technique), is described. Parameters such as the configuration of the base grid, electrode compaction temperature and pressure, binder composition, mixing time, etc. have been optimised using this method. The optimised electrodes have a capacity of 300 plus /minus 5 mA h/g of active material (mixture of Fe and magnetite) at 7 h rate to a cut-off voltage of 8.86V vs. Hg/HgO, OH exp 17 ref.

Veja mais

Domain Adaptation on the Statistical Manifold

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, we tackle the problem of unsupervised domain adaptation for classification. In the unsupervised scenario where no labeled samples from the target domain are provided, a popular approach consists in transforming the data such that the source and target distributions be- come similar. To compare the two distributions, existing approaches make use of the Maximum Mean Discrepancy (MMD). However, this does not exploit the fact that prob- ability distributions lie on a Riemannian manifold. Here, we propose to make better use of the structure of this man- ifold and rely on the distance on the manifold to compare the source and target distributions. In this framework, we introduce a sample selection method and a subspace-based method for unsupervised domain adaptation, and show that both these manifold-based techniques outperform the cor- responding approaches based on the MMD. Furthermore, we show that our subspace-based approach yields state-of- the-art results on a standard object recognition benchmark.

Veja mais

Annual forecasting of the Australian macadamia crop - integrating tree census data with statistical climate-adjustment models

Relevância:

20.00% 20.00%

Publicador:

Resumo:

To facilitate marketing and export, the Australian macadamia industry requires accurate crop forecasts. Each year, two levels of crop predictions are produced for this industry. The first is an overall longer-term forecast based on tree census data of growers in the Australian Macadamia Society (AMS). This data set currently accounts for around 70% of total production, and is supplemented by our best estimates of non-AMS orchards. Given these total tree numbers, average yields per tree are needed to complete the long-term forecasts. Yields from regional variety trials were initially used, but were found to be consistently higher than the average yields that growers were obtaining. Hence, a statistical model was developed using growers' historical yields, also taken from the AMS database. This model accounted for the effects of tree age, variety, year, region and tree spacing, and explained 65% of the total variation in the yield per tree data. The second level of crop prediction is an annual climate adjustment of these overall long-term estimates, taking into account the expected effects on production of the previous year's climate. This adjustment is based on relative historical yields, measured as the percentage deviance between expected and actual production. The dominant climatic variables are observed temperature, evaporation, solar radiation and modelled water stress. Initially, a number of alternate statistical models showed good agreement within the historical data, with jack-knife cross-validation R2 values of 96% or better. However, forecasts varied quite widely between these alternate models. Exploratory multivariate analyses and nearest-neighbour methods were used to investigate these differences. For 2001-2003, the overall forecasts were in the right direction (when compared with the long-term expected values), but were over-estimates. In 2004 the forecast was well under the observed production, and in 2005 the revised models produced a forecast within 5.1% of the actual production. Over the first five years of forecasting, the absolute deviance for the climate-adjustment models averaged 10.1%, just outside the targeted objective of 10%.

Veja mais

Generalized pencils of rays in statistical wave optics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The recently introduced generalized pencil of Sudarshan which gives an exact ray picture of wave optics is analysed in some situations of interest to wave optics. A relationship between ray dispersion and statistical inhomogeneity of the field is obtained. A paraxial approximation which preserves the rectilinear propagation character of the generalized pencils is presented. Under this approximation the pencils can be computed directly from the field conditions on a plane, without the necessity to compute the cross-spectral density function in the entire space as an intermediate quantity. The paraxial results are illustrated with examples. The pencils are shown to exhibit an interesting scaling behaviour in the far-zone. This scaling leads to a natural generalization of the Fraunhofer range criterion and of the classical van Cittert-Zernike theorem to planar sources of arbitrary state of coherence. The recently derived results of radiometry with partially coherent sources are shown to be simple consequences of this scaling.

Veja mais

Statistical analysis of the associations of hereditary factors with the risk of Type 1 diabetes and Diabetic Nephropathy based on the familial data collected through ascertaiment from population-based registers

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.

Veja mais

Agronomic Packages for Improved Yield and Quality in the Australian Peanut Industry

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Monitoring aflatoxin and developing improved peanut drying practices, cadmium management and web based irrigation decision support systems.

Veja mais

Statistical properties of turbulence: An overview

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present an introductory overview of several challenging problems in the statistical characterization of turbulence. We provide examples from fluid turbulence in three and two dimensions, from the turbulent advection of passive scalars, turbulence in the one-dimensional Burgers equation, and fluid turbulence in the presence of polymer additives.

Veja mais

Charged-soliton excitations in statistical mechanics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A method is developed for demonstrating how solitons with some internal periodic motion may emerge as elementary excitations in the statistical mechanics of field systems. The procedure is demonstrated in the context of complex scalar fields which can, for appropriate choices of the Lagrangian, yield charge-carrying solitons with such internal motion. The derivation uses the techniques of the steepest-descent method for functional integrals. It is shown that, despite the constraint of some fixed total charge, a gaslike excitation of such charged solitons does emerge.

Veja mais

Prediction of mortality and conception rates of beef breeding cattle in northern Australia

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In current simulation packages for the management of extensive beef-cattle enterprises, the relationships for the key biological rates (namely conception and mortality) are quite rudimentary. To better estimate these relationships, cohort-level data covering 17 100 cow-years from six sites across northern Australia were collated and analysed. Further validation data, from 7200 cow-years, were then used to test these relationships. Analytical problems included incomplete and non-standardised data, considerable levels of correlation among the 'independent' variables, and the close similarity of alternate possible models. In addition to formal statistical analyses of these data, the theoretical equations for predicting mortality and conception rates in the current simulation models were reviewed, and then reparameterised and recalibrated where appropriate. The final models explained up to 80% of the variation in the data. These are now proposed as more accurate and useful models to be used in the prediction of biological rates in simulation studies for northern Australia. © The State of Queensland (through the Department of Agriculture, Fisheries and Forestry) 2012. © CSIRO.

Veja mais

Statistical significance of sequential firing patterns in multi-neuronal spike trains

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sequential firings with fixed time delays are frequently observed in simultaneous recordings from multiple neurons. Such temporal patterns are potentially indicative of underlying microcircuits and it is important to know when a repeatedly occurring pattern is statistically significant. These sequences are typically identified through correlation counts. In this paper we present a method for assessing the significance of such correlations. We specify the null hypothesis in terms of a bound on the conditional probabilities that characterize the influence of one neuron on another. This method of testing significance is more general than the currently available methods since under our null hypothesis we do not assume that the spiking processes of different neurons are independent. The structure of our null hypothesis also allows us to rank order the detected patterns. We demonstrate our method on simulated spike trains.

Veja mais

Studies in Computational Statistical Methods with Applications to Pattern Recognition and Data Analysis

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Nonparametric Statistical Modeling of Recurrent Events: A Bayesian Approach

Relevância:

20.00% 20.00%

Publicador:

Veja mais

Bayesian statistical analysis of bacterial diversity

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.

Veja mais

Statistical and Information-Theoretic Methods for Data-Analysis

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this Thesis, we develop theory and methods for computational data analysis. The problems in data analysis are approached from three perspectives: statistical learning theory, the Bayesian framework, and the information-theoretic minimum description length (MDL) principle. Contributions in statistical learning theory address the possibility of generalization to unseen cases, and regression analysis with partially observed data with an application to mobile device positioning. In the second part of the Thesis, we discuss so called Bayesian network classifiers, and show that they are closely related to logistic regression models. In the final part, we apply the MDL principle to tracing the history of old manuscripts, and to noise reduction in digital signals.

Veja mais

998 resultados para Statistical packages

Filtro por publicador