To facilitate marketing and export, the Australian macadamia industry requires accurate crop forecasts. Each year, two levels of crop predictions are produced for this industry. The first is an overall longer-term forecast based on tree census data of growers in the Australian Macadamia Society (AMS). This data set currently accounts for around 70% of total production, and is supplemented by our best estimates of non-AMS orchards. Given these total tree numbers, average yields per tree are needed to complete the long-term forecasts. Yields from regional variety trials were initially used, but were found to be consistently higher than the average yields that growers were obtaining. Hence, a statistical model was developed using growers' historical yields, also taken from the AMS database. This model accounted for the effects of tree age, variety, year, region and tree spacing, and explained 65% of the total variation in the yield per tree data. The second level of crop prediction is an annual climate adjustment of these overall long-term estimates, taking into account the expected effects on production of the previous year's climate. This adjustment is based on relative historical yields, measured as the percentage deviance between expected and actual production. The dominant climatic variables are observed temperature, evaporation, solar radiation and modelled water stress. Initially, a number of alternate statistical models showed good agreement within the historical data, with jack-knife cross-validation R2 values of 96% or better. However, forecasts varied quite widely between these alternate models. Exploratory multivariate analyses and nearest-neighbour methods were used to investigate these differences. For 2001-2003, the overall forecasts were in the right direction (when compared with the long-term expected values), but were over-estimates. In 2004 the forecast was well under the observed production, and in 2005 the revised models produced a forecast within 5.1% of the actual production. Over the first five years of forecasting, the absolute deviance for the climate-adjustment models averaged 10.1%, just outside the targeted objective of 10%.


In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.


We have studied magneto-transport and optical properties of Ga1-xMnxSb crystals (x = 0.01, 0.02, 0.03 and 0.04) grown by horizontal Bridgman method. Negative magnetoresistance and anomalous Hall effect have been observed below 10K. Temperature dependence of magnetization measurement shows a magnetic ordering below 10K which could arise from Ga1-xMnxSb alloy formation. Also, saturation in magnetization observed even at room temperature suggests the existence of ferromagnetic MnSb clusters. Reduction in band gap is observed with increasing Mn concentration in the crystals. Temperature dependence of band gap follows Bose-Einstein's model.


This thesis presents ab initio studies of two kinds of physical systems, quantum dots and bosons, using two program packages of which the bosonic one has mainly been developed by the author. The implemented models, \emph{i.e.}, configuration interaction (CI) and coupled cluster (CC) take the correlated motion of the particles into account, and provide a hierarchy of computational schemes, on top of which the exact solution, within the limit of the single-particle basis set, is obtained. The theory underlying the models is presented in some detail, in order to provide insight into the approximations made and the circumstances under which they hold. Some of the computational methods are also highlighted. In the final sections the results are summarized. The CI and CC calculations on multiexciton complexes in self-assembled semiconductor quantum dots are presented and compared, along with radiative and non-radiative transition rates. Full CI calculations on quantum rings and double quantum rings are also presented. In the latter case, experimental and theoretical results from the literature are re-examined and an alternative explanation for the reported photoluminescence spectra is found. The boson program is first applied on a fictitious model system consisting of bosonic electrons in a central Coulomb field for which CI at the singles and doubles level is found to account for almost all of the correlation energy. Finally, the boson program is employed to study Bose-Einstein condensates confined in different anisotropic trap potentials. The effects of the anisotropy on the relative correlation energy is examined, as well as the effect of varying the interaction potential.}


Fracture owing to the coalescence of numerous microcracks can be described by a simple statistical model, where a coalescence event stochastically occurs as the number density of nucleated microcracks increases. Both numerical simulation and statistical analysis reveal that a microcrack coalescence process may display avalanche behavior and that the final failure is catastrophic. The cumulative distribution of coalescence events in the vicinity of critical fracture follows a power law and the fracture profile has self-affine fractal characteristic. Some macromechanical quantities may be traced back and extracted from the mesoscopic process based on the statistical analysis of coalescence events.


Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents BAGEL, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that BAGEL can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation performance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data. © 2010 Association for Computational Linguistics.


In spite of over two decades of intense research, illumination and pose invariance remain prohibitively challenging aspects of face recognition for most practical applications. The objective of this work is to recognize faces using video sequences both for training and recognition input, in a realistic, unconstrained setup in which lighting, pose and user motion pattern have a wide variability and face images are of low resolution. The central contribution is an illumination invariant, which we show to be suitable for recognition from video of loosely constrained head motion. In particular there are three contributions: (i) we show how a photometric model of image formation can be combined with a statistical model of generic face appearance variation to exploit the proposed invariant and generalize in the presence of extreme illumination changes; (ii) we introduce a video sequence re-illumination algorithm to achieve fine alignment of two video sequences; and (iii) we use the smoothness of geodesically local appearance manifold structure and a robust same-identity likelihood to achieve robustness to unseen head poses. We describe a fully automatic recognition system based on the proposed method and an extensive evaluation on 323 individuals and 1474 video sequences with extreme illumination, pose and head motion variation. Our system consistently achieved a nearly perfect recognition rate (over 99.7% on all four databases). © 2012 Elsevier Ltd All rights reserved.


We offer a solution to the problem of efficiently translating algorithms between different types of discrete statistical model. We investigate the expressive power of three classes of model-those with binary variables, with pairwise factors, and with planar topology-as well as their four intersections. We formalize a notion of "simple reduction" for the problem of inferring marginal probabilities and consider whether it is possible to "simply reduce" marginal inference from general discrete factor graphs to factor graphs in each of these seven subclasses. We characterize the reducibility of each class, showing in particular that the class of binary pairwise factor graphs is able to simply reduce only positive models. We also exhibit a continuous "spectral reduction" based on polynomial interpolation, which overcomes this limitation. Experiments assess the performance of standard approximate inference algorithms on the outputs of our reductions.


A statistical model of random wave is developed using Stokes wave theory of water wave dynamics. A new nonlinear probability distribution function of wave height is presented. The results indicate that wave steepness not only could be a parameter of the distribution function of wave height but also could reflect the degree of wave height distribution deviation from the Rayleigh distribution. The new wave height distribution overcomes the problem of Rayleigh distribution that the prediction of big wave is overestimated and the general wave is underestimated. The prediction of small probability wave height value of new distribution is also smaller than that of Rayleigh distribution. Wave height data taken from East China Normal University are used to verify the new distribution. The results indicate that the new distribution fits the measurements much better than the Rayleigh distribution.


We formulate and interpret several multi-modal registration methods in the context of a unified statistical and information theoretic framework. A unified interpretation clarifies the implicit assumptions of each method yielding a better understanding of their relative strengths and weaknesses. Additionally, we discuss a generative statistical model from which we derive a novel analysis tool, the "auto-information function", as a means of assessing and exploiting the common spatial dependencies inherent in multi-modal imagery. We analytically derive useful properties of the "auto-information" as well as verify them empirically on multi-modal imagery. Among the useful aspects of the "auto-information function" is that it can be computed from imaging modalities independently and it allows one to decompose the search space of registration problems.


In this thesis I theoretically study quantum states of ultracold atoms. The majority of the Chapters focus on engineering specific quantum states of single atoms with high fidelity in experimentally realistic systems. In the sixth Chapter, I investigate the stability and dynamics of new multidimensional solitonic states that can be created in inhomogeneous atomic Bose-Einstein condensates. In Chapter three I present two papers in which I demonstrate how the coherent tunnelling by adiabatic passage (CTAP) process can be implemented in an experimentally realistic atom chip system, to coherently transfer the centre-of-mass of a single atom between two spatially distinct magnetic waveguides. In these works I also utilise GPU (Graphics Processing Unit) computing which offers a significant performance increase in the numerical simulation of the Schrödinger equation. In Chapter four I investigate the CTAP process for a linear arrangement of radio frequency traps where the centre-of-mass of both, single atoms and clouds of interacting atoms, can be coherently controlled. In Chapter five I present a theoretical study of adiabatic radio frequency potentials where I use Floquet theory to more accurately model situations where frequencies are close and/or field amplitudes are large. I also show how one can create highly versatile 2D adiabatic radio frequency potentials using multiple radio frequency fields with arbitrary field orientation and demonstrate their utility by simulating the creation of ring vortex solitons. In the sixth Chapter I discuss the stability and dynamics of a family of multidimensional solitonic states created in harmonically confined Bose-Einstein condensates. I demonstrate that these solitonic states have interesting dynamical instabilities, where a continuous collapse and revival of the initial state occurs. Through Bogoliubov analysis, I determine the modes responsible for the observed instabilities of each solitonic state and also extract information related to the time at which instability can be observed.


A framework for adaptive and non-adaptive statistical compressive sensing is developed, where a statistical model replaces the standard sparsity model of classical compressive sensing. We propose within this framework optimal task-specific sensing protocols specifically and jointly designed for classification and reconstruction. A two-step adaptive sensing paradigm is developed, where online sensing is applied to detect the signal class in the first step, followed by a reconstruction step adapted to the detected class and the observed samples. The approach is based on information theory, here tailored for Gaussian mixture models (GMMs), where an information-theoretic objective relationship between the sensed signals and a representation of the specific task of interest is maximized. Experimental results using synthetic signals, Landsat satellite attributes, and natural images of different sizes and with different noise levels show the improvements achieved using the proposed framework when compared to more standard sensing protocols. The underlying formulation can be applied beyond GMMs, at the price of higher mathematical and computational complexity. © 1991-2012 IEEE.


The lesser sandeel Ammodytes marinus is a key species in the North Sea ecosystem, transferring energy from planktonic producers to top predators. Previous studies have shown a long-term decline in the size of 0-group sandeels in the western North Sea, but they were unable to pinpoint the mechanism (later hatching, slower growth or changes in size-dependent mortality) or cause. To investigate the first 2 possibilities we combined 2 independent time series of sandeel size, namely data from chick-feeding Atlantic puffins Fratercula arctica and from the Continuous Plankton Recorder (CPR), in a novel statistical model implemented using Markov Chain Monte Carlo (MCMC). The model estimated annual mean length on 1 July, as well as hatching date and growth rate for sandeels from 1973 to 2006. Mean length-at-date declined by 22% over this period, corresponding to a 60% decrease in energy content, with a sharper decline since 2002. Up to the mid-1990s, the decline was associated with a trend towards later hatching. Subsequently, hatching became earlier again, and the continued trend towards smaller size appears to have been driven by lower growth rates, particularly in the most recent years, although we could not rule out changes in size-dependent mortality. Our findings point to major changes in key aspects of sandeel life history, which we consider are most likely due to direct and indirect temperature-related changes over a range of biotic factors, including the seasonal distribution of copepods and intra- and inter-specific competition with planktivorous fish. The results have implications both for the many predators of sandeels and for age and size of maturation in this aggregation of North Sea sandeels.


We study quantum information flow in a model comprised of a trapped impurity qubit immersed in a Bose-Einstein-condensed reservoir. We demonstrate how information flux between the qubit and the condensate can be manipulated by engineering the ultracold reservoir within experimentally realistic limits. We show that this system undergoes a transition from Markovian to non-Markovian dynamics, which can be controlled by changing key parameters such as the condensate scattering length. In this way, one can realize a quantum simulator of both Markovian and non-Markovian open quantum systems, the latter ones being characterized by a reverse flow of information from the background gas (reservoir) to the impurity (system).


We study the entanglement of two impurity qubits immersed in a Bose-Einstein condensate (BEC) reservoir. This open quantum system model allows for interpolation between a common dephasing scenario and an independent dephasing scenario by modifying the wavelength of the superlattice superposed to the BEC, and how this influences the dynamical properties of the impurities. We demonstrate the existence of rich dynamics corresponding to different values of reservoir parameters, including phenomena such as entanglement trapping, revivals of entanglement, and entanglement generation. In the spirit of reservoir engineering, we present the optimal BEC parameters for entanglement generation and trapping, showing the key role of the ultracold-gas interactions. Copyright (C) EPLA, 2013