944 resultados para maximum likelihood method
Resumo:
The effect of interspecific heterosis in crosses between Medicago sativa subsp. sativa and M. sativa subsp. falcata was assessed. Three sativa and 3 falcata plants were crossed in a diallel design. Progeny dry matter yield and natural plant height were assessed in a replicated field experiment at Gatton, Queensland. Yield data were analysed using the method of residual maximum likelihood (REML) and Griffing's model 1. There were significant differences between the reciprocal, general combining ability (GCA), and specific combining ability (SCA) effects. As expected, S-1 populations were lower yielding than their respective intraspecific cross and falcata x falcata crosses were significantly lower yielding than sativa x sativa crosses. Some of the interspecific crosses indicated substantial SCA effects, yielding at least as well as the best sativa x sativa crosses. We have demonstrated the potential usefulness of unselected M. sativa subsp. falcata as a heterotic group in the improvement of yield in northern Australian adapted lucerne material, and discuss how it could be incorporated into future breeding to overcome the yield stagnation currently being experienced in Australian programs.
Resumo:
The Wet Tropics World Heritage Area in Far North Queens- land, Australia consists predominantly of tropical rainforest and wet sclerophyll forest in areas of variable relief. Previous maps of vegetation communities in the area were produced by a labor-intensive combination of field survey and air-photo interpretation. Thus,. the aim of this work was to develop a new vegetation mapping method based on imaging radar that incorporates topographical corrections, which could be repeated frequently, and which would reduce the need for detailed field assessments and associated costs. The method employed G topographic correction and mapping procedure that was developed to enable vegetation structural classes to be mapped from satellite imaging radar. Eight JERS-1 scenes covering the Wet Tropics area for 1996 were acquired from NASDA under the auspices of the Global Rainforest Mapping Project. JERS scenes were geometrically corrected for topographic distortion using an 80 m DEM and a combination of polynomial warping and radar viewing geometry modeling. An image mosaic was created to cover the Wet Tropics region, and a new technique for image smoothing was applied to the JERS texture bonds and DEM before a Maximum Likelihood classification was applied to identify major land-cover and vegetation communities. Despite these efforts, dominant vegetation community classes could only be classified to low levels of accuracy (57.5 percent) which were partly explained by the significantly larger pixel size of the DEM in comparison to the JERS image (12.5 m). In addition, the spatial and floristic detail contained in the classes of the original validation maps were much finer than the JERS classification product was able to distinguish. In comparison to field and aerial photo-based approaches for mapping the vegetation of the Wet Tropics, appropriately corrected SAR data provides a more regional scale, all-weather mapping technique for broader vegetation classes. Further work is required to establish an appropriate combination of imaging radar with elevation data and other environmental surrogates to accurately map vegetation communities across the entire Wet Tropics.
Resumo:
Determining the dimensionality of G provides an important perspective on the genetic basis of a multivariate suite of traits. Since the introduction of Fisher's geometric model, the number of genetically independent traits underlying a set of functionally related phenotypic traits has been recognized as an important factor influencing the response to selection. Here, we show how the effective dimensionality of G can be established, using a method for the determination of the dimensionality of the effect space from a multivariate general linear model introduced by AMEMIYA (1985). We compare this approach with two other available methods, factor-analytic modeling and bootstrapping, using a half-sib experiment that estimated G for eight cuticular hydrocarbons of Drosophila serrata. In our example, eight pheromone traits were shown to be adequately represented by only two underlying genetic dimensions by Amemiya's approach and factor-analytic modeling of the covariance structure at the sire level. In, contrast, bootstrapping identified four dimensions with significant genetic variance. A simulation study indicated that while the performance of Amemiya's method was more sensitive to power constraints, it performed as well or better than factor-analytic modeling in correctly identifying the original genetic dimensions at moderate to high levels of heritability. The bootstrap approach consistently overestimated the number of dimensions in all cases and performed less well than Amemiya's method at subspace recovery.
Resumo:
We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.
Resumo:
This work has, as its objective, the development of non-invasive and low-cost systems for monitoring and automatic diagnosing specific neonatal diseases by means of the analysis of suitable video signals. We focus on monitoring infants potentially at risk of diseases characterized by the presence or absence of rhythmic movements of one or more body parts. Seizures and respiratory diseases are specifically considered, but the approach is general. Seizures are defined as sudden neurological and behavioural alterations. They are age-dependent phenomena and the most common sign of central nervous system dysfunction. Neonatal seizures have onset within the 28th day of life in newborns at term and within the 44th week of conceptional age in preterm infants. Their main causes are hypoxic-ischaemic encephalopathy, intracranial haemorrhage, and sepsis. Studies indicate an incidence rate of neonatal seizures of 0.2% live births, 1.1% for preterm neonates, and 1.3% for infants weighing less than 2500 g at birth. Neonatal seizures can be classified into four main categories: clonic, tonic, myoclonic, and subtle. Seizures in newborns have to be promptly and accurately recognized in order to establish timely treatments that could avoid an increase of the underlying brain damage. Respiratory diseases related to the occurrence of apnoea episodes may be caused by cerebrovascular events. Among the wide range of causes of apnoea, besides seizures, a relevant one is Congenital Central Hypoventilation Syndrome (CCHS) \cite{Healy}. With a reported prevalence of 1 in 200,000 live births, CCHS, formerly known as Ondine's curse, is a rare life-threatening disorder characterized by a failure of the automatic control of breathing, caused by mutations in a gene classified as PHOX2B. CCHS manifests itself, in the neonatal period, with episodes of cyanosis or apnoea, especially during quiet sleep. The reported mortality rates range from 8% to 38% of newborn with genetically confirmed CCHS. Nowadays, CCHS is considered a disorder of autonomic regulation, with related risk of sudden infant death syndrome (SIDS). Currently, the standard method of diagnosis, for both diseases, is based on polysomnography, a set of sensors such as ElectroEncephaloGram (EEG) sensors, ElectroMyoGraphy (EMG) sensors, ElectroCardioGraphy (ECG) sensors, elastic belt sensors, pulse-oximeter and nasal flow-meters. This monitoring system is very expensive, time-consuming, moderately invasive and requires particularly skilled medical personnel, not always available in a Neonatal Intensive Care Unit (NICU). Therefore, automatic, real-time and non-invasive monitoring equipments able to reliably recognize these diseases would be of significant value in the NICU. A very appealing monitoring tool to automatically detect neonatal seizures or breathing disorders may be based on acquiring, through a network of sensors, e.g., a set of video cameras, the movements of the newborn's body (e.g., limbs, chest) and properly processing the relevant signals. An automatic multi-sensor system could be used to permanently monitor every patient in the NICU or specific patients at home. Furthermore, a wire-free technique may be more user-friendly and highly desirable when used with infants, in particular with newborns. This work has focused on a reliable method to estimate the periodicity in pathological movements based on the use of the Maximum Likelihood (ML) criterion. In particular, average differential luminance signals from multiple Red, Green and Blue (RGB) cameras or depth-sensor devices are extracted and the presence or absence of a significant periodicity is analysed in order to detect possible pathological conditions. The efficacy of this monitoring system has been measured on the basis of video recordings provided by the Department of Neurosciences of the University of Parma. Concerning clonic seizures, a kinematic analysis was performed to establish a relationship between neonatal seizures and human inborn pattern of quadrupedal locomotion. Moreover, we have decided to realize simulators able to replicate the symptomatic movements characteristic of the diseases under consideration. The reasons is, essentially, the opportunity to have, at any time, a 'subject' on which to test the continuously evolving detection algorithms. Finally, we have developed a smartphone App, called 'Smartphone based contactless epilepsy detector' (SmartCED), able to detect neonatal clonic seizures and warn the user about the occurrence in real-time.
Resumo:
We present results concerning the application of the Good-Turing (GT) estimation method to the frequentist n-tuple system. We show that the Good-Turing method can, to a certain extent rectify the Zero Frequency Problem by providing, within a formal framework, improved estimates of small tallies. We also show that it leads to better tuple system performance than Maximum Likelihood estimation (MLE). However, preliminary experimental results suggest that replacing zero tallies with an arbitrary constant close to zero before MLE yields better performance than that of GT system.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models (HMMs) to identify the lag (or delay) between different variables for such data. We first present a method using maximum likelihood estimation and propose a simple algorithm which is capable of identifying associations between variables. We also adopt an information-theoretic approach and develop a novel procedure for training HMMs to maximise the mutual information between delayed time series. Both methods are successfully applied to real data. We model the oil drilling process with HMMs and estimate a crucial parameter, namely the lag for return.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models to identify the lag (or delay) between different variables for such data. Adopting an information-theoretic approach, we develop a procedure for training HMMs to maximise the mutual information (MMI) between delayed time series. The method is used to model the oil drilling process. We show that cross-correlation gives no information and that the MMI approach outperforms maximum likelihood.
Resumo:
Automatically generating maps of a measured variable of interest can be problematic. In this work we focus on the monitoring network context where observations are collected and reported by a network of sensors, and are then transformed into interpolated maps for use in decision making. Using traditional geostatistical methods, estimating the covariance structure of data collected in an emergency situation can be difficult. Variogram determination, whether by method-of-moment estimators or by maximum likelihood, is very sensitive to extreme values. Even when a monitoring network is in a routine mode of operation, sensors can sporadically malfunction and report extreme values. If this extreme data destabilises the model, causing the covariance structure of the observed data to be incorrectly estimated, the generated maps will be of little value, and the uncertainty estimates in particular will be misleading. Marchant and Lark [2007] propose a REML estimator for the covariance, which is shown to work on small data sets with a manual selection of the damping parameter in the robust likelihood. We show how this can be extended to allow treatment of large data sets together with an automated approach to all parameter estimation. The projected process kriging framework of Ingram et al. [2007] is extended to allow the use of robust likelihood functions, including the two component Gaussian and the Huber function. We show how our algorithm is further refined to reduce the computational complexity while at the same time minimising any loss of information. To show the benefits of this method, we use data collected from radiation monitoring networks across Europe. We compare our results to those obtained from traditional kriging methodologies and include comparisons with Box-Cox transformations of the data. We discuss the issue of whether to treat or ignore extreme values, making the distinction between the robust methods which ignore outliers and transformation methods which treat them as part of the (transformed) process. Using a case study, based on an extreme radiological events over a large area, we show how radiation data collected from monitoring networks can be analysed automatically and then used to generate reliable maps to inform decision making. We show the limitations of the methods and discuss potential extensions to remedy these.
Resumo:
When making predictions with complex simulators it can be important to quantify the various sources of uncertainty. Errors in the structural specification of the simulator, for example due to missing processes or incorrect mathematical specification, can be a major source of uncertainty, but are often ignored. We introduce a methodology for inferring the discrepancy between the simulator and the system in discrete-time dynamical simulators. We assume a structural form for the discrepancy function, and show how to infer the maximum-likelihood parameter estimates using a particle filter embedded within a Monte Carlo expectation maximization (MCEM) algorithm. We illustrate the method on a conceptual rainfall-runoff simulator (logSPM) used to model the Abercrombie catchment in Australia. We assess the simulator and discrepancy model on the basis of their predictive performance using proper scoring rules. This article has supplementary material online. © 2011 International Biometric Society.
Resumo:
Sparse code division multiple access (CDMA), a variation on the standard CDMA method in which the spreading (signature) matrix contains only a relatively small number of nonzero elements, is presented and analysed using methods of statistical physics. The analysis provides results on the performance of maximum likelihood decoding for sparse spreading codes in the large system limit. We present results for both cases of regular and irregular spreading matrices for the binary additive white Gaussian noise channel (BIAWGN) with a comparison to the canonical (dense) random spreading code. © 2007 IOP Publishing Ltd.
Resumo:
Recently within the machine learning and spatial statistics communities many papers have explored the potential of reduced rank representations of the covariance matrix, often referred to as projected or fixed rank approaches. In such methods the covariance function of the posterior process is represented by a reduced rank approximation which is chosen such that there is minimal information loss. In this paper a sequential framework for inference in such projected processes is presented, where the observations are considered one at a time. We introduce a C++ library for carrying out such projected, sequential estimation which adds several novel features. In particular we have incorporated the ability to use a generic observation operator, or sensor model, to permit data fusion. We can also cope with a range of observation error characteristics, including non-Gaussian observation errors. Inference for the variogram parameters is based on maximum likelihood estimation. We illustrate the projected sequential method in application to synthetic and real data sets. We discuss the software implementation and suggest possible future extensions.
Resumo:
The subject of this thesis is the n-tuple net.work (RAMnet). The major advantage of RAMnets is their speed and the simplicity with which they can be implemented in parallel hardware. On the other hand, this method is not a universal approximator and the training procedure does not involve the minimisation of a cost function. Hence RAMnets are potentially sub-optimal. It is important to understand the source of this sub-optimality and to develop the analytical tools that allow us to quantify the generalisation cost of using this model for any given data. We view RAMnets as classifiers and function approximators and try to determine how critical their lack of' universality and optimality is. In order to understand better the inherent. restrictions of the model, we review RAMnets showing their relationship to a number of well established general models such as: Associative Memories, Kamerva's Sparse Distributed Memory, Radial Basis Functions, General Regression Networks and Bayesian Classifiers. We then benchmark binary RAMnet. model against 23 other algorithms using real-world data from the StatLog Project. This large scale experimental study indicates that RAMnets are often capable of delivering results which are competitive with those obtained by more sophisticated, computationally expensive rnodels. The Frequency Weighted version is also benchmarked and shown to perform worse than the binary RAMnet for large values of the tuple size n. We demonstrate that the main issues in the Frequency Weighted RAMnets is adequate probability estimation and propose Good-Turing estimates in place of the more commonly used :Maximum Likelihood estimates. Having established the viability of the method numerically, we focus on providillg an analytical framework that allows us to quantify the generalisation cost of RAMnets for a given datasetL. For the classification network we provide a semi-quantitative argument which is based on the notion of Tuple distance. It gives a good indication of whether the network will fail for the given data. A rigorous Bayesian framework with Gaussian process prior assumptions is given for the regression n-tuple net. We show how to calculate the generalisation cost of this net and verify the results numerically for one dimensional noisy interpolation problems. We conclude that the n-tuple method of classification based on memorisation of random features can be a powerful alternative to slower cost driven models. The speed of the method is at the expense of its optimality. RAMnets will fail for certain datasets but the cases when they do so are relatively easy to determine with the analytical tools we provide.
Resumo:
Urban regions present some of the most challenging areas for the remote sensing community. Many different types of land cover have similar spectral responses, making them difficult to distinguish from one another. Traditional per-pixel classification techniques suffer particularly badly because they only use these spectral properties to determine a class, and no other properties of the image, such as context. This project presents the results of the classification of a deeply urban area of Dudley, West Midlands, using 4 methods: Supervised Maximum Likelihood, SMAP, ECHO and Unsupervised Maximum Likelihood. An accuracy assessment method is then developed to allow a fair representation of each procedure and a direct comparison between them. Subsequently, a classification procedure is developed that makes use of the context in the image, though a per-polygon classification. The imagery is broken up into a series of polygons extracted from the Marr-Hildreth zero-crossing edge detector. These polygons are then refined using a region-growing algorithm, and then classified according to the mean class of the fine polygons. The imagery produced by this technique is shown to be of better quality and of a higher accuracy than that of other conventional methods. Further refinements are suggested and examined to improve the aesthetic appearance of the imagery. Finally a comparison with the results produced from a previous study of the James Bridge catchment, in Darleston, West Midlands, is made, showing that the Polygon classified ATM imagery performs significantly better than the Maximum Likelihood classified videography used in the initial study, despite the presence of geometric correction errors.
Resumo:
2002 Mathematics Subject Classification: 62F35, 62F15.