983 resultados para trimmed likelihood estimation
Resumo:
It is system dynamics that determines the function of cells, tissues and organisms. To develop mathematical models and estimate their parameters are an essential issue for studying dynamic behaviors of biological systems which include metabolic networks, genetic regulatory networks and signal transduction pathways, under perturbation of external stimuli. In general, biological dynamic systems are partially observed. Therefore, a natural way to model dynamic biological systems is to employ nonlinear state-space equations. Although statistical methods for parameter estimation of linear models in biological dynamic systems have been developed intensively in the recent years, the estimation of both states and parameters of nonlinear dynamic systems remains a challenging task. In this report, we apply extended Kalman Filter (EKF) to the estimation of both states and parameters of nonlinear state-space models. To evaluate the performance of the EKF for parameter estimation, we apply the EKF to a simulation dataset and two real datasets: JAK-STAT signal transduction pathway and Ras/Raf/MEK/ERK signaling transduction pathways datasets. The preliminary results show that EKF can accurately estimate the parameters and predict states in nonlinear state-space equations for modeling dynamic biochemical networks.
Resumo:
Attractive business cases in various application fields contribute to the sustained long-term interest in indoor localization and tracking by the research community. Location tracking is generally treated as a dynamic state estimation problem, consisting of two steps: (i) location estimation through measurement, and (ii) location prediction. For the estimation step, one of the most efficient and low-cost solutions is Received Signal Strength (RSS)-based ranging. However, various challenges - unrealistic propagation model, non-line of sight (NLOS), and multipath propagation - are yet to be addressed. Particle filters are a popular choice for dealing with the inherent non-linearities in both location measurements and motion dynamics. While such filters have been successfully applied to accurate, time-based ranging measurements, dealing with the more error-prone RSS based ranging is still challenging. In this work, we address the above issues with a novel, weighted likelihood, bootstrap particle filter for tracking via RSS-based ranging. Our filter weights the individual likelihoods from different anchor nodes exponentially, according to the ranging estimation. We also employ an improved propagation model for more accurate RSS-based ranging, which we suggested in recent work. We implemented and tested our algorithm in a passive localization system with IEEE 802.15.4 signals, showing that our proposed solution largely outperforms a traditional bootstrap particle filter.
Resumo:
The three articles that comprise this dissertation describe how small area estimation and geographic information systems (GIS) technologies can be integrated to provide useful information about the number of uninsured and where they are located. Comprehensive data about the numbers and characteristics of the uninsured are typically only available from surveys. Utilization and administrative data are poor proxies from which to develop this information. Those who cannot access services are unlikely to be fully captured, either by health care provider utilization data or by state and local administrative data. In the absence of direct measures, a well-developed estimation of the local uninsured count or rate can prove valuable when assessing the unmet health service needs of this population. However, the fact that these are “estimates” increases the chances that results will be rejected or, at best, treated with suspicion. The visual impact and spatial analysis capabilities afforded by geographic information systems (GIS) technology can strengthen the likelihood of acceptance of area estimates by those most likely to benefit from the information, including health planners and policy makers. ^ The first article describes how uninsured estimates are currently being performed in the Houston metropolitan region. It details the synthetic model used to calculate numbers and percentages of uninsured, and how the resulting estimates are integrated into a GIS. The second article compares the estimation method of the first article with one currently used by the Texas State Data Center to estimate numbers of uninsured for all Texas counties. Estimates are developed for census tracts in Harris County, using both models with the same data sets. The results are statistically compared. The third article describes a new, revised synthetic method that is being tested to provide uninsured estimates at sub-county levels for eight counties in the Houston metropolitan area. It is being designed to replicate the same categorical results provided by a current U.S. Census Bureau estimation method. The estimates calculated by this revised model are compared to the most recent U.S. Census Bureau estimates, using the same areas and population categories. ^
Resumo:
Computing the modal parameters of large structures in Operational Modal Analysis often requires to process data from multiple non simultaneously recorded setups of sensors. These setups share some sensors in common, the so-called reference sensors that are fixed for all the measurements, while the other sensors are moved from one setup to the next. One possibility is to process the setups separately what result in different modal parameter estimates for each setup. Then the reference sensors are used to merge or glue the different parts of the mode shapes to obtain global modes, while the natural frequencies and damping ratios are usually averaged. In this paper we present a state space model that can be used to process all setups at once so the global mode shapes are obtained automatically and subsequently only a value for the natural frequency and damping ratio of each mode is computed. We also present how this model can be estimated using maximum likelihood and the Expectation Maximization algorithm. We apply this technique to real data measured at a footbridge.
Resumo:
Whole brain resting state connectivity is a promising biomarker that might help to obtain an early diagnosis in many neurological diseases, such as dementia. Inferring resting-state connectivity is often based on correlations, which are sensitive to indirect connections, leading to an inaccurate representation of the real backbone of the network. The precision matrix is a better representation for whole brain connectivity, as it considers only direct connections. The network structure can be estimated using the graphical lasso (GL), which achieves sparsity through l1-regularization on the precision matrix. In this paper, we propose a structural connectivity adaptive version of the GL, where weaker anatomical connections are represented as stronger penalties on the corre- sponding functional connections. We applied beamformer source reconstruction to the resting state MEG record- ings of 81 subjects, where 29 were healthy controls, 22 were single-domain amnestic Mild Cognitive Impaired (MCI), and 30 were multiple-domain amnestic MCI. An atlas-based anatomical parcellation of 66 regions was ob- tained for each subject, and time series were assigned to each of the regions. The fiber densities between the re- gions, obtained with deterministic tractography from diffusion-weighted MRI, were used to define the anatomical connectivity. Precision matrices were obtained with the region specific time series in five different frequency bands. We compared our method with the traditional GL and a functional adaptive version of the GL, in terms of log-likelihood and classification accuracies between the three groups. We conclude that introduc- ing an anatomical prior improves the expressivity of the model and, in most cases, leads to a better classification between groups.
Resumo:
We propose a general procedure for solving incomplete data estimation problems. The procedure can be used to find the maximum likelihood estimate or to solve estimating equations in difficult cases such as estimation with the censored or truncated regression model, the nonlinear structural measurement error model, and the random effects model. The procedure is based on the general principle of stochastic approximation and the Markov chain Monte-Carlo method. Applying the theory on adaptive algorithms, we derive conditions under which the proposed procedure converges. Simulation studies also indicate that the proposed procedure consistently converges to the maximum likelihood estimate for the structural measurement error logistic regression model.
Resumo:
This paper deals with the estimation of a time-invariant channel spectrum from its own nonuniform samples, assuming there is a bound on the channel’s delay spread. Except for this last assumption, this is the basic estimation problem in systems providing channel spectral samples. However, as shown in the paper, the delay spread bound leads us to view the spectrum as a band-limited signal, rather than the Fourier transform of a tapped delay line (TDL). Using this alternative model, a linear estimator is presented that approximately minimizes the expected root-mean-square (RMS) error for a deterministic channel. Its main advantage over the TDL is that it takes into account the spectrum’s smoothness (time width), thus providing a performance improvement. The proposed estimator is compared numerically with the maximum likelihood (ML) estimator based on a TDL model in pilot-assisted channel estimation (PACE) for OFDM.
Resumo:
Transportation Department, Office of Systems Engineering, Washington, D.C.
Resumo:
Background: Sentinel node biopsy (SNB) is being increasingly used but its place outside randomized trials has not yet been established. Methods: The first 114 sentinel node (SN) biopsies performed for breast cancer at the Princess Alexandra Hospital from March 1999 to June 2001 are presented. In 111 cases axillary dissection was also performed, allowing the accuracy of the technique to be assessed. A standard combination of preoperative lymphoscintigraphy, intraoperative gamma probe and injection of blue dye was used in most cases. Results are discussed in relation to the risk and potential consequences of understaging. Results: Where both probe and dye were used, the SN was identified in 90% of patients. A significant number of patients were treated in two stages and the technique was no less effective in patients who had SNB performed at a second operation after the primary tumour had already been removed. The interval from radioisotope injection to operation was very wide (between 2 and 22 h) and did not affect the outcome. Nodal metastases were present in 42 patients in whom an SN was found, and in 40 of these the SN was positive, giving a false negative rate of 4.8% (2/42), with the overall percentage of patients understaged being 2%. For this particular group as a whole, the increased risk of death due to systemic therapy being withheld as a consequence of understaging (if SNB alone had been employed) is estimated at less than 1/500. The risk for individuals will vary depending on other features of the particular primary tumour. Conclusion: For patients who elect to have the axilla staged using SNB alone, the risk and consequences of understaging need to be discussed. These risks can be estimated by allowing for the specific surgeon's false negative rate for the technique, and considering the likelihood of nodal metastases for a given tumour. There appears to be no disadvantage with performing SNB at a second operation after the primary tumour has already been removed. Clearly, for a large number of patients, SNB alone will be safe, but ideally participation in randomized trials should continue to be encouraged.
Resumo:
We present a novel method, called the transform likelihood ratio (TLR) method, for estimation of rare event probabilities with heavy-tailed distributions. Via a simple transformation ( change of variables) technique the TLR method reduces the original rare event probability estimation with heavy tail distributions to an equivalent one with light tail distributions. Once this transformation has been established we estimate the rare event probability via importance sampling, using the classical exponential change of measure or the standard likelihood ratio change of measure. In the latter case the importance sampling distribution is chosen from the same parametric family as the transformed distribution. We estimate the optimal parameter vector of the importance sampling distribution using the cross-entropy method. We prove the polynomial complexity of the TLR method for certain heavy-tailed models and demonstrate numerically its high efficiency for various heavy-tailed models previously thought to be intractable. We also show that the TLR method can be viewed as a universal tool in the sense that not only it provides a unified view for heavy-tailed simulation but also can be efficiently used in simulation with light-tailed distributions. We present extensive simulation results which support the efficiency of the TLR method.
Resumo:
Genetic assignment methods use genotype likelihoods to draw inference about where individuals were or were not born, potentially allowing direct, real-time estimates of dispersal. We used simulated data sets to test the power and accuracy of Monte Carlo resampling methods in generating statistical thresholds for identifying F-0 immigrants in populations with ongoing gene flow, and hence for providing direct, real-time estimates of migration rates. The identification of accurate critical values required that resampling methods preserved the linkage disequilibrium deriving from recent generations of immigrants and reflected the sampling variance present in the data set being analysed. A novel Monte Carlo resampling method taking into account these aspects was proposed and its efficiency was evaluated. Power and error were relatively insensitive to the frequency assumed for missing alleles. Power to identify F-0 immigrants was improved by using large sample size (up to about 50 individuals) and by sampling all populations from which migrants may have originated. A combination of plotting genotype likelihoods and calculating mean genotype likelihood ratios (D-LR) appeared to be an effective way to predict whether F-0 immigrants could be identified for a particular pair of populations using a given set of markers.
Resumo:
In this paper we investigate a Bayesian procedure for the estimation of a flexible generalised distribution, notably the MacGillivray adaptation of the g-and-κ distribution. This distribution, described through its inverse cdf or quantile function, generalises the standard normal through extra parameters which together describe skewness and kurtosis. The standard quantile-based methods for estimating the parameters of generalised distributions are often arbitrary and do not rely on computation of the likelihood. MCMC, however, provides a simulation-based alternative for obtaining the maximum likelihood estimates of parameters of these distributions or for deriving posterior estimates of the parameters through a Bayesian framework. In this paper we adopt the latter approach, The proposed methodology is illustrated through an application in which the parameter of interest is slightly skewed.
Resumo:
The study of continuously varying, quantitative traits is important in evolutionary biology, agriculture, and medicine. Variation in such traits is attributable to many, possibly interacting, genes whose expression may be sensitive to the environment, which makes their dissection into underlying causative factors difficult. An important population parameter for quantitative traits is heritability, the proportion of total variance that is due to genetic factors. Response to artificial and natural selection and the degree of resemblance between relatives are all a function of this parameter. Following the classic paper by R. A. Fisher in 1918, the estimation of additive and dominance genetic variance and heritability in populations is based upon the expected proportion of genes shared between different types of relatives, and explicit, often controversial and untestable models of genetic and non-genetic causes of family resemblance. With genome-wide coverage of genetic markers it is now possible to estimate such parameters solely within families using the actual degree of identity-by-descent sharing between relatives. Using genome scans on 4,401 quasi-independent sib pairs of which 3,375 pairs had phenotypes, we estimated the heritability of height from empirical genome-wide identity-by-descent sharing, which varied from 0.374 to 0.617 (mean 0.498, standard deviation 0.036). The variance in identity-by-descent sharing per chromosome and per genome was consistent with theory. The maximum likelihood estimate of the heritability for height was 0.80 with no evidence for non-genetic causes of sib resemblance, consistent with results from independent twin and family studies but using an entirely separate source of information. Our application shows that it is feasible to estimate genetic variance solely from within- family segregation and provides an independent validation of previously untestable assumptions. Given sufficient data, our new paradigm will allow the estimation of genetic variation for disease susceptibility and quantitative traits that is free from confounding with non-genetic factors and will allow partitioning of genetic variation into additive and non-additive components.
Resumo:
We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.
Resumo:
Most traditional methods for extracting the relationships between two time series are based on cross-correlation. In a non-linear non-stationary environment, these techniques are not sufficient. We show in this paper how to use hidden Markov models to identify the lag (or delay) between different variables for such data. Adopting an information-theoretic approach, we develop a procedure for training HMMs to maximise the mutual information (MMI) between delayed time series. The method is used to model the oil drilling process. We show that cross-correlation gives no information and that the MMI approach outperforms maximum likelihood.