941 resultados para Maximum likelihood – Expectation maximization (ML-EM)
Resumo:
In this work, the relationship between diameter at breast height (d) and total height (h) of individual-tree was modeled with the aim to establish provisory height-diameter (h-d) equations for maritime pine (Pinus pinaster Ait.) stands in the Lomba ZIF, Northeast Portugal. Using data collected locally, several local and generalized h-d equations from the literature were tested and adaptations were also considered. Model fitting was conducted by using usual nonlinear least squares (nls) methods. The best local and generalized models selected, were also tested as mixed models applying a first-order conditional expectation (FOCE) approximation procedure and maximum likelihood methods to estimate fixed and random effects. For the calibration of the mixed models and in order to be consistent with the fitting procedure, the FOCE method was also used to test different sampling designs. The results showed that the local h-d equations with two parameters performed better than the analogous models with three parameters. However a unique set of parameter values for the local model can not be used to all maritime pine stands in Lomba ZIF and thus, a generalized model including covariates from the stand, in addition to d, was necessary to obtain an adequate predictive performance. No evident superiority of the generalized mixed model in comparison to the generalized model with nonlinear least squares parameters estimates was observed. On the other hand, in the case of the local model, the predictive performance greatly improved when random effects were included. The results showed that the mixed model based in the local h-d equation selected is a viable alternative for estimating h if variables from the stand are not available. Moreover, it was observed that it is possible to obtain an adequate calibrated response using only 2 to 5 additional h-d measurements in quantile (or random) trees from the distribution of d in the plot (stand). Balancing sampling effort, accuracy and straightforwardness in practical applications, the generalized model from nls fit is recommended. Examples of applications of the selected generalized equation to the forest management are presented, namely how to use it to complete missing information from forest inventory and also showing how such an equation can be incorporated in a stand-level decision support system that aims to optimize the forest management for the maximization of wood volume production in Lomba ZIF maritime pine stands.
Resumo:
In this paper an alternative approach to the one in Henze (1986) is proposed for deriving the odd moments of the skew-normal distribution considered in Azzalini (1985). The approach is based on a Pascal type triangle, which seems to greatly simplify moments computation. Moreover, it is shown that the likelihood equation for estimating the asymmetry parameter in such model is generated as orthogonal functions to the sample vector. As a consequence, conditions for a unique solution of the likelihood equation are established, which seem to hold in more general setting.
Resumo:
A significant problem in the collection of responses to potentially sensitive questions, such as relating to illegal, immoral or embarrassing activities, is non-sampling error due to refusal to respond or false responses. Eichhorn & Hayre (1983) suggested the use of scrambled responses to reduce this form of bias. This paper considers a linear regression model in which the dependent variable is unobserved but for which the sum or product with a scrambling random variable of known distribution, is known. The performance of two likelihood-based estimators is investigated, namely of a Bayesian estimator achieved through a Markov chain Monte Carlo (MCMC) sampling scheme, and a classical maximum-likelihood estimator. These two estimators and an estimator suggested by Singh, Joarder & King (1996) are compared. Monte Carlo results show that the Bayesian estimator outperforms the classical estimators in almost all cases, and the relative performance of the Bayesian estimator improves as the responses become more scrambled.
Resumo:
A mixture model for long-term survivors has been adopted in various fields such as biostatistics and criminology where some individuals may never experience the type of failure under study. It is directly applicable in situations where the only information available from follow-up on individuals who will never experience this type of failure is in the form of censored observations. In this paper, we consider a modification to the model so that it still applies in the case where during the follow-up period it becomes known that an individual will never experience failure from the cause of interest. Unless a model allows for this additional information, a consistent survival analysis will not be obtained. A partial maximum likelihood (ML) approach is proposed that preserves the simplicity of the long-term survival mixture model and provides consistent estimators of the quantities of interest. Some simulation experiments are performed to assess the efficiency of the partial ML approach relative to the full ML approach for survival in the presence of competing risks.
Resumo:
This article deals with the efficiency of fractional integration parameter estimators. This study was based on Monte Carlo experiments involving simulated stochastic processes with integration orders in the range]-1,1[. The evaluated estimation methods were classified into two groups: heuristics and semiparametric/maximum likelihood (ML). The study revealed that the comparative efficiency of the estimators, measured by the lesser mean squared error, depends on the stationary/non-stationary and persistency/anti-persistency conditions of the series. The ML estimator was shown to be superior for stationary persistent processes; the wavelet spectrum-based estimators were better for non-stationary mean reversible and invertible anti-persistent processes; the weighted periodogram-based estimator was shown to be superior for non-invertible anti-persistent processes.
Resumo:
HE PROBIT MODEL IS A POPULAR DEVICE for explaining binary choice decisions in econometrics. It has been used to describe choices such as labor force participation, travel mode, home ownership, and type of education. These and many more examples can be found in papers by Amemiya (1981) and Maddala (1983). Given the contribution of economics towards explaining such choices, and given the nature of data that are collected, prior information on the relationship between a choice probability and several explanatory variables frequently exists. Bayesian inference is a convenient vehicle for including such prior information. Given the increasing popularity of Bayesian inference it is useful to ask whether inferences from a probit model are sensitive to a choice between Bayesian and sampling theory techniques. Of interest is the sensitivity of inference on coefficients, probabilities, and elasticities. We consider these issues in a model designed to explain choice between fixed and variable interest rate mortgages. Two Bayesian priors are employed: a uniform prior on the coefficients, designed to be noninformative for the coefficients, and an inequality restricted prior on the signs of the coefficients. We often know, a priori, whether increasing the value of a particular explanatory variable will have a positive or negative effect on a choice probability. This knowledge can be captured by using a prior probability density function (pdf) that is truncated to be positive or negative. Thus, three sets of results are compared:those from maximum likelihood (ML) estimation, those from Bayesian estimation with an unrestricted uniform prior on the coefficients, and those from Bayesian estimation with a uniform prior truncated to accommodate inequality restrictions on the coefficients.
Resumo:
Objectives: To compare the population modelling programs NONMEM and P-PHARM during investigation of the pharmacokinetics of tacrolimus in paediatric liver-transplant recipients. Methods: Population pharmacokinetic analysis was performed using NONMEM and P-PHARM on retrospective data from 35 paediatric liver-transplant patients receiving tacrolimus therapy. The same data were presented to both programs. Maximum likelihood estimates were sought for apparent clearance (CL/F) and apparent volume of distribution (V/F). Covariates screened for influence on these parameters were weight, age, gender, post-operative day, days of tacrolimus therapy, transplant type, biliary reconstructive procedure, liver function tests, creatinine clearance, haematocrit, corticosteroid dose, and potential interacting drugs. Results: A satisfactory model was developed in both programs with a single categorical covariate - transplant type - providing stable parameter estimates and small, normally distributed (weighted) residuals. In NONMEM, the continuous covariates - age and liver function tests - improved modelling further. Mean parameter estimates were CL/F (whole liver) = 16.3 1/h, CL/F (cut-down liver) = 8.5 1/h and V/F = 565 1 in NONMEM, and CL/F = 8.3 1/h and V/F = 155 1 in P-PHARM. Individual Bayesian parameter estimates were CL/F (whole liver) = 17.9 +/- 8.8 1/h, CL/F (cutdown liver) = 11.6 +/- 18.8 1/h and V/F = 712 792 1 in NONMEM, and CL/F (whole liver) = 12.8 +/- 3.5 1/h, CL/F (cut-down liver) = 8.2 +/- 3.4 1/h and V/F = 221 1641 in P-PHARM. Marked interindividual kinetic variability (38-108%) and residual random error (approximately 3 ng/ml) were observed. P-PHARM was more user friendly and readily provided informative graphical presentation of results. NONMEM allowed a wider choice of errors for statistical modelling and coped better with complex covariate data sets. Conclusion: Results from parametric modelling programs can vary due to different algorithms employed to estimate parameters, alternative methods of covariate analysis and variations and limitations in the software itself.
Resumo:
Research on cluster analysis for categorical data continues to develop, new clustering algorithms being proposed. However, in this context, the determination of the number of clusters is rarely addressed. We propose a new approach in which clustering and the estimation of the number of clusters is done simultaneously for categorical data. We assume that the data originate from a finite mixture of multinomial distributions and use a minimum message length criterion (MML) to select the number of clusters (Wallace and Bolton, 1986). For this purpose, we implement an EM-type algorithm (Silvestre et al., 2008) based on the (Figueiredo and Jain, 2002) approach. The novelty of the approach rests on the integration of the model estimation and selection of the number of clusters in a single algorithm, rather than selecting this number based on a set of pre-estimated candidate models. The performance of our approach is compared with the use of Bayesian Information Criterion (BIC) (Schwarz, 1978) and Integrated Completed Likelihood (ICL) (Biernacki et al., 2000) using synthetic data. The obtained results illustrate the capacity of the proposed algorithm to attain the true number of cluster while outperforming BIC and ICL since it is faster, which is especially relevant when dealing with large data sets.
Resumo:
Submitted in partial fulfillment for the Requirements for the Degree of PhD in Mathematics, in the Speciality of Statistics in the Faculdade de Ciências e Tecnologia
Resumo:
This Letter presents a search at the LHC for s-channel single top-quark production in proton-proton collisions at a centre-of-mass energy of 8 TeV. The analyzed data set was recorded by the ATLAS detector and corresponds to an integrated luminosity of 20.3 fb−1. Selected events contain one charged lepton, large missing transverse momentum and exactly two b-tagged jets. A multivariate event classifier based on boosted decision trees is developed to discriminate s-channel single top-quark events from the main background contributions. The signal extraction is based on a binned maximum-likelihood fit of the output classifier distribution. The analysis leads to an upper limit on the s-channel single top-quark production cross-section of 14.6 pb at the 95% confidence level. The fit gives a cross-section of σs=5.0±4.3 pb, consistent with the Standard Model expectation.
Resumo:
The genus Thylamys Gray, 1843 lives in the central and southern portions of South America inhabiting open and shrub-like vegetation, from prairies to dry forest habitats in contrast to the preference of other Didelphidae genera for more mesic environments. Thylamys is a speciose genus including T. elegans (Waterhouse, 1839), T. macrurus (Olfers, 1818), T. pallidior (Thomas, 1902), T. pusillus (Desmarest, 1804), T. venustus (Thomas, 1902), T. sponsorius (Thomas, 1921), T. cinderella (Thomas, 1902), T. tatei (Handley, 1957), T. karimii (Petter, 1968), and T. velutinus (Wagner, 1842) species. Previous phylogenetic analyses in this genus did not include the Brazilian species T. karimii, which is widely distributed in this country. In this study, phylogenetic analyses were performed to establish the relationships among the Brazilian T. karimii and all other previously analyzed species. We used 402-bp fragments of the mitochondrial cytochrome b gene, and the phylogeny estimates were conducted employing maximum parsimony (MP), maximum likelihood (ML), Bayesian (BY), and neighbor-joining (NJ). The topologies of the trees obtained in the different analyses were all similar and pointed out that T. karimii is the sister taxon of a group constituted of taxa from dry and arid environments named the dryland species. The dryland species consists of T. pusillus, T. pallidior, T. tatei, and T. elegans. The results of this work suggest five species groups in Thylamys. In one of them, T. velutinus and T. kariimi could constitute a sister group forming one Thylamys clade that colonized Brazil.
Resumo:
Background and aims Recent studies have adopted a broad definition of Sapindaceae that includes taxa traditionally placed in Aceraceae and Hippocastanaceae, achieving monophyly but yielding a family difficult to characterize and for which no obvious morphological synapomorphy exists. This expanded circumscription was necessitated by the finding that the monotypic, temperate Asian genus Xanthoceras, historically placed in Sapindaceae tribe Harpullieae, is basal within the group. Here we seek to clarify the relationships of Xanthoceras based on phylogenetic analyses using a dataset encompassing nearly 3/4 of sapindaceous genera, comparing the results with information from morphology and biogeography, in particular with respect to the other taxa placed in Harpullieae. We then re-examine the appropriateness of maintaining the current broad, morphologically heterogeneous definition of Sapindaceae and explore the advantages of an alternative family circumscription. Methods Using 243 samples representing 104 of the 142 currently recognized genera of Sapindaceae s. lat. (including all in Harpullieae), sequence data were analyzed for nuclear (ITS) and plastid (matK, rpoB, trnD-trnT, trnK-matK, trnL-trnF and trnS-trnG) markers, adopting the methodology of a recent family-wide study, performing single-gene and total evidence analyses based on maximum likelihood (ML) and maximum parsimony (MP) criteria, and applying heuristic searches developed for large datasets, viz, a new strategy implemented in RAxML (for ML) and the parsimony ratchet (for MP). Bootstrap analyses were performed for each method to test for congruence between markers. Key results Our findings support earlier suggestions that Harpullieae are polyphyletic: Xanthoceras is confirmed as sister to all other sampled taxa of Sapindaceae s. lat.; the remaining members belong to three other clades within Sapindaceae s. lat., two of which correspond respectively to the groups traditionally treated as Aceraceae and Hippocastanaceae, together forming a clade sister to the largely tropical Sapindaceae s. str., which is monophyletic and morphologically coherent provided Xanthoceras is excluded. Conclusion To overcome the difficulties of a broadly circumscribed Sapindaceae, we resurrect the historically recognized temperate families Aceraceae and Hippocastanaceae, and describe a new family, Xanthoceraceae, thus adopting a monophyletic and easily characterized circumscription of Sapindaceae nearly identical to that used for over a century.
Resumo:
We focus on full-rate, fast-decodable space–time block codes (STBCs) for 2 x 2 and 4 x 2 multiple-input multiple-output (MIMO) transmission. We first derive conditions and design criteria for reduced-complexity maximum-likelihood (ML) decodable 2 x 2 STBCs, and we apply them to two families of codes that were recently discovered. Next, we derive a novel reduced-complexity 4 x 2 STBC, and show that it outperforms all previously known codes with certain constellations.
Resumo:
We design powerful low-density parity-check (LDPC) codes with iterative decoding for the block-fading channel. We first study the case of maximum-likelihood decoding, and show that the design criterion is rather straightforward. Since optimal constructions for maximum-likelihood decoding do not performwell under iterative decoding, we introduce a new family of full-diversity LDPC codes that exhibit near-outage-limit performance under iterative decoding for all block-lengths. This family competes favorably with multiplexed parallel turbo codes for nonergodic channels.
Resumo:
Nonlinear regression problems can often be reduced to linearity by transforming the response variable (e.g., using the Box-Cox family of transformations). The classic estimates of the parameter defining the transformation as well as of the regression coefficients are based on the maximum likelihood criterion, assuming homoscedastic normal errors for the transformed response. These estimates are nonrobust in the presence of outliers and can be inconsistent when the errors are nonnormal or heteroscedastic. This article proposes new robust estimates that are consistent and asymptotically normal for any unimodal and homoscedastic error distribution. For this purpose, a robust version of conditional expectation is introduced for which the prediction mean squared error is replaced with an M scale. This concept is then used to develop a nonparametric criterion to estimate the transformation parameter as well as the regression coefficients. A finite sample estimate of this criterion based on a robust version of smearing is also proposed. Monte Carlo experiments show that the new estimates compare favorably with respect to the available competitors.