945 resultados para Maximum penalized likelihood estimates
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
In this work, the relationship between diameter at breast height (d) and total height (h) of individual-tree was modeled with the aim to establish provisory height-diameter (h-d) equations for maritime pine (Pinus pinaster Ait.) stands in the Lomba ZIF, Northeast Portugal. Using data collected locally, several local and generalized h-d equations from the literature were tested and adaptations were also considered. Model fitting was conducted by using usual nonlinear least squares (nls) methods. The best local and generalized models selected, were also tested as mixed models applying a first-order conditional expectation (FOCE) approximation procedure and maximum likelihood methods to estimate fixed and random effects. For the calibration of the mixed models and in order to be consistent with the fitting procedure, the FOCE method was also used to test different sampling designs. The results showed that the local h-d equations with two parameters performed better than the analogous models with three parameters. However a unique set of parameter values for the local model can not be used to all maritime pine stands in Lomba ZIF and thus, a generalized model including covariates from the stand, in addition to d, was necessary to obtain an adequate predictive performance. No evident superiority of the generalized mixed model in comparison to the generalized model with nonlinear least squares parameters estimates was observed. On the other hand, in the case of the local model, the predictive performance greatly improved when random effects were included. The results showed that the mixed model based in the local h-d equation selected is a viable alternative for estimating h if variables from the stand are not available. Moreover, it was observed that it is possible to obtain an adequate calibrated response using only 2 to 5 additional h-d measurements in quantile (or random) trees from the distribution of d in the plot (stand). Balancing sampling effort, accuracy and straightforwardness in practical applications, the generalized model from nls fit is recommended. Examples of applications of the selected generalized equation to the forest management are presented, namely how to use it to complete missing information from forest inventory and also showing how such an equation can be incorporated in a stand-level decision support system that aims to optimize the forest management for the maximization of wood volume production in Lomba ZIF maritime pine stands.
Resumo:
Resumo: Registros de sobrevivência do nascimento ao desmame de 3846 crias de ovinos da raça Santa Inês foram analisados por modelos de reprodutor linear e não linear (modelo de limiar), para estimar componentes de variância e herdabilidade. Os modelos usados para sobrevivência, analisada como característica da cria, incluíram os efeitos fixos de sexo, da combinação tipo de nascimento-criação da cria e da idade da ovelha ao parto, efeito da covariável peso da cria ao nascer e efeitos aleatórios de reprodutor, da classe rebanho-ano-estação e do resíduo. Componentes de variância para o modelo linear foram estimados pelo método da máxima verossimilhança restrita (REML) e para o modelo não linear por uma aproximação da máxima verossimilhança marginal (MML), pelo programa CMMAT2. O coeficiente de herdabilidade (h2) estimado pelo modelo de limiar foi de 0,29, e pelo modelo linear, 0,14. A correlação de ordem de Spearman entre as capacidades de transmissão dos reprodutores, com base nos dois modelos foi de 0,96. As estimativas de h2 obtidas indicam a possibilidade de se obter, por seleção, ganho genético para sobrevivência. [Linear and nonlinear models in genetic analyses of lamb survival in the Santa Inês hair sheep breed]. Abstract: Records of 3,846 lambs survival from birth to weaning of Santa Inês hair sheep breed, were analyzed by linear and non linear sire models (threshold model) to estimate variance components and heritability (h2). The models that were used to analyze survival, considered in this study as a lamb trait, included the fixed effects of sex of the lamb, combination of type of birth-rearing of lamb, and age of ewe, birth weight of lamb as covariate, and random effects of sire, herd-year-season and residual. Variance components were obtained using restricted maximum likelihood (REML), in linear model and marginal maximum likelihood in threshold model through CMMAT2 program. Estimate of heritability (h2) obtained by threshold model was 0.29 and by linear model was 0.14. Rank correlation of Spearman, between sire solutions based on the two models was 0.96. The obtained estimates in this study indicate that it is possible to acquire genetic gain to survival by selection.
Resumo:
This research develops an econometric framework to analyze time series processes with bounds. The framework is general enough that it can incorporate several different kinds of bounding information that constrain continuous-time stochastic processes between discretely-sampled observations. It applies to situations in which the process is known to remain within an interval between observations, by way of either a known constraint or through the observation of extreme realizations of the process. The main statistical technique employs the theory of maximum likelihood estimation. This approach leads to the development of the asymptotic distribution theory for the estimation of the parameters in bounded diffusion models. The results of this analysis present several implications for empirical research. The advantages are realized in the form of efficiency gains, bias reduction and in the flexibility of model specification. A bias arises in the presence of bounding information that is ignored, while it is mitigated within this framework. An efficiency gain arises, in the sense that the statistical methods make use of conditioning information, as revealed by the bounds. Further, the specification of an econometric model can be uncoupled from the restriction to the bounds, leaving the researcher free to model the process near the bound in a way that avoids bias from misspecification. One byproduct of the improvements in model specification is that the more precise model estimation exposes other sources of misspecification. Some processes reveal themselves to be unlikely candidates for a given diffusion model, once the observations are analyzed in combination with the bounding information. A closer inspection of the theoretical foundation behind diffusion models leads to a more general specification of the model. This approach is used to produce a set of algorithms to make the model computationally feasible and more widely applicable. Finally, the modeling framework is applied to a series of interest rates, which, for several years, have been constrained by the lower bound of zero. The estimates from a series of diffusion models suggest a substantial difference in estimation results between models that ignore bounds and the framework that takes bounding information into consideration.
Resumo:
The objective of this study was to evaluate the effects of inclusion or non-inclusion of short lactations and cow (CGG) and/or dam (DGG) genetic group on the genetic evaluation of 305-day milk yield (MY305), age at first calving (AFC), and first calving interval (FCI) of Girolando cows. Covariance components were estimated by the restricted maximum likelihood method in an animal model of single trait analyses. The heritability estimates for MY305, AFC, and FCI ranged from 0.23 to 0.29, 0.40 to 0.44, and 0.13 to 0.14, respectively, when short lactations were not included, and from 0.23 to 0.28, 0.39 to 0.43, and 0.13 to 0.14, respectively, when short lactations were included. The inclusion of short lactations caused little variation in the variance components and heritability estimates of traits, but their non-inclusion resulted in the re-ranking of animals. Models with CGG or DGG fixed effects had higher heritability estimates for all traits compared with models that consider these two effects simultaneously. We recommend using the model with fixed effects of CGG and inclusion of short lactations for the genetic evaluation of Girolando cattle.
Resumo:
Speech recognition in car environments has been identified as a valuable means for reducing driver distraction when operating non-critical in-car systems. Likelihood-maximising (LIMA) frameworks optimise speech enhancement algorithms based on recognised state sequences rather than traditional signal-level criteria such as maximising signal-to-noise ratio. Previously presented LIMA frameworks require calibration utterances to generate optimised enhancement parameters which are used for all subsequent utterances. Sub-optimal recognition performance occurs in noise conditions which are significantly different from that present during the calibration session - a serious problem in rapidly changing noise environments. We propose a dialog-based design which allows regular optimisation iterations in order to track the changing noise conditions. Experiments using Mel-filterbank spectral subtraction are performed to determine the optimisation requirements for vehicular environments and show that minimal optimisation assists real-time operation with improved speech recognition accuracy. It is also shown that the proposed design is able to provide improved recognition performance over frameworks incorporating a calibration session.
Error, Bias, and Long-Branch Attraction in Data for Two Chloroplast Photosystem Genes in Seed Plants
Resumo:
Sequences of two chloroplast photosystem genes, psaA and psbB, together comprising about 3,500 bp, were obtained for all five major groups of extant seed plants and several outgroups among other vascular plants. Strongly supported, but significantly conflicting, phylogenetic signals were obtained in parsimony analyses from partitions of the data into first and second codon positions versus third positions. In the former, both genes agreed on a monophyletic gymnosperms, with Gnetales closely related to certain conifers. In the latter, Gnetales are inferred to be the sister group of all other seed plants, with gymnosperms paraphyletic. None of the data supported the modern ‘‘anthophyte hypothesis,’’ which places Gnetales as the sister group of flowering plants. A series of simulation studies were undertaken to examine the error rate for parsimony inference. Three kinds of errors were examined: random error, systematic bias (both properties of finite data sets), and statistical inconsistency owing to long-branch attraction (an asymptotic property). Parsimony reconstructions were extremely biased for third-position data for psbB. Regardless of the true underlying tree, a tree in which Gnetales are sister to all other seed plants was likely to be reconstructed for these data. None of the combinations of genes or partitions permits the anthophyte tree to be reconstructed with high probability. Simulations of progressively larger data sets indicate the existence of long-branch attraction (statistical inconsistency) for third-position psbB data if either the anthophyte tree or the gymnosperm tree is correct. This is also true for the anthophyte tree using either psaA third positions or psbB first and second positions. A factor contributing to bias and inconsistency is extremely short branches at the base of the seed plant radiation, coupled with extremely high rates in Gnetales and nonseed plant outgroups. M. J. Sanderson,* M. F. Wojciechowski,*† J.-M. Hu,* T. Sher Khan,* and S. G. Brady
Resumo:
An estimation of costs for maintenance and rehabilitation is subject to variation due to the uncertainties of input parameters. This paper presents the results of an analysis to identify input parameters that affect the prediction of variation in road deterioration. Road data obtained from 1688 km of a national highway located in the tropical northeast of Queensland in Australia were used in the analysis. Data were analysed using a probability-based method, the Monte Carlo simulation technique and HDM-4’s roughness prediction model. The results of the analysis indicated that among the input parameters the variability of pavement strength, rut depth, annual equivalent axle load and initial roughness affected the variability of the predicted roughness. The second part of the paper presents an analysis to assess the variation in cost estimates due to the variability of the overall identified critical input parameters.
Resumo:
Extended spectrum β-lactamases or ESBLs, which are derived from non-ESBL precursors by point mutation of β-lactamase genes (bla), are spreading rapidly all over the world and have caused considerable problems in the treatment of infections caused by bacteria which harbour them. The mechanism of this resistance is not fully understood and a better understanding of these mechanisms might significantly impact on choosing proper diagnostic and treatment strategies. Previous work on SHV β-lactamase gene, blaSHV, has shown that only Klebsiella pneumoniae strains which contain plasmid-borne blaSHV are able to mutate to phenotypically ESBL-positive strains and there was also evidence of an increase in blaSHV copy number. Therefore, it was hypothesised that although specific point mutation is essential for acquisition of ESBL activity, it is not yet enough, and blaSHV copy number amplification is also essential for an ESBL-positive phenotype, with homologous recombination being the likely mechanism of blaSHV copy number expansion. In this study, we investigated the mutation rate of non-ESBL expressing K. pneumoniae isolates to an ESBL-positive status by using the MSS-maximum likelihood method. Our data showed that blaSHV mutation rate of a non-ESBL expressing isolate is lower than the mutation rate of the other single base changes on the chromosome, even with a plasmid-borne blaSHV gene. On the other hand, mutation rate from a low MIC ESBL-positive (≤ 8 µg/mL for cefotaxime) to high MIC ESBL-positive (≥16 µg/mL for cefotaxime) is very high. This is because only gene copy number increase is needed which is probably mediated by homologous recombination that typically takes place at a much higher frequencies than point mutations. Using a subinhibitory concentration of novobiocin, as a homologous recombination inhibitor, revealed that this is the case.
Resumo:
This article describes the theoretical underpinning and development of a measurement instrument that provides teachers with a tool to observe the personal creativity characteristics of individual students. The instrument was developed by compiling a list of characteristics derived from the literature to be indicative of the personal characteristics of creative people. The list was then reduced by grouping like characteristics to 9 cognitive and dispositional traits that were considered appropriate for elementary students. The 9-item instrument was then administered in 24 classrooms to 520 Year 6 and Year 7 students. Factor analysis using maximum likelihood extraction with an oblimin rotation revealed a single factor with an eigenvalue greater than 1 and accounting for 63% of the variance. All 9 items on this factor loaded at .72 or greater. The results indicated that the Creativity Checklist has very high internal consistency and is a reliable measurement instrument (a = .93).