927 resultados para maximum likelihood estimate
Resumo:
The modal analysis of a structural system consists on computing its vibrational modes. The experimental way to estimate these modes requires to excite the system with a measured or known input and then to measure the system output at different points using sensors. Finally, system inputs and outputs are used to compute the modes of vibration. When the system refers to large structures like buildings or bridges, the tests have to be performed in situ, so it is not possible to measure system inputs such as wind, traffic, . . .Even if a known input is applied, the procedure is usually difficult and expensive, and there are still uncontrolled disturbances acting at the time of the test. These facts led to the idea of computing the modes of vibration using only the measured vibrations and regardless of the inputs that originated them, whether they are ambient vibrations (wind, earthquakes, . . . ) or operational loads (traffic, human loading, . . . ). This procedure is usually called Operational Modal Analysis (OMA), and in general consists on to fit a mathematical model to the measured data assuming the unobserved excitations are realizations of a stationary stochastic process (usually white noise processes). Then, the modes of vibration are computed from the estimated model. The first issue investigated in this thesis is the performance of the Expectation- Maximization (EM) algorithm for the maximum likelihood estimation of the state space model in the field of OMA. The algorithm is described in detail and it is analysed how to apply it to vibration data. After that, it is compared to another well known method, the Stochastic Subspace Identification algorithm. The maximum likelihood estimate enjoys some optimal properties from a statistical point of view what makes it very attractive in practice, but the most remarkable property of the EM algorithm is that it can be used to address a wide range of situations in OMA. In this work, three additional state space models are proposed and estimated using the EM algorithm: • The first model is proposed to estimate the modes of vibration when several tests are performed in the same structural system. Instead of analyse record by record and then compute averages, the EM algorithm is extended for the joint estimation of the proposed state space model using all the available data. • The second state space model is used to estimate the modes of vibration when the number of available sensors is lower than the number of points to be tested. In these cases it is usual to perform several tests changing the position of the sensors from one test to the following (multiple setups of sensors). Here, the proposed state space model and the EM algorithm are used to estimate the modal parameters taking into account the data of all setups. • And last, a state space model is proposed to estimate the modes of vibration in the presence of unmeasured inputs that cannot be modelled as white noise processes. In these cases, the frequency components of the inputs cannot be separated from the eigenfrequencies of the system, and spurious modes are obtained in the identification process. The idea is to measure the response of the structure corresponding to different inputs; then, it is assumed that the parameters common to all the data correspond to the structure (modes of vibration), and the parameters found in a specific test correspond to the input in that test. The problem is solved using the proposed state space model and the EM algorithm. Resumen El análisis modal de un sistema estructural consiste en calcular sus modos de vibración. Para estimar estos modos experimentalmente es preciso excitar el sistema con entradas conocidas y registrar las salidas del sistema en diferentes puntos por medio de sensores. Finalmente, los modos de vibración se calculan utilizando las entradas y salidas registradas. Cuando el sistema es una gran estructura como un puente o un edificio, los experimentos tienen que realizarse in situ, por lo que no es posible registrar entradas al sistema tales como viento, tráfico, . . . Incluso si se aplica una entrada conocida, el procedimiento suele ser complicado y caro, y todavía están presentes perturbaciones no controladas que excitan el sistema durante el test. Estos hechos han llevado a la idea de calcular los modos de vibración utilizando sólo las vibraciones registradas en la estructura y sin tener en cuenta las cargas que las originan, ya sean cargas ambientales (viento, terremotos, . . . ) o cargas de explotación (tráfico, cargas humanas, . . . ). Este procedimiento se conoce en la literatura especializada como Análisis Modal Operacional, y en general consiste en ajustar un modelo matemático a los datos registrados adoptando la hipótesis de que las excitaciones no conocidas son realizaciones de un proceso estocástico estacionario (generalmente ruido blanco). Posteriormente, los modos de vibración se calculan a partir del modelo estimado. El primer problema que se ha investigado en esta tesis es la utilización de máxima verosimilitud y el algoritmo EM (Expectation-Maximization) para la estimación del modelo espacio de los estados en el ámbito del Análisis Modal Operacional. El algoritmo se describe en detalle y también se analiza como aplicarlo cuando se dispone de datos de vibraciones de una estructura. A continuación se compara con otro método muy conocido, el método de los Subespacios. Los estimadores máximo verosímiles presentan una serie de propiedades que los hacen óptimos desde un punto de vista estadístico, pero la propiedad más destacable del algoritmo EM es que puede utilizarse para resolver un amplio abanico de situaciones que se presentan en el Análisis Modal Operacional. En este trabajo se proponen y estiman tres modelos en el espacio de los estados: • El primer modelo se utiliza para estimar los modos de vibración cuando se dispone de datos correspondientes a varios experimentos realizados en la misma estructura. En lugar de analizar registro a registro y calcular promedios, se utiliza algoritmo EM para la estimación conjunta del modelo propuesto utilizando todos los datos disponibles. • El segundo modelo en el espacio de los estados propuesto se utiliza para estimar los modos de vibración cuando el número de sensores disponibles es menor que vi Resumen el número de puntos que se quieren analizar en la estructura. En estos casos es usual realizar varios ensayos cambiando la posición de los sensores de un ensayo a otro (múltiples configuraciones de sensores). En este trabajo se utiliza el algoritmo EM para estimar los parámetros modales teniendo en cuenta los datos de todas las configuraciones. • Por último, se propone otro modelo en el espacio de los estados para estimar los modos de vibración en la presencia de entradas al sistema que no pueden modelarse como procesos estocásticos de ruido blanco. En estos casos, las frecuencias de las entradas no se pueden separar de las frecuencias del sistema y se obtienen modos espurios en la fase de identificación. La idea es registrar la respuesta de la estructura correspondiente a diferentes entradas; entonces se adopta la hipótesis de que los parámetros comunes a todos los registros corresponden a la estructura (modos de vibración), y los parámetros encontrados en un registro específico corresponden a la entrada en dicho ensayo. El problema se resuelve utilizando el modelo propuesto y el algoritmo EM.
Resumo:
In this paper we propose a method to estimate by maximum likelihood the divergence time between two populations, specifically designed for the analysis of nonrecurrent rare mutations. Given the rapidly growing amount of data, rare disease mutations affecting humans seem the most suitable candidates for this method. The estimator RD, and its conditional version RDc, were derived, assuming that the population dynamics of rare alleles can be described by using a birth–death process approximation and that each mutation arose before the split of a common ancestral population into the two diverging populations. The RD estimator seems more suitable for large sample sizes and few alleles, whose age can be approximated, whereas the RDc estimator appears preferable when this is not the case. When applied to three cystic fibrosis mutations, the estimator RD could not exclude a very recent time of divergence among three Mediterranean populations. On the other hand, the divergence time between these populations and the Danish population was estimated to be, on the average, 4,500 or 15,000 years, assuming or not a selective advantage for cystic fibrosis carriers, respectively. Confidence intervals are large, however, and can probably be reduced only by analyzing more alleles or loci.
Resumo:
Most unsignalised intersection capacity calculation procedures are based on gap acceptance models. Accuracy of critical gap estimation affects accuracy of capacity and delay estimation. Several methods have been published to estimate drivers’ sample mean critical gap, the Maximum Likelihood Estimation (MLE) technique regarded as the most accurate. This study assesses three novel methods; Average Central Gap (ACG) method, Strength Weighted Central Gap method (SWCG), and Mode Central Gap method (MCG), against MLE for their fidelity in rendering true sample mean critical gaps. A Monte Carlo event based simulation model was used to draw the maximum rejected gap and accepted gap for each of a sample of 300 drivers across 32 simulation runs. Simulation mean critical gap is varied between 3s and 8s, while offered gap rate is varied between 0.05veh/s and 0.55veh/s. This study affirms that MLE provides a close to perfect fit to simulation mean critical gaps across a broad range of conditions. The MCG method also provides an almost perfect fit and has superior computational simplicity and efficiency to the MLE. The SWCG method performs robustly under high flows; however, poorly under low to moderate flows. Further research is recommended using field traffic data, under a variety of minor stream and major stream flow conditions for a variety of minor stream movement types, to compare critical gap estimates using MLE against MCG. Should the MCG method prove as robust as MLE, serious consideration should be given to its adoption to estimate critical gap parameters in guidelines.
Resumo:
The Fabens method is commonly used to estimate growth parameters k and l infinity in the von Bertalanffy model from tag-recapture data. However, the Fabens method of estimation has an inherent bias when individual growth is variable. This paper presents an asymptotically unbiassed method using a maximum likelihood approach that takes account of individual variability in both maximum length and age-at-tagging. It is assumed that each individual's growth follows a von Bertalanffy curve with its own maximum length and age-at-tagging. The parameter k is assumed to be a constant to ensure that the mean growth follows a von Bertalanffy curve and to avoid overparameterization. Our method also makes more efficient use nf thp measurements at tno and recapture and includes diagnostic techniques for checking distributional assumptions. The method is reasonably robust and performs better than the Fabens method when individual growth differs from the von Bertalanffy relationship. When measurement error is negligible, the estimation involves maximizing the profile likelihood of one parameter only. The method is applied to tag-recapture data for the grooved tiger prawn (Penaeus semisulcatus) from the Gulf of Carpentaria, Australia.
Resumo:
It is common to model the dynamics of fisheries using natural and fishing mortality rates estimated independently using two separate analyses. Fishing mortality is routinely estimated from widely available logbook data, whereas natural mortality estimations have often required more specific, less frequently available, data. However, in the case of the fishery for brown tiger prawn (Penaeus esculentus) in Moreton Bay, both fishing and natural mortality rates have been estimated from logbook data. The present work extended the fishing mortality model to incorporate an eco-physiological response of tiger prawn to temperature, and allowed recruitment timing to vary from year to year. These ecological characteristics of the dynamics of this fishery were ignored in the separate model that estimated natural mortality. Therefore, we propose to estimate both natural and fishing mortality rates within a single model using a consistent set of hypotheses. This approach was applied to Moreton Bay brown tiger prawn data collected between 1990 and 2010. Natural mortality was estimated by maximum likelihood to be equal to 0.032 ± 0.002 week−1, approximately 30% lower than the fixed value used in previous models of this fishery (0.045 week−1).
Resumo:
Maximum likelihood (ML) algorithms, for the joint estimation of synchronisation impairments and channel in multiple input multiple output-orthogonal frequency division multiplexing (MIMO-OFDM) system, are investigated in this work. A system model that takes into account the effects of carrier frequency offset, sampling frequency offset, symbol timing error and channel impulse response is formulated. Cramer-Rao lower bounds for the estimation of continuous parameters are derived, which show the coupling effect among different impairments and the significance of the joint estimation. The authors propose an ML algorithm for the estimation of synchronisation impairments and channel together, using the grid search method. To reduce the complexity of the joint grid search in the ML algorithm, a modified ML (MML) algorithm with multiple one-dimensional searches is also proposed. Further, a stage-wise ML (SML) algorithm using existing algorithms, which estimate less number of parameters, is also proposed. Performance of the estimation algorithms is studied through numerical simulations and it is found that the proposed ML and MML algorithms exhibit better performance than SML algorithm.
Resumo:
Parmi les méthodes d’estimation de paramètres de loi de probabilité en statistique, le maximum de vraisemblance est une des techniques les plus populaires, comme, sous des conditions l´egères, les estimateurs ainsi produits sont consistants et asymptotiquement efficaces. Les problèmes de maximum de vraisemblance peuvent être traités comme des problèmes de programmation non linéaires, éventuellement non convexe, pour lesquels deux grandes classes de méthodes de résolution sont les techniques de région de confiance et les méthodes de recherche linéaire. En outre, il est possible d’exploiter la structure de ces problèmes pour tenter d’accélerer la convergence de ces méthodes, sous certaines hypothèses. Dans ce travail, nous revisitons certaines approches classiques ou récemment d´eveloppées en optimisation non linéaire, dans le contexte particulier de l’estimation de maximum de vraisemblance. Nous développons également de nouveaux algorithmes pour résoudre ce problème, reconsidérant différentes techniques d’approximation de hessiens, et proposons de nouvelles méthodes de calcul de pas, en particulier dans le cadre des algorithmes de recherche linéaire. Il s’agit notamment d’algorithmes nous permettant de changer d’approximation de hessien et d’adapter la longueur du pas dans une direction de recherche fixée. Finalement, nous évaluons l’efficacité numérique des méthodes proposées dans le cadre de l’estimation de modèles de choix discrets, en particulier les modèles logit mélangés.
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field.
Resumo:
The variogram is essential for local estimation and mapping of any variable by kriging. The variogram itself must usually be estimated from sample data. The sampling density is a compromise between precision and cost, but it must be sufficiently dense to encompass the principal spatial sources of variance. A nested, multi-stage, sampling with separating distances increasing in geometric progression from stage to stage will do that. The data may then be analyzed by a hierarchical analysis of variance to estimate the components of variance for every stage, and hence lag. By accumulating the components starting from the shortest lag one obtains a rough variogram for modest effort. For balanced designs the analysis of variance is optimal; for unbalanced ones, however, these estimators are not necessarily the best, and the analysis by residual maximum likelihood (REML) will usually be preferable. The paper summarizes the underlying theory and illustrates its application with data from three surveys, one in which the design had four stages and was balanced and two implemented with unbalanced designs to economize when there were more stages. A Fortran program is available for the analysis of variance, and code for the REML analysis is listed in the paper. (c) 2005 Elsevier Ltd. All rights reserved.
Resumo:
Milk, fat, and protein yields of Holstein cows from the States of New York and California in the United States were used to estimate (co)variances among yields in the first three lactations, using an animal model and a derivative-free restricted maximum likelihood (REML) algorithm, and to verify if yields in different lactations are the same trait. The data were split in 20 samples, 10 from each state, with means of 5463 and 5543 cows per sample from California and New York. Mean heritability estimates for milk, fat, and protein yields for California data were, respectively, 0.34, 0.35, and 0.40 for first; 0.31, 0.33, and 0.39 for second; and 0.28, 0.31, and 0.37 for third lactations. For New York data, estimates were 0.35, 0.40, and 0.34 for first; 0.34, 0.44, and 0.38 for second; and 0.32, 0.43, and 0.38 for third lactations. Means of estimates of genetic correlations between first and second, first and third, and second and third lactations for California data were 0.86, 0.77, and 0.96 for milk; 0.89, 0.84, and 0.97 for fat; and 0.90, 0.84, and 0.97 for protein yields. Mean estimates for New York data were 0.87, 0.81, and 0.97 for milk; 0.91, 0.86, and 0.98 for fat; and 0.88, 0.82, and 0.98 for protein yields. Environmental correlations varied from 0.30 to 0.50 and were larger between second and third lactations. Phenotypic correlations were similar for both states and varied from 0.52 to 0.66 for milk, fat and protein yields. These estimates are consistent with previous estimates obtained with animal models. Yields in different lactations are not statistically the same trait but for selection programs such yields can be modelled as the same trait because of the high genetic correlations.
Resumo:
This paper presents a time-domain stochastic system identification method based on maximum likelihood estimation (MLE) with the expectation maximization (EM) algorithm. The effectiveness of this structural identification method is evaluated through numerical simulation in the context of the ASCE benchmark problem on structural health monitoring. The benchmark structure is a four-story, two-bay by two-bay steel-frame scale model structure built in the Earthquake Engineering Research Laboratory at the University of British Columbia, Canada. This paper focuses on Phase I of the analytical benchmark studies. A MATLAB-based finite element analysis code obtained from the IASC-ASCE SHM Task Group web site is used to calculate the dynamic response of the prototype structure. A number of 100 simulations have been made using this MATLAB-based finite element analysis code in order to evaluate the proposed identification method. There are several techniques to realize system identification. In this work, stochastic subspace identification (SSI)method has been used for comparison. SSI identification method is a well known method and computes accurate estimates of the modal parameters. The principles of the SSI identification method has been introduced in the paper and next the proposed MLE with EM algorithm has been explained in detail. The advantages of the proposed structural identification method can be summarized as follows: (i) the method is based on maximum likelihood, that implies minimum variance estimates; (ii) EM is a computational simpler estimation procedure than other optimization algorithms; (iii) estimate more parameters than SSI, and these estimates are accurate. On the contrary, the main disadvantages of the method are: (i) EM algorithm is an iterative procedure and it consumes time until convergence is reached; and (ii) this method needs starting values for the parameters. Modal parameters (eigenfrequencies, damping ratios and mode shapes) of the benchmark structure have been estimated using both the SSI method and the proposed MLE + EM method. The numerical results show that the proposed method identifies eigenfrequencies, damping ratios and mode shapes reasonably well even in the presence of 10% measurement noises. These modal parameters are more accurate than the SSI estimated modal parameters.
Resumo:
A maximum likelihood estimator based on the coalescent for unequal migration rates and different subpopulation sizes is developed. The method uses a Markov chain Monte Carlo approach to investigate possible genealogies with branch lengths and with migration events. Properties of the new method are shown by using simulated data from a four-population n-island model and a source–sink population model. Our estimation method as coded in migrate is tested against genetree; both programs deliver a very similar likelihood surface. The algorithm converges to the estimates fairly quickly, even when the Markov chain is started from unfavorable parameters. The method was used to estimate gene flow in the Nile valley by using mtDNA data from three human populations.
Resumo:
Frequencies of meiotic configurations in cytogenetic stocks are dependent on chiasma frequencies in segments defined by centromeres, breakpoints, and telomeres. The expectation maximization algorithm is proposed as a general method to perform maximum likelihood estimations of the chiasma frequencies in the intervals between such locations. The estimates can be translated via mapping functions into genetic maps of cytogenetic landmarks. One set of observational data was analyzed to exemplify application of these methods, results of which were largely concordant with other comparable data. The method was also tested by Monte Carlo simulation of frequencies of meiotic configurations from a monotelodisomic translocation heterozygote, assuming six different sample sizes. The estimate averages were always close to the values given initially to the parameters. The maximum likelihood estimation procedures can be extended readily to other kinds of cytogenetic stocks and allow the pooling of diverse cytogenetic data to collectively estimate lengths of segments, arms, and chromosomes.
Resumo:
Lognormal distribution has abundant applications in various fields. In literature, most inferences on the two parameters of the lognormal distribution are based on Type-I censored sample data. However, exact measurements are not always attainable especially when the observation is below or above the detection limits, and only the numbers of measurements falling into predetermined intervals can be recorded instead. This is the so-called grouped data. In this paper, we will show the existence and uniqueness of the maximum likelihood estimators of the two parameters of the underlying lognormal distribution with Type-I censored data and grouped data. The proof was first established under the case of normal distribution and extended to the lognormal distribution through invariance property. The results are applied to estimate the median and mean of the lognormal population.