869 resultados para Markov chains hidden Markov models Viterbi algorithm Forward-Backward algorithm maximum likelihood
Resumo:
The article considers screening human populations with two screening tests. If any of the two tests is positive, then full evaluation of the disease status is undertaken; however, if both diagnostic tests are negative, then disease status remains unknown. This procedure leads to a data constellation in which, for each disease status, the 2 x 2 table associated with the two diagnostic tests used in screening has exactly one empty, unknown cell. To estimate the unobserved cell counts, previous approaches assume independence of the two diagnostic tests and use specific models, including the special mixture model of Walter or unconstrained capture-recapture estimates. Often, as is also demonstrated in this article by means of a simple test, the independence of the two screening tests is not supported by the data. Two new estimators are suggested that allow associations of the screening test, although the form of association must be assumed to be homogeneous over disease status. These estimators are modifications of the simple capture-recapture estimator and easy to construct. The estimators are investigated for several screening studies with fully evaluated disease status in which the superior behavior of the new estimators compared to the previous conventional ones can be shown. Finally, the performance of the new estimators is compared with maximum likelihood estimators, which are more difficult to obtain in these models. The results indicate the loss of efficiency as minor.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
Survival times for the Acacia mangium plantation in the Segaliud Lokan Project, Sabah, East Malaysia were analysed based on 20 permanent sample plots (PSPs) established in 1988 as a spacing experiment. The PSPs were established following a complete randomized block design with five levels of spacing randomly assigned to units within four blocks at different sites. The survival times of trees in years are of interest. Since the inventories were only conducted annually, the actual survival time for each tree was not observed. Hence, the data set comprises censored survival times. Initial analysis of the survival of the Acacia mangium plantation suggested there is block by spacing interaction; a Weibull model gives a reasonable fit to the replicate survival times within each PSP; but a standard Weibull regression model is inappropriate because the shape parameter differs between PSPs. In this paper we investigate the form of the non-constant Weibull shape parameter. Parsimonious models for the Weibull survival times have been derived using maximum likelihood methods. The factor selection for the parameters is based on a backward elimination procedure. The models are compared using likelihood ratio statistics. The results suggest that both Weibull parameters depend on spacing and block.
Resumo:
This article is about modeling count data with zero truncation. A parametric count density family is considered. The truncated mixture of densities from this family is different from the mixture of truncated densities from the same family. Whereas the former model is more natural to formulate and to interpret, the latter model is theoretically easier to treat. It is shown that for any mixing distribution leading to a truncated mixture, a (usually different) mixing distribution can be found so. that the associated mixture of truncated densities equals the truncated mixture, and vice versa. This implies that the likelihood surfaces for both situations agree, and in this sense both models are equivalent. Zero-truncated count data models are used frequently in the capture-recapture setting to estimate population size, and it can be shown that the two Horvitz-Thompson estimators, associated with the two models, agree. In particular, it is possible to achieve strong results for mixtures of truncated Poisson densities, including reliable, global construction of the unique NPMLE (nonparametric maximum likelihood estimator) of the mixing distribution, implying a unique estimator for the population size. The benefit of these results lies in the fact that it is valid to work with the mixture of truncated count densities, which is less appealing for the practitioner but theoretically easier. Mixtures of truncated count densities form a convex linear model, for which a developed theory exists, including global maximum likelihood theory as well as algorithmic approaches. Once the problem has been solved in this class, it might readily be transformed back to the original problem by means of an explicitly given mapping. Applications of these ideas are given, particularly in the case of the truncated Poisson family.
Resumo:
Objectives: To assess the potential source of variation that surgeon may add to patient outcome in a clinical trial of surgical procedures. Methods: Two large (n = 1380) parallel multicentre randomized surgical trials were undertaken to compare laparoscopically assisted hysterectomy with conventional methods of abdominal and vaginal hysterectomy; involving 43 surgeons. The primary end point of the trial was the occurrence of at least one major complication. Patients were nested within surgeons giving the data set a hierarchical structure. A total of 10% of patients had at least one major complication, that is, a sparse binary outcome variable. A linear mixed logistic regression model (with logit link function) was used to model the probability of a major complication, with surgeon fitted as a random effect. Models were fitted using the method of maximum likelihood in SAS((R)). Results: There were many convergence problems. These were resolved using a variety of approaches including; treating all effects as fixed for the initial model building; modelling the variance of a parameter on a logarithmic scale and centring of continuous covariates. The initial model building process indicated no significant 'type of operation' across surgeon interaction effect in either trial, the 'type of operation' term was highly significant in the abdominal trial, and the 'surgeon' term was not significant in either trial. Conclusions: The analysis did not find a surgeon effect but it is difficult to conclude that there was not a difference between surgeons. The statistical test may have lacked sufficient power, the variance estimates were small with large standard errors, indicating that the precision of the variance estimates may be questionable.
Resumo:
Performance analysis has been used for many applications including providing feedback to coaches and players, media applications, scoring of sports performance and scientific research into sports performance. The current study has used performance analysis to generate knowledge relating to the demands of netball competition which has been used in the development of a Netball Specific Fitness Test (NSFT). A modified version of the Bloomfield movement classification was used to provide a detailed analysis of player movement during netball competition. This was considered during a needs analysis when proposing the structure of the NSFT. A series of pilot versions were tested during an evolutionary prototyping process that resulted in the final version of the NSFT, which was found to be representative of movement in netball competition and it distinguished between recreational club players and players of university first team level or above. The test is incremental and involves forward, backward and sideways movement, jumping, lunging, turning and choice reaction.
Resumo:
Posterior cortical atrophy (PCA) is a type of dementia that is characterized by visuo-spatial and memory deficits, dyslexia and dysgraphia, relatively early onset and preserved insight. Language deficits have been reported in some cases of PCA. Using an off-line grammaticality judgement task, processing of wh-questions is investigated in a case of PCA. Other aspects of auditory language are also reported. It is shown that processing of wh-questions is influenced by syntactic structure, a novel finding in this condition. The results are discussed with reference to accounts of wh-questions in aphasia. An uneven profile of other language abilities is reported with deficits in digit span (forward, backward), story retelling ability, comparative questions but intact abilities in following commands, repetition, concept definition, generative naming and discourse comprehension.
Resumo:
External interferences can severely degrade the performance of an Over-the-horizon radar (OTHR), so suppression of external interferences in strong clutter environment is the prerequisite for the target detection. The traditional suppression solutions usually began with clutter suppression in either time or frequency domain, followed by the interference detection and suppression. Based on this traditional solution, this paper proposes a method characterized by joint clutter suppression and interference detection: by analyzing eigenvalues in a short-time moving window centered at different time position, Clutter is suppressed by discarding the maximum three eigenvalues at every time position and meanwhile detection is achieved by analyzing the remained eigenvalues at different position. Then, restoration is achieved by forward-backward linear prediction using interference-free data surrounding the interference position. In the numeric computation, the eigenvalue decomposition (EVD) is replaced by values decomposition (SVD) based on the equivalence of these two processing. Data processing and experimental results show its efficiency of noise floor falling down about 10-20 dB.
Resumo:
A Bayesian method of estimating multivariate sample selection models is introduced and applied to the estimation of a demand system for food in the UK to account for censoring arising from infrequency of purchase. We show how it is possible to impose identifying restrictions on the sample selection equations and that, unlike a maximum likelihood framework, the imposition of adding up at both latent and observed levels is straightforward. Our results emphasise the role played by low incomes and socio-economic circumstances in leading to poor diets and also indicate that the presence of children in a household has a negative impact on dietary quality.
Resumo:
This chapter considers the Multiband Orthogonal Frequency Division Multiplexing (MB- OFDM) modulation and demodulation with the intention to optimize the Ultra-Wideband (UWB) system performance. OFDM is a type of multicarrier modulation and becomes the most important aspect for the MB-OFDM system performance. It is also a low cost digital signal component efficiently using Fast Fourier Transform (FFT) algorithm to implement the multicarrier orthogonality. Within the MB-OFDM approach, the OFDM modulation is employed in each 528 MHz wide band to transmit the data across the different bands while also using the frequency hopping technique across different bands. Each parallel bit stream can be mapped onto one of the OFDM subcarriers. Quadrature Phase Shift Keying (QPSK) and Dual Carrier Modulation (DCM) are currently used as the modulation schemes for MB-OFDM in the ECMA-368 defined UWB radio platform. A dual QPSK soft-demapper is suitable for ECMA-368 that exploits the inherent Time-Domain Spreading (TDS) and guard symbol subcarrier diversity to improve the receiver performance, yet merges decoding operations together to minimize hardware and power requirements. There are several methods to demap the DCM, which are soft bit demapping, Maximum Likelihood (ML) soft bit demapping, and Log Likelihood Ratio (LLR) demapping. The Channel State Information (CSI) aided scheme coupled with the band hopping information is used as a further technique to improve the DCM demapping performance. ECMA-368 offers up to 480 Mb/s instantaneous bit rate to the Medium Access Control (MAC) layer, but depending on radio channel conditions dropped packets unfortunately result in a lower throughput. An alternative high data rate modulation scheme termed Dual Circular 32-QAM that fits within the configuration of the current standard increasing system throughput thus maintaining the high rate throughput even with a moderate level of dropped packets.
Resumo:
This article introduces generalized beta-generated (GBG) distributions. Sub-models include all classical beta-generated, Kumaraswamy-generated and exponentiated distributions. They are maximum entropy distributions under three intuitive conditions, which show that the classical beta generator skewness parameters only control tail entropy and an additional shape parameter is needed to add entropy to the centre of the parent distribution. This parameter controls skewness without necessarily differentiating tail weights. The GBG class also has tractable properties: we present various expansions for moments, generating function and quantiles. The model parameters are estimated by maximum likelihood and the usefulness of the new class is illustrated by means of some real data sets.
Resumo:
This paper explores the sensitivity of Atmospheric General Circulation Model (AGCM) simulations to changes in the meridional distribution of sea surface temperature (SST). The simulations are for an aqua-planet, a water covered Earth with no land, orography or sea-ice and with specified zonally symmetric SST. Simulations from 14 AGCMs developed for Numerical Weather Prediction and climate applications are compared. Four experiments are performed to study the sensitivity to the meridional SST profile. These profiles range from one in which the SST gradient continues to the equator to one which is flat approaching the equator, all with the same maximum SST at the equator. The zonal mean circulation of all models shows strong sensitivity to latitudinal distribution of SST. The Hadley circulation weakens and shifts poleward as the SST profile flattens in the tropics. One question of interest is the formation of a double versus a single ITCZ. There is a large variation between models of the strength of the ITCZ and where in the SST experiment sequence they transition from a single to double ITCZ. The SST profiles are defined such that as the equatorial SST gradient flattens, the maximum gradient increases and moves poleward. This leads to a weakening of the mid-latitude jet accompanied by a poleward shift of the jet core. Also considered are tropical wave activity and tropical precipitation frequency distributions. The details of each vary greatly between models, both with a given SST and in the response to the change in SST. One additional experiment is included to examine the sensitivity to an off-equatorial SST maximum. The upward branch of the Hadley circulation follows the SST maximum off the equator. The models that form a single precipitation maximum when the maximum SST is on the equator shift the precipitation maximum off equator and keep it centered over the SST maximum. Those that form a double with minimum on the equatorial maximum SST shift the double structure off the equator, keeping the minimum over the maximum SST. In both situations only modest changes appear in the shifted profile of zonal average precipitation. When the upward branch of the Hadley circulation moves into the hemisphere with SST maximum, the zonal average zonal, meridional and vertical winds all indicate that the Hadley cell in the other hemisphere dominates.
Resumo:
The weak-constraint inverse for nonlinear dynamical models is discussed and derived in terms of a probabilistic formulation. The well-known result that for Gaussian error statistics the minimum of the weak-constraint inverse is equal to the maximum-likelihood estimate is rederived. Then several methods based on ensemble statistics that can be used to find the smoother (as opposed to the filter) solution are introduced and compared to traditional methods. A strong point of the new methods is that they avoid the integration of adjoint equations, which is a complex task for real oceanographic or atmospheric applications. they also avoid iterative searches in a Hilbert space, and error estimates can be obtained without much additional computational effort. the feasibility of the new methods is illustrated in a two-layer quasigeostrophic model.
Resumo:
The origin of tropical forest diversity has been hotly debated for decades. Although specific mechanisms vary, many such explanations propose some vicariance in the distribution of species during glacial cycles and several have been supported by genetic evidence in Neotropical taxa. However, no consensus exists with regard to the extent or time frame of the vicariance events. Here, we analyse the cytochrome oxidase II mitochondrial gene of 250 Sabethes albiprivus B mosquitoes sampled from western Sao Paulo in Brazil. There was very low population structuring among collection sites (Phi(ST) = 0.03, P = 0.04). Historic demographic analyses and the contemporary geographic distribution of genetic diversity suggest that the populations sampled are not at demographic equilibrium. Three distinct mitochondrial clades were observed in the samples, one of which differed significantly in its geographic distribution relative to the other two within a small sampling area (similar to 70 x 35 km). This fact, supported by the inability of maximum likelihood analyses to achieve adequate fits to simple models for the population demography of the species, suggests a more complex history, possibly involving disjunct forest refugia. This hypothesis is supported by a genetic signal of recent population growth, which is expected if population sizes of this forest-obligate insect increased during the forest expansions that followed glacial periods. Although a time frame cannot be reliably inferred for the vicariance event leading to the three genetic clades, molecular clock estimates place this at similar to 1 Myr before present.
Resumo:
This work is an assessment of frequency of extreme values (EVs) of daily rainfall in the city of Sao Paulo. Brazil, over the period 1933-2005, based on the peaks-over-threshold (POT) and Generalized Pareto Distribution (GPD) approach. Usually. a GPD model is fitted to a sample of POT Values Selected With a constant threshold. However. in this work we use time-dependent thresholds, composed of relatively large p quantities (for example p of 0.97) of daily rainfall amounts computed from all available data. Samples of POT values were extracted with several Values of p. Four different GPD models (GPD-1, GPD-2, GPD-3. and GDP-4) were fitted to each one of these samples by the maximum likelihood (ML) method. The shape parameter was assumed constant for the four models, but time-varying covariates were incorporated into scale parameter of GPD-2. GPD-3, and GPD-4, describing annual cycle in GPD-2. linear trend in GPD-3, and both annual cycle and linear trend in GPD-4. The GPD-1 with constant scale and shape parameters is the simplest model. For identification of the best model among the four models WC used rescaled Akaike Information Criterion (AIC) with second-order bias correction. This criterion isolates GPD-3 as the best model, i.e. the one with positive linear trend in the scale parameter. The slope of this trend is significant compared to the null hypothesis of no trend, for about 98% confidence level. The non-parametric Mann-Kendall test also showed presence of positive trend in the annual frequency of excess over high thresholds. with p-value being virtually zero. Therefore. there is strong evidence that high quantiles of daily rainfall in the city of Sao Paulo have been increasing in magnitude and frequency over time. For example. 0.99 quantiles of daily rainfall amount have increased by about 40 mm between 1933 and 2005. Copyright (C) 2008 Royal Meteorological Society