977 resultados para Binomial theorem.
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
In this paper we consider the estimation of population size from onesource capture–recapture data, that is, a list in which individuals can potentially be found repeatedly and where the question is how many individuals are missed by the list. As a typical example, we provide data from a drug user study in Bangkok from 2001 where the list consists of drug users who repeatedly contact treatment institutions. Drug users with 1, 2, 3, . . . contacts occur, but drug users with zero contacts are not present, requiring the size of this group to be estimated. Statistically, these data can be considered as stemming from a zero-truncated count distribution.We revisit an estimator for the population size suggested by Zelterman that is known to be robust under potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a locally truncated Poisson likelihood which is equivalent to a binomial likelihood. This result allows the extension of the Zelterman estimator by means of logistic regression to include observed heterogeneity in the form of covariates. We also review an estimator proposed by Chao and explain why we are not able to obtain similar results for this estimator. The Zelterman estimator is applied in two case studies, the first a drug user study from Bangkok, the second an illegal immigrant study in the Netherlands. Our results suggest the new estimator should be used, in particular, if substantial unobserved heterogeneity is present.
Resumo:
None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture–recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture–recapture models. Alternative methods, still under the capture–recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture–recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao’s lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates—in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
The problem of estimating the individual probabilities of a discrete distribution is considered. The true distribution of the independent observations is a mixture of a family of power series distributions. First, we ensure identifiability of the mixing distribution assuming mild conditions. Next, the mixing distribution is estimated by non-parametric maximum likelihood and an estimator for individual probabilities is obtained from the corresponding marginal mixture density. We establish asymptotic normality for the estimator of individual probabilities by showing that, under certain conditions, the difference between this estimator and the empirical proportions is asymptotically negligible. Our framework includes Poisson, negative binomial and logarithmic series as well as binomial mixture models. Simulations highlight the benefit in achieving normality when using the proposed marginal mixture density approach instead of the empirical one, especially for small sample sizes and/or when interest is in the tail areas. A real data example is given to illustrate the use of the methodology.
Resumo:
The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.
Resumo:
Two simple and frequently used capture–recapture estimates of the population size are compared: Chao's lower-bound estimate and Zelterman's estimate allowing for contaminated distributions. In the Poisson case it is shown that if there are only counts of ones and twos, the estimator of Zelterman is always bounded above by Chao's estimator. If counts larger than two exist, the estimator of Zelterman is becoming larger than that of Chao's, if only the ratio of the frequencies of counts of twos and ones is small enough. A similar analysis is provided for the binomial case. For a two-component mixture of Poisson distributions the asymptotic bias of both estimators is derived and it is shown that the Zelterman estimator can experience large overestimation bias. A modified Zelterman estimator is suggested and also the bias-corrected version of Chao's estimator is considered. All four estimators are compared in a simulation study.
Resumo:
None of the current surveillance streams monitoring the presence of scrapie in Great Britain provide a comprehensive and unbiased estimate of the prevalence of the disease at the holding level. Previous work to estimate the under-ascertainment adjusted prevalence of scrapie in Great Britain applied multiple-list capture-recapture methods. The enforcement of new control measures on scrapie-affected holdings in 2004 has stopped the overlapping between surveillance sources and, hence, the application of multiple-list capture-recapture models. Alternative methods, still under the capture-recapture methodology, relying on repeated entries in one single list have been suggested in these situations. In this article, we apply one-list capture-recapture approaches to data held on the Scrapie Notifications Database to estimate the undetected population of scrapie-affected holdings with clinical disease in Great Britain for the years 2002, 2003, and 2004. For doing so, we develop a new diagnostic tool for indication of heterogeneity as well as a new understanding of the Zelterman and Chao's lower bound estimators to account for potential unobserved heterogeneity. We demonstrate that the Zelterman estimator can be viewed as a maximum likelihood estimator for a special, locally truncated Poisson likelihood equivalent to a binomial likelihood. This understanding allows the extension of the Zelterman approach by means of logistic regression to include observed heterogeneity in the form of covariates-in case studied here, the holding size and country of origin. Our results confirm the presence of substantial unobserved heterogeneity supporting the application of our two estimators. The total scrapie-affected holding population in Great Britain is around 300 holdings per year. None of the covariates appear to inform the model significantly.
Resumo:
This note considers the variance estimation for population size estimators based on capture–recapture experiments. Whereas a diversity of estimators of the population size has been suggested, the question of estimating the associated variances is less frequently addressed. This note points out that the technique of conditioning can be applied here successfully which also allows us to identify sources of variation: the variance due to estimation of the model parameters and the binomial variance due to sampling n units from a population of size N. It is applied to estimators typically used in capture–recapture experiments in continuous time including the estimators of Zelterman and Chao and improves upon previously used variance estimators. In addition, knowledge of the variances associated with the estimators by Zelterman and Chao allows the suggestion of a new estimator as the weighted sum of the two. The decomposition of the variance into the two sources allows also a new understanding of how resampling techniques like the Bootstrap could be used appropriately. Finally, the sample size question for capture–recapture experiments is addressed. Since the variance of population size estimators increases with the sample size, it is suggested to use relative measures such as the observed-to-hidden ratio or the completeness of identification proportion for approaching the question of sample size choice.
Resumo:
The climate belongs to the class of non-equilibrium forced and dissipative systems, for which most results of quasi-equilibrium statistical mechanics, including the fluctuation-dissipation theorem, do not apply. In this paper we show for the first time how the Ruelle linear response theory, developed for studying rigorously the impact of perturbations on general observables of non-equilibrium statistical mechanical systems, can be applied with great success to analyze the climatic response to general forcings. The crucial value of the Ruelle theory lies in the fact that it allows to compute the response of the system in terms of expectation values of explicit and computable functions of the phase space averaged over the invariant measure of the unperturbed state. We choose as test bed a classical version of the Lorenz 96 model, which, in spite of its simplicity, has a well-recognized prototypical value as it is a spatially extended one-dimensional model and presents the basic ingredients, such as dissipation, advection and the presence of an external forcing, of the actual atmosphere. We recapitulate the main aspects of the general response theory and propose some new general results. We then analyze the frequency dependence of the response of both local and global observables to perturbations having localized as well as global spatial patterns. We derive analytically several properties of the corresponding susceptibilities, such as asymptotic behavior, validity of Kramers-Kronig relations, and sum rules, whose main ingredient is the causality principle. We show that all the coefficients of the leading asymptotic expansions as well as the integral constraints can be written as linear function of parameters that describe the unperturbed properties of the system, such as its average energy. Some newly obtained empirical closure equations for such parameters allow to define such properties as an explicit function of the unperturbed forcing parameter alone for a general class of chaotic Lorenz 96 models. We then verify the theoretical predictions from the outputs of the simulations up to a high degree of precision. The theory is used to explain differences in the response of local and global observables, to define the intensive properties of the system, which do not depend on the spatial resolution of the Lorenz 96 model, and to generalize the concept of climate sensitivity to all time scales. We also show how to reconstruct the linear Green function, which maps perturbations of general time patterns into changes in the expectation value of the considered observable for finite as well as infinite time. Finally, we propose a simple yet general methodology to study general Climate Change problems on virtually any time scale by resorting to only well selected simulations, and by taking full advantage of ensemble methods. The specific case of globally averaged surface temperature response to a general pattern of change of the CO2 concentration is discussed. We believe that the proposed approach may constitute a mathematically rigorous and practically very effective way to approach the problem of climate sensitivity, climate prediction, and climate change from a radically new perspective.
Resumo:
A new Bayesian algorithm for retrieving surface rain rate from Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) over the ocean is presented, along with validations against estimates from the TRMM Precipitation Radar (PR). The Bayesian approach offers a rigorous basis for optimally combining multichannel observations with prior knowledge. While other rain-rate algorithms have been published that are based at least partly on Bayesian reasoning, this is believed to be the first self-contained algorithm that fully exploits Bayes’s theorem to yield not just a single rain rate, but rather a continuous posterior probability distribution of rain rate. To advance the understanding of theoretical benefits of the Bayesian approach, sensitivity analyses have been conducted based on two synthetic datasets for which the “true” conditional and prior distribution are known. Results demonstrate that even when the prior and conditional likelihoods are specified perfectly, biased retrievals may occur at high rain rates. This bias is not the result of a defect of the Bayesian formalism, but rather represents the expected outcome when the physical constraint imposed by the radiometric observations is weak owing to saturation effects. It is also suggested that both the choice of the estimators and the prior information are crucial to the retrieval. In addition, the performance of the Bayesian algorithm herein is found to be comparable to that of other benchmark algorithms in real-world applications, while having the additional advantage of providing a complete continuous posterior probability distribution of surface rain rate.
Resumo:
We consider problems of splitting and connectivity augmentation in hypergraphs. In a hypergraph G = (V +s, E), to split two edges su, sv, is to replace them with a single edge uv. We are interested in doing this in such a way as to preserve a defined level of connectivity in V . The splitting technique is often used as a way of adding new edges into a graph or hypergraph, so as to augment the connectivity to some prescribed level. We begin by providing a short history of work done in this area. Then several preliminary results are given in a general form so that they may be used to tackle several problems. We then analyse the hypergraphs G = (V + s, E) for which there is no split preserving the local-edge-connectivity present in V. We provide two structural theorems, one of which implies a slight extension to Mader’s classical splitting theorem. We also provide a characterisation of the hypergraphs for which there is no such “good” split and a splitting result concerned with a specialisation of the local-connectivity function. We then use our splitting results to provide an upper bound on the smallest number of size-two edges we must add to any given hypergraph to ensure that in the resulting hypergraph we have λ(x, y) ≥ r(x, y) for all x, y in V, where r is an integer valued, symmetric requirement function on V*V. This is the so called “local-edge-connectivity augmentation problem” for hypergraphs. We also provide an extension to a Theorem of Szigeti, about augmenting to satisfy a requirement r, but using hyperedges. Next, in a result born of collaborative work with Zoltán Király from Budapest, we show that the local-connectivity augmentation problem is NP-complete for hypergraphs. Lastly we concern ourselves with an augmentation problem that includes a locational constraint. The premise is that we are given a hypergraph H = (V,E) with a bipartition P = {P1, P2} of V and asked to augment it with size-two edges, so that the result is k-edge-connected, and has no new edge contained in some P(i). We consider the splitting technique and describe the obstacles that prevent us forming “good” splits. From this we deduce results about which hypergraphs have a complete Pk-split. This leads to a minimax result on the optimal number of edges required and a polynomial algorithm to provide an optimal augmentation.
Resumo:
A multivariable hyperstable robust adaptive decoupling control algorithm based on a neural network is presented for the control of nonlinear multivariable coupled systems with unknown parameters and structure. The Popov theorem is used in the design of the controller. The modelling errors, coupling action and other uncertainties of the system are identified on-line by a neural network. The identified results are taken as compensation signals such that the robust adaptive control of nonlinear systems is realised. Simulation results are given.
Resumo:
This paper presents the theoretical development of a nonlinear adaptive filter based on a concept of filtering by approximated densities (FAD). The most common procedures for nonlinear estimation apply the extended Kalman filter. As opposed to conventional techniques, the proposed recursive algorithm does not require any linearisation. The prediction uses a maximum entropy principle subject to constraints. Thus, the densities created are of an exponential type and depend on a finite number of parameters. The filtering yields recursive equations involving these parameters. The update applies the Bayes theorem. Through simulation on a generic exponential model, the proposed nonlinear filter is implemented and the results prove to be superior to that of the extended Kalman filter and a class of nonlinear filters based on partitioning algorithms.
Resumo:
We introduce the perspex machine which unifies projective geometry and Turing computation and results in a supra-Turing machine. We show two ways in which the perspex machine unifies symbolic and non-symbolic AI. Firstly, we describe concrete geometrical models that map perspexes onto neural networks, some of which perform only symbolic operations. Secondly, we describe an abstract continuum of perspex logics that includes both symbolic logics and a new class of continuous logics. We argue that an axiom in symbolic logic can be the conclusion of a perspex theorem. That is, the atoms of symbolic logic can be the conclusions of sub-atomic theorems. We argue that perspex space can be mapped onto the spacetime of the universe we inhabit. This allows us to discuss how a robot might be conscious, feel, and have free will in a deterministic, or semi-deterministic, universe. We ground the reality of our universe in existence. On a theistic point, we argue that preordination and free will are compatible. On a theological point, we argue that it is not heretical for us to give robots free will. Finally, we give a pragmatic warning as to the double-edged risks of creating robots that do, or alternatively do not, have free will.