104 resultados para Data Streams Distribution
Resumo:
This paper maps the carbonate geochemistry of the Makgadikgadi Pans region of northern Botswana from moderate resolution (500 m pixels) remotely sensed data, to assess the impact of various geomorphological processes on surficial carbonate distribution. Previous palaeo-environmental studies have demonstrated that the pans have experienced several highstands during the Quaternary, forming calcretes around shoreline embayments. The pans are also a significant regional source of dust, and some workers have suggested that surficial carbonate distributions may be controlled, in part, by wind regime. Field studies of carbonate deposits in the region have also highlighted the importance of fluvial and groundwater processes in calcrete formation. However, due to the large area involved and problems of accessibility, the carbonate distribution across the entire Makgadikgadi basin remains poorly understood. The MODIS instrument permits mapping of carbonate distribution over large areas; comparison with estimates from Landsat Thematic Mapper data show reasonable agreement, and there is good agreement with estimates from laboratory analysis of field samples. The results suggest that palaeo-lake highstands, reconstructed here using the SRTM 3 arc-second digital elevation model, have left behind surficial carbonate deposits, which can be mapped by the MODIS instrument. Copyright (c) 2006 John Wiley & Sons, Ltd.
Resumo:
Many time series are measured monthly, either as averages or totals, and such data often exhibit seasonal variability-the values of the series are consistently larger for some months of the year than for others. A typical series of this type is the number of deaths each month attributed to SIDS (Sudden Infant Death Syndrome). Seasonality can be modelled in a number of ways. This paper describes and discusses various methods for modelling seasonality in SIDS data, though much of the discussion is relevant to other seasonally varying data. There are two main approaches, either fitting a circular probability distribution to the data, or using regression-based techniques to model the mean seasonal behaviour. Both are discussed in this paper.
Resumo:
The images taken by the Heliospheric Imagers (HIs), part of the SECCHI imaging package onboard the pair of STEREO spacecraft, provide information on the radial and latitudinal evolution of the plasma compressed inside corotating interaction regions (CIRs). A plasma density wave imaged by the HI instrument onboard STEREO-B was found to propagate towards STEREO-A, enabling a comparison between simultaneous remotesensing and in situ observations of its structure to be performed. In situ measurements made by STEREO-A show that the plasma density wave is associated with the passage of a CIR. The magnetic field compressed after the CIR stream interface (SI) is found to have a planar distribution. Minimum variance analysis of the magnetic field vectors shows that the SI is inclined at 54° to the orbital plane of the STEREO-A spacecraft. This inclination of the CIR SI is comparable to the inclination of the associated plasma density wave observed by HI. A small-scale magnetic cloud with a flux rope topology and radial extent of 0.08 AU is also embedded prior to the SI. The pitch-angle distribution of suprathermal electrons measured by the STEREO-A SWEA instrument shows that an open magnetic field topology in the cloud replaced the heliospheric current sheet locally. These observations confirm that HI observes CIRs in difference images when a small-scale transient is caught up in the compression region.
Resumo:
While over-dispersion in capture–recapture studies is well known to lead to poor estimation of population size, current diagnostic tools to detect the presence of heterogeneity have not been specifically developed for capture–recapture studies. To address this, a simple and efficient method of testing for over-dispersion in zero-truncated count data is developed and evaluated. The proposed method generalizes an over-dispersion test previously suggested for un-truncated count data and may also be used for testing residual over-dispersion in zero-inflation data. Simulations suggest that the asymptotic distribution of the test statistic is standard normal and that this approximation is also reasonable for small sample sizes. The method is also shown to be more efficient than an existing test for over-dispersion adapted for the capture–recapture setting. Studies with zero-truncated and zero-inflated count data are used to illustrate the test procedures.
Resumo:
The contribution investigates the problem of estimating the size of a population, also known as the missing cases problem. Suppose a registration system is targeting to identify all cases having a certain characteristic such as a specific disease (cancer, heart disease, ...), disease related condition (HIV, heroin use, ...) or a specific behavior (driving a car without license). Every case in such a registration system has a certain notification history in that it might have been identified several times (at least once) which can be understood as a particular capture-recapture situation. Typically, cases are left out which have never been listed at any occasion, and it is this frequency one wants to estimate. In this paper modelling is concentrating on the counting distribution, e.g. the distribution of the variable that counts how often a given case has been identified by the registration system. Besides very simple models like the binomial or Poisson distribution, finite (nonparametric) mixtures of these are considered providing rather flexible modelling tools. Estimation is done using maximum likelihood by means of the EM algorithm. A case study on heroin users in Bangkok in the year 2001 is completing the contribution.
Resumo:
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.
Resumo:
Background: Molecular tools may help to uncover closely related and still diverging species from a wide variety of taxa and provide insight into the mechanisms, pace and geography of marine speciation. There is a certain controversy on the phylogeography and speciation modes of species-groups with an Eastern Atlantic-Western Indian Ocean distribution, with previous studies suggesting that older events (Miocene) and/or more recent (Pleistocene) oceanographic processes could have influenced the phylogeny of marine taxa. The spiny lobster genus Palinurus allows for testing among speciation hypotheses, since it has a particular distribution with two groups of three species each in the Northeastern Atlantic (P. elephas, P. mauritanicus and P. charlestoni) and Southeastern Atlantic and Southwestern Indian Oceans (P. gilchristi, P. delagoae and P. barbarae). In the present study, we obtain a more complete understanding of the phylogenetic relationships among these species through a combined dataset with both nuclear and mitochondrial markers, by testing alternative hypotheses on both the mutation rate and tree topology under the recently developed approximate Bayesian computation (ABC) methods. Results: Our analyses support a North-to-South speciation pattern in Palinurus with all the South-African species forming a monophyletic clade nested within the Northern Hemisphere species. Coalescent-based ABC methods allowed us to reject the previously proposed hypothesis of a Middle Miocene speciation event related with the closure of the Tethyan Seaway. Instead, divergence times obtained for Palinurus species using the combined mtDNA-microsatellite dataset and standard mutation rates for mtDNA agree with known glaciation-related processes occurring during the last 2 my. Conclusion: The Palinurus speciation pattern is a typical example of a series of rapid speciation events occurring within a group, with very short branches separating different species. Our results support the hypothesis that recent climate change-related oceanographic processes have influenced the phylogeny of marine taxa, with most Palinurus species originating during the last two million years. The present study highlights the value of new coalescent-based statistical methods such as ABC for testing different speciation hypotheses using molecular data.
Resumo:
Estimation of population size with missing zero-class is an important problem that is encountered in epidemiological assessment studies. Fitting a Poisson model to the observed data by the method of maximum likelihood and estimation of the population size based on this fit is an approach that has been widely used for this purpose. In practice, however, the Poisson assumption is seldom satisfied. Zelterman (1988) has proposed a robust estimator for unclustered data that works well in a wide class of distributions applicable for count data. In the work presented here, we extend this estimator to clustered data. The estimator requires fitting a zero-truncated homogeneous Poisson model by maximum likelihood and thereby using a Horvitz-Thompson estimator of population size. This was found to work well, when the data follow the hypothesized homogeneous Poisson model. However, when the true distribution deviates from the hypothesized model, the population size was found to be underestimated. In the search of a more robust estimator, we focused on three models that use all clusters with exactly one case, those clusters with exactly two cases and those with exactly three cases to estimate the probability of the zero-class and thereby use data collected on all the clusters in the Horvitz-Thompson estimator of population size. Loss in efficiency associated with gain in robustness was examined based on a simulation study. As a trade-off between gain in robustness and loss in efficiency, the model that uses data collected on clusters with at most three cases to estimate the probability of the zero-class was found to be preferred in general. In applications, we recommend obtaining estimates from all three models and making a choice considering the estimates from the three models, robustness and the loss in efficiency. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)
Resumo:
This paper presents a simple Bayesian approach to sample size determination in clinical trials. It is required that the trial should be large enough to ensure that the data collected will provide convincing evidence either that an experimental treatment is better than a control or that it fails to improve upon control by some clinically relevant difference. The method resembles standard frequentist formulations of the problem, and indeed in certain circumstances involving 'non-informative' prior information it leads to identical answers. In particular, unlike many Bayesian approaches to sample size determination, use is made of an alternative hypothesis that an experimental treatment is better than a control treatment by some specified magnitude. The approach is introduced in the context of testing whether a single stream of binary observations are consistent with a given success rate p(0). Next the case of comparing two independent streams of normally distributed responses is considered, first under the assumption that their common variance is known and then for unknown variance. Finally, the more general situation in which a large sample is to be collected and analysed according to the asymptotic properties of the score statistic is explored. Copyright (C) 2007 John Wiley & Sons, Ltd.
Resumo:
This investigation deals with the question of when a particular population can be considered to be disease-free. The motivation is the case of BSE where specific birth cohorts may present distinct disease-free subpopulations. The specific objective is to develop a statistical approach suitable for documenting freedom of disease, in particular, freedom from BSE in birth cohorts. The approach is based upon a geometric waiting time distribution for the occurrence of positive surveillance results and formalizes the relationship between design prevalence, cumulative sample size and statistical power. The simple geometric waiting time model is further modified to account for the diagnostic sensitivity and specificity associated with the detection of disease. This is exemplified for BSE using two different models for the diagnostic sensitivity. The model is furthermore modified in such a way that a set of different values for the design prevalence in the surveillance streams can be accommodated (prevalence heterogeneity) and a general expression for the power function is developed. For illustration, numerical results for BSE suggest that currently (data status September 2004) a birth cohort of Danish cattle born after March 1999 is free from BSE with probability (power) of 0.8746 or 0.8509, depending on the choice of a model for the diagnostic sensitivity.
Resumo:
A detailed spore investigation of spore release and dispersal from an isolated colony of Phascum cuspidatum Hedw. indicated that approximately 98% of the spores originally present remained within the colony. The spatial distribution of colonies of P.cuspidatum and Pottia truncata (Hedw.) Fürer. in relation to those of the previous year was investigated by mapping the occurrence of colonies in five permanent quadrats for each species during two successive years. Phascum cuspidatum reoccurred in three quadrats during the second year, and P. truncata in only one, in the latter case apparently due to invasion by other mosses, principally Barbula hornschuchiana Schultz. A substantial proportion of the second year colonies overlapped in position with the first year colonies, particularly in P.cuspidatum. The results are discussed in relation to data on spore dispersal and other aspects of the life-history of these annual or short-lived shuttle mosses.
Resumo:
The Cape Floristic Region is exceptionally species-rich both for its area and latitude, and this diversity is highly unevenly distributed among genera. The modern flora is hypothesized to result largely from recent (post-Oligocene) speciation, and it has long been speculated that particular species-poor lineages pre-date this burst of speciation. Here, we employ molecular phylogenetic data in combination with fossil calibrations to estimate the minimum duration of Cape occupation by 14 unrelated putative relicts. Estimates vary widely between lineages (7-101 Myr ago), and when compared with the estimated timing of onset of the modern flora's radiation, it is clear that many, but possibly not all, of these lineages pre-date its establishment. Statistical comparisons of diversities with lineage age show that low species diversity of many of the putative relicts results from a lower rate of diversification than in dated Cape radiations. In other putative relicts, however, we cannot reject the possibility that they diversify at the same underlying rate as the radiations, but have been present in the Cape for insufficient time to accumulate higher diversity. Although the extremes in diversity of currently dated Cape lineages fall outside expectations under a underlying diversification rate, sampling of all Cape lineages would be required to reject this null hypothesis.
Resumo:
Heterogeneity in lifetime data may be modelled by multiplying an individual's hazard by an unobserved frailty. We test for the presence of frailty of this kind in univariate and bivariate data with Weibull distributed lifetimes, using statistics based on the ordered Cox-Snell residuals from the null model of no frailty. The form of the statistics is suggested by outlier testing in the gamma distribution. We find through simulation that the sum of the k largest or k smallest order statistics, for suitably chosen k , provides a powerful test when the frailty distribution is assumed to be gamma or positive stable, respectively. We provide recommended values of k for sample sizes up to 100 and simple formulae for estimated critical values for tests at the 5% level.
Resumo:
1. Jerdon's courser Rhinoptilus bitorquatus is a nocturnally active cursorial bird that is only known to occur in a small area of scrub jungle in Andhra Pradesh, India, and is listed as critically endangered by the IUCN. Information on its habitat requirements is needed urgently to underpin conservation measures. We quantified the habitat features that correlated with the use of different areas of scrub jungle by Jerdon's coursers, and developed a model to map potentially suitable habitat over large areas from satellite imagery and facilitate the design of surveys of Jerdon's courser distribution. 2. We used 11 arrays of 5-m long tracking strips consisting of smoothed fine soil to detect the footprints of Jerdon's coursers, and measured tracking rates (tracking events per strip night). We counted the number of bushes and trees, and described other attributes of vegetation and substrate in a 10-m square plot centred on each strip. We obtained reflectance data from Landsat 7 satellite imagery for the pixel within which each strip lay. 3. We used logistic regression models to describe the relationship between tracking rate by Jerdon's coursers and characteristics of the habitat around the strips, using ground-based survey data and satellite imagery. 4. Jerdon's coursers were most likely to occur where the density of large (>2 m tall) bushes was in the range 300-700 ha(-1) and where the density of smaller bushes was less than 1000 ha(-1). This habitat was detectable using satellite imagery. 5. Synthesis and applications. The occurrence of Jerdon's courser is strongly correlated with the density of bushes and trees, and is in turn affected by grazing with domestic livestock, woodcutting and mechanical clearance of bushes to create pasture, orchards and farmland. It is likely that there is an optimal level of grazing and woodcutting that would maintain or create suitable conditions for the species. Knowledge of the species' distribution is incomplete and there is considerable pressure from human use of apparently suitable habitats. Hence, distribution mapping is a high conservation priority. A two-step procedure is proposed, involving the use of ground surveys of bush density to calibrate satellite image-based mapping of potential habitat. These maps could then be used to select priority areas for Jerdon's courser surveys. The use of tracking strips to study habitat selection and distribution has potential in studies of other scarce and secretive species.
Resumo:
Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.