93 resultados para grouping estimators


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We agree with Duckrow and Albano [Phys. Rev. E 67, 063901 (2003)] and Quian Quiroga [Phys. Rev. E 67, 063902 (2003)] that mutual information (MI) is a useful measure of dependence for electroencephalogram (EEG) data, but we show that the improvement seen in the performance of MI on extracting dependence trends from EEG is more dependent on the type of MI estimator rather than any embedding technique used. In an independent study we conducted in search for an optimal MI estimator, and in particular for EEG applications, we examined the performance of a number of MI estimators on the data set used by Quian Quiroga in their original study, where the performance of different dependence measures on real data was investigated [Phys. Rev. E 65, 041903 (2002)]. We show that for EEG applications the best performance among the investigated estimators is achieved by k-nearest neighbors, which supports the conjecture by Quian Quiroga in Phys. Rev. E 67, 063902 (2003) that the nearest neighbor estimator is the most precise method for estimating MI.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Population size estimation with discrete or nonparametric mixture models is considered, and reliable ways of construction of the nonparametric mixture model estimator are reviewed and set into perspective. Construction of the maximum likelihood estimator of the mixing distribution is done for any number of components up to the global nonparametric maximum likelihood bound using the EM algorithm. In addition, the estimators of Chao and Zelterman are considered with some generalisations of Zelterman’s estimator. All computations are done with CAMCR, a special software developed for population size estimation with mixture models. Several examples and data sets are discussed and the estimators illustrated. Problems using the mixture model-based estimators are highlighted.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The experiment asks whether constancy in hearing precedes or follows grouping. Listeners heard speech-like sounds comprising 8 auditory-filter shaped noise-bands that had temporal envelopes corresponding to those arising in these filters when a speech message is played. The „context‟ words in the message were “next you‟ll get _to click on”, into which a “sir” or “stir” test word was inserted. These test words were from an 11-step continuum that was formed by amplitude modulation. Listeners identified the test words appropriately and quite consistently, even though they had the „robotic‟ quality typical of this type of 8-band speech. The speech-like effects of these sounds appears to be a consequence of auditory grouping. Constancy was assessed by comparing the influence of room reflections on the test word across conditions where the context had either the same level of reflections, or where it had a much lower level. Constancy effects were obtained with these 8-band sounds, but only in „matched‟ conditions, where the room reflections were in the same bands in both the context and the test word. This was not the case in a comparison „mismatched‟ condition, and here, no constancy effects were found. It would appear that this type of constancy in hearing precedes the across-channel grouping whose effects are so apparent in these sounds. This result is discussed in terms of the ubiquity of grouping across different levels of representation.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper derives an efficient algorithm for constructing sparse kernel density (SKD) estimates. The algorithm first selects a very small subset of significant kernels using an orthogonal forward regression (OFR) procedure based on the D-optimality experimental design criterion. The weights of the resulting sparse kernel model are then calculated using a modified multiplicative nonnegative quadratic programming algorithm. Unlike most of the SKD estimators, the proposed D-optimality regression approach is an unsupervised construction algorithm and it does not require an empirical desired response for the kernel selection task. The strength of the D-optimality OFR is owing to the fact that the algorithm automatically selects a small subset of the most significant kernels related to the largest eigenvalues of the kernel design matrix, which counts for the most energy of the kernel training data, and this also guarantees the most accurate kernel weight estimate. The proposed method is also computationally attractive, in comparison with many existing SKD construction algorithms. Extensive numerical investigation demonstrates the ability of this regression-based approach to efficiently construct a very sparse kernel density estimate with excellent test accuracy, and our results show that the proposed method compares favourably with other existing sparse methods, in terms of test accuracy, model sparsity and complexity, for constructing kernel density estimates.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A new Bayesian algorithm for retrieving surface rain rate from Tropical Rainfall Measuring Mission (TRMM) Microwave Imager (TMI) over the ocean is presented, along with validations against estimates from the TRMM Precipitation Radar (PR). The Bayesian approach offers a rigorous basis for optimally combining multichannel observations with prior knowledge. While other rain-rate algorithms have been published that are based at least partly on Bayesian reasoning, this is believed to be the first self-contained algorithm that fully exploits Bayes’s theorem to yield not just a single rain rate, but rather a continuous posterior probability distribution of rain rate. To advance the understanding of theoretical benefits of the Bayesian approach, sensitivity analyses have been conducted based on two synthetic datasets for which the “true” conditional and prior distribution are known. Results demonstrate that even when the prior and conditional likelihoods are specified perfectly, biased retrievals may occur at high rain rates. This bias is not the result of a defect of the Bayesian formalism, but rather represents the expected outcome when the physical constraint imposed by the radiometric observations is weak owing to saturation effects. It is also suggested that both the choice of the estimators and the prior information are crucial to the retrieval. In addition, the performance of the Bayesian algorithm herein is found to be comparable to that of other benchmark algorithms in real-world applications, while having the additional advantage of providing a complete continuous posterior probability distribution of surface rain rate.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper proposes a three-shot improvement scheme for the hard-decision based method (HDM), an implementation solution for linear decorrelating detector (LDD) in asynchronous DS/CDMA systems. By taking advantage of the preceding (already reconstructed) bit and the matched filter output for the following two bits, the coupling between temporally adjacent bits (TABs), which always exists for asynchronous systems, is greatly suppressed and the performance of the original HDM is substantially improved. This new scheme requires no signaling overhead yet offers nearly the same performance as those more complicated methods. Also, it can easily accommodate the change in the number of active users in the channel, as no symbol/bit grouping is involved. Finally, the influence of synchronisation errors is investigated.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic keyword or keyphrase extraction is concerned with assigning keyphrases to documents based on words from within the document. Previous studies have shown that in a significant number of cases author-supplied keywords are not appropriate for the document to which they are attached. This can either be because they represent what the author believes the paper is about not what it actually is, or because they include keyphrases which are more classificatory than explanatory e.g., “University of Poppleton” instead of “Knowledge Discovery in Databases”. Thus, there is a need for a system that can generate appropriate and diverse range of keyphrases that reflect the document. This paper proposes a solution that examines the synonyms of words and phrases in the document to find the underlying themes, and presents these as appropriate keyphrases. The primary method explores taking n-grams of the source document phrases, and examining the synonyms of these, while the secondary considers grouping outputs by their synonyms. The experiments undertaken show the primary method produces good results and that the secondary method produces both good results and potential for future work.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Two so-called “integrated” polarimetric rate estimation techniques, ZPHI (Testud et al., 2000) and ZZDR (Illingworth and Thompson, 2005), are evaluated using 12 episodes of the year 2005 observed by the French C-band operational Trappes radar, located near Paris. The term “integrated” means that the concentration parameter of the drop size distribution is assumed to be constant over some area and the algorithms retrieve it using the polarimetric variables in that area. The evaluation is carried out in ideal conditions (no partial beam blocking, no ground-clutter contamination, no bright band contamination, a posteriori calibration of the radar variables ZH and ZDR) using hourly rain gauges located at distances less than 60 km from the radar. Also included in the comparison, for the sake of benchmarking, is a conventional Z = 282R1.66 estimator, with and without attenuation correction and with and without adjustment by rain gauges as currently done operationally at Météo France. Under those ideal conditions, the two polarimetric algorithms, which rely solely on radar data, appear to perform as well if not better, pending on the measurements conditions (attenuation, rain rates, …), than the conventional algorithms, even when the latter take into account rain gauges through the adjustment scheme. ZZDR with attenuation correction is the best estimator for hourly rain gauge accumulations lower than 5 mm h−1 and ZPHI is the best one above that threshold. A perturbation analysis has been conducted to assess the sensitivity of the various estimators with respect to biases on ZH and ZDR, taking into account the typical accuracy and stability that can be reasonably achieved with modern operational radars these days (1 dB on ZH and 0.2 dB on ZDR). A +1 dB positive bias on ZH (radar too hot) results in a +14% overestimation of the rain rate with the conventional estimator used in this study (Z = 282R^1.66), a -19% underestimation with ZPHI and a +23% overestimation with ZZDR. Additionally, a +0.2 dB positive bias on ZDR results in a typical rain rate under- estimation of 15% by ZZDR.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Statistical graphics are a fundamental, yet often overlooked, set of components in the repertoire of data analytic tools. Graphs are quick and efficient, yet simple instruments of preliminary exploration of a dataset to understand its structure and to provide insight into influential aspects of inference such as departures from assumptions and latent patterns. In this paper, we present and assess a graphical device for choosing a method for estimating population size in capture-recapture studies of closed populations. The basic concept is derived from a homogeneous Poisson distribution where the ratios of neighboring Poisson probabilities multiplied by the value of the larger neighbor count are constant. This property extends to the zero-truncated Poisson distribution which is of fundamental importance in capture–recapture studies. In practice however, this distributional property is often violated. The graphical device developed here, the ratio plot, can be used for assessing specific departures from a Poisson distribution. For example, simple contaminations of an otherwise homogeneous Poisson model can be easily detected and a robust estimator for the population size can be suggested. Several robust estimators are developed and a simulation study is provided to give some guidance on which should be used in practice. More systematic departures can also easily be detected using the ratio plot. In this paper, the focus is on Gamma mixtures of the Poisson distribution which leads to a linear pattern (called structured heterogeneity) in the ratio plot. More generally, the paper shows that the ratio plot is monotone for arbitrary mixtures of power series densities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The study explores the uptake of livestock vaccination among poor farming communities in Tamil Nadu State, India by revisiting innovation diffusion theory. Overall, 601 farmers participated in the study. We found the adoption of particular vaccines was strongly influenced by socio-cultural grouping i.e. caste, rather than other factors such as income, age, education-level or gender. Adoption was also related to specific knowledge frames regarding disease causality, rather than any wider ethno-veterinary beliefs. Thus, the adoption of livestock vaccination is unlikely to improve without knowledge transfer activities, which acknowledge both social divisions and local epistemologies regarding animal health. Copyright © 2010 John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Three experiments measured constancy in speech perception, using natural-speech messages or noise-band vocoder versions of them. The eight vocoder-bands had equally log-spaced center-frequencies and the shapes of corresponding “auditory” filters. Consequently, the bands had the temporal envelopes that arise in these auditory filters when the speech is played. The “sir” or “stir” test-words were distinguished by degrees of amplitude modulation, and played in the context; “next you’ll get _ to click on.” Listeners identified test-words appropriately, even in the vocoder conditions where the speech had a “noise-like” quality. Constancy was assessed by comparing the identification of test-words with low or high levels of room reflections across conditions where the context had either a low or a high level of reflections. Constancy was obtained with both the natural and the vocoded speech, indicating that the effect arises through temporal-envelope processing. Two further experiments assessed perceptual weighting of the different bands, both in the test word and in the context. The resulting weighting functions both increase monotonically with frequency, following the spectral characteristics of the test-word’s [s]. It is suggested that these two weighting functions are similar because they both come about through the perceptual grouping of the test-word’s bands.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Automatic keyword or keyphrase extraction is concerned with assigning keyphrases to documents based on words from within the document. Previous studies have shown that in a significant number of cases author-supplied keywords are not appropriate for the document to which they are attached. This can either be because they represent what the author believes a paper is about not what it actually is, or because they include keyphrases which are more classificatory than explanatory e.g., “University of Poppleton” instead of “Knowledge Discovery in Databases”. Thus, there is a need for a system that can generate an appropriate and diverse range of keyphrases that reflect the document. This paper proposes two possible solutions that examine the synonyms of words and phrases in the document to find the underlying themes, and presents these as appropriate keyphrases. Using three different freely available thesauri, the work undertaken examines two different methods of producing keywords and compares the outcomes across multiple strands in the timeline. The primary method explores taking n-grams of the source document phrases, and examining the synonyms of these, while the secondary considers grouping outputs by their synonyms. The experiments undertaken show the primary method produces good results and that the secondary method produces both good results and potential for future work. In addition, the different qualities of the thesauri are examined and it is concluded that the more entries in a thesaurus, the better it is likely to perform. The age of the thesaurus or the size of each entry does not correlate to performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

References (20)Cited By (1)Export CitationAboutAbstract Proper scoring rules provide a useful means to evaluate probabilistic forecasts. Independent from scoring rules, it has been argued that reliability and resolution are desirable forecast attributes. The mathematical expectation value of the score allows for a decomposition into reliability and resolution related terms, demonstrating a relationship between scoring rules and reliability/resolution. A similar decomposition holds for the empirical (i.e. sample average) score over an archive of forecast–observation pairs. This empirical decomposition though provides a too optimistic estimate of the potential score (i.e. the optimum score which could be obtained through recalibration), showing that a forecast assessment based solely on the empirical resolution and reliability terms will be misleading. The differences between the theoretical and empirical decomposition are investigated, and specific recommendations are given how to obtain better estimators of reliability and resolution in the case of the Brier and Ignorance scoring rule.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sampling strategies for monitoring the status and trends in wildlife populations are often determined before the first survey is undertaken. However, there may be little information about the distribution of the population and so the sample design may be inefficient. Through time, as data are collected, more information about the distribution of animals in the survey region is obtained but it can be difficult to incorporate this information in the survey design. This paper introduces a framework for monitoring motile wildlife populations within which the design of future surveys can be adapted using data from past surveys whilst ensuring consistency in design-based estimates of status and trends through time. In each survey, part of the sample is selected from the previous survey sample using simple random sampling. The rest is selected with inclusion probability proportional to predicted abundance. Abundance is predicted using a model constructed from previous survey data and covariates for the whole survey region. Unbiased design-based estimators of status and trends and their variances are derived from two-phase sampling theory. Simulations over the short and long-term indicate that in general more precise estimates of status and trends are obtained using this mixed strategy than a strategy in which all of the sample is retained or all selected with probability proportional to predicted abundance. Furthermore the mixed strategy is robust to poor predictions of abundance. Estimates of status are more precise than those obtained from a rotating panel design.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Seamless phase II/III clinical trials combine traditional phases II and III into a single trial that is conducted in two stages, with stage 1 used to answer phase II objectives such as treatment selection and stage 2 used for the confirmatory analysis, which is a phase III objective. Although seamless phase II/III clinical trials are efficient because the confirmatory analysis includes phase II data from stage 1, inference can pose statistical challenges. In this paper, we consider point estimation following seamless phase II/III clinical trials in which stage 1 is used to select the most effective experimental treatment and to decide if, compared with a control, the trial should stop at stage 1 for futility. If the trial is not stopped, then the phase III confirmatory part of the trial involves evaluation of the selected most effective experimental treatment and the control. We have developed two new estimators for the treatment difference between these two treatments with the aim of reducing bias conditional on the treatment selection made and on the fact that the trial continues to stage 2. We have demonstrated the properties of these estimators using simulations