966 resultados para overlap probability


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Accurate determination of same-sex twin zygosity is important for medical, scientific and personal reasons. Determination may be based upon questionnaire data, blood group, enzyme isoforms and fetal membrane examination, but assignment of zygosity must ultimately be confirmed by genotypic data. Here methods are reviewed for calculating average probabilities of correctly concluding a twin pair is monozygotic, given they share the same genotypes across all loci for commonly utilized multiplex short tandem repeat (STR) kits.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Sampling design is critical to the quality of quantitative research, yet it does not always receive appropriate attention in nursing research. The current article details how balancing probability techniques with practical considerations produced a representative sample of Australian nursing homes (NHs). Budgetary, logistical, and statistical constraints were managed by excluding some NHs (e.g., those too difficult to access) from the sampling frame; a stratified, random sampling methodology yielded a final sample of 53 NHs from a population of 2,774. In testing the adequacy of representation of the study population, chi-square tests for goodness of fit generated nonsignificant results for distribution by distance from major city and type of organization. A significant result for state/territory was expected and was easily corrected for by the application of weights. The current article provides recommendations for conducting high-quality, probability-based samples and stresses the importance of testing the representativeness of achieved samples.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Whether a statistician wants to complement a probability model for observed data with a prior distribution and carry out fully probabilistic inference, or base the inference only on the likelihood function, may be a fundamental question in theory, but in practice it may well be of less importance if the likelihood contains much more information than the prior. Maximum likelihood inference can be justified as a Gaussian approximation at the posterior mode, using flat priors. However, in situations where parametric assumptions in standard statistical models would be too rigid, more flexible model formulation, combined with fully probabilistic inference, can be achieved using hierarchical Bayesian parametrization. This work includes five articles, all of which apply probability modeling under various problems involving incomplete observation. Three of the papers apply maximum likelihood estimation and two of them hierarchical Bayesian modeling. Because maximum likelihood may be presented as a special case of Bayesian inference, but not the other way round, in the introductory part of this work we present a framework for probability-based inference using only Bayesian concepts. We also re-derive some results presented in the original articles using the toolbox equipped herein, to show that they are also justifiable under this more general framework. Here the assumption of exchangeability and de Finetti's representation theorem are applied repeatedly for justifying the use of standard parametric probability models with conditionally independent likelihood contributions. It is argued that this same reasoning can be applied also under sampling from a finite population. The main emphasis here is in probability-based inference under incomplete observation due to study design. This is illustrated using a generic two-phase cohort sampling design as an example. The alternative approaches presented for analysis of such a design are full likelihood, which utilizes all observed information, and conditional likelihood, which is restricted to a completely observed set, conditioning on the rule that generated that set. Conditional likelihood inference is also applied for a joint analysis of prevalence and incidence data, a situation subject to both left censoring and left truncation. Other topics covered are model uncertainty and causal inference using posterior predictive distributions. We formulate a non-parametric monotonic regression model for one or more covariates and a Bayesian estimation procedure, and apply the model in the context of optimal sequential treatment regimes, demonstrating that inference based on posterior predictive distributions is feasible also in this case.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Hydrologic impacts of climate change are usually assessed by downscaling the General Circulation Model (GCM) output of large-scale climate variables to local-scale hydrologic variables. Such an assessment is characterized by uncertainty resulting from the ensembles of projections generated with multiple GCMs, which is known as intermodel or GCM uncertainty. Ensemble averaging with the assignment of weights to GCMs based on model evaluation is one of the methods to address such uncertainty and is used in the present study for regional-scale impact assessment. GCM outputs of large-scale climate variables are downscaled to subdivisional-scale monsoon rainfall. Weights are assigned to the GCMs on the basis of model performance and model convergence, which are evaluated with the Cumulative Distribution Functions (CDFs) generated from the downscaled GCM output (for both 20th Century [20C3M] and future scenarios) and observed data. Ensemble averaging approach, with the assignment of weights to GCMs, is characterized by the uncertainty caused by partial ignorance, which stems from nonavailability of the outputs of some of the GCMs for a few scenarios (in Intergovernmental Panel on Climate Change [IPCC] data distribution center for Assessment Report 4 [AR4]). This uncertainty is modeled with imprecise probability, i.e., the probability being represented as an interval gray number. Furthermore, the CDF generated with one GCM is entirely different from that with another and therefore the use of multiple GCMs results in a band of CDFs. Representing this band of CDFs with a single valued weighted mean CDF may be misleading. Such a band of CDFs can only be represented with an envelope that contains all the CDFs generated with a number of GCMs. Imprecise CDF represents such an envelope, which not only contains the CDFs generated with all the available GCMs but also to an extent accounts for the uncertainty resulting from the missing GCM output. This concept of imprecise probability is also validated in the present study. The imprecise CDFs of monsoon rainfall are derived for three 30-year time slices, 2020s, 2050s and 2080s, with A1B, A2 and B1 scenarios. The model is demonstrated with the prediction of monsoon rainfall in Orissa meteorological subdivision, which shows a possible decreasing trend in the future.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Acute anterior uveitis (AAU) involves inflammation of the iris and ciliary body of the eye. It occurs both in isolation and as a complication of ankylosing spondylitis (AS). It is strongly associated with HLA-B*27, but previous studies have suggested that further genetic factors may confer additional risk. We sought to investigate this using the Illumina Exomechip microarray, to compare 1504 cases with AS and AAU, 1805 with AS but no AAU and 21 133 healthy controls. We also used a heterogeneity test to test the differences in effect size between AS with AAU and AS without AAU. In the analysis comparing AS+AAU+ cases versus controls, HLA-B*27 and HLA-A*02:01 were significantly associated with the presence of AAU (P<10−300 and P=6 × 10−8, respectively). Secondary independent association with PSORS1C3 (P=4.7 × 10−5) and TAP2 (P=1.1 × 10−5) were observed in the major histocompatibility complex. There was a new suggestive association with a low-frequency variant at zinc-finger protein 154 in the AS without AAU versus control analysis (zinc-finger protein 154 (ZNF154), P=2.2 × 10−6). Heterogeneity testing showed that rs30187 in ERAP1 has a larger effect on AAU compared with that in AS alone. These findings also suggest that variants in ERAP1 have a differential impact on the risk of AAU when compared with AS, and hence the genetic risk for AAU differs from AS.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We derive a very general expression of the survival probability and the first passage time distribution for a particle executing Brownian motion in full phase space with an absorbing boundary condition at a point in the position space, which is valid irrespective of the statistical nature of the dynamics. The expression, together with the Jensen's inequality, naturally leads to a lower bound to the actual survival probability and an approximate first passage time distribution. These are expressed in terms of the position-position, velocity-velocity, and position-velocity variances. Knowledge of these variances enables one to compute a lower bound to the survival probability and consequently the first passage distribution function. As examples, we compute these for a Gaussian Markovian process and, in the case of non-Markovian process, with an exponentially decaying friction kernel and also with a power law friction kernel. Our analysis shows that the survival probability decays exponentially at the long time irrespective of the nature of the dynamics with an exponent equal to the transition state rate constant.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We consider the problem of detecting statistically significant sequential patterns in multineuronal spike trains. These patterns are characterized by ordered sequences of spikes from different neurons with specific delays between spikes. We have previously proposed a data-mining scheme to efficiently discover such patterns, which occur often enough in the data. Here we propose a method to determine the statistical significance of such repeating patterns. The novelty of our approach is that we use a compound null hypothesis that not only includes models of independent neurons but also models where neurons have weak dependencies. The strength of interaction among the neurons is represented in terms of certain pair-wise conditional probabilities. We specify our null hypothesis by putting an upper bound on all such conditional probabilities. We construct a probabilistic model that captures the counting process and use this to derive a test of significance for rejecting such a compound null hypothesis. The structure of our null hypothesis also allows us to rank-order different significant patterns. We illustrate the effectiveness of our approach using spike trains generated with a simulator.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A straightforward computation of the list of the words (the `tail words' of the list) that are distributionally most similar to a given word (the `head word' of the list) leads to the question: How semantically similar to the head word are the tail words; that is: how similar are their meanings to its meaning? And can we do better? The experiment was done on nearly 18,000 most frequent nouns in a Finnish newsgroup corpus. These nouns are considered to be distributionally similar to the extent that they occur in the same direct dependency relations with the same nouns, adjectives and verbs. The extent of the similarity of their computational representations is quantified with the information radius. The semantic classification of head-tail pairs is intuitive; some tail words seem to be semantically similar to the head word, some do not. Each such pair is also associated with a number of further distributional variables. Individually, their overlap for the semantic classes is large, but the trained classification-tree models have some success in using combinations to predict the semantic class. The training data consists of a random sample of 400 head-tail pairs with the tail word ranked among the 20 distributionally most similar to the head word, excluding names. The models are then tested on a random sample of another 100 such pairs. The best success rates range from 70% to 92% of the test pairs, where a success means that the model predicted my intuitive semantic class of the pair. This seems somewhat promising when distributional similarity is used to capture semantically similar words. This analysis also includes a general discussion of several different similarity formulas, arranged in three groups: those that apply to sets with graded membership, those that apply to the members of a vector space, and those that apply to probability mass functions.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The probability that a random process crosses an arbitrary level for the first time is expressed as a Gram—Charlier series, the leading term of which is the Poisson approximation. The coefficients of this series are related to the moments of the number of level crossings. The results are applicable to both stationary and non-stationary processes. Some numerical results are presented for the response process of a linear single-degree-of-freedom oscillator under Gaussian white noise excitation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We report numerical and analytic results for the spatial survival probability for fluctuating one-dimensional interfaces with Edwards-Wilkinson or Kardar-Parisi-Zhang dynamics in the steady state. Our numerical results are obtained from analysis of steady-state profiles generated by integrating a spatially discretized form of the Edwards-Wilkinson equation to long times. We show that the survival probability exhibits scaling behavior in its dependence on the system size and the "sampling interval" used in the measurement for both "steady-state" and "finite" initial conditions. Analytic results for the scaling functions are obtained from a path-integral treatment of a formulation of the problem in terms of one-dimensional Brownian motion. A "deterministic approximation" is used to obtain closed-form expressions for survival probabilities from the formally exact analytic treatment. The resulting approximate analytic results provide a fairly good description of the numerical data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The probability distribution of the eigenvalues of a second-order stochastic boundary value problem is considered. The solution is characterized in terms of the zeros of an associated initial value problem. It is further shown that the probability distribution is related to the solution of a first-order nonlinear stochastic differential equation. Solutions of this equation based on the theory of Markov processes and also on the closure approximation are presented. A string with stochastic mass distribution is considered as an example for numerical work. The theoretical probability distribution functions are compared with digital simulation results. The comparison is found to be reasonably good.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, a novel genetic algorithm is developed by generating artificial chromosomes with probability control to solve the machine scheduling problems. Generating artificial chromosomes for Genetic Algorithm (ACGA) is closely related to Evolutionary Algorithms Based on Probabilistic Models (EAPM). The artificial chromosomes are generated by a probability model that extracts the gene information from current population. ACGA is considered as a hybrid algorithm because both the conventional genetic operators and a probability model are integrated. The ACGA proposed in this paper, further employs the ``evaporation concept'' applied in Ant Colony Optimization (ACO) to solve the permutation flowshop problem. The ``evaporation concept'' is used to reduce the effect of past experience and to explore new alternative solutions. In this paper, we propose three different methods for the probability of evaporation. This probability of evaporation is applied as soon as a job is assigned to a position in the permutation flowshop problem. Experimental results show that our ACGA with the evaporation concept gives better performance than some algorithms in the literature.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The phenomenological theory of hemispherical growth in the context of phase formation with more than one component is presented. The model discusses in a unified manner both instantaneous and progressive nucleation (at the substrate) as well as arbitrary growth rates (e.g. constant and diffusion controlled growth rates). A generalized version of Avrami ansatz (a mean field description) is used to tackle the ''overlap'' aspects arising from the growing multicentres of the many components involved, observing that the nucleation is confined to the substrate plane only. The time evolution of the total extent of macrogrowth as well as those of the individual components are discussed explicitly for the case of two phases. The asymptotic expressions for macrogrowth are derived. Such analysis depicts a saturation limit (i.e. the maximum extent of growth possible) for the slower growing component and its dependence on the kinetic parameters which, in the electrochemical context, can be controlled through potential. The significance of this model in the context of multicomponent alloy deposition and possible future directions for further development are pointed out.