997 resultados para Bayesian aggregation


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Les modèles à sur-représentation de zéros discrets et continus ont une large gamme d'applications et leurs propriétés sont bien connues. Bien qu'il existe des travaux portant sur les modèles discrets à sous-représentation de zéro et modifiés à zéro, la formulation usuelle des modèles continus à sur-représentation -- un mélange entre une densité continue et une masse de Dirac -- empêche de les généraliser afin de couvrir le cas de la sous-représentation de zéros. Une formulation alternative des modèles continus à sur-représentation de zéros, pouvant aisément être généralisée au cas de la sous-représentation, est présentée ici. L'estimation est d'abord abordée sous le paradigme classique, et plusieurs méthodes d'obtention des estimateurs du maximum de vraisemblance sont proposées. Le problème de l'estimation ponctuelle est également considéré du point de vue bayésien. Des tests d'hypothèses classiques et bayésiens visant à déterminer si des données sont à sur- ou sous-représentation de zéros sont présentées. Les méthodes d'estimation et de tests sont aussi évaluées au moyen d'études de simulation et appliquées à des données de précipitation agrégées. Les diverses méthodes s'accordent sur la sous-représentation de zéros des données, démontrant la pertinence du modèle proposé. Nous considérons ensuite la classification d'échantillons de données à sous-représentation de zéros. De telles données étant fortement non normales, il est possible de croire que les méthodes courantes de détermination du nombre de grappes s'avèrent peu performantes. Nous affirmons que la classification bayésienne, basée sur la distribution marginale des observations, tiendrait compte des particularités du modèle, ce qui se traduirait par une meilleure performance. Plusieurs méthodes de classification sont comparées au moyen d'une étude de simulation, et la méthode proposée est appliquée à des données de précipitation agrégées provenant de 28 stations de mesure en Colombie-Britannique.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We investigate whether relative contributions of genetic and shared environmental factors are associated with an increased risk in melanoma. Data from the Queensland Familial Melanoma Project comprising 15,907 subjects arising from 1912 families were analyzed to estimate the additive genetic, common and unique environmental contributions to variation in the age at onset of melanoma. Two complementary approaches for analyzing correlated time-to-onset family data were considered: the generalized estimating equations (GEE) method in which one can estimate relationship-specific dependence simultaneously with regression coefficients that describe the average population response to changing covariates; and a subject-specific Bayesian mixed model in which heterogeneity in regression parameters is explicitly modeled and the different components of variation may be estimated directly. The proportional hazards and Weibull models were utilized, as both produce natural frameworks for estimating relative risks while adjusting for simultaneous effects of other covariates. A simple Markov Chain Monte Carlo method for covariate imputation of missing data was used and the actual implementation of the Bayesian model was based on Gibbs sampling using the free ware package BUGS. In addition, we also used a Bayesian model to investigate the relative contribution of genetic and environmental effects on the expression of naevi and freckles, which are known risk factors for melanoma.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We have developed a real-time imaging method for two-color wide-field fluorescence microscopy using a combined approach that integrates multi-spectral imaging and Bayesian image reconstruction technique. To enable simultaneous observation of two dyes (primary and secondary), we exploit their spectral properties that allow parallel recording in both the channels. The key advantage of this technique is the use of a single wavelength of light to excite both the primary dye and the secondary dye. The primary and secondary dyes respectively give rise to fluorescence and bleed-through signal, which after normalization were merged to obtain two-color 3D images. To realize real-time imaging, we employed maximum likelihood (ML) and maximum a posteriori (MAP) techniques on a high-performance computing platform (GPU). The results show two-fold improvement in contrast while the signal-to-background ratio (SBR) is improved by a factor of 4. We report a speed boost of 52 and 350 for 2D and 3D images respectively. Using this system, we have studied the real-time protein aggregation in yeast cells and HeLa cells that exhibits dot-like protein distribution. The proposed technique has the ability to temporally resolve rapidly occurring biological events.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Abstract Background An important challenge for transcript counting methods such as Serial Analysis of Gene Expression (SAGE), "Digital Northern" or Massively Parallel Signature Sequencing (MPSS), is to carry out statistical analyses that account for the within-class variability, i.e., variability due to the intrinsic biological differences among sampled individuals of the same class, and not only variability due to technical sampling error. Results We introduce a Bayesian model that accounts for the within-class variability by means of mixture distribution. We show that the previously available approaches of aggregation in pools ("pseudo-libraries") and the Beta-Binomial model, are particular cases of the mixture model. We illustrate our method with a brain tumor vs. normal comparison using SAGE data from public databases. We show examples of tags regarded as differentially expressed with high significance if the within-class variability is ignored, but clearly not so significant if one accounts for it. Conclusion Using available information about biological replicates, one can transform a list of candidate transcripts showing differential expression to a more reliable one. Our method is freely available, under GPL/GNU copyleft, through a user friendly web-based on-line tool or as R language scripts at supplemental web-site.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Temporal replicate counts are often aggregated to improve model fit by reducing zero-inflation and count variability, and in the case of migration counts collected hourly throughout a migration, allows one to ignore nonindependence. However, aggregation can represent a loss of potentially useful information on the hourly or seasonal distribution of counts, which might impact our ability to estimate reliable trends. We simulated 20-year hourly raptor migration count datasets with known rate of change to test the effect of aggregating hourly counts to daily or annual totals on our ability to recover known trend. We simulated data for three types of species, to test whether results varied with species abundance or migration strategy: a commonly detected species, e.g., Northern Harrier, Circus cyaneus; a rarely detected species, e.g., Peregrine Falcon, Falco peregrinus; and a species typically counted in large aggregations with overdispersed counts, e.g., Broad-winged Hawk, Buteo platypterus. We compared accuracy and precision of estimated trends across species and count types (hourly/daily/annual) using hierarchical models that assumed a Poisson, negative binomial (NB) or zero-inflated negative binomial (ZINB) count distribution. We found little benefit of modeling zero-inflation or of modeling the hourly distribution of migration counts. For the rare species, trends analyzed using daily totals and an NB or ZINB data distribution resulted in a higher probability of detecting an accurate and precise trend. In contrast, trends of the common and overdispersed species benefited from aggregation to annual totals, and for the overdispersed species in particular, trends estimating using annual totals were more precise, and resulted in lower probabilities of estimating a trend (1) in the wrong direction, or (2) with credible intervals that excluded the true trend, as compared with hourly and daily counts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Bayesian Belief Networks (BBNs) are emerging as valuable tools for investigating complex ecological problems. In a BBN, the important variables in a problem are identified and causal relationships are represented graphically. Underpinning this is the probabilistic framework in which variables can take on a finite range of mutually exclusive states. Associated with each variable is a conditional probability table (CPT), showing the probability of a variable attaining each of its possible states conditioned on all possible combinations of it parents. Whilst the variables (nodes) are connected, the CPT attached to each node can be quantified independently. This allows each variable to be populated with the best data available, including expert opinion, simulation results or observed data. It also allows the information to be easily updated as better data become available ----- ----- This paper reports on the process of developing a BBN to better understand the initial rapid growth phase (initiation) of a marine cyanobacterium, Lyngbya majuscula, in Moreton Bay, Queensland. Anecdotal evidence suggests that Lyngbya blooms in this region have increased in severity and extent over the past decade. Lyngbya has been associated with acute dermatitis and a range of other health problems in humans. Blooms have been linked to ecosystem degradation and have also damaged commercial and recreational fisheries. However, the causes of blooms are as yet poorly understood.