902 resultados para Deterministic imputation


Relevância:

10.00% 10.00%

Publicador:

Resumo:

A collection of spherical obstacles in the unit ball in Euclidean space is said to be avoidable for Brownian motion if there is a positive probability that Brownian motion diffusing from some point in the ball will avoid all the obstacles and reach the boundary of the ball. The centres of the spherical obstacles are generated according to a Poisson point process while the radius of an obstacle is a deterministic function. If avoidable con gurations are generated with positive probability Lundh calls this percolation di usion. An integral condition for percolation di ffusion is derived in terms of the intensity of the point process and the function that determines the radii of the obstacles.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Human genetic variation contributes to differences in susceptibility to HIV-1 infection. To search for novel host resistance factors, we performed a genome-wide association study (GWAS) in hemophilia patients highly exposed to potentially contaminated factor VIII infusions. Individuals with hemophilia A and a documented history of factor VIII infusions before the introduction of viral inactivation procedures (1979-1984) were recruited from 36 hemophilia treatment centers (HTCs), and their genome-wide genetic variants were compared with those from matched HIV-infected individuals. Homozygous carriers of known CCR5 resistance mutations were excluded. Single nucleotide polymorphisms (SNPs) and inferred copy number variants (CNVs) were tested using logistic regression. In addition, we performed a pathway enrichment analysis, a heritability analysis, and a search for epistatic interactions with CCR5 Δ32 heterozygosity. A total of 560 HIV-uninfected cases were recruited: 36 (6.4%) were homozygous for CCR5 Δ32 or m303. After quality control and SNP imputation, we tested 1 081 435 SNPs and 3686 CNVs for association with HIV-1 serostatus in 431 cases and 765 HIV-infected controls. No SNP or CNV reached genome-wide significance. The additional analyses did not reveal any strong genetic effect. Highly exposed, yet uninfected hemophiliacs form an ideal study group to investigate host resistance factors. Using a genome-wide approach, we did not detect any significant associations between SNPs and HIV-1 susceptibility, indicating that common genetic variants of major effect are unlikely to explain the observed resistance phenotype in this population.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Performance prediction and application behavior modeling have been the subject of exten- sive research that aim to estimate applications performance with an acceptable precision. A novel approach to predict the performance of parallel applications is based in the con- cept of Parallel Application Signatures that consists in extract an application most relevant parts (phases) and the number of times they repeat (weights). Executing these phases in a target machine and multiplying its exeuction time by its weight an estimation of the application total execution time can be made. One of the problems is that the performance of an application depends on the program workload. Every type of workload affects differently how an application performs in a given system and so affects the signature execution time. Since the workloads used in most scientific parallel applications have dimensions and data ranges well known and the behavior of these applications are mostly deterministic, a model of how the programs workload affect its performance can be obtained. We create a new methodology to model how a program’s workload affect the parallel application signature. Using regression analysis we are able to generalize each phase time execution and weight function to predict an application performance in a target system for any type of workload within predefined range. We validate our methodology using a synthetic program, benchmarks applications and well known real scientific applications.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

DREAM is an initiative that allows researchers to assess how well their methods or approaches can describe and predict networks of interacting molecules [1]. Each year, recently acquired datasets are released to predictors ahead of publication. Researchers typically have about three months to predict the masked data or network of interactions, using any predictive method. Predictions are assessed prior to an annual conference where the best predictions are unveiled and discussed. Here we present the strategy we used to make a winning prediction for the DREAM3 phosphoproteomics challenge. We used Amelia II, a multiple imputation software method developed by Gary King, James Honaker and Matthew Blackwell[2] in the context of social sciences to predict the 476 out of 4624 measurements that had been masked for the challenge. To chose the best possible multiple imputation parameters to apply for the challenge, we evaluated how transforming the data and varying the imputation parameters affected the ability to predict additionally masked data. We discuss the accuracy of our findings and show that multiple imputations applied to this dataset is a powerful method to accurately estimate the missing data. We postulate that multiple imputations methods might become an integral part of experimental design as a mean to achieve cost savings in experimental design or to increase the quantity of samples that could be handled for a given cost.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: After age, sex is the most important risk factor for coronary artery disease (CAD). The mechanism through which women are protected from CAD is still largely unknown, but the observed sex difference suggests the involvement of the reproductive steroid hormone signaling system. Genetic association studies of the gene-encoding Estrogen Receptor α (ESR1) have shown conflicting results, although only a limited range of variation in the gene has been investigated. METHODS AND RESULTS: We exploited information made available by advanced new methods and resources in complex disease genetics to revisit the question of ESR1's role in risk of CAD. We performed a meta-analysis of 14 genome-wide association studies (CARDIoGRAM discovery analysis, N=≈87,000) to search for population-wide and sex-specific associations between CAD risk and common genetic variants throughout the coding, noncoding, and flanking regions of ESR1. In addition to samples from the MIGen (N=≈6000), WTCCC (N=≈7400), and Framingham (N=≈3700) studies, we extended this search to a larger number of common and uncommon variants by imputation into a panel of haplotypes constructed using data from the 1000 Genomes Project. Despite the widespread expression of ERα in vascular tissues, we found no evidence for involvement of common or low-frequency genetic variation throughout the ESR1 gene in modifying risk of CAD, either in the general population or as a function of sex. CONCLUSIONS: We suggest that future research on the genetic basis of sex-related differences in CAD risk should initially prioritize other genes in the reproductive steroid hormone biosynthesis system.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Résumé : Depuis la fin de la perestroïka s'est mis en place en Russie un discours identitaire qui, en linguistique, prend des formes extrêmes, reposant sur un strict déterminisme de la pensée par la langue. Les organismes de financement de la recherche scientifique soutiennent des projets qui étudient le rapport entre la grammaire russe et le « caractère national russe ». Des objets nouveaux apparaissent : « l'image linguistique russe du monde », « la personnalité linguistique », la « linguoculturologie ». Cet ensemble discursif construit dans l'imaginaire une identité collective rassurante, reposant sur l'idée que 1) tous les gens qui parlent la même langue pensent de la même façon; 2) les langues, donc les pensées collectives, sont imperméables entre elles, et donc intraduisibles. Cette tendance néo-humboldtienne dans la linguistique russe actuelle se déploie en toute méconnaissance de ses origines historiques : le Romantisme allemand dans son opposition à la philosophie des Lumières, le positivisme évolutionnisme d'Auguste Comte et la linguistique déterministe de l'Allemagne des années 1930.AbstractSince the end of perestroika, in linguistics in Russia, a new form of discourse has taken place, which stresses a very tight determinism of thought by language. The funding organizations of scientific research back up projects studying the relationship between Russiangrammar and the « Russian national character ». New objects of knowledge come to light : « the Russian linguistic image of the world », « linguistic personnality », « culturology ». This kind of discourse builds up an imaginary comforting collective identity, which relies on the principle that 1) all the people who speak the same language think the same way; 2) languages, hence collective kinds of thought, are hermetically closed to each other, and untranslatable. This neo-humboldtian trend in contemporary Russian linguistics has no knowledge of its historical origins : German Romanticism in its Anti-Enlightenment trend, evolutionnist positivism of Auguste Comte, and deterministic linguistics in Germany in the 1930s.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

MOTIVATION: Understanding gene regulation in biological processes and modeling the robustness of underlying regulatory networks is an important problem that is currently being addressed by computational systems biologists. Lately, there has been a renewed interest in Boolean modeling techniques for gene regulatory networks (GRNs). However, due to their deterministic nature, it is often difficult to identify whether these modeling approaches are robust to the addition of stochastic noise that is widespread in gene regulatory processes. Stochasticity in Boolean models of GRNs has been addressed relatively sparingly in the past, mainly by flipping the expression of genes between different expression levels with a predefined probability. This stochasticity in nodes (SIN) model leads to over representation of noise in GRNs and hence non-correspondence with biological observations. RESULTS: In this article, we introduce the stochasticity in functions (SIF) model for simulating stochasticity in Boolean models of GRNs. By providing biological motivation behind the use of the SIF model and applying it to the T-helper and T-cell activation networks, we show that the SIF model provides more biologically robust results than the existing SIN model of stochasticity in GRNs. AVAILABILITY: Algorithms are made available under our Boolean modeling toolbox, GenYsis. The software binaries can be downloaded from http://si2.epfl.ch/ approximately garg/genysis.html.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The observation of non-random phylogenetic distribution of traits in communities provides evidence for niche-based community assembly. Environment may influence the phylogenetic structure of communities because traits determining how species respond to prevailing conditions can be phylogenetically conserved. In this study, we investigate the variation of butterfly species richness and of phylogenetic - and -diversities along temperature and plant species richness gradients. Our study indicates that butterfly richness is independently positively correlated to temperature and plant species richness in the study area. However, the variation of phylogenetic - and -diversities is only correlated to temperature. The significant phylogenetic clustering at high elevation suggests that cold temperature filters butterfly lineages, leading to communities mostly composed of closely related species adapted to those climatic conditions. These results suggest that in colder and more severe conditions at high elevations deterministic processes and not purely stochastic events drive the assemblage of butterfly communities.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Low concentrations of elements in geochemical analyses have the peculiarity of beingcompositional data and, for a given level of significance, are likely to be beyond thecapabilities of laboratories to distinguish between minute concentrations and completeabsence, thus preventing laboratories from reporting extremely low concentrations of theanalyte. Instead, what is reported is the detection limit, which is the minimumconcentration that conclusively differentiates between presence and absence of theelement. A spatially distributed exhaustive sample is employed in this study to generateunbiased sub-samples, which are further censored to observe the effect that differentdetection limits and sample sizes have on the inference of population distributionsstarting from geochemical analyses having specimens below detection limit (nondetects).The isometric logratio transformation is used to convert the compositional data in thesimplex to samples in real space, thus allowing the practitioner to properly borrow fromthe large source of statistical techniques valid only in real space. The bootstrap method isused to numerically investigate the reliability of inferring several distributionalparameters employing different forms of imputation for the censored data. The casestudy illustrates that, in general, best results are obtained when imputations are madeusing the distribution best fitting the readings above detection limit and exposes theproblems of other more widely used practices. When the sample is spatially correlated, itis necessary to combine the bootstrap with stochastic simulation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Natural variation in DNA sequence contributes to individual differences in quantitative traits. While multiple studies have shown genetic control over gene expression variation, few additional cellular traits have been investigated. Here, we investigated the natural variation of NADPH oxidase-dependent hydrogen peroxide (H(2)O(2) release), which is the joint effect of reactive oxygen species (ROS) production, superoxide metabolism and degradation, and is related to a number of human disorders. We assessed the normal variation of H(2)O(2) release in lymphoblastoid cell lines (LCL) in a family-based 3-generation cohort (CEPH-HapMap), and in 3 population-based cohorts (KORA, GenCord, HapMap). Substantial individual variation was observed, 45% of which were associated with heritability in the CEPH-HapMap cohort. We identified 2 genome-wide significant loci of Hsa12 and Hsa15 in genome-wide linkage analysis. Next, we performed genome-wide association study (GWAS) for the combined KORA-GenCord cohorts (n = 279) using enhanced marker resolution by imputation (>1.4 million SNPs). We found 5 significant associations (p<5.00×10-8) and 54 suggestive associations (p<1.00×10-5), one of which confirmed the linked region on Hsa15. To replicate our findings, we performed GWAS using 58 HapMap individuals and ∼2.1 million SNPs. We identified 40 genome-wide significant and 302 suggestive SNPs, and confirmed genome signals on Hsa1, Hsa12, and Hsa15. Genetic loci within 900 kb from the known candidate gene p67phox on Hsa1 were identified in GWAS in both cohorts. We did not find replication of SNPs across all cohorts, but replication within the same genomic region. Finally, a highly significant decrease in H(2)O(2) release was observed in Down Syndrome (DS) individuals (p<2.88×10-12). Taken together, our results show strong evidence of genetic control of H(2)O(2) in LCL of healthy and DS cohorts and suggest that cellular phenotypes, which themselves are also complex, may be used as proxies for dissection of complex disorders.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

First: A continuous-time version of Kyle's model (Kyle 1985), known as the Back's model (Back 1992), of asset pricing with asymmetric information, is studied. A larger class of price processes and of noise traders' processes are studied. The price process, as in Kyle's model, is allowed to depend on the path of the market order. The process of the noise traders' is an inhomogeneous Lévy process. Solutions are found by the Hamilton-Jacobi-Bellman equations. With the insider being risk-neutral, the price pressure is constant, and there is no equilibirium in the presence of jumps. If the insider is risk-averse, there is no equilibirium in the presence of either jumps or drifts. Also, it is analised when the release time is unknown. A general relation is established between the problem of finding an equilibrium and of enlargement of filtrations. Random announcement time is random is also considered. In such a case the market is not fully efficient and there exists equilibrium if the sensitivity of prices with respect to the global demand is time decreasing according with the distribution of the random time. Second: Power variations. it is considered, the asymptotic behavior of the power variation of processes of the form _integral_0^t u(s-)dS(s), where S_ is an alpha-stable process with index of stability 0&alpha&2 and the integral is an Itô integral. Stable convergence of corresponding fluctuations is established. These results provide statistical tools to infer the process u from discrete observations. Third: A bond market is studied where short rates r(t) evolve as an integral of g(t-s)sigma(s) with respect to W(ds), where g and sigma are deterministic and W is the stochastic Wiener measure. Processes of this type are particular cases of ambit processes. These processes are in general not of the semimartingale kind.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In the forensic examination of DNA mixtures, the question of how to set the total number of contributors (N) presents a topic of ongoing interest. Part of the discussion gravitates around issues of bias, in particular when assessments of the number of contributors are not made prior to considering the genotypic configuration of potential donors. Further complication may stem from the observation that, in some cases, there may be numbers of contributors that are incompatible with the set of alleles seen in the profile of a mixed crime stain, given the genotype of a potential contributor. In such situations, procedures that take a single and fixed number contributors as their output can lead to inferential impasses. Assessing the number of contributors within a probabilistic framework can help avoiding such complication. Using elements of decision theory, this paper analyses two strategies for inference on the number of contributors. One procedure is deterministic and focuses on the minimum number of contributors required to 'explain' an observed set of alleles. The other procedure is probabilistic using Bayes' theorem and provides a probability distribution for a set of numbers of contributors, based on the set of observed alleles as well as their respective rates of occurrence. The discussion concentrates on mixed stains of varying quality (i.e., different numbers of loci for which genotyping information is available). A so-called qualitative interpretation is pursued since quantitative information such as peak area and height data are not taken into account. The competing procedures are compared using a standard scoring rule that penalizes the degree of divergence between a given agreed value for N, that is the number of contributors, and the actual value taken by N. Using only modest assumptions and a discussion with reference to a casework example, this paper reports on analyses using simulation techniques and graphical models (i.e., Bayesian networks) to point out that setting the number of contributors to a mixed crime stain in probabilistic terms is, for the conditions assumed in this study, preferable to a decision policy that uses categoric assumptions about N.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As stated in Aitchison (1986), a proper study of relative variation in a compositional data set should be based on logratios, and dealing with logratios excludes dealing with zeros. Nevertheless, it is clear that zero observations might be present in real data sets, either because the corresponding part is completelyabsent –essential zeros– or because it is below detection limit –rounded zeros. Because the second kind of zeros is usually understood as “a trace too small to measure”, it seems reasonable to replace them by a suitable small value, and this has been the traditional approach. As stated, e.g. by Tauber (1999) and byMartín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000), the principal problem in compositional data analysis is related to rounded zeros. One should be careful to use a replacement strategy that does not seriously distort the general structure of the data. In particular, the covariance structure of the involvedparts –and thus the metric properties– should be preserved, as otherwise further analysis on subpopulations could be misleading. Following this point of view, a non-parametric imputation method isintroduced in Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2000). This method is analyzed in depth by Martín-Fernández, Barceló-Vidal, and Pawlowsky-Glahn (2003) where it is shown that thetheoretical drawbacks of the additive zero replacement method proposed in Aitchison (1986) can be overcome using a new multiplicative approach on the non-zero parts of a composition. The new approachhas reasonable properties from a compositional point of view. In particular, it is “natural” in the sense thatit recovers the “true” composition if replacement values are identical to the missing values, and it is coherent with the basic operations on the simplex. This coherence implies that the covariance structure of subcompositions with no zeros is preserved. As a generalization of the multiplicative replacement, in thesame paper a substitution method for missing values on compositional data sets is introduced

Relevância:

10.00% 10.00%

Publicador:

Resumo:

All of the imputation techniques usually applied for replacing values below thedetection limit in compositional data sets have adverse effects on the variability. In thiswork we propose a modification of the EM algorithm that is applied using the additivelog-ratio transformation. This new strategy is applied to a compositional data set and theresults are compared with the usual imputation techniques

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: Estimates of drug resistance incidence to modern first-line combination antiretroviral therapies against human immunodeficiency virus (HIV) type 1 are complicated by limited availability of genotypic drug resistance tests (GRTs) and uncertain timing of resistance emergence. METHODS: Five first-line combinations were studied (all paired with lamivudine or emtricitabine): efavirenz (EFV) plus zidovudine (AZT) (n = 524); EFV plus tenofovir (TDF) (n = 615); lopinavir (LPV) plus AZT (n = 573); LPV plus TDF (n = 301); and ritonavir-boosted atazanavir (ATZ/r) plus TDF (n = 250). Virological treatment outcomes were classified into 3 risk strata for emergence of resistance, based on whether undetectable HIV RNA levels were maintained during therapy and, if not, whether viral loads were >500 copies/mL during treatment. Probabilities for presence of resistance mutations were estimated from GRTs (n = 2876) according to risk stratum and therapy received at time of testing. On the basis of these data, events of resistance emergence were imputed for each individual and were assessed using survival analysis. Imputation was repeated 100 times, and results were summarized by median values (2.5th-97.5th percentile range). RESULTS: Six years after treatment initiation, EFV plus AZT showed the highest cumulative resistance incidence (16%) of all regimens (<11%). Confounder-adjusted Cox regression confirmed that first-line EFV plus AZT (reference) was associated with a higher median hazard for resistance emergence, compared with other treatments: EFV plus TDF (hazard ratio [HR], 0.57; range, 0.42-0.76), LPV plus AZT (HR, 0.63; range, 0.45-0.89), LPV plus TDF (HR, 0.55; range, 0.33-0.83), ATZ/r plus TDF (HR, 0.43; range, 0.17-0.83). Two-thirds of resistance events were associated with detectable HIV RNA level ≤500 copies/mL during treatment, and only one-third with virological failure (HIV RNA level, >500 copies/mL). CONCLUSIONS: The inclusion of TDF instead of AZT and ATZ/r was correlated with lower rates of resistance emergence, most likely because of improved tolerability and pharmacokinetics resulting from a once-daily dosage.