980 resultados para stochastic approximation algorithm
Resumo:
Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn’s disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron(tm) Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn’s disease (CD) data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn’s disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations.
Resumo:
It is well known the relationship between source separation and blind deconvolution: If a filtered version of an unknown i.i.d. signal is observed, temporal independence between samples can be used to retrieve the original signal, in the same manner as spatial independence is used for source separation. In this paper we propose the use of a Genetic Algorithm (GA) to blindly invert linear channels. The use of GA is justified in the case of small number of samples, where other gradient-like methods fails because of poor estimation of statistics.
Resumo:
This paper proposes a very fast method for blindly initial- izing a nonlinear mapping which transforms a sum of random variables. The method provides a surprisingly good approximation even when the basic assumption is not fully satis¯ed. The method can been used success- fully for initializing nonlinearity in post-nonlinear mixtures or in Wiener system inversion, for improving algorithm speed and convergence.
Resumo:
Mouse NK cells express MHC class I-specific inhibitory Ly49 receptors. Since these receptors display distinct ligand specificities and are clonally distributed, their expression generates a diverse NK cell receptor repertoire specific for MHC class I molecules. We have previously found that the Dd (or Dk)-specific Ly49A receptor is usually expressed from a single allele. However, a small fraction of short-term NK cell clones expressed both Ly49A alleles, suggesting that the two Ly49A alleles are independently and randomly expressed. Here we show that the genes for two additional Ly49 receptors (Ly49C and Ly49G2) are also expressed in a (predominantly) mono-allelic fashion. Since single NK cells can co-express multiple Ly49 receptors, we also investigated whether mono-allelic expression from within the tightly linked Ly49 gene cluster is coordinate or independent. Our clonal analysis suggests that the expression of alleles of distinct Ly49 genes is not coordinate. Thus Ly49 alleles are apparently independently and randomly chosen for stable expression, a process that directly restricts the number of Ly49 receptors expressed per single NK cell. We propose that the Ly49 receptor repertoire specific for MHC class I is generated by an allele-specific, stochastic gene expression process that acts on the entire Ly49 gene cluster.
Resumo:
Postprint (published version)
Resumo:
We show that the dipole, a system usually proposed to model relaxation phenomena, exhibits a maximum in the signal-to-noise ratio at a nonzero noise level, thus indicating the appearance of stochastic resonance. The phenomenon occurs in two different situations, i.e., when the minimum of the potential of the dipole remains fixed in time and when it switches periodically between two equilibrium points. We have also found that the signal-to-noise ratio has a maximum for a certain value of the amplitude of the oscillating field.
Resumo:
Simulated-annealing-based conditional simulations provide a flexible means of quantitatively integrating diverse types of subsurface data. Although such techniques are being increasingly used in hydrocarbon reservoir characterization studies, their potential in environmental, engineering and hydrological investigations is still largely unexploited. Here, we introduce a novel simulated annealing (SA) algorithm geared towards the integration of high-resolution geophysical and hydrological data which, compared to more conventional approaches, provides significant advancements in the way that large-scale structural information in the geophysical data is accounted for. Model perturbations in the annealing procedure are made by drawing from a probability distribution for the target parameter conditioned to the geophysical data. This is the only place where geophysical information is utilized in our algorithm, which is in marked contrast to other approaches where model perturbations are made through the swapping of values in the simulation grid and agreement with soft data is enforced through a correlation coefficient constraint. Another major feature of our algorithm is the way in which available geostatistical information is utilized. Instead of constraining realizations to match a parametric target covariance model over a wide range of spatial lags, we constrain the realizations only at smaller lags where the available geophysical data cannot provide enough information. Thus we allow the larger-scale subsurface features resolved by the geophysical data to have much more due control on the output realizations. Further, since the only component of the SA objective function required in our approach is a covariance constraint at small lags, our method has improved convergence and computational efficiency over more traditional methods. Here, we present the results of applying our algorithm to the integration of porosity log and tomographic crosshole georadar data to generate stochastic realizations of the local-scale porosity structure. Our procedure is first tested on a synthetic data set, and then applied to data collected at the Boise Hydrogeophysical Research Site.
Resumo:
Context: Ovarian tumors (OT) typing is a competency expected from pathologists, with significant clinical implications. OT however come in numerous different types, some rather rare, with the consequence of few opportunities for practice in some departments. Aim: Our aim was to design a tool for pathologists to train in less common OT typing. Method and Results: Representative slides of 20 less common OT were scanned (Nano Zoomer Digital Hamamatsu®) and the diagnostic algorithm proposed by Young and Scully applied to each case (Young RH and Scully RE, Seminars in Diagnostic Pathology 2001, 18: 161-235) to include: recognition of morphological pattern(s); shortlisting of differential diagnosis; proposition of relevant immunohistochemical markers. The next steps of this project will be: evaluation of the tool in several post-graduate training centers in Europe and Québec; improvement of its design based on evaluation results; diffusion to a larger public. Discussion: In clinical medicine, solving many cases is recognized as of utmost importance for a novice to become an expert. This project relies on the virtual slides technology to provide pathologists with a learning tool aimed at increasing their skills in OT typing. After due evaluation, this model might be extended to other uncommon tumors.
Resumo:
Quantifying the spatial configuration of hydraulic conductivity (K) in heterogeneous geological environments is essential for accurate predictions of contaminant transport, but is difficult because of the inherent limitations in resolution and coverage associated with traditional hydrological measurements. To address this issue, we consider crosshole and surface-based electrical resistivity geophysical measurements, collected in time during a saline tracer experiment. We use a Bayesian Markov-chain-Monte-Carlo (McMC) methodology to jointly invert the dynamic resistivity data, together with borehole tracer concentration data, to generate multiple posterior realizations of K that are consistent with all available information. We do this within a coupled inversion framework, whereby the geophysical and hydrological forward models are linked through an uncertain relationship between electrical resistivity and concentration. To minimize computational expense, a facies-based subsurface parameterization is developed. The Bayesian-McMC methodology allows us to explore the potential benefits of including the geophysical data into the inverse problem by examining their effect on our ability to identify fast flowpaths in the subsurface, and their impact on hydrological prediction uncertainty. Using a complex, geostatistically generated, two-dimensional numerical example representative of a fluvial environment, we demonstrate that flow model calibration is improved and prediction error is decreased when the electrical resistivity data are included. The worth of the geophysical data is found to be greatest for long spatial correlation lengths of subsurface heterogeneity with respect to wellbore separation, where flow and transport are largely controlled by highly connected flowpaths.
Resumo:
We present a novel scheme for the appearance of stochastic resonance when the dynamics of a Brownian particle takes place in a confined medium. The presence of uneven boundaries, giving rise to an entropic contribution to the potential, may upon application of a periodic driving force result in an increase of the spectral amplification at an optimum value of the ambient noise level. The entropic stochastic resonance, characteristic of small-scale systems, may constitute a useful mechanism for the manipulation and control of single molecules and nanodevices.
Resumo:
We study biased, diffusive transport of Brownian particles through narrow, spatially periodic structures in which the motion is constrained in lateral directions. The problem is analyzed under the perspective of the Fick-Jacobs equation, which accounts for the effect of the lateral confinement by introducing an entropic barrier in a one-dimensional diffusion. The validity of this approximation, based on the assumption of an instantaneous equilibration of the particle distribution in the cross section of the structure, is analyzed by comparing the different time scales that characterize the problem. A validity criterion is established in terms of the shape of the structure and of the applied force. It is analytically corroborated and verified by numerical simulations that the critical value of the force up to which this description holds true scales as the square of the periodicity of the structure. The criterion can be visualized by means of a diagram representing the regions where the Fick-Jacobs description becomes inaccurate in terms of the scaled force versus the periodicity of the structure.
Resumo:
3 Summary 3. 1 English The pharmaceutical industry has been facing several challenges during the last years, and the optimization of their drug discovery pipeline is believed to be the only viable solution. High-throughput techniques do participate actively to this optimization, especially when complemented by computational approaches aiming at rationalizing the enormous amount of information that they can produce. In siiico techniques, such as virtual screening or rational drug design, are now routinely used to guide drug discovery. Both heavily rely on the prediction of the molecular interaction (docking) occurring between drug-like molecules and a therapeutically relevant target. Several softwares are available to this end, but despite the very promising picture drawn in most benchmarks, they still hold several hidden weaknesses. As pointed out in several recent reviews, the docking problem is far from being solved, and there is now a need for methods able to identify binding modes with a high accuracy, which is essential to reliably compute the binding free energy of the ligand. This quantity is directly linked to its affinity and can be related to its biological activity. Accurate docking algorithms are thus critical for both the discovery and the rational optimization of new drugs. In this thesis, a new docking software aiming at this goal is presented, EADock. It uses a hybrid evolutionary algorithm with two fitness functions, in combination with a sophisticated management of the diversity. EADock is interfaced with .the CHARMM package for energy calculations and coordinate handling. A validation was carried out on 37 crystallized protein-ligand complexes featuring 11 different proteins. The search space was defined as a sphere of 15 R around the center of mass of the ligand position in the crystal structure, and conversely to other benchmarks, our algorithms was fed with optimized ligand positions up to 10 A root mean square deviation 2MSD) from the crystal structure. This validation illustrates the efficiency of our sampling heuristic, as correct binding modes, defined by a RMSD to the crystal structure lower than 2 A, were identified and ranked first for 68% of the complexes. The success rate increases to 78% when considering the five best-ranked clusters, and 92% when all clusters present in the last generation are taken into account. Most failures in this benchmark could be explained by the presence of crystal contacts in the experimental structure. EADock has been used to understand molecular interactions involved in the regulation of the Na,K ATPase, and in the activation of the nuclear hormone peroxisome proliferatoractivated receptors a (PPARa). It also helped to understand the action of common pollutants (phthalates) on PPARy, and the impact of biotransformations of the anticancer drug Imatinib (Gleevec®) on its binding mode to the Bcr-Abl tyrosine kinase. Finally, a fragment-based rational drug design approach using EADock was developed, and led to the successful design of new peptidic ligands for the a5ß1 integrin, and for the human PPARa. In both cases, the designed peptides presented activities comparable to that of well-established ligands such as the anticancer drug Cilengitide and Wy14,643, respectively. 3.2 French Les récentes difficultés de l'industrie pharmaceutique ne semblent pouvoir se résoudre que par l'optimisation de leur processus de développement de médicaments. Cette dernière implique de plus en plus. de techniques dites "haut-débit", particulièrement efficaces lorsqu'elles sont couplées aux outils informatiques permettant de gérer la masse de données produite. Désormais, les approches in silico telles que le criblage virtuel ou la conception rationnelle de nouvelles molécules sont utilisées couramment. Toutes deux reposent sur la capacité à prédire les détails de l'interaction moléculaire entre une molécule ressemblant à un principe actif (PA) et une protéine cible ayant un intérêt thérapeutique. Les comparatifs de logiciels s'attaquant à cette prédiction sont flatteurs, mais plusieurs problèmes subsistent. La littérature récente tend à remettre en cause leur fiabilité, affirmant l'émergence .d'un besoin pour des approches plus précises du mode d'interaction. Cette précision est essentielle au calcul de l'énergie libre de liaison, qui est directement liée à l'affinité du PA potentiel pour la protéine cible, et indirectement liée à son activité biologique. Une prédiction précise est d'une importance toute particulière pour la découverte et l'optimisation de nouvelles molécules actives. Cette thèse présente un nouveau logiciel, EADock, mettant en avant une telle précision. Cet algorithme évolutionnaire hybride utilise deux pressions de sélections, combinées à une gestion de la diversité sophistiquée. EADock repose sur CHARMM pour les calculs d'énergie et la gestion des coordonnées atomiques. Sa validation a été effectuée sur 37 complexes protéine-ligand cristallisés, incluant 11 protéines différentes. L'espace de recherche a été étendu à une sphère de 151 de rayon autour du centre de masse du ligand cristallisé, et contrairement aux comparatifs habituels, l'algorithme est parti de solutions optimisées présentant un RMSD jusqu'à 10 R par rapport à la structure cristalline. Cette validation a permis de mettre en évidence l'efficacité de notre heuristique de recherche car des modes d'interactions présentant un RMSD inférieur à 2 R par rapport à la structure cristalline ont été classés premier pour 68% des complexes. Lorsque les cinq meilleures solutions sont prises en compte, le taux de succès grimpe à 78%, et 92% lorsque la totalité de la dernière génération est prise en compte. La plupart des erreurs de prédiction sont imputables à la présence de contacts cristallins. Depuis, EADock a été utilisé pour comprendre les mécanismes moléculaires impliqués dans la régulation de la Na,K ATPase et dans l'activation du peroxisome proliferatoractivated receptor a (PPARa). Il a également permis de décrire l'interaction de polluants couramment rencontrés sur PPARy, ainsi que l'influence de la métabolisation de l'Imatinib (PA anticancéreux) sur la fixation à la kinase Bcr-Abl. Une approche basée sur la prédiction des interactions de fragments moléculaires avec protéine cible est également proposée. Elle a permis la découverte de nouveaux ligands peptidiques de PPARa et de l'intégrine a5ß1. Dans les deux cas, l'activité de ces nouveaux peptides est comparable à celles de ligands bien établis, comme le Wy14,643 pour le premier, et le Cilengitide (PA anticancéreux) pour la seconde.
Resumo:
A consistent extension of local spin density approximation (LSDA) to account for mass and dielectric mismatches in nanocrystals is presented. The extension accounting for variable effective mass is exact. Illustrative comparisons with available configuration interaction calculations show that the approach is also very reliable when it comes to account for dielectric mismatches. The modified LSDA is as fast and computationally low demanding as LSDA. Therefore, it is a tool suitable to study large particle systems in inhomogeneous media without much effort.
Resumo:
Low-copy-number molecules are involved in many functions in cells. The intrinsic fluctuations of these numbers can enable stochastic switching between multiple steady states, inducing phenotypic variability. Herein we present a theoretical and computational study based on Master Equations and Fokker-Planck and Langevin descriptions of stochastic switching for a genetic circuit of autoactivation. We show that in this circuit the intrinsic fluctuations arising from low-copy numbers, which are inherently state-dependent, drive asymmetric switching. These theoretical results are consistent with experimental data that have been reported for the bistable system of the gallactose signaling network in yeast. Our study unravels that intrinsic fluctuations, while not required to describe bistability, are fundamental to understand stochastic switching and the dynamical relative stability of multiple states.