945 resultados para Probability Distribution Function


Relevância:

80.00% 80.00%

Publicador:

Resumo:

L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Context. Recent observations of brown dwarf spectroscopic variability in the infrared infer the presence of patchy cloud cover. Aims. This paper proposes a mechanism for producing inhomogeneous cloud coverage due to the depletion of cloud particles through the Coulomb explosion of dust in atmospheric plasma regions. Charged dust grains Coulomb-explode when the electrostatic stress of the grain exceeds its mechanical tensile stress, which results in grains below a critical radius a < a Coul crit being broken up. Methods. This work outlines the criteria required for the Coulomb explosion of dust clouds in substellar atmospheres, the effect on the dust particle size distribution function, and the resulting radiative properties of the atmospheric regions. Results. Our results show that for an atmospheric plasma region with an electron temperature of Te = 10 eV (≈105 K), the critical grain radius varies from 10−7 to 10−4 cm, depending on the grains’ tensile strength. Higher critical radii up to 10−3 cm are attainable for higher electron temperatures. We find that the process produces a bimodal particle size distribution composed of stable nanoscale seed particles and dust particles with a ≥ a Coul crit , with the intervening particle sizes defining a region devoid of dust. As a result, the dust population is depleted, and the clouds become optically thin in the wavelength range 0.1–10 μm, with a characteristic peak that shifts to higher wavelengths as more sub-micrometer particles are destroyed. Conclusions. In an atmosphere populated with a distribution of plasma volumes, this will yield regions of contrasting radiative properties, thereby giving a source of inhomogeneous cloud coverage. The results presented here may also be relevant for dust in supernova remnants and protoplanetary disks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract. Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person's assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We implement conditional moment closure (CMC) for simulation of chemical reactions in laminar chaotic flows. The CMC approach predicts the expected concentration of reactive species, conditional upon the concentration of a corresponding nonreactive scalar. Closure is obtained by neglecting the difference between the local concentration of the reactive scalar and its conditional average. We first use a Monte Carlo method to calculate the evolution of the moments of a conserved scalar; we then reconstruct the corresponding probability density function and dissipation rate. Finally, the concentrations of the reactive scalars are determined. The results are compared (and show excellent agreement) with full numerical simulations of the reaction processes in a chaotic laminar flow. This is a preprint of an article published in AlChE Journal copyright (2007) American Institute of Chemical Engineers: http://www3.interscience.wiley.com/

Relevância:

80.00% 80.00%

Publicador:

Resumo:

International audience

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract. Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person's assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Tese (doutorado)—Universidade de Brasília, Faculdade de Tecnologia, Programa de Pós-Graduação em Geotecnia, 2015.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We derive and solve models for coagulation with mass loss arising, for example, from industrial processes in which growing inclusions are lost from the melt by colliding with the wall of the vessel. We consider a variety of loss laws and a variety of coagulation kernels, deriving exact results where possible, and more generally reducing the equations to similarity solutions valid in the large-time limit. One notable result is the effect that mass removal has on gelation: for small loss rates, gelation is delayed, whilst above a critical threshold, gelation is completely prevented. Finally, by forming an exact explicit solution for a more general initial cluster size distribution function, we illustrate how numerical results from earlier work can be interpreted in the light of the theory presented herein.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Plasma process like ionic nitriding and cathodic cage plasma nitriding are utilized in order to become hard surface of steels. The ionic nitriding is already accepted in the industry while cathodic cage plasma nitriding process is in industrial implementation stage. Those process depend of plasma parameters like electronic and ionic temperature (Te, Ti), species density (ne, ni) and of distribution function of these species. In the present work, the plasma used to those two processes has been observed through Optical Emission Spectroscopy OES technique in order to identify presents species in the treatment ambient and relatively quantify them. So plasma of typical mixtures like N2 H2 has been monitored through in order to study evolution of those species during the process. Moreover, it has been realized a systematic study about leaks, also thought OES, that accomplish the evolution of contaminant species arising because there is flux of atmosphere to inside nitriding chamber and in what conditions the species are sufficiently reduced. Finally, to describe the physic mechanism that acts on both coating techniques ionic nitriding and cathodic cage plasma nitriding

Relevância:

80.00% 80.00%

Publicador:

Resumo:

X-Ray Powder Diffraction (XRPD) laboratory is a facility placed at Servicios Centrales de apoyo a la Investigación (SCAI) at University of Malaga (UMA) http://www.scai.uma.es/. This facility has three XRPD diffractometers and a diffractometer to measure high-resolution thin-films. X´Pert PRO MPD from PANalytical. This is a bragg-brentano (theta/2theta) with reflection geometry diffractometer which allows to obtain high resolution XRPD data with strictly monochromatic CuKα1 radiation (λ=1.54059Å) [Ge(111) primary monochromator] and an automatic sample charger. Moreover, it has parallel monochromatic CuKα1 radiation (λ=1.54059Å) with an hybrid Ge(220) monochromator for capillary and multiproposal (bulk samples up to 1 Kg) sample holders. The HTK1200N chamber from Anton Paar allows collecting high resolution high temperature patterns. EMPYREAN from PANalytical. This diffractometer works in reflection and transmission geometries with theta/theta goniometer, using CuKα1,2 radiation (λ=1.5418Å), a focusing X-ray mirror and a ultra-fast PIXCEL 3D detector with 1D and 2D collection data modes (microstructural and preferred orientation analysis). Moreover, the TTK450N chamber allows low temperature and up to 450ºC studies. A D8 ADVANCE (BRUKER) was installed in April 2014. It is the first diffractometer in Europe equipped with a Johansson Ge(111) primary monochromator, which gives a strictly monochromatic Mo radiation (λ=0.7093 Å) [1]. It works in transmission mode (with a sample charger) with this high resolution configuration. XRPD data suitable for PDF (Pair Distribution Function) analysis can be collected with a capillary sample holder, due to the high energy and high resolution capabilities of this diffractometer. Moreover, it has a humidity chamber MHC-trans from Anton Paar working on transmission mode with MoKα1 (measurements can be collected at 5 to 95% of relative humidity (from 20 to 80 ºC) and up to 150ºC [2]). Furthermore, this diffractometer also has a reaction chamber XRK900 from Anton Paar (which uses CuKα1,2 radiation in reflection mode), which allows data collection from room temperature to 900ºC with up to 10 bar of different gases. Finally, a D8 DISVOVER A25 from BRUKER was installed on December 2014. It has a five axis Euler cradler and optics devices suitable for high resolution thin film data collection collected in in-plane and out-of-plane modes. To sum up, high-resolution thin films, microstructural, rocking-curve, Small Angle X-ray Scattering (SAXS), Grazing incident SAXS (GISAXS), Ultra Grazing incident diffraction (Ultra-GID) and microdiffraction measurements can be performed with the appropriated optics and sample holders. [1] L. León-Reina, M. García-Maté, G. Álvarez-Pinazo, I. Santacruz, O. Vallcorba, A.G. De la Torre, M.A.G. Aranda “Accuracy in Rietveld quantitative phase analysis: a comparative study of strictly monochromatic Mo and Cu radiations” J. Appl. Crystallogr. 2016, 49, 722-735. [2] J. Aríñez-Soriano, J. Albalad, C. Vila-Parrondo, J. Pérez-Carvajal, S. Rodríguez-Hermida, A. Cabeza, F. Busqué, J. Juanhuix, I. Imaz, Daniel Maspoch “Single-crystal and humidity-controlled powder diffraction study of the breathing effect in a metal-organic framework upon water adsorption/desorption” Chem. Commun., 2016, DOI: 10.1039/C6CC02908F.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The change in the carbonaceous skeleton of nanoporous carbons during their activation has received limited attention, unlike its counterpart process in the presence of an inert atmosphere. Here we adopt a multi-method approach to elucidate this change in a poly(furfuryl alcohol)-derived carbon activated using cyclic application of oxygen saturation at 250 °C before its removal (with carbon) at 800 °C in argon. The methods used include helium pycnometry, synchrotron-based X-ray diffraction (XRD) and associated radial distribution function (RDF) analysis, transmission electron microscopy (TEM) and, uniquely, electron energy-loss spectroscopy spectrum-imaging (EELS-SI), electron nanodiffraction and fluctuation electron microscopy (FEM). Helium pycnometry indicates the solid skeleton of the carbon densifies during activation from 78% to 93% of graphite. RDF analysis, EELS-SI, and FEM all suggest this densification comes through an in-plane growth of sp2 carbon out to the medium range without commensurate increase in order normal to the plane. This process could be termed ‘graphenization’. The exact way in which this process occurs is not clear, but TEM images of the carbon before and after activation suggest it may come through removal of the more reactive carbon, breaking constraining cross-links and creating space that allows the remaining carbon material to migrate in an annealing-like process.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present study provides a methodology that gives a predictive character the computer simulations based on detailed models of the geometry of a porous medium. We using the software FLUENT to investigate the flow of a viscous Newtonian fluid through a random fractal medium which simplifies a two-dimensional disordered porous medium representing a petroleum reservoir. This fractal model is formed by obstacles of various sizes, whose size distribution function follows a power law where exponent is defined as the fractal dimension of fractionation Dff of the model characterizing the process of fragmentation these obstacles. They are randomly disposed in a rectangular channel. The modeling process incorporates modern concepts, scaling laws, to analyze the influence of heterogeneity found in the fields of the porosity and of the permeability in such a way as to characterize the medium in terms of their fractal properties. This procedure allows numerically analyze the measurements of permeability k and the drag coefficient Cd proposed relationships, like power law, for these properties on various modeling schemes. The purpose of this research is to study the variability provided by these heterogeneities where the velocity field and other details of viscous fluid dynamics are obtained by solving numerically the continuity and Navier-Stokes equations at pore level and observe how the fractal dimension of fractionation of the model can affect their hydrodynamic properties. This study were considered two classes of models, models with constant porosity, MPC, and models with varying porosity, MPV. The results have allowed us to find numerical relationship between the permeability, drag coefficient and the fractal dimension of fractionation of the medium. Based on these numerical results we have proposed scaling relations and algebraic expressions involving the relevant parameters of the phenomenon. In this study analytical equations were determined for Dff depending on the geometrical parameters of the models. We also found a relation between the permeability and the drag coefficient which is inversely proportional to one another. As for the difference in behavior it is most striking in the classes of models MPV. That is, the fact that the porosity vary in these models is an additional factor that plays a significant role in flow analysis. Finally, the results proved satisfactory and consistent, which demonstrates the effectiveness of the referred methodology for all applications analyzed in this study.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mobile and wireless networks have long exploited mobility predictions, focused on predicting the future location of given users, to perform more efficient network resource management. In this paper, we present a new approach in which we provide predictions as a probability distribution of the likelihood of moving to a set of future locations. This approach provides wireless services a greater amount of knowledge and enables them to perform more effectively. We present a framework for the evaluation of this new type of predictor, and develop 2 new predictors, HEM and G-Stat. We evaluate our predictors accuracy in predicting future cells for mobile users, using two large geolocation data sets, from MDC [11], [12] and Crawdad [13]. We show that our predictors can successfully predict with as low as an average 2.2% inaccuracy in certain scenarios.