969 resultados para Probability distribution functions
Resumo:
Various physical systems have dynamics that can be modeled by percolation processes. Percolation is used to study issues ranging from fluid diffusion through disordered media to fragmentation of a computer network caused by hacker attacks. A common feature of all of these systems is the presence of two non-coexistent regimes associated to certain properties of the system. For example: the disordered media can allow or not allow the flow of the fluid depending on its porosity. The change from one regime to another characterizes the percolation phase transition. The standard way of analyzing this transition uses the order parameter, a variable related to some characteristic of the system that exhibits zero value in one of the regimes and a nonzero value in the other. The proposal introduced in this thesis is that this phase transition can be investigated without the explicit use of the order parameter, but rather through the Shannon entropy. This entropy is a measure of the uncertainty degree in the information content of a probability distribution. The proposal is evaluated in the context of cluster formation in random graphs, and we apply the method to both classical percolation (Erd¨os- R´enyi) and explosive percolation. It is based in the computation of the entropy contained in the cluster size probability distribution and the results show that the transition critical point relates to the derivatives of the entropy. Furthermore, the difference between the smooth and abrupt aspects of the classical and explosive percolation transitions, respectively, is reinforced by the observation that the entropy has a maximum value in the classical transition critical point, while that correspondence does not occurs during the explosive percolation.
Resumo:
The aim of this paper is to suggest a simple methodology to be used by renewable power generators to bid in Spanish markets in order to minimize the cost of their imbalances. As it is known, the optimal bid depends on the probability distribution function of the energy to produce, of the probability distribution function of the future system imbalance and of its expected cost. We assume simple methods for estimating any of these parameters and, using actual data of 2014, we test the potential economic benefit for a wind generator from using our optimal bid instead of just the expected power generation. We find evidence that Spanish wind generators savings would be from 7% to 26%.
Resumo:
Recent theoretical advances predict the existence, deep into the glass phase, of a novel phase transition, the so-called Gardner transition. This transition is associated with the emergence of a complex free energy landscape composed of many marginally stable sub-basins within a glass metabasin. In this study, we explore several methods to detect numerically the Gardner transition in a simple structural glass former, the infinite-range Mari-Kurchan model. The transition point is robustly located from three independent approaches: (i) the divergence of the characteristic relaxation time, (ii) the divergence of the caging susceptibility, and (iii) the abnormal tail in the probability distribution function of cage order parameters. We show that the numerical results are fully consistent with the theoretical expectation. The methods we propose may also be generalized to more realistic numerical models as well as to experimental systems.
Resumo:
Prior research has established that idiosyncratic volatility of the securities prices exhibits a positive trend. This trend and other factors have made the merits of investment diversification and portfolio construction more compelling. A new optimization technique, a greedy algorithm, is proposed to optimize the weights of assets in a portfolio. The main benefits of using this algorithm are to: a) increase the efficiency of the portfolio optimization process, b) implement large-scale optimizations, and c) improve the resulting optimal weights. In addition, the technique utilizes a novel approach in the construction of a time-varying covariance matrix. This involves the application of a modified integrated dynamic conditional correlation GARCH (IDCC - GARCH) model to account for the dynamics of the conditional covariance matrices that are employed. The stochastic aspects of the expected return of the securities are integrated into the technique through Monte Carlo simulations. Instead of representing the expected returns as deterministic values, they are assigned simulated values based on their historical measures. The time-series of the securities are fitted into a probability distribution that matches the time-series characteristics using the Anderson-Darling goodness-of-fit criterion. Simulated and actual data sets are used to further generalize the results. Employing the S&P500 securities as the base, 2000 simulated data sets are created using Monte Carlo simulation. In addition, the Russell 1000 securities are used to generate 50 sample data sets. The results indicate an increase in risk-return performance. Choosing the Value-at-Risk (VaR) as the criterion and the Crystal Ball portfolio optimizer, a commercial product currently available on the market, as the comparison for benchmarking, the new greedy technique clearly outperforms others using a sample of the S&P500 and the Russell 1000 securities. The resulting improvements in performance are consistent among five securities selection methods (maximum, minimum, random, absolute minimum, and absolute maximum) and three covariance structures (unconditional, orthogonal GARCH, and integrated dynamic conditional GARCH).
Resumo:
We calculate the first two moments and full probability distribution of the work performed on a system of bosonic particles in a two-mode Bose-Hubbard Hamiltonian when the self-interaction term is varied instantaneously or with a finite-time ramp. In the instantaneous case, we show how the irreversible work scales differently depending on whether the system is driven to the Josephson or Fock regime of the bosonic Josephson junction. In the finite-time case, we use optimal control techniques to substantially decrease the irreversible work to negligible values. Our analysis can be implemented in present-day experiments with ultracold atoms and we show how to relate the work statistics to that of the population imbalance of the two modes.
Resumo:
Palladium nanoparticles have been immobilized into an amino-functionalized metal-organic framework (MOF), MIL-101Cr-NH2, to form Pd@MIL-101Cr-NH2. Four materials with different loadings of palladium have been prepared (denoted as 4-, 8-, 12-, and 16wt%Pd@MIL-101Cr-NH2). The effects of catalyst loading and the size and distribution of the Pd nanoparticles on the catalytic performance have been studied. The catalysts were characterized by using scanning electron microscopy (SEM), transmission electron microscopy (TEM), Fourier-transform infrared (FTIR) spectroscopy, powder X-ray diffraction (PXRD), N-2-sorption isotherms, elemental analysis, and thermogravimetric analysis (TGA). To better characterize the palladium nanoparticles and their distribution in MIL-101Cr-NH2, electron tomography was employed to reconstruct the 3D volume of 8wt%Pd@MIL-101Cr-NH2 particles. The pair distribution functions (PDFs) of the samples were extracted from total scattering experiments using high-energy X-rays (60keV). The catalytic activity of the four MOF materials with different loadings of palladium nanoparticles was studied in the Suzuki-Miyaura cross-coupling reaction. The best catalytic performance was obtained with the MOF that contained 8wt% palladium nanoparticles. The metallic palladium nanoparticles were homogeneously distributed, with an average size of 2.6nm. Excellent yields were obtained for a wide scope of substrates under remarkably mild conditions (water, aerobic conditions, room temperature, catalyst loading as low as 0.15mol%). The material can be recycled at least 10times without alteration of its catalytic properties.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
L’un des problèmes importants en apprentissage automatique est de déterminer la complexité du modèle à apprendre. Une trop grande complexité mène au surapprentissage, ce qui correspond à trouver des structures qui n’existent pas réellement dans les données, tandis qu’une trop faible complexité mène au sous-apprentissage, c’est-à-dire que l’expressivité du modèle est insuffisante pour capturer l’ensemble des structures présentes dans les données. Pour certains modèles probabilistes, la complexité du modèle se traduit par l’introduction d’une ou plusieurs variables cachées dont le rôle est d’expliquer le processus génératif des données. Il existe diverses approches permettant d’identifier le nombre approprié de variables cachées d’un modèle. Cette thèse s’intéresse aux méthodes Bayésiennes nonparamétriques permettant de déterminer le nombre de variables cachées à utiliser ainsi que leur dimensionnalité. La popularisation des statistiques Bayésiennes nonparamétriques au sein de la communauté de l’apprentissage automatique est assez récente. Leur principal attrait vient du fait qu’elles offrent des modèles hautement flexibles et dont la complexité s’ajuste proportionnellement à la quantité de données disponibles. Au cours des dernières années, la recherche sur les méthodes d’apprentissage Bayésiennes nonparamétriques a porté sur trois aspects principaux : la construction de nouveaux modèles, le développement d’algorithmes d’inférence et les applications. Cette thèse présente nos contributions à ces trois sujets de recherches dans le contexte d’apprentissage de modèles à variables cachées. Dans un premier temps, nous introduisons le Pitman-Yor process mixture of Gaussians, un modèle permettant l’apprentissage de mélanges infinis de Gaussiennes. Nous présentons aussi un algorithme d’inférence permettant de découvrir les composantes cachées du modèle que nous évaluons sur deux applications concrètes de robotique. Nos résultats démontrent que l’approche proposée surpasse en performance et en flexibilité les approches classiques d’apprentissage. Dans un deuxième temps, nous proposons l’extended cascading Indian buffet process, un modèle servant de distribution de probabilité a priori sur l’espace des graphes dirigés acycliques. Dans le contexte de réseaux Bayésien, ce prior permet d’identifier à la fois la présence de variables cachées et la structure du réseau parmi celles-ci. Un algorithme d’inférence Monte Carlo par chaîne de Markov est utilisé pour l’évaluation sur des problèmes d’identification de structures et d’estimation de densités. Dans un dernier temps, nous proposons le Indian chefs process, un modèle plus général que l’extended cascading Indian buffet process servant à l’apprentissage de graphes et d’ordres. L’avantage du nouveau modèle est qu’il admet les connections entres les variables observables et qu’il prend en compte l’ordre des variables. Nous présentons un algorithme d’inférence Monte Carlo par chaîne de Markov avec saut réversible permettant l’apprentissage conjoint de graphes et d’ordres. L’évaluation est faite sur des problèmes d’estimations de densité et de test d’indépendance. Ce modèle est le premier modèle Bayésien nonparamétrique permettant d’apprendre des réseaux Bayésiens disposant d’une structure complètement arbitraire.
Resumo:
Cette thèse s’inscrit dans le contexte d’une optimisation industrielle et économique des éléments de structure en BFUP permettant d’en garantir la ductilité au niveau structural, tout en ajustant la quantité de fibres et en optimisant le mode de fabrication. Le modèle développé décrit explicitement la participation du renfort fibré en traction au niveau local, en enchaînant une phase de comportement écrouissante suivie d’une phase adoucissante. La loi de comportement est fonction de la densité, de l’orientation des fibres vis-à-vis des directions principales de traction, de leur élancement et d’autres paramètres matériaux usuels liés aux fibres, à la matrice cimentaire et à leur interaction. L’orientation des fibres est prise en compte à partir d’une loi de probabilité normale à une ou deux variables permettant de reproduire n’importe quelle orientation obtenue à partir d’un calcul représentatif de la mise en oeuvre du BFUP frais ou renseignée par analyse expérimentale sur prototype. Enfin, le modèle reproduit la fissuration des BFUP sur le principe des modèles de fissures diffuses et tournantes. La loi de comportement est intégrée au sein d’un logiciel de calcul de structure par éléments finis, permettant de l’utiliser comme un outil prédictif de la fiabilité et de la ductilité globale d’éléments en BFUP. Deux campagnes expérimentales ont été effectuées, une à l’Université Laval de Québec et l’autre à l’Ifsttar, Marne-la-Vallée. La première permet de valider la capacité du modèle reproduire le comportement global sous des sollicitations typiques de traction et de flexion dans des éléments structurels simples pour lesquels l’orientation préférentielle des fibres a été renseignée par tomographie. La seconde campagne expérimentale démontre les capacités du modèle dans une démarche d’optimisation, pour la fabrication de plaques nervurées relativement complexes et présentant un intérêt industriel potentiel pour lesquels différentes modalités de fabrication et des BFUP plus ou moins fibrés ont été envisagés. Le contrôle de la répartition et de l’orientation des fibres a été réalisé à partir d’essais mécaniques sur prélèvements. Les prévisions du modèle ont été confrontées au comportement structurel global et à la ductilité mis en évidence expérimentalement. Le modèle a ainsi pu être qualifié vis-à-vis des méthodes analytiques usuelles de l’ingénierie, en prenant en compte la variabilité statistique. Des pistes d’amélioration et de complément de développement ont été identifiées.
Resumo:
Abstract. Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person's assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
International audience
Resumo:
The subject of quark transverse spin and transverse momentum distribution are two current research frontier in understanding the spin structure of the nucleons. The goal of the research reported in this dissertation is to extract new information on the quark transversity distribution and the novel transverse-momentum-dependent Sivers function in the neutron. A semi-inclusive deep inelastic scattering experiment was performed at the Hall A of the Jefferson laboratory using 5.9 GeV electron beam and a transversely polarized ^{3}He target. The scattered electrons and the produced hadrons (pions, kaons, and protons) were detected in coincidence with two large magnetic spectrometers. By regularly flipping the spin direction of the transversely polarized target, the single-spin-asymmetry (SSA) of the semi-inclusive deep inelastic reaction ^{3}He^{uparrow}(e,e'h^{\pm})X was measured over the kinematic range 0.13 < x < 0.41 and 1.3 < Q^{2} < 3.1 (GeV)^{2}. The SSA contains several different azimuthal angular modulations which are convolutions of quarks distribution functions in the nucleons and the quark fragmentation functions into hadrons. It is from the extraction of the various ``moments'' of these azimuthal angular distributions (Collins moment and Sivers moment) that we obtain information on the quark transversity distribution and the novel T-odd Sivers function. In this dissertation, I first introduced the theoretical background and experimental status of nucleon spins and the physics of SSA. I will then present the experimental setup and data collection of the JLab E06-010 experiment. Details of data analysis will be discussed next with emphasis on the kaon particle identification and the Ring-Imaging Cherenkov detector which are my major responsibilities in this experiment. Finally, results on the kaon Collins and Sivers moments extracted from the Maximum Likelihood method will be presented and interpreted. I will conclude with a discussion on the future prospects for this research.
Resumo:
Abstract. Two ideas taken from Bayesian optimization and classifier systems are presented for personnel scheduling based on choosing a suitable scheduling rule from a set for each person's assignment. Unlike our previous work of using genetic algorithms whose learning is implicit, the learning in both approaches is explicit, i.e. we are able to identify building blocks directly. To achieve this target, the Bayesian optimization algorithm builds a Bayesian network of the joint probability distribution of the rules used to construct solutions, while the adapted classifier system assigns each rule a strength value that is constantly updated according to its usefulness in the current situation. Computational results from 52 real data instances of nurse scheduling demonstrate the success of both approaches. It is also suggested that the learning mechanism in the proposed approaches might be suitable for other scheduling problems.
Resumo:
In this paper we construct a model for the simultaneous compaction by which clusters are restructured, and growth of clusters by pairwise coagulation. The model has the form of a multicomponent aggregation problem in which the components are cluster mass and cluster diameter. Following suitable approximations, exact explicit solutions are derived which may be useful for the verification of simulations of such systems. Numerical simulations are presented to illustrate typical behaviour and to show the accuracy of approximations made in deriving the model. The solutions are then simplified using asymptotic techniques to show the relevant timescales of the kinetic processes and elucidate the shape of the cluster distribution functions at large times.