5 resultados para Discrete Markov Random Field Modeling
em DRUM (Digital Repository at the University of Maryland)
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
A primary goal of this dissertation is to understand the links between mathematical models that describe crystal surfaces at three fundamental length scales: The scale of individual atoms, the scale of collections of atoms forming crystal defects, and macroscopic scale. Characterizing connections between different classes of models is a critical task for gaining insight into the physics they describe, a long-standing objective in applied analysis, and also highly relevant in engineering applications. The key concept I use in each problem addressed in this thesis is coarse graining, which is a strategy for connecting fine representations or models with coarser representations. Often this idea is invoked to reduce a large discrete system to an appropriate continuum description, e.g. individual particles are represented by a continuous density. While there is no general theory of coarse graining, one closely related mathematical approach is asymptotic analysis, i.e. the description of limiting behavior as some parameter becomes very large or very small. In the case of crystalline solids, it is natural to consider cases where the number of particles is large or where the lattice spacing is small. Limits such as these often make explicit the nature of links between models capturing different scales, and, once established, provide a means of improving our understanding, or the models themselves. Finding appropriate variables whose limits illustrate the important connections between models is no easy task, however. This is one area where computer simulation is extremely helpful, as it allows us to see the results of complex dynamics and gather clues regarding the roles of different physical quantities. On the other hand, connections between models enable the development of novel multiscale computational schemes, so understanding can assist computation and vice versa. Some of these ideas are demonstrated in this thesis. The important outcomes of this thesis include: (1) a systematic derivation of the step-flow model of Burton, Cabrera, and Frank, with corrections, from an atomistic solid-on-solid-type models in 1+1 dimensions; (2) the inclusion of an atomistically motivated transport mechanism in an island dynamics model allowing for a more detailed account of mound evolution; and (3) the development of a hybrid discrete-continuum scheme for simulating the relaxation of a faceted crystal mound. Central to all of these modeling and simulation efforts is the presence of steps composed of individual layers of atoms on vicinal crystal surfaces. Consequently, a recurring theme in this research is the observation that mesoscale defects play a crucial role in crystal morphological evolution.
Resumo:
In support of the achievement goal theory (AGT), empirical research has demonstrated psychosocial benefits of the mastery-oriented learning climate. In this study, we examined the effects of perceived coaching behaviors on various indicators of psychosocial well-being (competitive anxiety, self-esteem, perceived competence, enjoyment, and future intentions for participation), as mediated by perceptions of the coach-initiated motivational climate, achievement goal orientations and perceptions of sport-specific skills efficacy. Using a pre-post test design, 1,464 boys, ages 10-15 (M = 12.84 years, SD = 1.44), who participated in a series of 12 football skills clinics were surveyed from various locations across the United States. Using structural equation modeling (SEM) path analysis and hierarchical regression analysis, the cumulative direct and indirect effects of the perceived coaching behaviors on the psychosocial variables at post-test were parsed out to determine what types of coaching behaviors are more conducive to the positive psychosocial development of youth athletes. The study demonstrated that how coaching behaviors are perceived impacts the athletes’ perceptions of the motivational climate and achievement goal orientations, as well as self-efficacy beliefs. These effects in turn affect the athletes’ self-esteem, general competence, sport-specific competence, competitive anxiety, enjoyment, and intentions to remain involved in the sport. The findings also clarify how young boys internalize and interpret coaches’ messages through modification of achievement goal orientations and sport-specific efficacy beliefs.
Resumo:
Hydroxyl radical (OH) is the primary oxidant in the troposphere, initiating the removal of numerous atmospheric species including greenhouse gases, pollutants that are detrimental to human health, and ozone-depleting substances. Because of the complexity of OH chemistry, models vary widely in their OH chemistry schemes and resulting methane (CH4) lifetimes. The current state of knowledge concerning global OH abundances is often contradictory. This body of work encompasses three projects that investigate tropospheric OH from a modeling perspective, with the goal of improving the tropospheric community’s knowledge of the atmospheric lifetime of CH4. First, measurements taken during the airborne CONvective TRansport of Active Species in the Tropics (CONTRAST) field campaign are used to evaluate OH in global models. A box model constrained to measured variables is utilized to infer concentrations of OH along the flight track. Results are used to evaluate global model performance, suggest against the existence of a proposed “OH Hole” in the tropical Western Pacific, and investigate implications of high O3/low H2O filaments on chemical transport to the stratosphere. While methyl chloroform-based estimates of global mean OH suggest that models are overestimating OH, we report evidence that these models are actually underestimating OH in the tropical Western Pacific. The second project examines OH within global models to diagnose differences in CH4 lifetime. I developed an approach to quantify the roles of OH precursor field differences (O3, H2O, CO, NOx, etc.) using a neural network method. This technique enables us to approximate the change in CH4 lifetime resulting from variations in individual precursor fields. The dominant factors driving CH4 lifetime differences between models are O3, CO, and J(O3-O1D). My third project evaluates the effect of climate change on global fields of OH using an empirical model. Observations of H2O and O3 from satellite instruments are combined with a simulation of tropical expansion to derive changes in global mean OH over the past 25 years. We find that increasing H2O and increasing width of the tropics tend to increase global mean OH, countering the increasing CH4 sink and resulting in well-buffered global tropospheric OH concentrations.
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.