912 resultados para Call Graph
Resumo:
In the past decade, systems that extract information from millions of Internet documents have become commonplace. Knowledge graphs -- structured knowledge bases that describe entities, their attributes and the relationships between them -- are a powerful tool for understanding and organizing this vast amount of information. However, a significant obstacle to knowledge graph construction is the unreliability of the extracted information, due to noise and ambiguity in the underlying data or errors made by the extraction system and the complexity of reasoning about the dependencies between these noisy extractions. My dissertation addresses these challenges by exploiting the interdependencies between facts to improve the quality of the knowledge graph in a scalable framework. I introduce a new approach called knowledge graph identification (KGI), which resolves the entities, attributes and relationships in the knowledge graph by incorporating uncertain extractions from multiple sources, entity co-references, and ontological constraints. I define a probability distribution over possible knowledge graphs and infer the most probable knowledge graph using a combination of probabilistic and logical reasoning. Such probabilistic models are frequently dismissed due to scalability concerns, but my implementation of KGI maintains tractable performance on large problems through the use of hinge-loss Markov random fields, which have a convex inference objective. This allows the inference of large knowledge graphs using 4M facts and 20M ground constraints in 2 hours. To further scale the solution, I develop a distributed approach to the KGI problem which runs in parallel across multiple machines, reducing inference time by 90%. Finally, I extend my model to the streaming setting, where a knowledge graph is continuously updated by incorporating newly extracted facts. I devise a general approach for approximately updating inference in convex probabilistic models, and quantify the approximation error by defining and bounding inference regret for online models. Together, my work retains the attractive features of probabilistic models while providing the scalability necessary for large-scale knowledge graph construction. These models have been applied on a number of real-world knowledge graph projects, including the NELL project at Carnegie Mellon and the Google Knowledge Graph.
Resumo:
This paper presents an investigation of a simple generic hyper-heuristic approach upon a set of widely used constructive heuristics (graph coloring heuristics) in timetabling. Within the hyperheuristic framework, a Tabu Search approach is employed to search for permutations of graph heuristics which are used for constructing timetables in exam and course timetabling problems. This underpins a multi-stage hyper-heuristic where the Tabu Search employs permutations upon a different number of graph heuristics in two stages. We study this graph-based hyper-heuristic approach within the context of exploring fundamental issues concerning the search space of the hyper-heuristic (the heuristic space) and the solution space. Such issues have not been addressed in other hyper-heuristic research. These approaches are tested on both exam and course benchmark timetabling problems and are compared with the fine-tuned bespoke state-of-the-art approaches. The results are within the range of the best results reported in the literature. The approach described here represents a significantly more generally applicable approach than the current state of the art in the literature. Future work will extend this hyper-heuristic framework by employing methodologies which are applicable on a wider range of timetabling and scheduling problems.
Resumo:
This paper is concerned with the hybridization of two graph coloring heuristics (Saturation Degree and Largest Degree), and their application within a hyperheuristic for exam timetabling problems. Hyper-heuristics can be seen as algorithms which intelligently select appropriate algorithms/heuristics for solving a problem. We developed a Tabu Search based hyper-heuristic to search for heuristic lists (of graph heuristics) for solving problems and investigated the heuristic lists found by employing knowledge discovery techniques. Two hybrid approaches (involving Saturation Degree and Largest Degree) including one which employs Case Based Reasoning are presented and discussed. Both the Tabu Search based hyper-heuristic and the hybrid approaches are tested on random and real-world exam timetabling problems. Experimental results are comparable with the best state-of-the-art approaches (as measured against established benchmark problems). The results also demonstrate an increased level of generality in our approach.
Resumo:
Este artículo de investigación propone la aplicación de un modelo de gestión del conocimiento para mejorar el programa de capacitación en el Call Center de la empresa Promociones y Cobranzas Beta S.A., ubicada en la zona centro de Bogotá, Cra. 10 No. 16 -39 piso 13. La investigación se realizó durante el año 2015 utilizando como metodología revisión documental, entrevistas a los coordinadores del área y encuestas con un grupo de asesores del Call Center. La aplicación de dichas herramientas generó información acerca del proceso de capacitación actual. A partir de algunas teorías de gestión del conocimiento se buscó determinar el ciclo de las capacitaciones que mejor se adaptara a la necesidad del Call Center de Promociones y Cobranzas Beta S.A., de tal forma que la comunicación con los asesores de cobranzas fuera más efectiva y el conocimiento mantuviera una actualización permanente. La aplicación del modelo permitirá a la gerencia de la compañía, ampliar el conocimiento y mejorar en el servicio con los clientes atendidos, reflejando cumplimiento de las metas de cobro establecidas en el área. Palabras clave: Gestión del conocimiento, programa de capacitación, Call Center
Resumo:
An earlier Case-based Reasoning (CBR) approach developed by the authors for educational course timetabling problems employed structured cases to represent the complex relationships between courses. Previous solved cases represented by attribute graphs were organized hierarchically into a decision tree. The retrieval searches for graph isomorphism among these attribute graphs. In this paper, the approach is further developed to solve a wider range of problems. We also attempt to retrieve those graphs that have common similar structures but also have some differences. Costs that are assigned to these differences have an input upon the similarity measure. A large number of experiments are performed consisting of different randomly produced timetabling problems and the results presented here strongly indicate that a CBR approach could provide a significant step forward in the development of automated system to solve difficult timetabling problems. They show that using relatively little effort, we can retrieve these structurally similar cases to provide high quality timetables for new timetabling problems.
Resumo:
This paper presents an investigation of a simple generic hyper-heuristic approach upon a set of widely used constructive heuristics (graph coloring heuristics) in timetabling. Within the hyperheuristic framework, a Tabu Search approach is employed to search for permutations of graph heuristics which are used for constructing timetables in exam and course timetabling problems. This underpins a multi-stage hyper-heuristic where the Tabu Search employs permutations upon a different number of graph heuristics in two stages. We study this graph-based hyper-heuristic approach within the context of exploring fundamental issues concerning the search space of the hyper-heuristic (the heuristic space) and the solution space. Such issues have not been addressed in other hyper-heuristic research. These approaches are tested on both exam and course benchmark timetabling problems and are compared with the fine-tuned bespoke state-of-the-art approaches. The results are within the range of the best results reported in the literature. The approach described here represents a significantly more generally applicable approach than the current state of the art in the literature. Future work will extend this hyper-heuristic framework by employing methodologies which are applicable on a wider range of timetabling and scheduling problems.
Resumo:
Edge-labeled graphs have proliferated rapidly over the last decade due to the increased popularity of social networks and the Semantic Web. In social networks, relationships between people are represented by edges and each edge is labeled with a semantic annotation. Hence, a huge single graph can express many different relationships between entities. The Semantic Web represents each single fragment of knowledge as a triple (subject, predicate, object), which is conceptually identical to an edge from subject to object labeled with predicates. A set of triples constitutes an edge-labeled graph on which knowledge inference is performed. Subgraph matching has been extensively used as a query language for patterns in the context of edge-labeled graphs. For example, in social networks, users can specify a subgraph matching query to find all people that have certain neighborhood relationships. Heavily used fragments of the SPARQL query language for the Semantic Web and graph queries of other graph DBMS can also be viewed as subgraph matching over large graphs. Though subgraph matching has been extensively studied as a query paradigm in the Semantic Web and in social networks, a user can get a large number of answers in response to a query. These answers can be shown to the user in accordance with an importance ranking. In this thesis proposal, we present four different scoring models along with scalable algorithms to find the top-k answers via a suite of intelligent pruning techniques. The suggested models consist of a practically important subset of the SPARQL query language augmented with some additional useful features. The first model called Substitution Importance Query (SIQ) identifies the top-k answers whose scores are calculated from matched vertices' properties in each answer in accordance with a user-specified notion of importance. The second model called Vertex Importance Query (VIQ) identifies important vertices in accordance with a user-defined scoring method that builds on top of various subgraphs articulated by the user. Approximate Importance Query (AIQ), our third model, allows partial and inexact matchings and returns top-k of them with a user-specified approximation terms and scoring functions. In the fourth model called Probabilistic Importance Query (PIQ), a query consists of several sub-blocks: one mandatory block that must be mapped and other blocks that can be opportunistically mapped. The probability is calculated from various aspects of answers such as the number of mapped blocks, vertices' properties in each block and so on and the most top-k probable answers are returned. An important distinguishing feature of our work is that we allow the user a huge amount of freedom in specifying: (i) what pattern and approximation he considers important, (ii) how to score answers - irrespective of whether they are vertices or substitution, and (iii) how to combine and aggregate scores generated by multiple patterns and/or multiple substitutions. Because so much power is given to the user, indexing is more challenging than in situations where additional restrictions are imposed on the queries the user can ask. The proposed algorithms for the first model can also be used for answering SPARQL queries with ORDER BY and LIMIT, and the method for the second model also works for SPARQL queries with GROUP BY, ORDER BY and LIMIT. We test our algorithms on multiple real-world graph databases, showing that our algorithms are far more efficient than popular triple stores.
Resumo:
In this thesis, we define the spectrum problem for packings (coverings) of G to be the problem of finding all graphs H such that a maximum G-packing (minimum G- covering) of the complete graph with the leave (excess) graph H exists. The set of achievable leave (excess) graphs in G-packings (G-coverings) of the complete graph is called the spectrum of leave (excess) graphs for G. Then, we consider this problem for trees with up to five edges. We will prove that for any tree T with up to five edges, if the leave graph in a maximum T-packing of the complete graph Kn has i edges, then the spectrum of leave graphs for T is the set of all simple graphs with i edges. In fact, for these T and i and H any simple graph with i edges, we will construct a maximum T-packing of Kn with the leave graph H. We will also show that for any tree T with k ≤ 5 edges, if the excess graph in a minimum T-covering of the complete graph Kn has i edges, then the spectrum of excess graphs for T is the set of all simple graphs and multigraphs with i edges, except for the case that T is a 5-star, for which the graph formed by four multiple edges is not achievable when n = 12.
Resumo:
Call & Response is the newsletter of the South Carolina African American Heritage Commission, whose mission is to identify and promote the preservation of historic sites, structures, buildings, and culture of the African American experience in South Carolina. This is volume X number 2 and includes a message from the chair, list of board members, spotlight on Vitoria A. Smalls and the agenda for the annual conference and awards ceremony.
Resumo:
The Reverend Joseph McKeen (1757-1807) was the first president of Bowdoin College, Brunswick, Maine, USA, (founded 1794). McKeen is famous for his inaugural address in which he calls students to serve the common good. His view of common good is a deeply theological view, coloured by the theological era in which he lived and worked. This study examines the idea of common in the light of McKeen’s college sermons, taking note of the following subjects: Scottish Common Sense Realism; The Nature of True Virtue; The Controversy with Unitarianism; and Science and Mathematics. McKeen’s view of common good is not simply a political view. He is not merely a republican, expressing his views on the future of the republic in a classical political way. He is also, indeed primarily, a pastor and theologian.
Resumo:
This dissertation investigates the connection between spectral analysis and frame theory. When considering the spectral properties of a frame, we present a few novel results relating to the spectral decomposition. We first show that scalable frames have the property that the inner product of the scaling coefficients and the eigenvectors must equal the inverse eigenvalues. From this, we prove a similar result when an approximate scaling is obtained. We then focus on the optimization problems inherent to the scalable frames by first showing that there is an equivalence between scaling a frame and optimization problems with a non-restrictive objective function. Various objective functions are considered, and an analysis of the solution type is presented. For linear objectives, we can encourage sparse scalings, and with barrier objective functions, we force dense solutions. We further consider frames in high dimensions, and derive various solution techniques. From here, we restrict ourselves to various frame classes, to add more specificity to the results. Using frames generated from distributions allows for the placement of probabilistic bounds on scalability. For discrete distributions (Bernoulli and Rademacher), we bound the probability of encountering an ONB, and for continuous symmetric distributions (Uniform and Gaussian), we show that symmetry is retained in the transformed domain. We also prove several hyperplane-separation results. With the theory developed, we discuss graph applications of the scalability framework. We make a connection with graph conditioning, and show the in-feasibility of the problem in the general case. After a modification, we show that any complete graph can be conditioned. We then present a modification of standard PCA (robust PCA) developed by Cand\`es, and give some background into Electron Energy-Loss Spectroscopy (EELS). We design a novel scheme for the processing of EELS through robust PCA and least-squares regression, and test this scheme on biological samples. Finally, we take the idea of robust PCA and apply the technique of kernel PCA to perform robust manifold learning. We derive the problem and present an algorithm for its solution. There is also discussion of the differences with RPCA that make theoretical guarantees difficult.
Resumo:
Graphs are powerful tools to describe social, technological and biological networks, with nodes representing agents (people, websites, gene, etc.) and edges (or links) representing relations (or interactions) between agents. Examples of real-world networks include social networks, the World Wide Web, collaboration networks, protein networks, etc. Researchers often model these networks as random graphs. In this dissertation, we study a recently introduced social network model, named the Multiplicative Attribute Graph model (MAG), which takes into account the randomness of nodal attributes in the process of link formation (i.e., the probability of a link existing between two nodes depends on their attributes). Kim and Lesckovec, who defined the model, have claimed that this model exhibit some of the properties a real world social network is expected to have. Focusing on a homogeneous version of this model, we investigate the existence of zero-one laws for graph properties, e.g., the absence of isolated nodes, graph connectivity and the emergence of triangles. We obtain conditions on the parameters of the model, so that these properties occur with high or vanishingly probability as the number of nodes becomes unboundedly large. In that regime, we also investigate the property of triadic closure and the nodal degree distribution.
Resumo:
The search for patterns or motifs in data represents an area of key interest to many researchers. In this paper we present the Motif Tracking Algorithm, a novel immune inspired pattern identification tool that is able to identify unknown motifs which repeat within time series data. The power of the algorithm is derived from its use of a small number of parameters with minimal assumptions. The algorithm searches from a completely neutral perspective that is independent of the data being analysed and the underlying motifs. In this paper the motif tracking algorithm is applied to the search for patterns within sequences of low level system calls between the Linux kernel and the operating system’s user space. The MTA is able to compress data found in large system call data sets to a limited number of motifs which summarise that data. The motifs provide a resource from which a profile of executed processes can be built. The potential for these profiles and new implications for security research are highlighted. A higher level system call language for measuring similarity between patterns of such calls is also suggested.
Resumo:
In this dissertation I draw a connection between quantum adiabatic optimization, spectral graph theory, heat-diffusion, and sub-stochastic processes through the operators that govern these processes and their associated spectra. In particular, we study Hamiltonians which have recently become known as ``stoquastic'' or, equivalently, the generators of sub-stochastic processes. The operators corresponding to these Hamiltonians are of interest in all of the settings mentioned above. I predominantly explore the connection between the spectral gap of an operator, or the difference between the two lowest energies of that operator, and certain equilibrium behavior. In the context of adiabatic optimization, this corresponds to the likelihood of solving the optimization problem of interest. I will provide an instance of an optimization problem that is easy to solve classically, but leaves open the possibility to being difficult adiabatically. Aside from this concrete example, the work in this dissertation is predominantly mathematical and we focus on bounding the spectral gap. Our primary tool for doing this is spectral graph theory, which provides the most natural approach to this task by simply considering Dirichlet eigenvalues of subgraphs of host graphs. I will derive tight bounds for the gap of one-dimensional, hypercube, and general convex subgraphs. The techniques used will also adapt methods recently used by Andrews and Clutterbuck to prove the long-standing ``Fundamental Gap Conjecture''.