24 resultados para Graph mining
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We present a computer-assisted analysis of combinatorial properties of the Cayley graphs of certain finitely generated groups: Given a group with a finite set of generators, we study the density of the corresponding Cayley graph, that is, the least upper bound for the average vertex degree (= number of adjacent edges) of any finite subgraph. It is known that an m-generated group is amenable if and only if the density of the corresponding Cayley graph equals to 2m. We test amenable and non-amenable groups, and also groups for which amenability is unknown. In the latter class we focus on Richard Thompson’s group F.
Resumo:
Consider a model with parameter phi, and an auxiliary model with parameter theta. Let phi be a randomly sampled from a given density over the known parameter space. Monte Carlo methods can be used to draw simulated data and compute the corresponding estimate of theta, say theta_tilde. A large set of tuples (phi, theta_tilde) can be generated in this manner. Nonparametric methods may be use to fit the function E(phi|theta_tilde=a), using these tuples. It is proposed to estimate phi using the fitted E(phi|theta_tilde=theta_hat), where theta_hat is the auxiliary estimate, using the real sample data. This is a consistent and asymptotically normally distributed estimator, under certain assumptions. Monte Carlo results for dynamic panel data and vector autoregressions show that this estimator can have very attractive small sample properties. Confidence intervals can be constructed using the quantiles of the phi for which theta_tilde is close to theta_hat. Such confidence intervals are found to have very accurate coverage.
Resumo:
We survey the main theoretical aspects of models for Mobile Ad Hoc Networks (MANETs). We present theoretical characterizations of mobile network structural properties, different dynamic graph models of MANETs, and finally we give detailed summaries of a few selected articles. In particular, we focus on articles dealing with connectivity of mobile networks, and on articles which show that mobility can be used to propagate information between nodes of the network while at the same time maintaining small transmission distances, and thus saving energy.
Resumo:
Graph pebbling is a network model for studying whether or not a given supply of discrete pebbles can satisfy a given demand via pebbling moves. A pebbling move across an edge of a graph takes two pebbles from one endpoint and places one pebble at the other endpoint; the other pebble is lost in transit as a toll. It has been shown that deciding whether a supply can meet a demand on a graph is NP-complete. The pebbling number of a graph is the smallest t such that every supply of t pebbles can satisfy every demand of one pebble. Deciding if the pebbling number is at most k is NP 2 -complete. In this paper we develop a tool, called theWeight Function Lemma, for computing upper bounds and sometimes exact values for pebbling numbers with the assistance of linear optimization. With this tool we are able to calculate the pebbling numbers of much larger graphs than in previous algorithms, and much more quickly as well. We also obtain results for many families of graphs, in many cases by hand, with much simpler and remarkably shorter proofs than given in previously existing arguments (certificates typically of size at most the number of vertices times the maximum degree), especially for highly symmetric graphs. Here we apply theWeight Function Lemma to several specific graphs, including the Petersen, Lemke, 4th weak Bruhat, Lemke squared, and two random graphs, as well as to a number of infinite families of graphs, such as trees, cycles, graph powers of cycles, cubes, and some generalized Petersen and Coxeter graphs. This partly answers a question of Pachter, et al., by computing the pebbling exponent of cycles to within an asymptotically small range. It is conceivable that this method yields an approximation algorithm for graph pebbling.
Resumo:
In this project a research both in finding predictors via clustering techniques and in reviewing the Data Mining free software is achieved. The research is based in a case of study, from where additionally to the KDD free software used by the scientific community; a new free tool for pre-processing the data is presented. The predictors are intended for the e-learning domain as the data from where these predictors have to be inferred are student qualifications from different e-learning environments. Through our case of study not only clustering algorithms are tested but also additional goals are proposed.
Resumo:
HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.
Resumo:
Gairebé 182 milions d'ciutadans de la Unió Europea (= 37,5% de la població total) viuen en aproximadament 130 regions frontereres i transfrontereres. Aquestes regions contribueixen significativament al procés d'integració europea. Aquesta importància es documenta pel paquet dels Fons Estructurals 2007-2013, que ha estat presentat per la Comissió Europea i que va ser aprovat recentment pel Parlament Europeu. Considerant que la UE ha gastat uns 4875 € milions per a la cooperació transfronterera, transnacional i interregional en el marc de la iniciativa Interreg per al període 2000-2006, la cooperació territorial europea es convertirà en un dels tres objectius dels fons estructurals i rebrà € 7750000000 (5,57 milions d'euros per a la cooperació transfronterera només) per al període 2007-2013 (Comissió Europea, 2006a, 2006b). A part d'això, un nou conjunt de normes per a l'establiment d'una "agrupació europea de cooperació territorial" (AECT) ha estat adoptat i que facilitarà la cooperació transboundray, transnacional i interregional a la UE. Aquest treball s'ocuparà de les estructures de la institucionalització, la presa de decisions i l'execució i les polítiques de la "Gran Regió" / "Großregion" (d'ara endavant: GR o Gran Regió).
Resumo:
The objective of the PANACEA ICT-2007.2.2 EU project is to build a platform that automates the stages involved in the acquisition,production, updating and maintenance of the large language resources required by, among others, MT systems. The development of a Corpus Acquisition Component (CAC) for extracting monolingual and bilingual data from the web is one of the most innovative building blocks of PANACEA. The CAC, which is the first stage in the PANACEA pipeline for building Language Resources, adopts an efficient and distributed methodology to crawl for web documents with rich textual content in specific languages and predefined domains. The CAC includes modules that can acquire parallel data from sites with in-domain content available in more than one language. In order to extrinsically evaluate the CAC methodology, we have conducted several experiments that used crawled parallel corpora for the identification and extraction of parallel sentences using sentence alignment. The corpora were then successfully used for domain adaptation of Machine Translation Systems.
Resumo:
Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on random-ization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in orderto obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.
Resumo:
En aquest article es presenten breument els diferents capítols d’un treball interdisciplinari per tal d’entendre el context de prohibició de la mineria de ferro a Goa a finals del 2012 i proporcionar la informació necessària per tal d’orientar i gestionar la presa de decisions sobre l’activitat minera en un futur. Els sis primers capítols consisteixen en l’estudi del medi abiòtic, medi biòtic, fluxos de materials, aspectes socials, aspectes econòmics i finalment aspectes polítics. En canvi, en els dos últims capítols s'avaluen i es gestionen els impactes ambientals de la mineria mitjançant, per una banda, una anàlisi DPSIR i, d'altra banda, es proposen tres escenaris per integrar les diferents variables i fomentar la participació en la presa de decisions. S’ha dut a terme una extensa recerca mitjançant la recopilació de dades, entrevistes i visites a les zones d’estudi d’interès per tal d’entendre el conflicte de la mineria a Goa.
Resumo:
The main objective of this Master Thesis is to discover more about Girona’s image as a tourism destination from different agents’ perspective and to study its differences on promotion or opinions. In order to meet this objective, three components of Girona’s destination image will be studied: attribute-based component, the holistic component, and the affective component. It is true that a lot of research has been done about tourism destination image, but it is less when we are talking about the destination of Girona. Some studies have already focused on Girona as a tourist destination, but they used a different type of sample and different methodological steps. This study is new among destination studies in the sense that it is based only on textual online data and it follows a methodology based on text-miming. Text-mining is a kind of methodology that allows people extract relevant information from texts. Also, after this information is extracted by this methodology, some statistical multivariate analyses are done with the aim of discovering more about Girona’s tourism image
Resumo:
Recently, several anonymization algorithms have appeared for privacy preservation on graphs. Some of them are based on random-ization techniques and on k-anonymity concepts. We can use both of them to obtain an anonymized graph with a given k-anonymity value. In this paper we compare algorithms based on both techniques in orderto obtain an anonymized graph with a desired k-anonymity value. We want to analyze the complexity of these methods to generate anonymized graphs and the quality of the resulting graphs.