993 resultados para graph approach
Resumo:
HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.
Resumo:
During the Early Toarcian, major paleoenvironnemental and paleoceanographical changes occurred, leading to an oceanic anoxic event (OAE) and to a perturbation of the carbon isotope cycle. Although the standard biochronology of the Lower Jurassic is essentially based upon ammonites, in recent years biostratigraphy based on calcareous nannofossils and dinoflagellate cysts is increasingly used to date Jurassic rocks. However, the precise dating and correlation of the Early Toarcian OAE, and of the associated delta C-13 anomaly in different settings of the western Tethys, are still partly problematic, and it is still unclear whether these events are synchronous or not. In order to allow more accurate correlations of the organic rich levels recorded in the Lower Toarcian OAE, this account proposes a new biozonation based on a quantitative biochronology approach, the Unitary Associations (UA), applied to calcareous nannofossils. This study represents the first attempt to apply the UA method to Jurassic nannofossils. The study incorporates eighteen sections distributed across western Tethys and ranging from the Pliensbachian to Aalenian, comprising 1220 samples and 72 calcareous nannofossil taxa. The BioGraph [Savary, J., Guex, J., 1999. Discrete biochronological scales and unitary associations: description of the Biograph Computer program. Memoires de Geologie de Lausanne 34, 282 pp] and UA-Graph (Copyright Hammer O., Guex and Savary, 2002) softwares provide a discrete biochronological framework based upon multi-taxa concurrent range zones in the different sections. The optimized dataset generates nine UAs using the co-occurrences of 56 taxa. These UAs are grouped into six Unitary Association Zones (UA-Z), which constitute a robust biostratigraphic synthesis of all the observed or deduced biostratigraphic relationships between the analysed taxa. The UA zonation proposed here is compared to ``classic'' calcareous nannofossil biozonations, which are commonly used for the southern and the northern sides of Tethys. The biostratigraphic resolution of the UA-Zones varies from one nannofossil subzone or part of it to several subzones, and can be related to the pattern of calcareous nannoplankton originations and extinctions during the studied time interval. The Late Pliensbachian - Early Toarcian interval (corresponding to the UA-Z II) represents a major step in the Jurassic nannoplankton radiation. The recognized UA-Zones are also compared to the carbon isotopic negative excursion and TOC maximum in five sections of central Italy, Germany and England, with the aim of providing a more reliable correlation tool for the Early Toarcian OAE, and of the associated isotopic anomaly, between the southern and northern part of western Tethys. The results of this work show that the TOC maximum and delta C-13 negative excursion correspond to the upper part of the UA-Z II (i.e., UA 3) in the sections analysed. This suggests that the Early Toarcian OAE was a synchronous event within the western Tethys. (c) 2006 Elsevier B.V. All rights reserved.
Resumo:
Segmenting ultrasound images is a challenging problemwhere standard unsupervised segmentation methods such asthe well-known Chan-Vese method fail. We propose in thispaper an efficient segmentation method for this class ofimages. Our proposed algorithm is based on asemi-supervised approach (user labels) and the use ofimage patches as data features. We also consider thePearson distance between patches, which has been shown tobe robust w.r.t speckle noise present in ultrasoundimages. Our results on phantom and clinical data show avery high similarity agreement with the ground truthprovided by a medical expert.
Resumo:
La hiérarchie de Wagner constitue à ce jour la plus fine classification des langages ω-réguliers. Par ailleurs, l'approche algébrique de la théorie de langages formels montre que ces ensembles ω-réguliers correspondent précisément aux langages reconnaissables par des ω-semigroupes finis pointés. Ce travail s'inscrit dans ce contexte en fournissant une description complète de la contrepartie algébrique de la hiérarchie de Wagner, et ce par le biais de la théorie descriptive des jeux de Wadge. Plus précisément, nous montrons d'abord que le degré de Wagner d'un langage ω-régulier est effectivement un invariant syntaxique. Nous définissons ensuite une relation de réduction entre ω-semigroupes pointés par le biais d'un jeu infini de type Wadge. La collection de ces structures algébriques ordonnée par cette relation apparaît alors comme étant isomorphe à la hiérarchie de Wagner, soit un quasi bon ordre décidable de largeur 2 et de hauteur ω. Nous exposons par la suite une procédure de décidabilité de cette hiérarchie algébrique : on décrit une représentation graphique des ω-semigroupes finis pointés, puis un algorithme sur ces structures graphiques qui calcule le degré de Wagner de n'importe quel élément. Ainsi le degré de Wagner de tout langage ω-régulier peut être calculé de manière effective directement sur son image syntaxique. Nous montrons ensuite comment construire directement et inductivement une structure de n''importe quel degré. Nous terminons par une description détaillée des invariants algébriques qui caractérisent tous les degrés de cette hiérarchie. Abstract The Wagner hierarchy is known so far to be the most refined topological classification of ω-rational languages. Also, the algebraic study of formal languages shows that these ω-rational sets correspond precisely to the languages recognizable by finite pointed ω-semigroups. Within this framework, we provide a construction of the algebraic counterpart of the Wagner hierarchy. We adopt a hierarchical game approach, by translating the Wadge theory from the ω-rational language to the ω-semigroup context. More precisely, we first show that the Wagner degree is indeed a syntactic invariant. We then define a reduction relation on finite pointed ω-semigroups by means of a Wadge-like infinite two-player game. The collection of these algebraic structures ordered by this reduction is then proven to be isomorphic to the Wagner hierarchy, namely a well-founded and decidable partial ordering of width 2 and height $\omega^\omega$. We also describe a decidability procedure of this hierarchy: we introduce a graph representation of finite pointed ω-semigroups allowing to compute their precise Wagner degrees. The Wagner degree of every ω-rational language can therefore be computed directly on its syntactic image. We then show how to build a finite pointed ω-semigroup of any given Wagner degree. We finally describe the algebraic invariants characterizing every Wagner degree of this hierarchy.
Resumo:
Temporary streams are those water courses that undergo the recurrent cessation of flow or the complete drying of their channel. The structure and composition of biological communities in temporary stream reaches are strongly dependent on the temporal changes of the aquatic habitats determined by the hydrological conditions. Therefore, the structural and functional characteristics of aquatic fauna to assess the ecological quality of a temporary stream reach cannot be used without taking into account the controls imposed by the hydrological regime. This paper develops methods for analysing temporary streams' aquatic regimes, based on the definition of six aquatic states that summarize the transient sets of mesohabitats occurring on a given reach at a particular moment, depending on the hydrological conditions: Hyperrheic, Eurheic, Oligorheic, Arheic, Hyporheic and Edaphic. When the hydrological conditions lead to a change in the aquatic state, the structure and composition of the aquatic community changes according to the new set of available habitats. We used the water discharge records from gauging stations or simulations with rainfall-runoff models to infer the temporal patterns of occurrence of these states in the Aquatic States Frequency Graph we developed. The visual analysis of this graph is complemented by the development of two metrics which describe the permanence of flow and the seasonal predictability of zero flow periods. Finally, a classification of temporary streams in four aquatic regimes in terms of their influence over the development of aquatic life is updated from the existing classifications, with stream aquatic regimes defined as Permanent, Temporary-pools, Temporary-dry and Episodic. While aquatic regimes describe the long-term overall variability of the hydrological conditions of the river section and have been used for many years by hydrologists and ecologists, aquatic states describe the availability of mesohabitats in given periods that determine the presence of different biotic assemblages. This novel concept links hydrological and ecological conditions in a unique way. All these methods were implemented with data from eight temporary streams around the Mediterranean within the MIRAGE project. Their application was a precondition to assessing the ecological quality of these streams.
Resumo:
The use of domain-specific languages (DSLs) has been proposed as an approach to cost-e ectively develop families of software systems in a restricted application domain. Domain-specific languages in combination with the accumulated knowledge and experience of previous implementations, can in turn be used to generate new applications with unique sets of requirements. For this reason, DSLs are considered to be an important approach for software reuse. However, the toolset supporting a particular domain-specific language is also domain-specific and is per definition not reusable. Therefore, creating and maintaining a DSL requires additional resources that could be even larger than the savings associated with using them. As a solution, di erent tool frameworks have been proposed to simplify and reduce the cost of developments of DSLs. Developers of tool support for DSLs need to instantiate, customize or configure the framework for a particular DSL. There are di erent approaches for this. An approach is to use an application programming interface (API) and to extend the basic framework using an imperative programming language. An example of a tools which is based on this approach is Eclipse GEF. Another approach is to configure the framework using declarative languages that are independent of the underlying framework implementation. We believe this second approach can bring important benefits as this brings focus to specifying what should the tool be like instead of writing a program specifying how the tool achieves this functionality. In this thesis we explore this second approach. We use graph transformation as the basic approach to customize a domain-specific modeling (DSM) tool framework. The contributions of this thesis includes a comparison of di erent approaches for defining, representing and interchanging software modeling languages and models and a tool architecture for an open domain-specific modeling framework that e ciently integrates several model transformation components and visual editors. We also present several specific algorithms and tool components for DSM framework. These include an approach for graph query based on region operators and the star operator and an approach for reconciling models and diagrams after executing model transformation programs. We exemplify our approach with two case studies MICAS and EFCO. In these studies we show how our experimental modeling tool framework has been used to define tool environments for domain-specific languages.
Resumo:
Complex networks can arise naturally and spontaneously from all things that act as a part of a larger system. From the patterns of socialization between people to the way biological systems organize themselves, complex networks are ubiquitous, but are currently poorly understood. A number of algorithms, designed by humans, have been proposed to describe the organizational behaviour of real-world networks. Consequently, breakthroughs in genetics, medicine, epidemiology, neuroscience, telecommunications and the social sciences have recently resulted. The algorithms, called graph models, represent significant human effort. Deriving accurate graph models is non-trivial, time-intensive, challenging and may only yield useful results for very specific phenomena. An automated approach can greatly reduce the human effort required and if effective, provide a valuable tool for understanding the large decentralized systems of interrelated things around us. To the best of the author's knowledge this thesis proposes the first method for the automatic inference of graph models for complex networks with varied properties, with and without community structure. Furthermore, to the best of the author's knowledge it is the first application of genetic programming for the automatic inference of graph models. The system and methodology was tested against benchmark data, and was shown to be capable of reproducing close approximations to well-known algorithms designed by humans. Furthermore, when used to infer a model for real biological data the resulting model was more representative than models currently used in the literature.
Resumo:
A complex network is an abstract representation of an intricate system of interrelated elements where the patterns of connection hold significant meaning. One particular complex network is a social network whereby the vertices represent people and edges denote their daily interactions. Understanding social network dynamics can be vital to the mitigation of disease spread as these networks model the interactions, and thus avenues of spread, between individuals. To better understand complex networks, algorithms which generate graphs exhibiting observed properties of real-world networks, known as graph models, are often constructed. While various efforts to aid with the construction of graph models have been proposed using statistical and probabilistic methods, genetic programming (GP) has only recently been considered. However, determining that a graph model of a complex network accurately describes the target network(s) is not a trivial task as the graph models are often stochastic in nature and the notion of similarity is dependent upon the expected behavior of the network. This thesis examines a number of well-known network properties to determine which measures best allowed networks generated by different graph models, and thus the models themselves, to be distinguished. A proposed meta-analysis procedure was used to demonstrate how these network measures interact when used together as classifiers to determine network, and thus model, (dis)similarity. The analytical results form the basis of the fitness evaluation for a GP system used to automatically construct graph models for complex networks. The GP-based automatic inference system was used to reproduce existing, well-known graph models as well as a real-world network. Results indicated that the automatically inferred models exemplified functional similarity when compared to their respective target networks. This approach also showed promise when used to infer a model for a mammalian brain network.
Object-Oriented Genetic Programming for the Automatic Inference of Graph Models for Complex Networks
Resumo:
Complex networks are systems of entities that are interconnected through meaningful relationships. The result of the relations between entities forms a structure that has a statistical complexity that is not formed by random chance. In the study of complex networks, many graph models have been proposed to model the behaviours observed. However, constructing graph models manually is tedious and problematic. Many of the models proposed in the literature have been cited as having inaccuracies with respect to the complex networks they represent. However, recently, an approach that automates the inference of graph models was proposed by Bailey [10] The proposed methodology employs genetic programming (GP) to produce graph models that approximate various properties of an exemplary graph of a targeted complex network. However, there is a great deal already known about complex networks, in general, and often specific knowledge is held about the network being modelled. The knowledge, albeit incomplete, is important in constructing a graph model. However it is difficult to incorporate such knowledge using existing GP techniques. Thus, this thesis proposes a novel GP system which can incorporate incomplete expert knowledge that assists in the evolution of a graph model. Inspired by existing graph models, an abstract graph model was developed to serve as an embryo for inferring graph models of some complex networks. The GP system and abstract model were used to reproduce well-known graph models. The results indicated that the system was able to evolve models that produced networks that had structural similarities to the networks generated by the respective target models.
Resumo:
Embedded systems are usually designed for a single or a specified set of tasks. This specificity means the system design as well as its hardware/software development can be highly optimized. Embedded software must meet the requirements such as high reliability operation on resource-constrained platforms, real time constraints and rapid development. This necessitates the adoption of static machine codes analysis tools running on a host machine for the validation and optimization of embedded system codes, which can help meet all of these goals. This could significantly augment the software quality and is still a challenging field.Embedded systems are usually designed for a single or a specified set of tasks. This specificity means the system design as well as its hardware/software development can be highly optimized. Embedded software must meet the requirements such as high reliability operation on resource-constrained platforms, real time constraints and rapid development. This necessitates the adoption of static machine codes analysis tools running on a host machine for the validation and optimization of embedded system codes, which can help meet all of these goals. This could significantly augment the software quality and is still a challenging field.Embedded systems are usually designed for a single or a specified set of tasks. This specificity means the system design as well as its hardware/software development can be highly optimized. Embedded software must meet the requirements such as high reliability operation on resource-constrained platforms, real time constraints and rapid development. This necessitates the adoption of static machine codes analysis tools running on a host machine for the validation and optimization of embedded system codes, which can help meet all of these goals. This could significantly augment the software quality and is still a challenging field.Embedded systems are usually designed for a single or a specified set of tasks. This specificity means the system design as well as its hardware/software development can be highly optimized. Embedded software must meet the requirements such as high reliability operation on resource-constrained platforms, real time constraints and rapid development. This necessitates the adoption of static machine codes analysis tools running on a host machine for the validation and optimization of embedded system codes, which can help meet all of these goals. This could significantly augment the software quality and is still a challenging field.This dissertation contributes to an architecture oriented code validation, error localization and optimization technique assisting the embedded system designer in software debugging, to make it more effective at early detection of software bugs that are otherwise hard to detect, using the static analysis of machine codes. The focus of this work is to develop methods that automatically localize faults as well as optimize the code and thus improve the debugging process as well as quality of the code.Validation is done with the help of rules of inferences formulated for the target processor. The rules govern the occurrence of illegitimate/out of place instructions and code sequences for executing the computational and integrated peripheral functions. The stipulated rules are encoded in propositional logic formulae and their compliance is tested individually in all possible execution paths of the application programs. An incorrect sequence of machine code pattern is identified using slicing techniques on the control flow graph generated from the machine code.An algorithm to assist the compiler to eliminate the redundant bank switching codes and decide on optimum data allocation to banked memory resulting in minimum number of bank switching codes in embedded system software is proposed. A relation matrix and a state transition diagram formed for the active memory bank state transition corresponding to each bank selection instruction is used for the detection of redundant codes. Instances of code redundancy based on the stipulated rules for the target processor are identified.This validation and optimization tool can be integrated to the system development environment. It is a novel approach independent of compiler/assembler, applicable to a wide range of processors once appropriate rules are formulated. Program states are identified mainly with machine code pattern, which drastically reduces the state space creation contributing to an improved state-of-the-art model checking. Though the technique described is general, the implementation is architecture oriented, and hence the feasibility study is conducted on PIC16F87X microcontrollers. The proposed tool will be very useful in steering novices towards correct use of difficult microcontroller features in developing embedded systems.
Resumo:
This paper describes a new statistical, model-based approach to building a contact state observer. The observer uses measurements of the contact force and position, and prior information about the task encoded in a graph, to determine the current location of the robot in the task configuration space. Each node represents what the measurements will look like in a small region of configuration space by storing a predictive, statistical, measurement model. This approach assumes that the measurements are statistically block independent conditioned on knowledge of the model, which is a fairly good model of the actual process. Arcs in the graph represent possible transitions between models. Beam Viterbi search is used to match measurement history against possible paths through the model graph in order to estimate the most likely path for the robot. The resulting approach provides a new decision process that can be use as an observer for event driven manipulation programming. The decision procedure is significantly more robust than simple threshold decisions because the measurement history is used to make decisions. The approach can be used to enhance the capabilities of autonomous assembly machines and in quality control applications.
Resumo:
Biological systems exhibit rich and complex behavior through the orchestrated interplay of a large array of components. It is hypothesized that separable subsystems with some degree of functional autonomy exist; deciphering their independent behavior and functionality would greatly facilitate understanding the system as a whole. Discovering and analyzing such subsystems are hence pivotal problems in the quest to gain a quantitative understanding of complex biological systems. In this work, using approaches from machine learning, physics and graph theory, methods for the identification and analysis of such subsystems were developed. A novel methodology, based on a recent machine learning algorithm known as non-negative matrix factorization (NMF), was developed to discover such subsystems in a set of large-scale gene expression data. This set of subsystems was then used to predict functional relationships between genes, and this approach was shown to score significantly higher than conventional methods when benchmarking them against existing databases. Moreover, a mathematical treatment was developed to treat simple network subsystems based only on their topology (independent of particular parameter values). Application to a problem of experimental interest demonstrated the need for extentions to the conventional model to fully explain the experimental data. Finally, the notion of a subsystem was evaluated from a topological perspective. A number of different protein networks were examined to analyze their topological properties with respect to separability, seeking to find separable subsystems. These networks were shown to exhibit separability in a nonintuitive fashion, while the separable subsystems were of strong biological significance. It was demonstrated that the separability property found was not due to incomplete or biased data, but is likely to reflect biological structure.
Resumo:
Goal modelling is a well known rigorous method for analysing problem rationale and developing requirements. Under the pressures typical of time-constrained projects its benefits are not accessible. This is because of the effort and time needed to create the graph and because reading the results can be difficult owing to the effects of crosscutting concerns. Here we introduce an adaptation of KAOS to meet the needs of rapid turn around and clarity. The main aim is to help the stakeholders gain an insight into the larger issues that might be overlooked if they make a premature start into implementation. The method emphasises the use of obstacles, accepts under-refined goals and has new methods for managing crosscutting concerns and strategic decision making. It is expected to be of value to agile as well as traditional processes.
Resumo:
Aspect-oriented programming (AOP) is a promising technology that supports separation of crosscutting concerns (i.e., functionality that tends to be tangled with, and scattered through the rest of the system). In AOP, a method-like construct named advice is applied to join points in the system through a special construct named pointcut. This mechanism supports the modularization of crosscutting behavior; however, since the added interactions are not explicit in the source code, it is hard to ensure their correctness. To tackle this problem, this paper presents a rigorous coverage analysis approach to ensure exercising the logic of each advice - statements, branches, and def-use pairs - at each affected join point. To make this analysis possible, a structural model based on Java bytecode - called PointCut-based Del-Use Graph (PCDU) - is proposed, along with three integration testing criteria. Theoretical, empirical, and exploratory studies involving 12 aspect-oriented programs and several fault examples present evidence of the feasibility and effectiveness of the proposed approach. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Automatic summarization of texts is now crucial for several information retrieval tasks owing to the huge amount of information available in digital media, which has increased the demand for simple, language-independent extractive summarization strategies. In this paper, we employ concepts and metrics of complex networks to select sentences for an extractive summary. The graph or network representing one piece of text consists of nodes corresponding to sentences, while edges connect sentences that share common meaningful nouns. Because various metrics could be used, we developed a set of 14 summarizers, generically referred to as CN-Summ, employing network concepts such as node degree, length of shortest paths, d-rings and k-cores. An additional summarizer was created which selects the highest ranked sentences in the 14 systems, as in a voting system. When applied to a corpus of Brazilian Portuguese texts, some CN-Summ versions performed better than summarizers that do not employ deep linguistic knowledge, with results comparable to state-of-the-art summarizers based on expensive linguistic resources. The use of complex networks to represent texts appears therefore as suitable for automatic summarization, consistent with the belief that the metrics of such networks may capture important text features. (c) 2008 Elsevier Inc. All rights reserved.