842 resultados para Graph Based Algorithms
Resumo:
HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.
Resumo:
BACKGROUND: Tests for recent infections (TRIs) are important for HIV surveillance. We have shown that a patient's antibody pattern in a confirmatory line immunoassay (Inno-Lia) also yields information on time since infection. We have published algorithms which, with a certain sensitivity and specificity, distinguish between incident (< = 12 months) and older infection. In order to use these algorithms like other TRIs, i.e., based on their windows, we now determined their window periods. METHODS: We classified Inno-Lia results of 527 treatment-naïve patients with HIV-1 infection < = 12 months according to incidence by 25 algorithms. The time after which all infections were ruled older, i.e. the algorithm's window, was determined by linear regression of the proportion ruled incident in dependence of time since infection. Window-based incident infection rates (IIR) were determined utilizing the relationship 'Prevalence = Incidence x Duration' in four annual cohorts of HIV-1 notifications. Results were compared to performance-based IIR also derived from Inno-Lia results, but utilizing the relationship 'incident = true incident + false incident' and also to the IIR derived from the BED incidence assay. RESULTS: Window periods varied between 45.8 and 130.1 days and correlated well with the algorithms' diagnostic sensitivity (R(2) = 0.962; P<0.0001). Among the 25 algorithms, the mean window-based IIR among the 748 notifications of 2005/06 was 0.457 compared to 0.453 obtained for performance-based IIR with a model not correcting for selection bias. Evaluation of BED results using a window of 153 days yielded an IIR of 0.669. Window-based IIR and performance-based IIR increased by 22.4% and respectively 30.6% in 2008, while 2009 and 2010 showed a return to baseline for both methods. CONCLUSIONS: IIR estimations by window- and performance-based evaluations of Inno-Lia algorithm results were similar and can be used together to assess IIR changes between annual HIV notification cohorts.
Resumo:
Segmenting ultrasound images is a challenging problemwhere standard unsupervised segmentation methods such asthe well-known Chan-Vese method fail. We propose in thispaper an efficient segmentation method for this class ofimages. Our proposed algorithm is based on asemi-supervised approach (user labels) and the use ofimage patches as data features. We also consider thePearson distance between patches, which has been shown tobe robust w.r.t speckle noise present in ultrasoundimages. Our results on phantom and clinical data show avery high similarity agreement with the ground truthprovided by a medical expert.
Resumo:
The state of the art to describe image quality in medical imaging is to assess the performance of an observer conducting a task of clinical interest. This can be done by using a model observer leading to a figure of merit such as the signal-to-noise ratio (SNR). Using the non-prewhitening (NPW) model observer, we objectively characterised the evolution of its figure of merit in various acquisition conditions. The NPW model observer usually requires the use of the modulation transfer function (MTF) as well as noise power spectra. However, although the computation of the MTF poses no problem when dealing with the traditional filtered back-projection (FBP) algorithm, this is not the case when using iterative reconstruction (IR) algorithms, such as adaptive statistical iterative reconstruction (ASIR) or model-based iterative reconstruction (MBIR). Given that the target transfer function (TTF) had already shown it could accurately express the system resolution even with non-linear algorithms, we decided to tune the NPW model observer, replacing the standard MTF by the TTF. It was estimated using a custom-made phantom containing cylindrical inserts surrounded by water. The contrast differences between the inserts and water were plotted for each acquisition condition. Then, mathematical transformations were performed leading to the TTF. As expected, the first results showed a dependency of the image contrast and noise levels on the TTF for both ASIR and MBIR. Moreover, FBP also proved to be dependent of the contrast and noise when using the lung kernel. Those results were then introduced in the NPW model observer. We observed an enhancement of SNR every time we switched from FBP to ASIR to MBIR. IR algorithms greatly improve image quality, especially in low-dose conditions. Based on our results, the use of MBIR could lead to further dose reduction in several clinical applications.
Resumo:
The paper presents some contemporary approaches to spatial environmental data analysis. The main topics are concentrated on the decision-oriented problems of environmental spatial data mining and modeling: valorization and representativity of data with the help of exploratory data analysis, spatial predictions, probabilistic and risk mapping, development and application of conditional stochastic simulation models. The innovative part of the paper presents integrated/hybrid model-machine learning (ML) residuals sequential simulations-MLRSS. The models are based on multilayer perceptron and support vector regression ML algorithms used for modeling long-range spatial trends and sequential simulations of the residuals. NIL algorithms deliver non-linear solution for the spatial non-stationary problems, which are difficult for geostatistical approach. Geostatistical tools (variography) are used to characterize performance of ML algorithms, by analyzing quality and quantity of the spatially structured information extracted from data with ML algorithms. Sequential simulations provide efficient assessment of uncertainty and spatial variability. Case study from the Chernobyl fallouts illustrates the performance of the proposed model. It is shown that probability mapping, provided by the combination of ML data driven and geostatistical model based approaches, can be efficiently used in decision-making process. (C) 2003 Elsevier Ltd. All rights reserved.
Resumo:
In this research work we searched for open source libraries which supports graph drawing and visualisation and can run in a browser. Subsequent these libraries were evaluated to find out which one is the best for this task. The result was the d3.js is that library which has the greatest functionality, flexibility and customisability. Afterwards we developed an open source software tool where d3.js was included and which was written in JavaScript so that it can run browser-based.
Resumo:
Social interactions are a very important component in people"s lives. Social network analysis has become a common technique used to model and quantify the properties of social interactions. In this paper, we propose an integrated framework to explore the characteristics of a social network extracted from multimodal dyadic interactions. For our study, we used a set of videos belonging to New York Times" Blogging Heads opinion blog. The Social Network is represented as an oriented graph, whose directed links are determined by the Influence Model. The links" weights are a measure of the"influence" a person has over the other. The states of the Influence Model encode automatically extracted audio/visual features from our videos using state-of-the art algorithms. Our results are reported in terms of accuracy of audio/visual data fusion for speaker segmentation and centrality measures used to characterize the extracted social network.
Resumo:
The use of domain-specific languages (DSLs) has been proposed as an approach to cost-e ectively develop families of software systems in a restricted application domain. Domain-specific languages in combination with the accumulated knowledge and experience of previous implementations, can in turn be used to generate new applications with unique sets of requirements. For this reason, DSLs are considered to be an important approach for software reuse. However, the toolset supporting a particular domain-specific language is also domain-specific and is per definition not reusable. Therefore, creating and maintaining a DSL requires additional resources that could be even larger than the savings associated with using them. As a solution, di erent tool frameworks have been proposed to simplify and reduce the cost of developments of DSLs. Developers of tool support for DSLs need to instantiate, customize or configure the framework for a particular DSL. There are di erent approaches for this. An approach is to use an application programming interface (API) and to extend the basic framework using an imperative programming language. An example of a tools which is based on this approach is Eclipse GEF. Another approach is to configure the framework using declarative languages that are independent of the underlying framework implementation. We believe this second approach can bring important benefits as this brings focus to specifying what should the tool be like instead of writing a program specifying how the tool achieves this functionality. In this thesis we explore this second approach. We use graph transformation as the basic approach to customize a domain-specific modeling (DSM) tool framework. The contributions of this thesis includes a comparison of di erent approaches for defining, representing and interchanging software modeling languages and models and a tool architecture for an open domain-specific modeling framework that e ciently integrates several model transformation components and visual editors. We also present several specific algorithms and tool components for DSM framework. These include an approach for graph query based on region operators and the star operator and an approach for reconciling models and diagrams after executing model transformation programs. We exemplify our approach with two case studies MICAS and EFCO. In these studies we show how our experimental modeling tool framework has been used to define tool environments for domain-specific languages.
Resumo:
Identification of low-dimensional structures and main sources of variation from multivariate data are fundamental tasks in data analysis. Many methods aimed at these tasks involve solution of an optimization problem. Thus, the objective of this thesis is to develop computationally efficient and theoretically justified methods for solving such problems. Most of the thesis is based on a statistical model, where ridges of the density estimated from the data are considered as relevant features. Finding ridges, that are generalized maxima, necessitates development of advanced optimization methods. An efficient and convergent trust region Newton method for projecting a point onto a ridge of the underlying density is developed for this purpose. The method is utilized in a differential equation-based approach for tracing ridges and computing projection coordinates along them. The density estimation is done nonparametrically by using Gaussian kernels. This allows application of ridge-based methods with only mild assumptions on the underlying structure of the data. The statistical model and the ridge finding methods are adapted to two different applications. The first one is extraction of curvilinear structures from noisy data mixed with background clutter. The second one is a novel nonlinear generalization of principal component analysis (PCA) and its extension to time series data. The methods have a wide range of potential applications, where most of the earlier approaches are inadequate. Examples include identification of faults from seismic data and identification of filaments from cosmological data. Applicability of the nonlinear PCA to climate analysis and reconstruction of periodic patterns from noisy time series data are also demonstrated. Other contributions of the thesis include development of an efficient semidefinite optimization method for embedding graphs into the Euclidean space. The method produces structure-preserving embeddings that maximize interpoint distances. It is primarily developed for dimensionality reduction, but has also potential applications in graph theory and various areas of physics, chemistry and engineering. Asymptotic behaviour of ridges and maxima of Gaussian kernel densities is also investigated when the kernel bandwidth approaches infinity. The results are applied to the nonlinear PCA and to finding significant maxima of such densities, which is a typical problem in visual object tracking.
Resumo:
The hyper-star interconnection network was proposed in 2002 to overcome the drawbacks of the hypercube and its variations concerning the network cost, which is defined by the product of the degree and the diameter. Some properties of the graph such as connectivity, symmetry properties, embedding properties have been studied by other researchers, routing and broadcasting algorithms have also been designed. This thesis studies the hyper-star graph from both the topological and algorithmic point of view. For the topological properties, we try to establish relationships between hyper-star graphs with other known graphs. We also give a formal equation for the surface area of the graph. Another topological property we are interested in is the Hamiltonicity problem of this graph. For the algorithms, we design an all-port broadcasting algorithm and a single-port neighbourhood broadcasting algorithm for the regular form of the hyper-star graphs. These algorithms are both optimal time-wise. Furthermore, we prove that the folded hyper-star, a variation of the hyper-star, to be maixmally fault-tolerant.
Resumo:
A complex network is an abstract representation of an intricate system of interrelated elements where the patterns of connection hold significant meaning. One particular complex network is a social network whereby the vertices represent people and edges denote their daily interactions. Understanding social network dynamics can be vital to the mitigation of disease spread as these networks model the interactions, and thus avenues of spread, between individuals. To better understand complex networks, algorithms which generate graphs exhibiting observed properties of real-world networks, known as graph models, are often constructed. While various efforts to aid with the construction of graph models have been proposed using statistical and probabilistic methods, genetic programming (GP) has only recently been considered. However, determining that a graph model of a complex network accurately describes the target network(s) is not a trivial task as the graph models are often stochastic in nature and the notion of similarity is dependent upon the expected behavior of the network. This thesis examines a number of well-known network properties to determine which measures best allowed networks generated by different graph models, and thus the models themselves, to be distinguished. A proposed meta-analysis procedure was used to demonstrate how these network measures interact when used together as classifiers to determine network, and thus model, (dis)similarity. The analytical results form the basis of the fitness evaluation for a GP system used to automatically construct graph models for complex networks. The GP-based automatic inference system was used to reproduce existing, well-known graph models as well as a real-world network. Results indicated that the automatically inferred models exemplified functional similarity when compared to their respective target networks. This approach also showed promise when used to infer a model for a mammalian brain network.
Resumo:
The performance of a model-based diagnosis system could be affected by several uncertainty sources, such as,model errors,uncertainty in measurements, and disturbances. This uncertainty can be handled by mean of interval models.The aim of this thesis is to propose a methodology for fault detection, isolation and identification based on interval models. The methodology includes some algorithms to obtain in an automatic way the symbolic expression of the residual generators enhancing the structural isolability of the faults, in order to design the fault detection tests. These algorithms are based on the structural model of the system. The stages of fault detection, isolation, and identification are stated as constraint satisfaction problems in continuous domains and solved by means of interval based consistency techniques. The qualitative fault isolation is enhanced by a reasoning in which the signs of the symptoms are derived from analytical redundancy relations or bond graph models of the system. An initial and empirical analysis regarding the differences between interval-based and statistical-based techniques is presented in this thesis. The performance and efficiency of the contributions are illustrated through several application examples, covering different levels of complexity.
Resumo:
Frequency recognition is an important task in many engineering fields such as audio signal processing and telecommunications engineering, for example in applications like Dual-Tone Multi-Frequency (DTMF) detection or the recognition of the carrier frequency of a Global Positioning, System (GPS) signal. This paper will present results of investigations on several common Fourier Transform-based frequency recognition algorithms implemented in real time on a Texas Instruments (TI) TMS320C6713 Digital Signal Processor (DSP) core. In addition, suitable metrics are going to be evaluated in order to ascertain which of these selected algorithms is appropriate for audio signal processing(1).
Resumo:
This paper formally derives a new path-based neural branch prediction algorithm (FPP) into blocks of size two for a lower hardware solution while maintaining similar input-output characteristic to the algorithm. The blocked solution, here referred to as B2P algorithm, is obtained using graph theory and retiming methods. Verification approaches were exercised to show that prediction performances obtained from the FPP and B2P algorithms differ within one mis-prediction per thousand instructions using a known framework for branch prediction evaluation. For a chosen FPGA device, circuits generated from the B2P algorithm showed average area savings of over 25% against circuits for the FPP algorithm with similar time performances thus making the proposed blocked predictor superior from a practical viewpoint.
Resumo:
In this paper, a fuzzy Markov random field (FMRF) model is used to segment land-objects into free, grass, building, and road regions by fusing remotely, sensed LIDAR data and co-registered color bands, i.e. scanned aerial color (RGB) photo and near infra-red (NIR) photo. An FMRF model is defined as a Markov random field (MRF) model in a fuzzy domain. Three optimization algorithms in the FMRF model, i.e. Lagrange multiplier (LM), iterated conditional mode (ICM), and simulated annealing (SA), are compared with respect to the computational cost and segmentation accuracy. The results have shown that the FMRF model-based ICM algorithm balances the computational cost and segmentation accuracy in land-cover segmentation from LIDAR data and co-registered bands.