869 resultados para Probabilistic Algorithms


Relevância:

20.00% 20.00%

Publicador:

Resumo:

HEMOLIA (a project under European community’s 7th framework programme) is a new generation Anti-Money Laundering (AML) intelligent multi-agent alert and investigation system which in addition to the traditional financial data makes extensive use of modern society’s huge telecom data source, thereby opening up a new dimension of capabilities to all Money Laundering fighters (FIUs, LEAs) and Financial Institutes (Banks, Insurance Companies, etc.). This Master-Thesis project is done at AIA, one of the partners for the HEMOLIA project in Barcelona. The objective of this thesis is to find the clusters in a network drawn by using the financial data. An extensive literature survey has been carried out and several standard algorithms related to networks have been studied and implemented. The clustering problem is a NP-hard problem and several algorithms like K-Means and Hierarchical clustering are being implemented for studying several problems relating to sociology, evolution, anthropology etc. However, these algorithms have certain drawbacks which make them very difficult to implement. The thesis suggests (a) a possible improvement to the K-Means algorithm, (b) a novel approach to the clustering problem using the Genetic Algorithms and (c) a new algorithm for finding the cluster of a node using the Genetic Algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper proposes a multicast implementation based on adaptive routing with anticipated calculation. Three different cost measures for a point-to-multipoint connection: bandwidth cost, connection establishment cost and switching cost can be considered. The application of the method based on pre-evaluated routing tables makes possible the reduction of bandwidth cost and connection establishment cost individually

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In the first part of this research, three stages were stated for a program to increase the information extracted from ink evidence and maximise its usefulness to the criminal and civil justice system. These stages are (a) develop a standard methodology for analysing ink samples by high-performance thin layer chromatography (HPTLC) in reproducible way, when ink samples are analysed at different time, locations and by different examiners; (b) compare automatically and objectively ink samples; and (c) define and evaluate theoretical framework for the use of ink evidence in forensic context. This report focuses on the second of the three stages. Using the calibration and acquisition process described in the previous report, mathematical algorithms are proposed to automatically and objectively compare ink samples. The performances of these algorithms are systematically studied for various chemical and forensic conditions using standard performance tests commonly used in biometrics studies. The results show that different algorithms are best suited for different tasks. Finally, this report demonstrates how modern analytical and computer technology can be used in the field of ink examination and how tools developed and successfully applied in other fields of forensic science can help maximising its impact within the field of questioned documents.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The paper discusses maintenance challenges of organisations with a huge number of devices and proposes the use of probabilistic models to assist monitoring and maintenance planning. The proposal assumes connectivity of instruments to report relevant features for monitoring. Also, the existence of enough historical registers with diagnosed breakdowns is required to make probabilistic models reliable and useful for predictive maintenance strategies based on them. Regular Markov models based on estimated failure and repair rates are proposed to calculate the availability of the instruments and Dynamic Bayesian Networks are proposed to model cause-effect relationships to trigger predictive maintenance services based on the influence between observed features and previously documented diagnostics

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Miralls deformables més i més grans, amb cada cop més actuadors estan sent utilitzats actualment en aplicacions d'òptica adaptativa. El control dels miralls amb centenars d'actuadors és un tema de gran interès, ja que les tècniques de control clàssiques basades en la seudoinversa de la matriu de control del sistema es tornen massa lentes quan es tracta de matrius de dimensions tan grans. En aquesta tesi doctoral es proposa un mètode per l'acceleració i la paral.lelitzacó dels algoritmes de control d'aquests miralls, a través de l'aplicació d'una tècnica de control basada en la reducció a zero del components més petits de la matriu de control (sparsification), seguida de l'optimització de l'ordenació dels accionadors de comandament atenent d'acord a la forma de la matriu, i finalment de la seva posterior divisió en petits blocs tridiagonals. Aquests blocs són molt més petits i més fàcils de fer servir en els càlculs, el que permet velocitats de càlcul molt superiors per l'eliminació dels components nuls en la matriu de control. A més, aquest enfocament permet la paral.lelització del càlcul, donant una com0onent de velocitat addicional al sistema. Fins i tot sense paral. lelització, s'ha obtingut un augment de gairebé un 40% de la velocitat de convergència dels miralls amb només 37 actuadors, mitjançant la tècnica proposada. Per validar això, s'ha implementat un muntatge experimental nou complet , que inclou un modulador de fase programable per a la generació de turbulència mitjançant pantalles de fase, i s'ha desenvolupat un model complert del bucle de control per investigar el rendiment de l'algorisme proposat. Els resultats, tant en la simulació com experimentalment, mostren l'equivalència total en els valors de desviació després de la compensació dels diferents tipus d'aberracions per als diferents algoritmes utilitzats, encara que el mètode proposat aquí permet una càrrega computacional molt menor. El procediment s'espera que sigui molt exitós quan s'aplica a miralls molt grans.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Solexa/Illumina short-read ultra-high throughput DNA sequencing technology produces millions of short tags (up to 36 bases) by parallel sequencing-by-synthesis of DNA colonies. The processing and statistical analysis of such high-throughput data poses new challenges; currently a fair proportion of the tags are routinely discarded due to an inability to match them to a reference sequence, thereby reducing the effective throughput of the technology. RESULTS: We propose a novel base calling algorithm using model-based clustering and probability theory to identify ambiguous bases and code them with IUPAC symbols. We also select optimal sub-tags using a score based on information content to remove uncertain bases towards the ends of the reads. CONCLUSION: We show that the method improves genome coverage and number of usable tags as compared with Solexa's data processing pipeline by an average of 15%. An R package is provided which allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract This thesis proposes a set of adaptive broadcast solutions and an adaptive data replication solution to support the deployment of P2P applications. P2P applications are an emerging type of distributed applications that are running on top of P2P networks. Typical P2P applications are video streaming, file sharing, etc. While interesting because they are fully distributed, P2P applications suffer from several deployment problems, due to the nature of the environment on which they perform. Indeed, defining an application on top of a P2P network often means defining an application where peers contribute resources in exchange for their ability to use the P2P application. For example, in P2P file sharing application, while the user is downloading some file, the P2P application is in parallel serving that file to other users. Such peers could have limited hardware resources, e.g., CPU, bandwidth and memory or the end-user could decide to limit the resources it dedicates to the P2P application a priori. In addition, a P2P network is typically emerged into an unreliable environment, where communication links and processes are subject to message losses and crashes, respectively. To support P2P applications, this thesis proposes a set of services that address some underlying constraints related to the nature of P2P networks. The proposed services include a set of adaptive broadcast solutions and an adaptive data replication solution that can be used as the basis of several P2P applications. Our data replication solution permits to increase availability and to reduce the communication overhead. The broadcast solutions aim, at providing a communication substrate encapsulating one of the key communication paradigms used by P2P applications: broadcast. Our broadcast solutions typically aim at offering reliability and scalability to some upper layer, be it an end-to-end P2P application or another system-level layer, such as a data replication layer. Our contributions are organized in a protocol stack made of three layers. In each layer, we propose a set of adaptive protocols that address specific constraints imposed by the environment. Each protocol is evaluated through a set of simulations. The adaptiveness aspect of our solutions relies on the fact that they take into account the constraints of the underlying system in a proactive manner. To model these constraints, we define an environment approximation algorithm allowing us to obtain an approximated view about the system or part of it. This approximated view includes the topology and the components reliability expressed in probabilistic terms. To adapt to the underlying system constraints, the proposed broadcast solutions route messages through tree overlays permitting to maximize the broadcast reliability. Here, the broadcast reliability is expressed as a function of the selected paths reliability and of the use of available resources. These resources are modeled in terms of quotas of messages translating the receiving and sending capacities at each node. To allow a deployment in a large-scale system, we take into account the available memory at processes by limiting the view they have to maintain about the system. Using this partial view, we propose three scalable broadcast algorithms, which are based on a propagation overlay that tends to the global tree overlay and adapts to some constraints of the underlying system. At a higher level, this thesis also proposes a data replication solution that is adaptive both in terms of replica placement and in terms of request routing. At the routing level, this solution takes the unreliability of the environment into account, in order to maximize reliable delivery of requests. At the replica placement level, the dynamically changing origin and frequency of read/write requests are analyzed, in order to define a set of replica that minimizes communication cost.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Part I of this series of articles focused on the construction of graphical probabilistic inference procedures, at various levels of detail, for assessing the evidential value of gunshot residue (GSR) particle evidence. The proposed models - in the form of Bayesian networks - address the issues of background presence of GSR particles, analytical performance (i.e., the efficiency of evidence searching and analysis procedures) and contamination. The use and practical implementation of Bayesian networks for case pre-assessment is also discussed. This paper, Part II, concentrates on Bayesian parameter estimation. This topic complements Part I in that it offers means for producing estimates useable for the numerical specification of the proposed probabilistic graphical models. Bayesian estimation procedures are given a primary focus of attention because they allow the scientist to combine (his/her) prior knowledge about the problem of interest with newly acquired experimental data. The present paper also considers further topics such as the sensitivity of the likelihood ratio due to uncertainty in parameters and the study of likelihood ratio values obtained for members of particular populations (e.g., individuals with or without exposure to GSR).

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Wireless “MIMO” systems, employing multiple transmit and receive antennas, promise a significant increase of channel capacity, while orthogonal frequency-division multiplexing (OFDM) is attracting a good deal of attention due to its robustness to multipath fading. Thus, the combination of both techniques is an attractive proposition for radio transmission. The goal of this paper is the description and analysis of a new and novel pilot-aided estimator of multipath block-fading channels. Typical models leading to estimation algorithms assume the number of multipath components and delays to be constant (and often known), while their amplitudes are allowed to vary with time. Our estimator is focused instead on the more realistic assumption that the number of channel taps is also unknown and varies with time following a known probabilistic model. The estimation problem arising from these assumptions is solved using Random-Set Theory (RST), whereby one regards the multipath-channel response as a single set-valued random entity.Within this framework, Bayesian recursive equations determine the evolution with time of the channel estimator. Due to the lack of a closed form for the solution of Bayesian equations, a (Rao–Blackwellized) particle filter (RBPF) implementation ofthe channel estimator is advocated. Since the resulting estimator exhibits a complexity which grows exponentially with the number of multipath components, a simplified version is also introduced. Simulation results describing the performance of our channel estimator demonstrate its effectiveness.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

BACKGROUND: Tests for recent infections (TRIs) are important for HIV surveillance. We have shown that a patient's antibody pattern in a confirmatory line immunoassay (Inno-Lia) also yields information on time since infection. We have published algorithms which, with a certain sensitivity and specificity, distinguish between incident (< = 12 months) and older infection. In order to use these algorithms like other TRIs, i.e., based on their windows, we now determined their window periods. METHODS: We classified Inno-Lia results of 527 treatment-naïve patients with HIV-1 infection < = 12 months according to incidence by 25 algorithms. The time after which all infections were ruled older, i.e. the algorithm's window, was determined by linear regression of the proportion ruled incident in dependence of time since infection. Window-based incident infection rates (IIR) were determined utilizing the relationship 'Prevalence = Incidence x Duration' in four annual cohorts of HIV-1 notifications. Results were compared to performance-based IIR also derived from Inno-Lia results, but utilizing the relationship 'incident = true incident + false incident' and also to the IIR derived from the BED incidence assay. RESULTS: Window periods varied between 45.8 and 130.1 days and correlated well with the algorithms' diagnostic sensitivity (R(2) = 0.962; P<0.0001). Among the 25 algorithms, the mean window-based IIR among the 748 notifications of 2005/06 was 0.457 compared to 0.453 obtained for performance-based IIR with a model not correcting for selection bias. Evaluation of BED results using a window of 153 days yielded an IIR of 0.669. Window-based IIR and performance-based IIR increased by 22.4% and respectively 30.6% in 2008, while 2009 and 2010 showed a return to baseline for both methods. CONCLUSIONS: IIR estimations by window- and performance-based evaluations of Inno-Lia algorithm results were similar and can be used together to assess IIR changes between annual HIV notification cohorts.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Abstract Sitting between your past and your future doesn't mean you are in the present. Dakota Skye Complex systems science is an interdisciplinary field grouping under the same umbrella dynamical phenomena from social, natural or mathematical sciences. The emergence of a higher order organization or behavior, transcending that expected of the linear addition of the parts, is a key factor shared by all these systems. Most complex systems can be modeled as networks that represent the interactions amongst the system's components. In addition to the actual nature of the part's interactions, the intrinsic topological structure of underlying network is believed to play a crucial role in the remarkable emergent behaviors exhibited by the systems. Moreover, the topology is also a key a factor to explain the extraordinary flexibility and resilience to perturbations when applied to transmission and diffusion phenomena. In this work, we study the effect of different network structures on the performance and on the fault tolerance of systems in two different contexts. In the first part, we study cellular automata, which are a simple paradigm for distributed computation. Cellular automata are made of basic Boolean computational units, the cells; relying on simple rules and information from- the surrounding cells to perform a global task. The limited visibility of the cells can be modeled as a network, where interactions amongst cells are governed by an underlying structure, usually a regular one. In order to increase the performance of cellular automata, we chose to change its topology. We applied computational principles inspired by Darwinian evolution, called evolutionary algorithms, to alter the system's topological structure starting from either a regular or a random one. The outcome is remarkable, as the resulting topologies find themselves sharing properties of both regular and random network, and display similitudes Watts-Strogtz's small-world network found in social systems. Moreover, the performance and tolerance to probabilistic faults of our small-world like cellular automata surpasses that of regular ones. In the second part, we use the context of biological genetic regulatory networks and, in particular, Kauffman's random Boolean networks model. In some ways, this model is close to cellular automata, although is not expected to perform any task. Instead, it simulates the time-evolution of genetic regulation within living organisms under strict conditions. The original model, though very attractive by it's simplicity, suffered from important shortcomings unveiled by the recent advances in genetics and biology. We propose to use these new discoveries to improve the original model. Firstly, we have used artificial topologies believed to be closer to that of gene regulatory networks. We have also studied actual biological organisms, and used parts of their genetic regulatory networks in our models. Secondly, we have addressed the improbable full synchronicity of the event taking place on. Boolean networks and proposed a more biologically plausible cascading scheme. Finally, we tackled the actual Boolean functions of the model, i.e. the specifics of how genes activate according to the activity of upstream genes, and presented a new update function that takes into account the actual promoting and repressing effects of one gene on another. Our improved models demonstrate the expected, biologically sound, behavior of previous GRN model, yet with superior resistance to perturbations. We believe they are one step closer to the biological reality.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This monthly report from the Iowa Department of Transportation is about the water quality management of Iowa's rivers, streams and lakes.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.