948 resultados para BAYESIAN NETWORK
Resumo:
Modern computer systems are plagued with stability and security problems: applications lose data, web servers are hacked, and systems crash under heavy load. Many of these problems or anomalies arise from rare program behavior caused by attacks or errors. A substantial percentage of the web-based attacks are due to buffer overflows. Many methods have been devised to detect and prevent anomalous situations that arise from buffer overflows. The current state-of-art of anomaly detection systems is relatively primitive and mainly depend on static code checking to take care of buffer overflow attacks. For protection, Stack Guards and I-leap Guards are also used in wide varieties.This dissertation proposes an anomaly detection system, based on frequencies of system calls in the system call trace. System call traces represented as frequency sequences are profiled using sequence sets. A sequence set is identified by the starting sequence and frequencies of specific system calls. The deviations of the current input sequence from the corresponding normal profile in the frequency pattern of system calls is computed and expressed as an anomaly score. A simple Bayesian model is used for an accurate detection.Experimental results are reported which show that frequency of system calls represented using sequence sets, captures the normal behavior of programs under normal conditions of usage. This captured behavior allows the system to detect anomalies with a low rate of false positives. Data are presented which show that Bayesian Network on frequency variations responds effectively to induced buffer overflows. It can also help administrators to detect deviations in program flow introduced due to errors.
Resumo:
The management of information in engineering organisations is facing a particular challenge in the ever-increasing volume of information. It has been recognised that an effective methodology is required to evaluate information in order to avoid information overload and to retain the right information for reuse. By using, as a starting point, a number of the current tools and techniques which attempt to obtain ‘the value’ of information, it is proposed that an assessment or filter mechanism for information is needed to be developed. This paper addresses this issue firstly by briefly reviewing the information overload problem, the definition of value, and related research work on the value of information in various areas. Then a “characteristic” based framework of information evaluation is introduced using the key characteristics identified from related work as an example. A Bayesian Network diagram method is introduced to the framework to build the linkage between the characteristics and information value in order to quantitatively calculate the quality and value of information. The training and verification process for the model is then described using 60 real engineering documents as a sample. The model gives a reasonable accurate result and the differences between the model calculation and training judgements are summarised as the potential causes are discussed. Finally, several further issues including the challenge of the framework and the implementations of this evaluation assessment method are raised.
Resumo:
The genetic analysis workshop 15 (GAW15) problem 1 contained baseline expression levels of 8793 genes in immortalised B cells from 194 individuals in 14 Centre d’Etude du Polymorphisme Humane (CEPH) Utah pedigrees. Previous analysis of the data showed linkage and association and evidence of substantial individual variations. In particular, correlation was examined on expression levels of 31 genes and 25 target genes corresponding to two master regulatory regions. In this analysis, we apply Bayesian network analysis to gain further insight into these findings. We identify strong dependences and therefore provide additional insight into the underlying relationships between the genes involved. More generally, the approach is expected to be applicable for integrated analysis of genes on biological pathways.
Resumo:
In order to overcome divergence of estimation with the same data, the proposed digital costing process adopts an integrated design of information system to design the process knowledge and costing system together. By employing and extending a widely used international standard, industry foundation classes, the system can provide an integrated process which can harvest information and knowledge of current quantity surveying practice of costing method and data. Knowledge of quantification is encoded from literatures, motivation case and standards. It can reduce the time consumption of current manual practice. The further development will represent the pricing process in a Bayesian Network based knowledge representation approach. The hybrid types of knowledge representation can produce a reliable estimation for construction project. In a practical term, the knowledge management of quantity surveying can improve the system of construction estimation. The theoretical significance of this study lies in the fact that its content and conclusion make it possible to develop an automatic estimation system based on hybrid knowledge representation approach.
Resumo:
Consideration of a wide range of plausible crime scenarios during any crime investigation is important to seek convincing evidence and hence to minimize the likelihood of miscarriages of justice. It is equally important for crime investigators to be able to employ effective and efficient evidence-collection strategies that are likely to produce the most conclusive information under limited available resources. An intelligent decision support system that can assist human investigators by automatically constructing plausible scenarios, and reasoning with the likely best investigating actions will clearly be very helpful in addressing these challenging problems. This paper presents a system for creating scenario spaces from given evidence, based on an integrated application of techniques for compositional modelling and Bayesian network-based evidence evaluation. Methods of analysis are also provided by the use of entropy to exploit the synthesized scenario spaces in order to prioritize investigating actions and hypotheses. These theoretical developments are illustrated by realistic examples of serious crime investigation.
Resumo:
This paper proposes a fuzzy classification system for the risk of infestation by weeds in agricultural zones considering the variability of weeds. The inputs of the system are features of the infestation extracted from estimated maps by kriging for the weed seed production and weed coverage, and from the competitiveness, inferred from narrow and broad-leaved weeds. Furthermore, a Bayesian network classifier is used to extract rules from data which are compared to the fuzzy rule set obtained on the base of specialist knowledge. Results for the risk inference in a maize crop field are presented and evaluated by the estimated yield loss. © 2009 IEEE.
Resumo:
A automação na gestão e análise de dados tem sido um fator crucial para as empresas que necessitam de soluções eficientes em um mundo corporativo cada vez mais competitivo. A explosão do volume de informações, que vem se mantendo crescente nos últimos anos, tem exigido cada vez mais empenho em buscar estratégias para gerenciar e, principalmente, extrair informações estratégicas valiosas a partir do uso de algoritmos de Mineração de Dados, que comumente necessitam realizar buscas exaustivas na base de dados a fim de obter estatísticas que solucionem ou otimizem os parâmetros do modelo de extração do conhecimento utilizado; processo que requer computação intensiva para a execução de cálculos e acesso frequente à base de dados. Dada a eficiência no tratamento de incerteza, Redes Bayesianas têm sido amplamente utilizadas neste processo, entretanto, à medida que o volume de dados (registros e/ou atributos) aumenta, torna-se ainda mais custoso e demorado extrair informações relevantes em uma base de conhecimento. O foco deste trabalho é propor uma nova abordagem para otimização do aprendizado da estrutura da Rede Bayesiana no contexto de BigData, por meio do uso do processo de MapReduce, com vista na melhora do tempo de processamento. Para tanto, foi gerada uma nova metodologia que inclui a criação de uma Base de Dados Intermediária contendo todas as probabilidades necessárias para a realização dos cálculos da estrutura da rede. Por meio das análises apresentadas neste estudo, mostra-se que a combinação da metodologia proposta com o processo de MapReduce é uma boa alternativa para resolver o problema de escalabilidade nas etapas de busca em frequência do algoritmo K2 e, consequentemente, reduzir o tempo de resposta na geração da rede.
Resumo:
In the collective imaginaries a robot is a human like machine as any androids in science fiction. However the type of robots that you will encounter most frequently are machinery that do work that is too dangerous, boring or onerous. Most of the robots in the world are of this type. They can be found in auto, medical, manufacturing and space industries. Therefore a robot is a system that contains sensors, control systems, manipulators, power supplies and software all working together to perform a task. The development and use of such a system is an active area of research and one of the main problems is the development of interaction skills with the surrounding environment, which include the ability to grasp objects. To perform this task the robot needs to sense the environment and acquire the object informations, physical attributes that may influence a grasp. Humans can solve this grasping problem easily due to their past experiences, that is why many researchers are approaching it from a machine learning perspective finding grasp of an object using information of already known objects. But humans can select the best grasp amongst a vast repertoire not only considering the physical attributes of the object to grasp but even to obtain a certain effect. This is why in our case the study in the area of robot manipulation is focused on grasping and integrating symbolic tasks with data gained through sensors. The learning model is based on Bayesian Network to encode the statistical dependencies between the data collected by the sensors and the symbolic task. This data representation has several advantages. It allows to take into account the uncertainty of the real world, allowing to deal with sensor noise, encodes notion of causality and provides an unified network for learning. Since the network is actually implemented and based on the human expert knowledge, it is very interesting to implement an automated method to learn the structure as in the future more tasks and object features can be introduced and a complex network design based only on human expert knowledge can become unreliable. Since structure learning algorithms presents some weaknesses, the goal of this thesis is to analyze real data used in the network modeled by the human expert, implement a feasible structure learning approach and compare the results with the network designed by the expert in order to possibly enhance it.
Resumo:
Automatic identification and extraction of bone contours from X-ray images is an essential first step task for further medical image analysis. In this paper we propose a 3D statistical model based framework for the proximal femur contour extraction from calibrated X-ray images. The automatic initialization is solved by an estimation of Bayesian network algorithm to fit a multiple component geometrical model to the X-ray data. The contour extraction is accomplished by a non-rigid 2D/3D registration between a 3D statistical model and the X-ray images, in which bone contours are extracted by a graphical model based Bayesian inference. Preliminary experiments on clinical data sets verified its validity
Resumo:
Systems biology techniques are a topic of recent interest within the neurological field. Computational intelligence (CI) addresses this holistic perspective by means of consensus or ensemble techniques ultimately capable of uncovering new and relevant findings. In this paper, we propose the application of a CI approach based on ensemble Bayesian network classifiers and multivariate feature subset selection to induce probabilistic dependences that could match or unveil biological relationships. The research focuses on the analysis of high-throughput Alzheimer's disease (AD) transcript profiling. The analysis is conducted from two perspectives. First, we compare the expression profiles of hippocampus subregion entorhinal cortex (EC) samples of AD patients and controls. Second, we use the ensemble approach to study four types of samples: EC and dentate gyrus (DG) samples from both patients and controls. Results disclose transcript interaction networks with remarkable structures and genes not directly related to AD by previous studies. The ensemble is able to identify a variety of transcripts that play key roles in other neurological pathologies. Classical statistical assessment by means of non-parametric tests confirms the relevance of the majority of the transcripts. The ensemble approach pinpoints key metabolic mechanisms that could lead to new findings in the pathogenesis and development of AD
Resumo:
This paper proposes a new multi-objective estimation of distribution algorithm (EDA) based on joint modeling of objectives and variables. This EDA uses the multi-dimensional Bayesian network as its probabilistic model. In this way it can capture the dependencies between objectives, variables and objectives, as well as the dependencies learnt between variables in other Bayesian network-based EDAs. This model leads to a problem decomposition that helps the proposed algorithm to find better trade-off solutions to the multi-objective problem. In addition to Pareto set approximation, the algorithm is also able to estimate the structure of the multi-objective problem. To apply the algorithm to many-objective problems, the algorithm includes four different ranking methods proposed in the literature for this purpose. The algorithm is applied to the set of walking fish group (WFG) problems, and its optimization performance is compared with an evolutionary algorithm and another multi-objective EDA. The experimental results show that the proposed algorithm performs significantly better on many of the problems and for different objective space dimensions, and achieves comparable results on some compared with the other algorithms.
Resumo:
Probabilistic modeling is the de�ning characteristic of estimation of distribution algorithms (EDAs) which determines their behavior and performance in optimization. Regularization is a well-known statistical technique used for obtaining an improved model by reducing the generalization error of estimation, especially in high-dimensional problems. `1-regularization is a type of this technique with the appealing variable selection property which results in sparse model estimations. In this thesis, we study the use of regularization techniques for model learning in EDAs. Several methods for regularized model estimation in continuous domains based on a Gaussian distribution assumption are presented, and analyzed from di�erent aspects when used for optimization in a high-dimensional setting, where the population size of EDA has a logarithmic scale with respect to the number of variables. The optimization results obtained for a number of continuous problems with an increasing number of variables show that the proposed EDA based on regularized model estimation performs a more robust optimization, and is able to achieve signi�cantly better results for larger dimensions than other Gaussian-based EDAs. We also propose a method for learning a marginally factorized Gaussian Markov random �eld model using regularization techniques and a clustering algorithm. The experimental results show notable optimization performance on continuous additively decomposable problems when using this model estimation method. Our study also covers multi-objective optimization and we propose joint probabilistic modeling of variables and objectives in EDAs based on Bayesian networks, speci�cally models inspired from multi-dimensional Bayesian network classi�ers. It is shown that with this approach to modeling, two new types of relationships are encoded in the estimated models in addition to the variable relationships captured in other EDAs: objectivevariable and objective-objective relationships. An extensive experimental study shows the e�ectiveness of this approach for multi- and many-objective optimization. With the proposed joint variable-objective modeling, in addition to the Pareto set approximation, the algorithm is also able to obtain an estimation of the multi-objective problem structure. Finally, the study of multi-objective optimization based on joint probabilistic modeling is extended to noisy domains, where the noise in objective values is represented by intervals. A new version of the Pareto dominance relation for ordering the solutions in these problems, namely �-degree Pareto dominance, is introduced and its properties are analyzed. We show that the ranking methods based on this dominance relation can result in competitive performance of EDAs with respect to the quality of the approximated Pareto sets. This dominance relation is then used together with a method for joint probabilistic modeling based on `1-regularization for multi-objective feature subset selection in classi�cation, where six di�erent measures of accuracy are considered as objectives with interval values. The individual assessment of the proposed joint probabilistic modeling and solution ranking methods on datasets with small-medium dimensionality, when using two di�erent Bayesian classi�ers, shows that comparable or better Pareto sets of feature subsets are approximated in comparison to standard methods.
Resumo:
Following the Integrated Water Resources Management approach, the European Water Framework Directive demands Member States to develop water management plans at the catchment level. Those plans have to integrate the different interests and must be developed with stakeholder participation. To face these requirements, managers need tools to assess the impacts of possible management alternatives on natural and socio-economic systems. These tools should ideally be able to address the complexity and uncertainties of the water system, while serving as a platform for stakeholder participation. The objective of our research was to develop a participatory integrated assessment model, based on the combination of a crop model, an economic model and a participatory Bayesian network, with an application in the middle Guadiana sub-basin, in Spain. The methodology is intended to capture the complexity of water management problems, incorporating the relevant sectors, as well as the relevant scales involved in water management decision making. The integrated model has allowed us testing different management, market and climate change scenarios and assessing the impacts of such scenarios on the natural system (crops), on the socio-economic system (farms) and on the environment (water resources). Finally, this integrated assessment modelling process has allowed stakeholder participation, complying with the main requirements of current European water laws.
Resumo:
A participatory modelling process has been conducted in two areas of the Guadiana river (the upper and the middle sub-basins), in Spain, with the aim of providing support for decision making in the water management field. The area has a semi-arid climate where irrigated agriculture plays a key role in the economic development of the region and accounts for around 90% of water use. Following the guidelines of the European Water Framework Directive, we promote stakeholder involvement in water management with the aim to achieve an improved understanding of the water system and to encourage the exchange of knowledge and views between stakeholders in order to help building a shared vision of the system. At the same time, the resulting models, which integrate the different sectors and views, provide some insight of the impacts that different management options and possible future scenarios could have. The methodology is based on a Bayesian network combined with an economic model and, in the middle Guadiana sub-basin, with a crop model. The resulting integrated modelling framework is used to simulate possible water policy, market and climate scenarios to find out the impacts of those scenarios on farm income and on the environment. At the end of the modelling process, an evaluation questionnaire was filled by participants in both sub-basins. Results show that this type of processes are found very helpful by stakeholders to improve the system understanding, to understand each others views and to reduce conflict when it exists. In addition, they found the model an extremely useful tool to support management. The graphical interface, the quantitative output and the explicit representation of uncertainty helped stakeholders to better understand the implications of the scenario tested. Finally, the combination of different types of models was also found very useful, as it allowed exploring in detail specific aspects of the water management problems.