931 resultados para bayesian networks
Resumo:
El objetivo principal de esta tesis doctoral es profundizar en el anlisis y diseo de un sistema inteligente para la prediccin y control del acabado superficial en un proceso de fresado a alta velocidad, basado fundamentalmente en clasificadores Bayesianos, con el proposito de desarrollar una metodologa que facilite el diseo de este tipo de sistemas. El sistema, cuyo propsito es posibilitar la prediccin y control de la rugosidad superficial, se compone de un modelo aprendido a partir de datos experimentales con redes Bayesianas, que ayudara a comprender los procesos dinmicos involucrados en el mecanizado y las interacciones entre las variables relevantes. Dado que las redes neuronales artificiales son modelos ampliamente utilizados en procesos de corte de materiales, tambin se incluye un modelo para fresado usndolas, donde se introdujo la geometra y la dureza del material como variables novedosas hasta ahora no estudiadas en este contexto. Por lo tanto, una importante contribucin en esta tesis son estos dos modelos para la prediccin de la rugosidad superficial, que se comparan con respecto a diferentes aspectos: la influencia de las nuevas variables, los indicadores de evaluacin del desempeo, interpretabilidad. Uno de los principales problemas en la modelizacin con clasificadores Bayesianos es la comprensin de las enormes tablas de probabilidad a posteriori producidas. Introducimos un metodo de explicacin que genera un conjunto de reglas obtenidas de rboles de decisin. Estos rboles son inducidos a partir de un conjunto de datos simulados generados de las probabilidades a posteriori de la variable clase, calculadas con la red Bayesiana aprendida a partir de un conjunto de datos de entrenamiento. Por ltimo, contribuimos en el campo multiobjetivo en el caso de que algunos de los objetivos no se puedan cuantificar en nmeros reales, sino como funciones en intervalo de valores. Esto ocurre a menudo en aplicaciones de aprendizaje automtico, especialmente las basadas en clasificacin supervisada. En concreto, se extienden las ideas de dominancia y frontera de Pareto a esta situacin. Su aplicacin a los estudios de prediccin de la rugosidad superficial en el caso de maximizar al mismo tiempo la sensibilidad y la especificidad del clasificador inducido de la red Bayesiana, y no solo maximizar la tasa de clasificacin correcta. Los intervalos de estos dos objetivos provienen de un metodo de estimacin honesta de ambos objetivos, como e.g. validacin cruzada en k rodajas o bootstrap.---ABSTRACT---The main objective of this PhD Thesis is to go more deeply into the analysis and design of an intelligent system for surface roughness prediction and control in the end-milling machining process, based fundamentally on Bayesian network classifiers, with the aim of developing a methodology that makes easier the design of this type of systems. The system, whose purpose is to make possible the surface roughness prediction and control, consists of a model learnt from experimental data with the aid of Bayesian networks, that will help to understand the dynamic processes involved in the machining and the interactions among the relevant variables. Since artificial neural networks are models widely used in material cutting proceses, we include also an end-milling model using them, where the geometry and hardness of the piecework are introduced as novel variables not studied so far within this context. Thus, an important contribution in this thesis is these two models for surface roughness prediction, that are then compared with respecto to different aspects: influence of the new variables, performance evaluation metrics, interpretability. One of the main problems with Bayesian classifier-based modelling is the understanding of the enormous posterior probabilitiy tables produced. We introduce an explanation method that generates a set of rules obtained from decision trees. Such trees are induced from a simulated data set generated from the posterior probabilities of the class variable, calculated with the Bayesian network learned from a training data set. Finally, we contribute in the multi-objective field in the case that some of the objectives cannot be quantified as real numbers but as interval-valued functions. This often occurs in machine learning applications, especially those based on supervised classification. Specifically, the dominance and Pareto front ideas are extended to this setting. Its application to the surface roughness prediction studies the case of maximizing simultaneously the sensitivity and specificity of the induced Bayesian network classifier, rather than only maximizing the correct classification rate. Intervals in these two objectives come from a honest estimation method of both objectives, like e.g. k-fold cross-validation or bootstrap.
Resumo:
El funcionamiento interno del cerebro es todava hoy en da un misterio, siendo su comprensin uno de los principales desafos a los que se enfrenta la ciencia moderna. El crtex cerebral es el rea del cerebro donde tienen lugar los procesos cerebrales de ms alto nivel, cmo la imaginacin, el juicio o el pensamiento abstracto. Las neuronas piramidales, un tipo especfico de neurona, suponen cerca del 80% de los cerca de los 10.000 millones de que componen el crtex cerebral, haciendo de ellas un objetivo principal en el estudio del funcionamiento del cerebro. La morfologa neuronal, y ms especficamente la morfologa dendrtica, determina cmo estas procesan la informacin y los patrones de conexin entre neuronas, siendo los modelos computacionales herramientas imprescindibles para el estudio de su rol en el funcionamiento del cerebro. En este trabajo hemos creado un modelo computacional, con ms de 50 variables relativas a la morfologa dendrtica, capaz de simular el crecimiento de arborizaciones dendrticas basales completas a partir de reconstrucciones de neuronas piramidales reales, abarcando desde el nmero de dendritas hasta el crecimiento los los rboles dendrticos. A diferencia de los trabajos anteriores, nuestro modelo basado en redes Bayesianas contempla la arborizacin dendrtica en su conjunto, teniendo en cuenta las interacciones entre dendritas y detectando de forma automtica las relaciones entre las variables morfolgicas que caracterizan la arborizacin. Adems, el anlisis de las redes Bayesianas puede ayudar a identificar relaciones hasta ahora desconocidas entre variables morfolgicas. Motivado por el estudio de la orientacin de las dendritas basales, en este trabajo se introduce una regularizacin L1 generalizada, aplicada al aprendizaje de la distribucin von Mises multivariante, una de las principales distribuciones de probabilidad direccional multivariante. Tambin se propone una distancia circular multivariante que puede utilizarse para estimar la divergencia de Kullback-Leibler entre dos muestras de datos circulares. Comparamos los modelos con y sin regularizaci n en el estudio de la orientacin de la dendritas basales en neuronas humanas, comprobando que, en general, el modelo regularizado obtiene mejores resultados. El muestreo, ajuste y representacin de la distribucin von Mises multivariante se implementa en un nuevo paquete de R denominado mvCircular.---ABSTRACT---The inner workings of the brain are, as of today, a mystery. To understand the brain is one of the main challenges faced by current science. The cerebral cortex is the region of the brain where all superior brain processes, like imagination, judge and abstract reasoning take place. Pyramidal neurons, a specific type of neurons, constitute approximately the 80% of the more than 10.000 million neurons that compound the cerebral cortex. It makes the study of the pyramidal neurons crucial in order to understand how the brain works. Neuron morphology, and specifically the dendritic morphology, determines how the information is processed in the neurons, as well as the connection patterns among neurons. Computational models are one of the main tools for studying dendritic morphology and its role in the brain function. We have built a computational model that contains more than 50 morphological variables of the dendritic arborizations. This model is able to simulate the growth of complete dendritic arborizations from real neuron reconstructions, starting with the number of basal dendrites, and ending modeling the growth of dendritic trees. One of the main diferences between our approach, mainly based on the use of Bayesian networks, and other models in the state of the art is that we model the whole dendritic arborization instead of focusing on individual trees, which makes us able to take into account the interactions between dendrites and to automatically detect relationships between the morphologic variables that characterize the arborization. Moreover, the posterior analysis of the relationships in the model can help to identify new relations between morphological variables. Motivated by the study of the basal dendrites orientation, a generalized L1 regularization applied to the multivariate von Mises distribution, one of the most used distributions in multivariate directional statistics, is also introduced in this work. We also propose a circular multivariate distance that can be used to estimate the Kullback-Leibler divergence between two circular data samples. We compare the regularized and unregularized models on basal dendrites orientation of human neurons and prove that regularized model achieves better results than non regularized von Mises model. Sampling, fitting and plotting functions for the multivariate von Mises are implemented in a new R packaged called mvCircular.
Resumo:
El correcto pronstico en el mbito de la logstica de transportes es de vital importancia para una adecuada planificacin de medios y recursos, as como de su optimizacin. Hasta la fecha los estudios sobre planificacin portuaria se basan principalmente en modelos empricos; que se han utilizado para planificar nuevas terminales y desarrollar planes directores cuando no se dispone de datos iniciales, analticos; ms relacionados con la teora de colas y tiempos de espera con formulaciones matemticas complejas y necesitando simplificaciones de las mismas para hacer manejable y prctico el modelo o de simulacin; que requieren de una inversin significativa como para poder obtener resultados aceptables invirtiendo en programas y desarrollos complejos. La Minera de Datos (MD) es un rea moderna interdisciplinaria que engloba a aquellas tcnicas que operan de forma automtica (requieren de la mnima intervencin humana) y, adems, son eficientes para trabajar con las grandes cantidades de informacin disponible en las bases de datos de numerosos problemas prcticos. La aplicacin prctica de estas disciplinas se extiende a numerosos mbitos comerciales y de investigacin en problemas de prediccin, clasificacin o diagnosis. Entre las diferentes tcnicas disponibles en minera de datos las redes neuronales artificiales (RNA) y las redes probabilsticas o redes bayesianas (RB) permiten modelizar de forma conjunta toda la informacin relevante para un problema dado. En el presente trabajo se han analizado dos aplicaciones de estos casos al mbito portuario y en concreto a contenedores. En la Tesis Doctoral se desarrollan las RNA como herramienta para obtener previsiones de trfico y de recursos a futuro de diferentes puertos, a partir de variables de explotacin, obtenindose valores continuos. Para el caso de las redes bayesianas (RB), se realiza un trabajo similar que para el caso de las RNA, obtenindose valores discretos (un intervalo). El principal resultado que se obtiene es la posibilidad de utilizar tanto las RNA como las RB para la estimacin a futuro de parmetros fsicos, as como la relacin entre los mismos en una terminal para una correcta asignacin de los medios a utilizar y por tanto aumentar la eficiencia productiva de la terminal. Como paso final se realiza un estudio de complementariedad de ambos modelos a corto plazo, donde se puede comprobar la buena aceptacin de los resultados obtenidos. Por tanto, se puede concluir que estos mtodos de prediccin pueden ser de gran ayuda a la planificacin portuaria. The correct assets forecast in the field of transportation logistics is a matter of vital importance for a suitable planning and optimization of the necessary means and resources. Up to this date, ports planning studies were basically using empirical models to deal with new terminals planning or master plans development when no initial data are available; analytical models, more connected to the queuing theory and the waiting times, and very complicated mathematical formulations requiring significant simplifications to acquire a practical and easy to handle model; or simulation models, that require a significant investment in computer codes and complex developments to produce acceptable results. The Data Mining (DM) is a modern interdisciplinary field that include those techniques that operate automatically (almost no human intervention is required) and are highly efficient when dealing with practical problems characterized by huge data bases containing significant amount of information. These disciplines practical application extends to many commercial or research fields, dealing with forecast, classification or diagnosis problems. Among the different techniques of the Data Mining, the Artificial Neuronal Networks (ANN) and the probabilistic or Bayesian networks (BN) allow the joint modeling of all the relevant information for a given problem. This PhD work analyses their application to two practical cases in the ports field, concretely to container terminals. This PhD work details how the ANN have been developed as a tool to produce traffic and resources forecasts for several ports, based on exploitation variables to obtain continuous values. For the Bayesian networks case (BN), a similar development has been carried out, obtaining discreet values (an interval). The main finding is the possibility to use ANN and BN to estimate future needs of the ports or terminals physical parameters, as well as the relationship between them within a specific terminal, that allow a correct assignment of the necessary means and, thus, to increase the terminals productive efficiency. The final step is a short term complementarily study of both models, carried out in order to verify the obtained results. It can thus be stated that these prediction methods can be a very useful tool in ports planning.
Resumo:
Las redes Bayesianas constituyen un modelo ampliamente utilizado para la representacin de relaciones de dependencia condicional en datos multivariantes. Su aprendizaje a partir de un conjunto de datos o expertos ha sido estudiado profundamente desde su concepcin. Sin embargo, en determinados escenarios se demanda la obtencin de un modelo comn asociado a particiones de datos o conjuntos de expertos. En este caso, se trata el problema de fusin o agregacin de modelos. Los trabajos y resultados en agregacin de redes Bayesianas son de naturaleza variada, aunque escasos en comparacin con aquellos de aprendizaje. En este documento, se proponen dos mtodos para la agregacin de redes Gaussianas, definidas como aquellas redes Bayesianas que modelan una distribucin Gaussiana multivariante. Los mtodos presentados son efectivos, precisos y producen redes con menor cantidad de parmetros en comparacin con los modelos obtenidos individualmente. Adems, constituyen un enfoque novedoso al incorporar nociones exploradas tradicionalmente por separado en el estado del arte. Futuras aplicaciones en entornos escalables hacen dichos mtodos especialmente atractivos, dada su simplicidad y la ganancia en compacidad de la representacin obtenida.---ABSTRACT---Bayesian networks are a widely used model for the representation of conditional dependence relationships among variables in multivariate data. The task of learning them from a data set or experts has been deeply studied since their conception. However, situations emerge where there is a need of obtaining a consensuated model from several data partitions or a set of experts. This situation is referred to as model fusion or aggregation. Results about Bayesian network aggregation, although rich in variety, have been scarce when compared to the learning task. In this context, two methods are proposed for the aggregation of Gaussian Bayesian networks, that is, Bayesian networks whose underlying modelled distribution is a multivariate Gaussian. Both methods are effective, precise and produce networks with fewer parameters in comparison with the models obtained by individual learning. They constitute a novel approach given that they incorporate notions traditionally explored separately in the state of the art. Future applications in scalable computer environments make such models specially attractive, given their simplicity and the gaining in sparsity of the produced model.
Resumo:
La estructura econmica mundial, con centros de produccin y consumo descentralizados y el consiguiente aumento en el trfico de mercancas en todo el mundo, crea considerables problemas y desafos para el sector del transporte de mercancas. Esta situacin ha llevado al transporte martimo a convertirse en el modo ms econmico y ms adecuado para el transporte de mercancas a nivel global. De este modo, los puertos martimos se configuran como nodos de importancia capital en la cadena de suministro al servir como enlace entre dos sistemas de transporte, el martimo y el terrestre. El aumento de la actividad en los puertos martimos produce tres efectos indeseables: el aumento de la congestin vial, la falta de espacio abierto en las instalaciones portuarias y un impacto ambiental significativo en los puertos martimos. Los puertos secos nacen para favorecer la utilizacin de cada modo de transporte en los segmentos en que resultan ms competitivos y para mitigar estos problemas moviendo parte de la actividad en el interior. Adems, gracias a la implantacin de puertos secos es posible discretizar cada uno de los eslabones de la cadena de transporte, permitiendo que los modos ms contaminantes y con menor capacidad de transporte tengan itinerarios lo ms cortos posible, o bien, sean utilizados nicamente para el transporte de mercancas de alto valor aadido. As, los puertos secos se presentan como una oportunidad para fortalecer las soluciones intermodales como parte de una cadena integrada de transporte sostenible, potenciando el transporte de mercancas por ferrocarril. Sin embargo, su potencial no es aprovechado al no existir una metodologa de planificacin de la ubicacin de uso sencillo y resultados claros para la toma de decisiones a partir de los criterios ingenieriles definidos por los tcnicos. La decisin de dnde ubicar un puerto seco exige un anlisis exhaustivo de toda la cadena logstica, con el objetivo de transferir el mayor volumen de trfico posible a los modos ms eficientes desde el punto de vista energtico, que son menos perjudiciales para el medio ambiente. Sin embargo, esta decisin tambin debe garantizar la sostenibilidad de la propia localizacin. Esta Tesis Doctoral, pretende sentar las bases tericas para el desarrollo de una herramienta de Herramienta de Ayuda a la Toma de Decisiones que permita establecer la localizacin ms adecuada para la construccin de puertos secos. Este primer paso es el desarrollo de una metodologa de evaluacin de la sostenibilidad y la calidad de las localizaciones de los puertos secos actuales mediante el uso de las siguientes tcnicas: Metodologa DELPHI, Redes Bayesianas, Anlisis Multicriterio y Sistemas de Informacin Geogrfica. Reconociendo que la determinacin de la ubicacin ms adecuada para situar diversos tipos de instalaciones es un importante problema geogrfico, con significativas repercusiones medioambientales, sociales, econmicos, locacionales y de accesibilidad territorial, se considera un conjunto de 40 variables (agrupadas en 17 factores y estos, a su vez, en 4 criterios) que permiten evaluar la sostenibilidad de las localizaciones. El Anlisis Multicriterio se utiliza como forma de establecer una puntuacin a travs de un algoritmo de scoring. Este algoritmo se alimenta a travs de: 1) unas calificaciones para cada variable extradas de informacin geogrfica analizada con ArcGIS (Criteria Assessment Score); 2) los pesos de los factores obtenidos a travs de un cuestionario DELPHI, una tcnica caracterizada por su capacidad para alcanzar consensos en un grupo de expertos de muy diferentes especialidades: logstica, sostenibilidad, impacto ambiental, planificacin de transportes y geografa; y 3) los pesos de las variables, para lo que se emplean las Redes Bayesianas lo que supone una importante aportacin metodolgica al tratarse de una novedosa aplicacin de esta tcnica. Los pesos se obtienen aprovechando la capacidad de clasificacin de las Redes Bayesianas, en concreto de una red diseada con un algoritmo de tipo greedy denominado K2 que permite priorizar cada variable en funcin de las relaciones que se establecen en el conjunto de variables. La principal ventaja del empleo de esta tcnica es la reduccin de la arbitrariedad en la fijacin de los pesos de la cual suelen adolecer las tcnicas de Anlisis Multicriterio. Como caso de estudio, se evala la sostenibilidad de los 10 puertos secos existentes en Espaa. Los resultados del cuestionario DELPHI revelan una mayor importancia a la hora de buscar la localizacin de un Puerto Seco en los aspectos tenidos en cuenta en las teoras clsicas de localizacin industrial, principalmente econmicos y de accesibilidad. Sin embargo, no deben perderse de vista el resto de factores, cuestin que se pone de manifiesto a travs del cuestionario, dado que ninguno de los factores tiene un peso tan pequeo como para ser despreciado. Por el contrario, los resultados de la aplicacin de Redes Bayesianas, muestran una mayor importancia de las variables medioambientales, por lo que la sostenibilidad de las localizaciones exige un gran respeto por el medio natural y el medio urbano en que se encuadra. Por ltimo, la aplicacin prctica refleja que la localizacin de los puertos secos existentes en Espaa en la actualidad presenta una calidad modesta, que parece responder ms a decisiones polticas que a criterios tcnicos. Por ello, deben emprenderse polticas encaminadas a generar un modelo logstico colaborativo-competitivo en el que se evalen los diferentes factores tenidos en cuenta en esta investigacin. The global economic structure, with its decentralized production and the consequent increase in freight traffic all over the world, creates considerable problems and challenges for the freight transport sector. This situation has led shipping to become the most suitable and cheapest way to transport goods. Thus, ports are configured as nodes with critical importance in the logistics supply chain as a link between two transport systems, sea and land. Increase in activity at seaports is producing three undesirable effects: increasing road congestion, lack of open space in port installations and a significant environmental impact on seaports. These adverse effects can be mitigated by moving part of the activity inland. Implementation of dry ports is a possible solution and would also provide an opportunity to strengthen intermodal solutions as part of an integrated and more sustainable transport chain, acting as a link between road and railway networks. In this sense, implementation of dry ports allows the separation of the links of the transport chain, thus facilitating the shortest possible routes for the lowest capacity and most polluting means of transport. Thus, the decision of where to locate a dry port demands a thorough analysis of the whole logistics supply chain, with the objective of transferring the largest volume of goods possible from road to more energy efficient means of transport, like rail or short-sea shipping, that are less harmful to the environment. However, the decision of where to locate a dry port must also ensure the sustainability of the site. Thus, the main goal of this dissertation is to research the variables influencing the sustainability of dry port location and how this sustainability can be evaluated. With this objective, in this research we present a methodology for assessing the sustainability of locations by the use of Multi-Criteria Decision Analysis (MCDA) and Bayesian Networks (BNs). MCDA is used as a way to establish a scoring, whilst BNs were chosen to eliminate arbitrariness in setting the weightings using a technique that allows us to prioritize each variable according to the relationships established in the set of variables. In order to determine the relationships between all the variables involved in the decision, giving us the importance of each factor and variable, we built a K2 BN algorithm. To obtain the scores of each variable, we used a complete cartography analysed by ArcGIS. Recognising that setting the most appropriate location to place a dry port is a geographical multidisciplinary problem, with significant economic, social and environmental implications, we consider 41 variables (grouped into 17 factors) which respond to this need. As a case of study, the sustainability of all of the 10 existing dry ports in Spain has been evaluated. In this set of logistics platforms, we found that the most important variables for achieving sustainability are those related to environmental protection, so the sustainability of the locations requires a great respect for the natural environment and the urban environment in which they are framed.
Resumo:
Esta tesis presenta el diseo y la aplicacin de una metodologa que permite la determinacin de los parmetros para la planificacin de nodos e infraestructuras logsticas en un territorio, considerando adems el impacto de estas en los diferentes componentes territoriales, as como en el desarrollo poblacional, el desarrollo econmico y el medio ambiente, presentando as un avance en la planificacin integral del territorio. La Metodologa propuesta est basada en Minera de Datos, que permite el descubrimiento de patrones detrs de grandes volmenes de datos previamente procesados. Las caractersticas propias de los datos sobre el territorio y los componentes que lo conforman hacen de los estudios territoriales un campo ideal para la aplicacin de algunas de las tcnicas de Minera de Datos, tales como los arboles decisin y las redes bayesianas. Los rboles de decisin permiten representar y categorizar de forma esquemtica una serie de variables de prediccin que ayudan al anlisis de una variable objetivo. Las redes bayesianas representan en un grafo acclico dirigido, un modelo probabilstico de variables distribuidas en padres e hijos, y la inferencia estadstica que permite determinar la probabilidad de certeza de una hiptesis planteada, es decir, permiten construir modelos de probabilidad conjunta que presentan de manera grfica las dependencias relevantes en un conjunto de datos. Al igual que con los rboles de decisin, la divisin del territorio en diferentes unidades administrativas hace de las redes bayesianas una herramienta potencial para definir las caractersticas fsicas de alguna tipologa especifica de infraestructura logstica tomando en consideracin las caractersticas territoriales, poblacionales y econmicas del rea donde se plantea su desarrollo y las posibles sinergias que se puedan presentar sobre otros nodos e infraestructuras logsticas. El caso de estudio seleccionado para la aplicacin de la metodologa ha sido la Repblica de Panam, considerando que este pas presenta algunas caractersticas singulares, entra las que destacan su alta concentracin de poblacin en la Ciudad de Panam; que a su vez a concentrado la actividad econmica del pas; su alto porcentaje de zonas protegidas, lo que ha limitado la vertebracin del territorio; y el Canal de Panam y los puertos de contenedores adyacentes al mismo. La metodologa se divide en tres fases principales: Fase 1: Determinacin del escenario de trabajo 1. Revisin del estado del arte. 2. Determinacin y obtencin de las variables de estudio. Fase 2: Desarrollo del modelo de inteligencia artificial 3. Construccin de los arboles de decisin. 4. Construccin de las redes bayesianas. Fase 3: Conclusiones 5. Determinacin de las conclusiones. Con relacin al modelo de planificacin aplicado al caso de estudio, una vez aplicada la metodologa, se estableci un modelo compuesto por 47 variables que definen la planificacin logstica de Panam, el resto de variables se definen a partir de estas, es decir, conocidas estas, el resto se definen a travs de ellas. Este modelo de planificacin establecido a travs de la red bayesiana considera los aspectos de una planificacin sostenible: econmica, social y ambiental; que crean sinergia con la planificacin de nodos e infraestructuras logsticas. The thesis presents the design and application of a methodology that allows the determination of parameters for the planning of nodes and logistics infrastructure in a territory, besides considering the impact of these different territorial components, as well as the population growth, economic and environmental development. The proposed methodology is based on Data Mining, which allows the discovery of patterns behind large volumes of previously processed data. The own characteristics of the territorial data makes of territorial studies an ideal field of knowledge for the implementation of some of the Data Mining techniques, such as Decision Trees and Bayesian Networks. Decision trees categorize schematically a series of predictor variables of an analyzed objective variable. Bayesian Networks represent a directed acyclic graph, a probabilistic model of variables divided in fathers and sons, and statistical inference that allow determine the probability of certainty in a hypothesis. The case of study for the application of the methodology is the Republic of Panama. This country has some unique features: a high population density in the Panama City, a concentration of economic activity, a high percentage of protected areas, and the Panama Canal. The methodology is divided into three main phases: Phase 1: definition of the work stage. 1. Review of the State of the art. 2. Determination of the variables. Phase 2: Development of artificial intelligence model 3. Construction of decision trees. 4. Construction of Bayesian Networks. Phase 3: conclusions 5. Determination of the conclusions. The application of the methodology to the case study established a model composed of 47 variables that define the logistics planning for Panama. This model of planning established through the Bayesian network considers aspects of sustainable planning and simulates the synergies between the nodes and logistical infrastructure planning.
Resumo:
O Gs Natural Liquefeito (GNL) tem, aos poucos, se tornado uma importante opo para a diversificao da matriz energtica brasileira. Os navios metaneiros so os responsveis pelo transporte do GNL desde as plantas de liquefao at as de regaseificao. Dada a importncia, bem como a periculosidade, das operaes de transporte e de carga e descarga de navios metaneiros, torna-se necessrio no s um bom plano de manuteno como tambm um sistema de deteco de falhas que podem ocorrer durante estes processos. Este trabalho apresenta um mtodo de diagnose de falhas para a operao de carga e descarga de navios transportadores de GNL atravs da utilizao de Redes Bayesianas em conjunto com tcnicas de anlise de confiabilidade, como a Anlise de Modos e Efeitos de Falhas (FMEA) e a Anlise de rvores de Falhas (FTA). O mtodo proposto indica, atravs da leitura de sensores presentes no sistema de carga e descarga, quais os componentes que mais provavelmente esto em falha. O mtodo fornece uma abordagem bem estruturada para a construo das Redes Bayesianas utilizadas na diagnose de falhas do sistema.
Resumo:
A theoretical model was developed to investigate the relationships among subordinate-manager gender combinations, perceived leadership style, experienced frustration and optimism, organization-based self-esteem and organizational commitment. The model was tested within the context of a probabilistic structural model, a discrete Bayesian network, using cross-sectional data from a global pharmaceutical company. The Bayesian network allowed forward inference to assess the relative influence of gender combination and leadership style on the emotions, self-esteem and commitment consequence variables. Further, diagnostics from backward inference were used to assess the relative influence of variables antecedent to organizational commitment. The results showed that gender combination was independent of leadership style and had a direct impact on subordinates' levels of frustration and optimism. Female manager-female subordinate had the largest probability of optimism, while male manager teamed with a male subordinate had the largest probability of frustration. Furthermore, having a female manager teamed up with a male subordinate resulted in the lowest possibility of frustration. However, the findings show that the gender issue is not simply female managers versus male managers, but is concerned with the interaction of the subordinate-manager gender combination and leadership style in a nonlinear manner. (C) 2003 Elsevier Inc. All rights reserved.
Resumo:
This thesis presents an investigation into the application of methods of uncertain reasoning to the biological classification of river water quality. Existing biological methods for reporting river water quality are critically evaluated, and the adoption of a discrete biological classification scheme advocated. Reasoning methods for managing uncertainty are explained, in which the Bayesian and Dempster-Shafer calculi are cited as primary numerical schemes. Elicitation of qualitative knowledge on benthic invertebrates is described. The specificity of benthic response to changes in water quality leads to the adoption of a sensor model of data interpretation, in which a reference set of taxa provide probabilistic support for the biological classes. The significance of sensor states, including that of absence, is shown. Novel techniques of directly eliciting the required uncertainty measures are presented. Bayesian and Dempster-Shafer calculi were used to combine the evidence provided by the sensors. The performance of these automatic classifiers was compared with the expert's own discrete classification of sampled sites. Variations of sensor data weighting, combination order and belief representation were examined for their effect on classification performance. The behaviour of the calculi under evidential conflict and alternative combination rules was investigated. Small variations in evidential weight and the inclusion of evidence from sensors absent from a sample improved classification performance of Bayesian belief and support for singleton hypotheses. For simple support, inclusion of absent evidence decreased classification rate. The performance of Dempster-Shafer classification using consonant belief functions was comparable to Bayesian and singleton belief. Recommendations are made for further work in biological classification using uncertain reasoning methods, including the combination of multiple-expert opinion, the use of Bayesian networks, and the integration of classification software within a decision support system for water quality assessment.
Resumo:
Based on Bayesian Networks, methods were created that address protein sequence-based bacterial subcellular location prediction. Distinct predictive algorithms for the eight bacterial subcellular locations were created. Several variant methods were explored. These variations included differences in the number of residues considered within the query sequence - which ranged from the N-terminal 10 residues to the whole sequence - and residue representation - which took the form of amino acid composition, percentage amino acid composition, or normalised amino acid composition. The accuracies of the best performing networks were then compared to PSORTB. All individual location methods outperform PSORTB except for the Gram+ cytoplasmic protein predictor, for which accuracies were essentially equal, and for outer membrane protein prediction, where PSORTB outperforms the binary predictor. The method described here is an important new approach to method development for subcellular location prediction. It is also a new, potentially valuable tool for candidate subunit vaccine selection.
Resumo:
Accurate protein structure prediction remains an active objective of research in bioinformatics. Membrane proteins comprise approximately 20% of most genomes. They are, however, poorly tractable targets of experimental structure determination. Their analysis using bioinformatics thus makes an important contribution to their on-going study. Using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we have addressed the alignment-free discrimination of membrane from non-membrane proteins. The method successfully identifies prokaryotic and eukaryotic -helical membrane proteins at 94.4% accuracy, -barrel proteins at 72.4% accuracy, and distinguishes assorted non-membranous proteins with 85.9% accuracy. The method here is an important potential advance in the computational analysis of membrane protein structure. It represents a useful tool for the characterisation of membrane proteins with a wide variety of potential applications.
Resumo:
We describe a novel and potentially important tool for candidate subunit vaccine selection through in silico reverse-vaccinology. A set of Bayesian networks able to make individual predictions for specific subcellular locations is implemented in three pipelines with different architectures: a parallel implementation with a confidence level-based decision engine and two serial implementations with a hierarchical decision structure, one initially rooted by prediction between membrane types and another rooted by soluble versus membrane prediction. The parallel pipeline outperformed the serial pipeline, but took twice as long to execute. The soluble-rooted serial pipeline outperformed the membrane-rooted predictor. Assessment using genomic test sets was more equivocal, as many more predictions are made by the parallel pipeline, yet the serial pipeline identifies 22 more of the 74 proteins of known location.
Resumo:
Bacterial lipoproteins have many important functions and represent a class of possible vaccine candidates. The prediction of lipoproteins from sequence is thus an important task for computational vaccinology. Nave-Bayesian networks were trained to identify SpaseII cleavage sites and their preceding signal sequences using a set of 199 distinct lipoprotein sequences. A comprehensive range of sequence models was used to identify the best model for lipoprotein signal sequences. The best performing sequence model was found to be 10-residues in length, including the conserved cysteine lipid attachment site and the nine residues prior to it. The sensitivity of prediction for LipPred was 0.979, while the specificity was 0.742. Here, we describe LipPred, a web server for lipoprotein prediction; available at the URL: http://www.jenner.ac.uk/LipPred/. LipPred is the most accurate method available for the detection of SpaseIIcleaved lipoprotein signal sequences and the prediction of their cleavage sites.
Resumo:
Our approach for knowledge presentation is based on the idea of expert system shell. At first we will build a graph shell of both possible dependencies and possible actions. Then, reasoning by means of Loglinear models, we will activate some nodes and some directed links. In this way a Bayesian network and networks presenting loglinear models are generated.
Resumo:
Feature selection is important in medical field for many reasons. However, selecting important variables is a difficult task with the presence of censoring that is a unique feature in survival data analysis. This paper proposed an approach to deal with the censoring problem in endovascular aortic repair survival data through Bayesian networks. It was merged and embedded with a hybrid feature selection process that combines cox's univariate analysis with machine learning approaches such as ensemble artificial neural networks to select the most relevant predictive variables. The proposed algorithm was compared with common survival variable selection approaches such as; least absolute shrinkage and selection operator LASSO, and Akaike information criterion AIC methods. The results showed that it was capable of dealing with high censoring in the datasets. Moreover, ensemble classifiers increased the area under the roc curves of the two datasets collected from two centers located in United Kingdom separately. Furthermore, ensembles constructed with center 1 enhanced the concordance index of center 2 prediction compared to the model built with a single network. Although the size of the final reduced model using the neural networks and its ensembles is greater than other methods, the model outperformed the others in both concordance index and sensitivity for center 2 prediction. This indicates the reduced model is more powerful for cross center prediction.