959 resultados para Causal Tree Method
Resumo:
We have developed an alignment-free method that calculates phylogenetic distances using a maximum-likelihood approach for a model of sequence change on patterns that are discovered in unaligned sequences. To evaluate the phylogenetic accuracy of our method, and to conduct a comprehensive comparison of existing alignment-free methods (freely available as Python package decaf+py at http://www.bioinformatics.org.au), we have created a data set of reference trees covering a wide range of phylogenetic distances. Amino acid sequences were evolved along the trees and input to the tested methods; from their calculated distances we infered trees whose topologies we compared to the reference trees. We find our pattern-based method statistically superior to all other tested alignment-free methods. We also demonstrate the general advantage of alignment-free methods over an approach based on automated alignments when sequences violate the assumption of collinearity. Similarly, we compare methods on empirical data from an existing alignment benchmark set that we used to derive reference distances and trees. Our pattern-based approach yields distances that show a linear relationship to reference distances over a substantially longer range than other alignment-free methods. The pattern-based approach outperforms alignment-free methods and its phylogenetic accuracy is statistically indistinguishable from alignment-based distances.
Resumo:
The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the performance of Naïve Bayes classifiers and in this paper, we investigate its effectiveness on application to the TAN classifier.
Resumo:
Indexing high dimensional datasets has attracted extensive attention from many researchers in the last decade. Since R-tree type of index structures are known as suffering curse of dimensionality problems, Pyramid-tree type of index structures, which are based on the B-tree, have been proposed to break the curse of dimensionality. However, for high dimensional data, the number of pyramids is often insufficient to discriminate data points when the number of dimensions is high. Its effectiveness degrades dramatically with the increase of dimensionality. In this paper, we focus on one particular issue of curse of dimensionality; that is, the surface of a hypercube in a high dimensional space approaches 100% of the total hypercube volume when the number of dimensions approaches infinite. We propose a new indexing method based on the surface of dimensionality. We prove that the Pyramid tree technology is a special case of our method. The results of our experiments demonstrate clear priority of our novel method.
Resumo:
In this paper we present an efficient k-Means clustering algorithm for two dimensional data. The proposed algorithm re-organizes dataset into a form of nested binary tree*. Data items are compared at each node with only two nearest means with respect to each dimension and assigned to the one that has the closer mean. The main intuition of our research is as follows: We build the nested binary tree. Then we scan the data in raster order by in-order traversal of the tree. Lastly we compare data item at each node to the only two nearest means to assign the value to the intendant cluster. In this way we are able to save the computational cost significantly by reducing the number of comparisons with means and also by the least use to Euclidian distance formula. Our results showed that our method can perform clustering operation much faster than the classical ones. © Springer-Verlag Berlin Heidelberg 2005
Resumo:
Government agencies responsible for riparian environments are assessing the combined utility of field survey and remote sensing for mapping and monitoring indicators of riparian zone condition. The objective of this work was to compare the Tropical Rapid Appraisal of Riparian Condition (TRARC) method to a satellite image based approach. TRARC was developed for rapid assessment of the environmental condition of savanna riparian zones. The comparison assessed mapping accuracy, representativeness of TRARC assessment, cost-effectiveness, and suitability for multi-temporal analysis. Two multi-spectral QuickBird images captured in 2004 and 2005 and coincident field data covering sections of the Daly River in the Northern Territory, Australia were used in this work. Both field and image data were processed to map riparian health indicators (RHIs) including percentage canopy cover, organic litter, canopy continuity, stream bank stability, and extent of tree clearing. Spectral vegetation indices, image segmentation and supervised classification were used to produce RHI maps. QuickBird image data were used to examine if the spatial distribution of TRARC transects provided a representative sample of ground based RHI measurements. Results showed that TRARC transects were required to cover at least 3% of the study area to obtain a representative sample. The mapping accuracy and costs of the image based approach were compared to those of the ground based TRARC approach. Results proved that TRARC was more cost-effective at smaller scales (1-100km), while image based assessment becomes more feasible at regional scales (100-1000km). Finally, the ability to use both the image and field based approaches for multi-temporal analysis of RHIs was assessed. Change detection analysis demonstrated that image data can provide detailed information on gradual change, while the TRARC method was only able to identify more gross scale changes. In conclusion, results from both methods were considered to complement each other if used at appropriate spatial scales.
Resumo:
Objectives Effective skin antisepsis and disinfection of medical devices are key factors in preventing many healthcare-acquired infections associated with skin microorganisms, particularly Staphylococcus epidermidis. The aim of this study was to investigate the antimicrobial efficacy of chlorhexidine digluconate (CHG), a widely used antiseptic in clinical practice, alone and in combination with tea tree oil (TTO), eucalyptus oil (EO) and thymol against planktonic and biofilm cultures of S. epidermidis. Methods Antimicrobial susceptibility assays against S. epidermidis in a suspension and in a biofilm mode of growth were performed with broth microdilution and ATP bioluminescence methods, respectively. Synergy of antimicrobial agents was evaluated with the chequerboard method. Results CHG exhibited antimicrobial activity against S. epidermidis in both suspension and biofilm (MIC 2–8 mg/L). Of the essential oils thymol exhibited the greatest antimicrobial efficacy (0.5–4 g/L) against S. epidermidis in suspension and biofilm followed by TTO (2–16 g/L) and EO (4–64 g/L). MICs of CHG and EO were reduced against S. epidermidis biofilm when in combination (MIC of 8 reduced to 0.25–1 mg/L and MIC of 32–64 reduced to 4 g/L for CHG and EO, respectively). Furthermore, the combination of EO with CHG demonstrated synergistic activity against S. epidermidis biofilm with a fractional inhibitory concentration index of <0.5. Conclusions The results from this study suggest that there may be a role for essential oils, in particular EO, for improved skin antisepsis when combined with CHG.
Resumo:
Hazard and operability (HAZOP) studies on chemical process plants are very time consuming, and often tedious, tasks. The requirement for HAZOP studies is that a team of experts systematically analyse every conceivable process deviation, identifying possible causes and any hazards that may result. The systematic nature of the task, and the fact that some team members may be unoccupied for much of the time, can lead to tedium, which in turn may lead to serious errors or omissions. An aid to HAZOP are fault trees, which present the system failure logic graphically such that the study team can readily assimilate their findings. Fault trees are also useful to the identification of design weaknesses, and may additionally be used to estimate the likelihood of hazardous events occurring. The one drawback of fault trees is that they are difficult to generate by hand. This is because of the sheer size and complexity of modern process plants. The work in this thesis proposed a computer-based method to aid the development of fault trees for chemical process plants. The aim is to produce concise, structured fault trees that are easy for analysts to understand. Standard plant input-output equation models for major process units are modified such that they include ancillary units and pipework. This results in a reduction in the nodes required to represent a plant. Control loops and protective systems are modelled as operators which act on process variables. This modelling maintains the functionality of loops, making fault tree generation easier and improving the structure of the fault trees produced. A method, called event ordering, is proposed which allows the magnitude of deviations of controlled or measured variables to be defined in terms of the control loops and protective systems with which they are associated.
Resumo:
Hierarchical knowledge structures are frequently used within clinical decision support systems as part of the model for generating intelligent advice. The nodes in the hierarchy inevitably have varying influence on the decisionmaking processes, which needs to be reflected by parameters. If the model has been elicited from human experts, it is not feasible to ask them to estimate the parameters because there will be so many in even moderately-sized structures. This paper describes how the parameters could be obtained from data instead, using only a small number of cases. The original method [1] is applied to a particular web-based clinical decision support system called GRiST, which uses its hierarchical knowledge to quantify the risks associated with mental-health problems. The knowledge was elicited from multidisciplinary mental-health practitioners but the tree has several thousand nodes, all requiring an estimation of their relative influence on the assessment process. The method described in the paper shows how they can be obtained from about 200 cases instead. It greatly reduces the experts’ elicitation tasks and has the potential for being generalised to similar knowledge-engineering domains where relative weightings of node siblings are part of the parameter space.
Resumo:
In multicriteria decision problems many values must be assigned, such as the importance of the different criteria and the values of the alternatives with respect to subjective criteria. Since these assignments are approximate, it is very important to analyze the sensitivity of results when small modifications of the assignments are made. When solving a multicriteria decision problem, it is desirable to choose a decision function that leads to a solution as stable as possible. We propose here a method based on genetic programming that produces better decision functions than the commonly used ones. The theoretical expectations are validated by case studies. © 2003 Elsevier B.V. All rights reserved.
Resumo:
Usually, data mining projects that are based on decision trees for classifying test cases will use the probabilities provided by these decision trees for ranking classified test cases. We have a need for a better method for ranking test cases that have already been classified by a binary decision tree because these probabilities are not always accurate and reliable enough. A reason for this is that the probability estimates computed by existing decision tree algorithms are always the same for all the different cases in a particular leaf of the decision tree. This is only one reason why the probability estimates given by decision tree algorithms can not be used as an accurate means of deciding if a test case has been correctly classified. Isabelle Alvarez has proposed a new method that could be used to rank the test cases that were classified by a binary decision tree [Alvarez, 2004]. In this paper we will give the results of a comparison of different ranking methods that are based on the probability estimate, the sensitivity of a particular case or both.
Resumo:
The problem of recognition on finite set of events is considered. The generalization ability of classifiers for this problem is studied within the Bayesian approach. The method for non-uniform prior distribution specification on recognition tasks is suggested. It takes into account the assumed degree of intersection between classes. The results of the analysis are applied for pruning of classification trees.
Resumo:
Stylization is a method of ornamental plant use usually applied in urban open space and garden design based on aesthetic consideration. Stylization can be seen as a nature-imitating ornamental plant application which evokes the scenery rather than an ecological plant application which assists the processes and functions observed in the nature. From a different point of view, stylization of natural or semi-natural habitats can sometimes serve as a method for preserving the physiognomy of the plant associations that may be affected by the climate change of the 21st century. The vulnerability of the Hungarian habitats has thus far been examined by the researchers only from the botanical point of view but not in terms of its landscape design value. In Hungary coniferous forests are edaphic and classified on this basis. The General National Habitat Classification System (Á-NÉR) distinguishes calcareous Scots pine forests and acidofrequent coniferous forests. The latter seems to be highly sensitive to climate change according to ecological models. The physiognomy and species pool of its subtypes are strongly determined by the dominant coniferous species that can be Norway spruce (Picea abies) or Scots pine (Pinus sylvestris). We are going to discuss the methodology of stylization of climate sensitive habitats and briefly refer to acidofrequent coniferous forests as a case study. In the course of stylization those coniferous and deciduous tree species of the studied habitat that are water demanding should be substituted by drought tolerant ones with similar characteristics. A list of the proposed taxa is going to be given.
Resumo:
Stylization is a common method of ornamental plant use that imitates nature and evokes the scenery. This paper discloses a not yet proposed aspect of stylization, since the method offers the possibility of preserving the physiognomy of those habitats that seem to vanish due to future climate change. In addition, novelty of the method is founded also on that vulnerability of the Hungarian habitats has been examined by the researchers only from the botanical and ecological point of view so far and not in terms of its landscape design value. In Hungary, acidofrequent mixed forests appear to be highly sensitive to climate change according to ecological models. We are going to discuss the methodology of stylization of climate sensitive habitats and briefly refer to acidofrequent mixed forests as a case study. Those coniferous and deciduous tree species of the studied habitat that are water demanding are proposed to be substituted by drought tolerant ones with similar characteristics, and an optionally expandable list of these taxa is presented. Based on this the authors suggest experimental investigations of those of the proposed taxa for which the higher drought tolerance is based on observations only.
Resumo:
In 2005 we began a multi-year intensive monitoring and assessment study of tropical hardwood hammocks within two distinct hydrologic regions in Everglades National Park, under funding from the CERP Monitoring and Assessment Program. In serving as an Annual Report for 2010, this document, reports in detail on the population dynamics and status of tropical hardwood hammocks in Shark Slough and adjacent marl prairies during a 4-year period between 2005 and 2009. 2005-09 was a period that saw a marked drawdown in marsh water levels (July 2006 - July 2008), and an active hurricane season in 2005 with two hurricanes, Hurricane Katrina and Wilma, making landfall over south Florida. Thus much of our focus here is on the responses of these forests to annual variation in marsh water level, and on recovery from disturbance. Most of the data are from 16 rectangular permanent plots of 225-625 m2 , with all trees mapped and tagged, and bi-annual sampling of the tree, sapling, shrub, and herb layer in a nested design. At each visit, canopy photos were taken and later analyzed for determination of interannual variation in leaf area index and canopy openness. Three of the plots were sampled at 2-month intervals, in order to gain a better idea of seasonal dynamics in litterfall and litter turnover. Changes in canopy structure were monitored through a vertical line intercept method.
Resumo:
OBJECTIVE: To demonstrate the application of causal inference methods to observational data in the obstetrics and gynecology field, particularly causal modeling and semi-parametric estimation. BACKGROUND: Human immunodeficiency virus (HIV)-positive women are at increased risk for cervical cancer and its treatable precursors. Determining whether potential risk factors such as hormonal contraception are true causes is critical for informing public health strategies as longevity increases among HIV-positive women in developing countries. METHODS: We developed a causal model of the factors related to combined oral contraceptive (COC) use and cervical intraepithelial neoplasia 2 or greater (CIN2+) and modified the model to fit the observed data, drawn from women in a cervical cancer screening program at HIV clinics in Kenya. Assumptions required for substantiation of a causal relationship were assessed. We estimated the population-level association using semi-parametric methods: g-computation, inverse probability of treatment weighting, and targeted maximum likelihood estimation. RESULTS: We identified 2 plausible causal paths from COC use to CIN2+: via HPV infection and via increased disease progression. Study data enabled estimation of the latter only with strong assumptions of no unmeasured confounding. Of 2,519 women under 50 screened per protocol, 219 (8.7%) were diagnosed with CIN2+. Marginal modeling suggested a 2.9% (95% confidence interval 0.1%, 6.9%) increase in prevalence of CIN2+ if all women under 50 were exposed to COC; the significance of this association was sensitive to method of estimation and exposure misclassification. CONCLUSION: Use of causal modeling enabled clear representation of the causal relationship of interest and the assumptions required to estimate that relationship from the observed data. Semi-parametric estimation methods provided flexibility and reduced reliance on correct model form. Although selected results suggest an increased prevalence of CIN2+ associated with COC, evidence is insufficient to conclude causality. Priority areas for future studies to better satisfy causal criteria are identified.