999 resultados para causal discovery


Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper presents an ensemble MML approach for the discovery of causal models. The component learners are formed based on the MML causal induction methods. Six different ensemble causal induction algorithms are proposed. Our experiential results reveal that (1) the ensemble MML causal induction approach has achieved an improved result compared with any single learner in terms of learning accuracy and correctness; (2) Among all the ensemble causal induction algorithms examined, the weighted voting without seeding algorithm outperforms all the rest; (3) It seems that the ensembled CI algorithms could alleviate the local minimum problem. The only drawback of this method is that the time complexity is increased by δ times, where δ is the ensemble size.

Relevância:

100.00% 100.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Efficiently inducing precise causal models accurately reflecting given data sets is the ultimate goal of causal discovery. The algorithms proposed by Dai et al. has demonstrated the ability of the Minimum Message Length (MML) principle in discovering Linear Causal Models from training data. In order to further explore ways to improve efficiency, this paper incorporates the Hoeffding Bounds into the learning process. At each step of causal discovery, if a small number of data items is enough to distinguish the better model from the rest, the computation cost will be reduced by ignoring the other data items. Experiments with data set from related benchmark models indicate that the new algorithm achieves speedup over previous work in terms of learning efficiency while preserving the discovery accuracy.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Discovering a precise causal structure accurately reflecting the given data is one of the most essential tasks in the area of data mining and machine learning. One of the successful causal discovery approaches is the information-theoretic approach using the Minimum Message Length Principle[19]. This paper presents an improved and further experimental results of the MML discovery algorithm. We introduced a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. The experimental results of the current version of the discovery system show that: (1) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal models with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper presents a Minimal Causal Model Inducer that can be used for the reliable knowledge discovery. The minimal-model semantics of causal discovery is an essential concept for the identification of a best fitting model in the sense of satisfactory consistent with the given data and be the simpler, less expressive model. Consistency is one of major measures of reliability in knowledge discovery. Therefore to develop an algorithm being able to derive a minimal model is an interesting topic in the are of reliable knowledge discovery. various causal induction algorithms and tools developed so far can not guarantee that the derived model is minimal and consistent. It was proved the MML induction approach introduced by Wallace, Keven and Honghua Dai is a minimal causal model learner. In this paper, we further prove that the developed minimal causal model learner is reliable in the sense of satisfactory consistency. The experimental results obtained from the tests on a number of both artificial and real models provided in this paper confirm this theoretical result.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Software reuse is an important topic due to its potential benefits in increasing product quality and decreasing cost. Although more and more people are aware that not only technical issues, but also nontechnical issues are important to the success of software reuse, people are still not certain which factors will have direct effect on the success of reuse. In this paper, we applied a causal discovery algorithm to the software reuse survey data [2]. Ensemble strategy is incorporated to locate a probable causal model structure for software reuse, and find all those factors which have direct effect on the success of reuse. Our discovery results reinforced some conclusions of Morisio et al. and found some new conclusions which might significantly improve the odds of a reuse project succeeding.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Efficiently inducing precise causal models accurately reflecting given data sets is the ultimate goal of causal discovery. The algorithm proposed by Wallace et al. [10] has demonstrated its ability in discovering Linear Causal Models from data. To explore the ways to improve efficiency, this research examines three different encoding schemes and four searching strategies. The experimental results reveal that (1) specifying parents encoding method is the best among three encoding methods we examined; (2) In the discovery of linear causal models, local Hill climbing works very well compared to other more sophisticated methods, like Markov Chain Monte Carto (MCMC), Genetic Algorithm (GA) and Parallel MCMC searching.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents an examination report on the performance of the improved MML based causal model discovery algorithm. In this paper, We firstly describe our improvement to the causal discovery algorithm which introduces a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. It is followed by a detailed examination report on the performance of our improved discovery algorithm. The experimental results of the current version of the discovery system show that: (l) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal networks with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Determining the causal relation among attributes in a domain
is a key task in the data mining and knowledge discovery. In this
paper, we applied a causal discovery algorithm to the business traveler
expenditure survey data [1]. A general class of causal models is adopted in
this paper to discover the causal relationship among continuous and discrete variables. All those factors which have direct effect on the expense
pattern of travelers could be detected. Our discovery results reinforced
some conclusions of the rough set analysis and found some new conclusions which might significantly improve the understanding of expenditure behaviors of the business traveler.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Automatic causal discovery is a challenge research with extraordinary significance in sceintific research and in many real world problems where recovery of causes and effects and their causality relationship is an essential task. This paper firstly introduces the causality and perspectives of causal discovery. Then it provides an anlaysis on the three major approaches that are proposed in the last decades for the automatic discovery of casual models from given data. Afterwards it presents a analysis on the capability and applicability of the different proposed approaches followed by a conclusion on the potentials and the future research. © 2013 IEEE.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Determining the causal relation among attributes in a domain is a key task in data mining and knowledge discovery. The Minimum Message Length (MML) principle has demonstrated its ability in discovering linear causal models from training data. To explore the ways to improve efficiency, this paper proposes a novel Markov Blanket identification algorithm based on the Lasso estimator. For each variable, this algorithm first generates a Lasso tree, which represents a pruned candidate set of possible feature sets. The Minimum Message Length principle is then employed to evaluate all those candidate feature sets, and the feature set with minimum message length is chosen as the Markov Blanket. Our experiment results show the ability of this algorithm. In addition, this algorithm can be used to prune the search space of causal discovery, and further reduce the computational cost of those score-based causal discovery algorithms.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Determining the causal structure of a domain is frequently a key task in the area of Data Mining and Knowledge Discovery. This paper introduces ensemble learning into linear causal model discovery, then examines several algorithms based on different ensemble strategies including Bagging, Adaboost and GASEN. Experimental results show that (1) Ensemble discovery algorithm can achieve an improved result compared with individual causal discovery algorithm in terms of accuracy; (2) Among all examined ensemble discovery algorithms, BWV algorithm which uses a simple Bagging strategy works excellently compared to other more sophisticated ensemble strategies; (3) Ensemble method can also improve the stability of parameter estimation. In addition, Ensemble discovery algorithm is amenable to parallel and distributed processing, which is important for data mining in large data sets.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Determining the causal structure of a domain is a key task in the area of Data Mining and Knowledge Discovery.The algorithm proposed by Wallace et al. [15] has demonstrated its strong ability in discovering Linear Causal Models from given data sets. However, some experiments showed that this algorithm experienced difficulty in discovering linear relations with small deviation, and it occasionally gives a negative message length, which should not be allowed. In this paper, a more efficient and precise MML encoding scheme is proposed to describe the model structure and the nodes in a Linear Causal Model. The estimation of different parameters is also derived. Empirical results show that the new algorithm outperformed the previous MML-based algorithm in terms of both speed and precision.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

One major difficulty frustrating the application of linear causal models is that they are not easily adapted to cope with discrete data. This is unfortunate since most real problems involve both continuous and discrete variables. In this paper, we consider a class of graphical models which allow both continuous and discrete variables, and propose the parameter estimation method and a structure discovery algorithm based on Minimum Message Length and parameter estimation. Experimental results are given to demonstrate the potential for the application of this method.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This thesis made outstanding contribution in automating the discovery of linear causal models. It introduced a highly efficient discovery algorithm, which implements new encoding, ensemble and accelerating strategies. Theoretic research and experimental work showed that this new discovery algorithm outperforms the previous system in both accuracy and efficiency.