9 resultados para minimum message length

em Deakin Research Online - Australia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Efficiently inducing precise causal models accurately reflecting given data sets is the ultimate goal of causal discovery. The algorithms proposed by Dai et al. has demonstrated the ability of the Minimum Message Length (MML) principle in discovering Linear Causal Models from training data. In order to further explore ways to improve efficiency, this paper incorporates the Hoeffding Bounds into the learning process. At each step of causal discovery, if a small number of data items is enough to distinguish the better model from the rest, the computation cost will be reduced by ignoring the other data items. Experiments with data set from related benchmark models indicate that the new algorithm achieves speedup over previous work in terms of learning efficiency while preserving the discovery accuracy.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Determining the causal relation among attributes in a domain is a key task in data mining and knowledge discovery. The Minimum Message Length (MML) principle has demonstrated its ability in discovering linear causal models from training data. To explore the ways to improve efficiency, this paper proposes a novel Markov Blanket identification algorithm based on the Lasso estimator. For each variable, this algorithm first generates a Lasso tree, which represents a pruned candidate set of possible feature sets. The Minimum Message Length principle is then employed to evaluate all those candidate feature sets, and the feature set with minimum message length is chosen as the Markov Blanket. Our experiment results show the ability of this algorithm. In addition, this algorithm can be used to prune the search space of causal discovery, and further reduce the computational cost of those score-based causal discovery algorithms.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Information Bottleneck method can be used as a dimensionality reduction approach by grouping “similar” features together [1]. In application, a natural question is how many “features groups” will be appropriate. The dependency on prior knowledge restricts the applications of many Information Bottleneck algorithms. In this paper we alleviate this dependency by formulating the parameter determination as a model selection problem, and solve it using the minimum message length principle. An efficient encoding scheme is designed to describe the information bottleneck solutions and the original data, then the minimum message length principle is incorporated to automatically determine the optimal cardinality value. Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Discovering a precise causal structure accurately reflecting the given data is one of the most essential tasks in the area of data mining and machine learning. One of the successful causal discovery approaches is the information-theoretic approach using the Minimum Message Length Principle[19]. This paper presents an improved and further experimental results of the MML discovery algorithm. We introduced a new encoding scheme for measuring the cost of describing the causal structure. Stiring function is also applied to further simplify the computational complexity and thus works more efficiently. The experimental results of the current version of the discovery system show that: (1) the current version is capable of discovering what discovered by previous system; (2) current system is capable of discovering more complicated causal models with large number of variables; (3) the new version works more efficiently compared with the previous version in terms of time complexity.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

One major difficulty frustrating the application of linear causal models is that they are not easily adapted to cope with discrete data. This is unfortunate since most real problems involve both continuous and discrete variables. In this paper, we consider a class of graphical models which allow both continuous and discrete variables, and propose the parameter estimation method and a structure discovery algorithm based on Minimum Message Length and parameter estimation. Experimental results are given to demonstrate the potential for the application of this method.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Determining the causal structure of a domain is a key task in the area of Data Mining and Knowledge Discovery.The algorithm proposed by Wallace et al. [15] has demonstrated its strong ability in discovering Linear Causal Models from given data sets. However, some experiments showed that this algorithm experienced difficulty in discovering linear relations with small deviation, and it occasionally gives a negative message length, which should not be allowed. In this paper, a more efficient and precise MML encoding scheme is proposed to describe the model structure and the nodes in a Linear Causal Model. The estimation of different parameters is also derived. Empirical results show that the new algorithm outperformed the previous MML-based algorithm in terms of both speed and precision.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Roll forming of ultra-high strength steels (UHSS) and other high strength alloys is an advanced manufacturing methodology with the ability of cold forming those materials to complex three-dimensional shapes for lightweight structural applications. Due to their high strength, most of these materials have a reduced ductility which excludes conventional sheet forming methods under cold forming conditions. Roll forming is possible due to its low strains and incremental forming characteristic. Recent research investigates the development of high strength nano-structured aluminum sheet and titanium alloys, as well as their behaviour in roll forming with regard to formability, material behaviour and shape defects. The development of new materials is often limited to small scale samples due to the high preparation costs. In contrast, industrial application needs larger scale tests for validation, especially in roll forming where a minimum sheet length is required to feed the sample trough the roll forming machine. This work describes a novel technique for studying roll forming of a short length of experimental material. DP780 steel strips (500mm – 1300mm length) were welded between two mild steel carrier sheets of similar width and thickness giving an overall strip length of 2m. Roll forming trials were performed and longitudinal edge strain, bow and springback determined on the welded samples and samples formed of full length DP780 strip before and after cut off. The experimental results of this work show that this method gives a reasonable approach for predicting material behavior in roll forming transverse to the rolling direction. In contrast to that significant differences in longitudinal bow were observed between the welded sections and the sections formed of full length DP780 strip; this indicates that the applicability of this method is limited with regard to predicting longitudinal material behavior in roll forming.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Methods are presented for calculating minimum sample sizes necessary to obtain precise estimates of fungal spore dimensions. Using previously published spore-length data sets for Peronospora species, we demonstrate that 41—71 spores need to be measured to estimate the mean length with a reasonable level of statistical precision and resolution. This is further progressed with examples for calculating the minimum number of spore lengths to measure when matching an undetermined specimen to a known species. Although applied only to spore-length data, all described methods can be applied to any morphometric data that satisfy certain statistical assumptions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

1. The lack of consensus concerning the impact of telomere length (TL) dynamics on survival emphasizes the need for additional studies to evaluate the effect of TL on key life-history processes.
2. Using both cross-sectional and longitudinal data, we therefore explored age-specific TL dynamics in a squamate reptile: the frillneck lizard (Chlamydosaurus kingii).
3. Our cross-sectional analyses revealed that young lizards had short TL, TL increased in medium-aged lizards, but TL decreased in older age cohorts, revealing a curvilinear relationship between TL and frillneck lizard age.
4. Neither our cross-sectional nor our longitudinal analyses revealed any association between TL dynamics and lizard survival.
5. We observed a significant positive relationship between TL and telomerase expression (TE), suggesting that TE is a significant determinant of frillneck lizard TL dynamics.
6. Importantly, our longitudinal analyses revealed a positive relationship between initial TL and telomere attrition rate within individual lizards, that is lizards with short initial telomeres were subjected to reduced telomere attrition rates compared to lizards with long initial TL.
7. Our results strongly suggest that TL and TE dynamics in frillneck lizards is not associated with lizard survival but rather reflect an adaptation to maintain TL above a critical minimum length in order to sustain cellular homeostasis.