46 resultados para bayesian bottleneck

em Deakin Research Online - Australia


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Fire is a major disturbance process in many ecosystems world-wide, resulting in spatially and temporally dynamic landscapes. For populations occupying such environments, fire-induced landscape change is likely to influence population processes, and genetic patterns and structure among populations. The Mallee Emu-wren Stipiturus mallee is an endangered passerine whose global distribution is confined to fire-prone, semi-arid mallee shrublands in south-eastern Australia. This species, with poor capacity for dispersal, has undergone a precipitous reduction in distribution and numbers in recent decades. We used genetic analyses of 11 length-variable, nuclear loci to examine population structure and processes within this species, across its global range. Populations of the Mallee Emu-wren exhibited a low to moderate level of genetic diversity, and evidence of bottlenecks and genetic drift. Bayesian clustering methods revealed weak genetic population structure across the species' range. The direct effects of large fires, together with associated changes in the spatial and temporal patterns of suitable habitat, have the potential to cause population bottlenecks, serial local extinctions and subsequent recolonisation, all of which may interact to erode and homogenise genetic diversity in this species. Movement among temporally and spatially shifting habitat, appears to maintain long-term genetic connectivity. A plausible explanation for the observed genetic patterns is that, following extensive fires, recolonisation exceeds in-situ survival as the primary driver of population recovery in this species. These findings suggest that dynamic, fire-dominated landscapes can drive genetic homogenisation of populations of species with low-mobility and specialised habitat that otherwise would be expected to show strongly structured populations. Such effects must be considered when formulating management actions to conserve species in fire-prone systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Information Bottleneck method can be used as a dimensionality reduction approach by grouping “similar” features together [1]. In application, a natural question is how many “features groups” will be appropriate. The dependency on prior knowledge restricts the applications of many Information Bottleneck algorithms. In this paper we alleviate this dependency by formulating the parameter determination as a model selection problem, and solve it using the minimum message length principle. An efficient encoding scheme is designed to describe the information bottleneck solutions and the original data, then the minimum message length principle is incorporated to automatically determine the optimal cardinality value. Empirical results in the documentation clustering scenario indicates that the proposed method works well for the determination of the optimal parameter value for information bottleneck method.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper formulates the problem of learning Bayesian network structures from data as determining the structure that best approximates the probability distribution indicated by the data. A new metric, Penalized Mutual Information metric, is proposed, and a evolutionary algorithm is designed to search for the best structure among alternatives. The experimental results show that this approach is reliable and promising.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Supervised machine learning techniques generally require that the training set on which learning is based contain sufficient examples representative of the target concept, as well as known counter-examples of the concept; however, in many application domains it is not possible to supply a set of labeled counter-examples. This paper proposes an objective function based on Bayesian likelihoods of necessity and sufficiency. This function can be used to guide search towards the discovery of a concept description given only a set of labeled positive examples of the target concept, and as a corpus of unlabeled examples. Results of experiments performed on several datasets from the VCI repository show that the technique achieves comparable accuracy to conventional supervised learning techniques, despite the fact that the latter require a set of labeled counter-examples to be supplied. The technique can be applied in many domains in which the provision of labeled counter-examples is problematic.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Mineral Prospectivity Mapping is the process of combining maps containing different geoscientific data sets to produce a single map depicting areas ranked according to their potential to host mineral deposits of a particular type. This paper outlines two approaches for deriving a function which can be used to assign to each cell in the study area a value representing the posterior probability that the cell contains a deposit of the sought-after mineral. One approach is based on estimating probability density functions (pdfs); the second uses multilayer perceptrons (MLPs). Results are provided from applying these approaches to geoscientific datasets covering a region in North Western Victoria, Australia. The results demonstrate that while both the Bayesian approach and the MLP approach yield similar results when the number of input dimensions is small, the Bayesian approach rapidly becomes unstable as the number of input dimensions increases, with the resulting maps displaying high sensitivity to the number of mixtures used to model the distributions. However, despite the fact that Bayesian assigned values cannot be interpreted as posterior probabilities in high dimensional input spaces, the pixel favorability rankings produced by the two methods is similar.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering of multivariate data is a commonly used technique in ecology, and many approaches to clustering are available. The results from a clustering algorithm are uncertain, but few clustering approaches explicitly acknowledge this uncertainty. One exception is Bayesian mixture modelling, which treats all results probabilistically, and allows comparison of multiple plausible classifications of the same data set. We used this method, implemented in the AutoClass program, to classify catchments (watersheds) in the Murray Darling Basin (MDB), Australia, based on their physiographic characteristics (e.g. slope, rainfall, lithology). The most likely classification found nine classes of catchments. Members of each class were aggregated geographically within the MDB. Rainfall and slope were the two most important variables that defined classes. The second-most likely classification was very similar to the first, but had one fewer class. Increasing the nominal uncertainty of continuous data resulted in a most likely classification with five classes, which were again aggregated geographically. Membership probabilities suggested that a small number of cases could be members of either of two classes. Such cases were located on the edges of groups of catchments that belonged to one class, with a group belonging to the second-most likely class adjacent. A comparison of the Bayesian approach to a distance-based deterministic method showed that the Bayesian mixture model produced solutions that were more spatially cohesive and intuitively appealing. The probabilistic presentation of results from the Bayesian classification allows richer interpretation, including decisions on how to treat cases that are intermediate between two or more classes, and whether to consider more than one classification. The explicit consideration and presentation of uncertainty makes this approach useful for ecological investigations, where both data and expectations are often highly uncertain.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A major challenge facing freshwater ecologists and managers is the development of models that link stream ecological condition to catchment scale effects, such as land use. Previous attempts to make such models have followed two general approaches. The bottom-up approach employs mechanistic models, which can quickly become too complex to be useful. The top-down approach employs empirical models derived from large data sets, and has often suffered from large amounts of unexplained variation in stream condition.

We believe that the lack of success of both modelling approaches may be at least partly explained by scientists considering too wide a breadth of catchment type. Thus, we believe that by stratifying large sets of catchments into groups of similar types prior to modelling, both types of models may be improved. This paper describes preliminary work using a Bayesian classification software package, ‘Autoclass’ (Cheeseman and Stutz 1996) to create classes of catchments within the Murray Darling Basin based on physiographic data.

Autoclass uses a model-based classification method that employs finite mixture modelling and trades off model fit versus complexity, leading to a parsimonious solution. The software provides information on the posterior probability that the classification is ‘correct’ and also probabilities for alternative classifications. The importance of each attribute in defining the individual classes is calculated and presented, assisting description of the classes. Each case is ‘assigned’ to a class based on membership probability, but the probability of membership of other classes is also provided. This feature deals very well with cases that do not fit neatly into a larger class. Lastly, Autoclass requires the user to specify the measurement error of continuous variables.

Catchments were derived from the Australian digital elevation model. Physiographic data werederived from national spatial data sets. There was very little information on measurement errors for the spatial data, and so a conservative error of 5% of data range was adopted for all continuous attributes. The incorporation of uncertainty into spatial data sets remains a research challenge.

The results of the classification were very encouraging. The software found nine classes of catchments in the Murray Darling Basin. The classes grouped together geographically, and followed altitude and latitude gradients, despite the fact that these variables were not included in the classification. Descriptions of the classes reveal very different physiographic environments, ranging from dry and flat catchments (i.e. lowlands), through to wet and hilly catchments (i.e. mountainous areas). Rainfall and slope were two important discriminators between classes. These two attributes, in particular, will affect the ways in which the stream interacts with the catchment, and can thus be expected to modify the effects of land use change on ecological condition. Thus, realistic models of the effects of land use change on streams would differ between the different types of catchments, and sound management practices will differ.

A small number of catchments were assigned to their primary class with relatively low probability. These catchments lie on the boundaries of groups of catchments, with the second most likely class being an adjacent group. The locations of these ‘uncertain’ catchments show that the Bayesian classification dealt well with cases that do not fit neatly into larger classes.

Although the results are intuitive, we cannot yet assess whether the classifications described in this paper would assist the modelling of catchment scale effects on stream ecological condition. It is most likely that catchment classification and modelling will be an iterative process, where the needs of the model are used to guide classification, and the results of classifications used to suggest further refinements to models.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Information Bottleneck method aims to extract a compact representation which preserves the maximum relevant information. The sub-optimality in agglomerative Information Bottleneck (aIB) algorithm restricts the applications of Information Bottleneck method. In this paper, the concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information loss between pairs of elements. The DaIB algorithm is then presented to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure. The experiment results on the benchmark data sets show that the DaIB algorithm can get more relevant information and higher precision than aIB algorithm, and the paired t-test indicates that these improvements are statistically significant.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Clustering with the agglomerative Information Bottleneck (aIB) algorithm suffers from the sub-optimality problem, which cannot guarantee to preserve as much relative information as possible. To handle this problem, we introduce a density connectivity chain, by which we consider not only the information between two data elements, but also the information among the neighbors of a data element. Based on this idea, we propose DCIB, a Density Connectivity Information Bottleneck algorithm that applies the Information Bottleneck method to quantify the relative information during the clustering procedure. As a hierarchical algorithm, the DCIB algorithm produces a pruned clustering tree-structure and gets clustering results in different sizes in a single execution. The experiment results in the documentation clustering indicate that the DCIB algorithm can preserve more relative information and achieve higher precision than the aIB algorithm.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The thesis examined the inter-rater reliability and procedural validity of four computerised Bayesian belief networks (BBNs) which were developed to assist with the diagnosis of psychotic disorders. The results of this research indicated that BBNs can significantly improve diagnostic reliability and may represent an important advance over current diagnostic methods. The professional portfolio investigated, through the presentation of case studies and review of literature relevant to each case study, how comorbidity and context of depression may impact on cognitive behavioural therapy treatment.