808 resultados para Semi-supervised clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

We present some additions to a fuzzy variable radius niche technique called Dynamic Niche Clustering (DNC) (Gan and Warwick, 1999; 2000; 2001) that enable the identification and creation of niches of arbitrary shape through a mechanism called Niche Linkage. We show that by using this mechanism it is possible to attain better feature extraction from the underlying population.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper describes the recent developments and improvements made to the variable radius niching technique called Dynamic Niche Clustering (DNC). DNC is fitness sharing based technique that employs a separate population of overlapping fuzzy niches with independent radii which operate in the decoded parameter space, and are maintained alongside the normal GA population. We describe a speedup process that can be applied to the initial generation which greatly reduces the complexity of the initial stages. A split operator is also introduced that is designed to counteract the excessive growth of niches, and it is shown that this improves the overall robustness of the technique. Finally, the effect of local elitism is documented and compared to the performance of the basic DNC technique on a selection of 2D test functions. The paper is concluded with a view to future work to be undertaken on the technique.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recent experimental evidence underlines the importance of reduced diffusivity in amorphous semi-solid or glassy atmospheric aerosols. This paper investigates the impact of diffusivity on the ageing of multi-component reactive organic particles approximating atmospheric cooking aerosols. We apply and extend the recently developed KMSUB model in a study of a 12-component mixture containing oleic and palmitoleic acids. We demonstrate that changes in the diffusivity may explain the evolution of chemical loss rates in ageing semi-solid particles, and we resolve surface and bulk processes under transient reaction conditions considering diffusivities altered by oligomerisation. This new model treatment allows prediction of the ageing of mixed organic multi-component aerosols over atmospherically relevant timescales and conditions. We illustrate the impact of changing diffusivity on the chemical half-life of reactive components in semi-solid particles, and we demonstrate how solidification and crust formation at the particle surface can affect the chemical transformation of organic aerosols.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A UK field experiment compared a complete factorial combination of three backgrounds (cvs Mercia, Maris Huntsman and Maris Widgeon), three alleles at the Rht-B1 locus as Near Isogenic Lines (NILs: rht-B1a (tall), Rht-B1b (semi-dwarf), Rht-B1c (severe dwarf)) and four nitrogen (N) fertilizer application rates (0, 100, 200 and 350 kg N/ha). Linear+exponential functions were fitted to grain yield (GY) and nitrogen-use efficiency (NUE; GY/available N) responses to N rate. Averaged over N rate and background Rht-B1b conferred significantly (P<0.05) greater GY, NUE, N uptake efficiency (NUpE; N in above ground crop / available N) and N utilization efficiency (NUtEg; GY / N in above ground crop) compared with rht-B1a and Rht-B1c. However the economically optimal N rate (Nopt) for N:grain price ratios of 3.5:1 to 10:1 were also greater for Rht-B1b, and because NUE, NUpE and NUtE all declined with N rate, Rht-Blb failed to increase NUE or its components at Nopt. The adoption of semi-dwarf lines in temperate and humid regions, and the greater N rates that such adoption justifies economically, greatly increases land-use efficiency, but not necessarily, NUE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. This work proposes a fully decentralised algorithm (Epidemic K-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art distributed K-Means algorithms based on sampling methods. The experimental analysis confirms that the proposed algorithm is a practical and accurate distributed K-Means implementation for networked systems of very large and extreme scale.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This dissertation deals with aspects of sequential data assimilation (in particular ensemble Kalman filtering) and numerical weather forecasting. In the first part, the recently formulated Ensemble Kalman-Bucy (EnKBF) filter is revisited. It is shown that the previously used numerical integration scheme fails when the magnitude of the background error covariance grows beyond that of the observational error covariance in the forecast window. Therefore, we present a suitable integration scheme that handles the stiffening of the differential equations involved and doesn’t represent further computational expense. Moreover, a transform-based alternative to the EnKBF is developed: under this scheme, the operations are performed in the ensemble space instead of in the state space. Advantages of this formulation are explained. For the first time, the EnKBF is implemented in an atmospheric model. The second part of this work deals with ensemble clustering, a phenomenon that arises when performing data assimilation using of deterministic ensemble square root filters in highly nonlinear forecast models. Namely, an M-member ensemble detaches into an outlier and a cluster of M-1 members. Previous works may suggest that this issue represents a failure of EnSRFs; this work dispels that notion. It is shown that ensemble clustering can be reverted also due to nonlinear processes, in particular the alternation between nonlinear expansion and compression of the ensemble for different regions of the attractor. Some EnSRFs that use random rotations have been developed to overcome this issue; these formulations are analyzed and their advantages and disadvantages with respect to common EnSRFs are discussed. The third and last part contains the implementation of the Robert-Asselin-Williams (RAW) filter in an atmospheric model. The RAW filter is an improvement to the widely popular Robert-Asselin filter that successfully suppresses spurious computational waves while avoiding any distortion in the mean value of the function. Using statistical significance tests both at the local and field level, it is shown that the climatology of the SPEEDY model is not modified by the changed time stepping scheme; hence, no retuning of the parameterizations is required. It is found the accuracy of the medium-term forecasts is increased by using the RAW filter.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Ensemble clustering (EC) can arise in data assimilation with ensemble square root filters (EnSRFs) using non-linear models: an M-member ensemble splits into a single outlier and a cluster of M−1 members. The stochastic Ensemble Kalman Filter does not present this problem. Modifications to the EnSRFs by a periodic resampling of the ensemble through random rotations have been proposed to address it. We introduce a metric to quantify the presence of EC and present evidence to dispel the notion that EC leads to filter failure. Starting from a univariate model, we show that EC is not a permanent but transient phenomenon; it occurs intermittently in non-linear models. We perform a series of data assimilation experiments using a standard EnSRF and a modified EnSRF by a resampling though random rotations. The modified EnSRF thus alleviates issues associated with EC at the cost of traceability of individual ensemble trajectories and cannot use some of algorithms that enhance performance of standard EnSRF. In the non-linear regimes of low-dimensional models, the analysis root mean square error of the standard EnSRF slowly grows with ensemble size if the size is larger than the dimension of the model state. However, we do not observe this problem in a more complex model that uses an ensemble size much smaller than the dimension of the model state, along with inflation and localisation. Overall, we find that transient EC does not handicap the performance of the standard EnSRF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We study boundary value problems posed in a semistrip for the elliptic sine-Gordon equation, which is the paradigm of an elliptic integrable PDE in two variables. We use the method introduced by one of the authors, which provides a substantial generalization of the inverse scattering transform and can be used for the analysis of boundary as opposed to initial-value problems. We first express the solution in terms of a 2 by 2 matrix Riemann-Hilbert problem whose \jump matrix" depends on both the Dirichlet and the Neumann boundary values. For a well posed problem one of these boundary values is an unknown function. This unknown function is characterised in terms of the so-called global relation, but in general this characterisation is nonlinear. We then concentrate on the case that the prescribed boundary conditions are zero along the unbounded sides of a semistrip and constant along the bounded side. This corresponds to a case of the so-called linearisable boundary conditions, however a major difficulty for this problem is the existence of non-integrable singularities of the function q_y at the two corners of the semistrip; these singularities are generated by the discontinuities of the boundary condition at these corners. Motivated by the recent solution of the analogous problem for the modified Helmholtz equation, we introduce an appropriate regularisation which overcomes this difficulty. Furthermore, by mapping the basic Riemann-Hilbert problem to an equivalent modified Riemann-Hilbert problem, we show that the solution can be expressed in terms of a 2 by 2 matrix Riemann-Hilbert problem whose jump matrix depends explicitly on the width of the semistrip L, on the constant value d of the solution along the bounded side, and on the residues at the given poles of a certain spectral function denoted by h. The determination of the function h remains open.