808 resultados para Semi-supervised clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The K-Means algorithm for cluster analysis is one of the most influential and popular data mining methods. Its straightforward parallel formulation is well suited for distributed memory systems with reliable interconnection networks, such as massively parallel processors and clusters of workstations. However, in large-scale geographically distributed systems the straightforward parallel algorithm can be rendered useless by a single communication failure or high latency in communication paths. The lack of scalable and fault tolerant global communication and synchronisation methods in large-scale systems has hindered the adoption of the K-Means algorithm for applications in large networked systems such as wireless sensor networks, peer-to-peer systems and mobile ad hoc networks. This work proposes a fully distributed K-Means algorithm (EpidemicK-Means) which does not require global communication and is intrinsically fault tolerant. The proposed distributed K-Means algorithm provides a clustering solution which can approximate the solution of an ideal centralised algorithm over the aggregated data as closely as desired. A comparative performance analysis is carried out against the state of the art sampling methods and shows that the proposed method overcomes the limitations of the sampling-based approaches for skewed clusters distributions. The experimental analysis confirms that the proposed algorithm is very accurate and fault tolerant under unreliable network conditions (message loss and node failures) and is suitable for asynchronous networks of very large and extreme scale.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The proteome of Salmonella enterica serovar Typhimurium was characterized by 2-dimensional HPLC mass spectrometry to provide a platform for subsequent proteomic investigations of low level multiple antibiotic resistance (MAR). Bacteria (2.15 +/- 0.23 x 10(10) cfu; mean +/- s.d.) were harvested from liquid culture and proteins differentially fractionated, on the basis of solubility, into preparations representative of the cytosol, cell envelope and outer membrane proteins (OMPs). These preparations were digested by treatment with trypsin and peptides separated into fractions (n = 20) by strong cation exchange chromatography (SCX). Tryptic peptides in each SCX fraction were further separated by reversed-phase chromatography and detected by mass spectrometry. Peptides were assigned to proteins and consensus rank listings compiled using SEQUEST. A total of 816 +/- 11 individual proteins were identified which included 371 +/- 33, 565 +/- 15 and 262 +/- 5 from the cytosolic, cell envelope and OMP preparations, respectively. A significant correlation was observed (r(2) = 0.62 +/- 0.10; P < 0.0001) between consensus rank position for duplicate cell preparations and an average of 74 +/- 5% of proteins were common to both replicates. A total of 34 outer membrane proteins were detected, 20 of these from the OMP preparation. A range of proteins (n = 20) previously associated with the mar locus in E. coli were also found including the key MAR effectors AcrA, TolC and OmpF.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A neurofuzzy classifier identification algorithm is introduced for two class problems. The initial fuzzy base construction is based on fuzzy clustering utilizing a Gaussian mixture model (GMM) and the analysis of covariance (ANOVA) decomposition. The expectation maximization (EM) algorithm is applied to determine the parameters of the fuzzy membership functions. Then neurofuzzy model is identified via the supervised subspace orthogonal least square (OLS) algorithm. Finally a logistic regression model is applied to produce the class probability. The effectiveness of the proposed neurofuzzy classifier has been demonstrated using a real data set.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper aims to develop a mathematical model based on semi-group theory, which allows to improve quality of service (QoS), including the reduction of the carbon path, in a pervasive environment of a Mobile Virtual Network Operator (MVNO). This paper generalise an interrelationship Machine to Machine (M2M) mathematical model, based on semi-group theory. This paper demonstrates that using available technology and with a solid mathematical model, is possible to streamline relationships between building agents, to control pervasive spaces so as to reduce the impact in carbon footprint through the reduction of GHG.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Global communicationrequirements andloadimbalanceof someparalleldataminingalgorithms arethe major obstacles to exploitthe computational power of large-scale systems. This work investigates how non-uniform data distributions can be exploited to remove the global communication requirement and to reduce the communication costin parallel data mining algorithms and, in particular, in the k-means algorithm for cluster analysis. In the straightforward parallel formulation of the k-means algorithm, data and computation loads are uniformly distributed over the processing nodes. This approach has excellent load balancing characteristics that may suggest it could scale up to large and extreme-scale parallel computing systems. However, at each iteration step the algorithm requires a global reduction operationwhichhinders thescalabilityoftheapproach.Thisworkstudiesadifferentparallelformulation of the algorithm where the requirement of global communication is removed, while maintaining the same deterministic nature ofthe centralised algorithm. The proposed approach exploits a non-uniform data distribution which can be either found in real-world distributed applications or can be induced by means ofmulti-dimensional binary searchtrees. The approachcanalso be extended to accommodate an approximation error which allows a further reduction ofthe communication costs. The effectiveness of the exact and approximate methods has been tested in a parallel computing system with 64 processors and in simulations with 1024 processing element

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The ability to create accurate geometric models of neuronal morphology is important for understanding the role of shape in information processing. Despite a significant amount of research on automating neuron reconstructions from image stacks obtained via microscopy, in practice most data are still collected manually. This paper describes Neuromantic, an open source system for three dimensional digital tracing of neurites. Neuromantic reconstructions are comparable in quality to those of existing commercial and freeware systems while balancing speed and accuracy of manual reconstruction. The combination of semi-automatic tracing, intuitive editing, and ability of visualizing large image stacks on standard computing platforms provides a versatile tool that can help address the reconstructions availability bottleneck. Practical considerations for reducing the computational time and space requirements of the extended algorithm are also discussed.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work proposes a unified neurofuzzy modelling scheme. To begin with, the initial fuzzy base construction method is based on fuzzy clustering utilising a Gaussian mixture model (GMM) combined with the analysis of covariance (ANOVA) decomposition in order to obtain more compact univariate and bivariate membership functions over the subspaces of the input features. The mean and covariance of the Gaussian membership functions are found by the expectation maximisation (EM) algorithm with the merit of revealing the underlying density distribution of system inputs. The resultant set of membership functions forms the basis of the generalised fuzzy model (GFM) inference engine. The model structure and parameters of this neurofuzzy model are identified via the supervised subspace orthogonal least square (OLS) learning. Finally, instead of providing deterministic class label as model output by convention, a logistic regression model is applied to present the classifier’s output, in which the sigmoid type of logistic transfer function scales the outputs of the neurofuzzy model to the class probability. Experimental validation results are presented to demonstrate the effectiveness of the proposed neurofuzzy modelling scheme.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The planning of semi-autonomous vehicles in traffic scenarios is a relatively new problem that contributes towards the goal of making road travel by vehicles free of human drivers. An algorithm needs to ensure optimal real time planning of multiple vehicles (moving in either direction along a road), in the presence of a complex obstacle network. Unlike other approaches, here we assume that speed lanes are not present and that different lanes do not need to be maintained for inbound and outbound traffic. Our basic hypothesis is to carry forward the planning task to ensure that a sufficient distance is maintained by each vehicle from all other vehicles, obstacles and road boundaries. We present here a 4-layer planning algorithm that consists of road selection (for selecting the individual roads of traversal to reach the goal), pathway selection (a strategy to avoid and/or overtake obstacles, road diversions and other blockages), pathway distribution (to select the position of a vehicle at every instance of time in a pathway), and trajectory generation (for generating a curve, smooth enough, to allow for the maximum possible speed). Cooperation between vehicles is handled separately at the different levels, the aim being to maximize the separation between vehicles. Simulated results exhibit behaviours of smooth, efficient and safe driving of vehicles in multiple scenarios; along with typical vehicle behaviours including following and overtaking.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper represents the second part of a study of semi-geostrophic (SG) geophysical fluid dynamics. SG dynamics shares certain attractive properties with the better known and more widely used quasi-geostrophic (QG) model, but is also a good prototype for balanced models that are more accurate than QG dynamics. The development of such balanced models is an area of great current interest. The goal of the present work is to extend a central body of QG theory, concerning the evolution of disturbances to prescribed basic states, to SG dynamics. Part 1 was based on the pseudomomentum; Part 2 is based on the pseudoenergy. A pseudoenergy invariant is a conserved quantity, of second order in disturbance amplitude relative to a prescribed steady basic state, which is related to the time symmetry of the system. We derive such an invariant for the semi-geostrophic equations, and use it to obtain: (i) a linear stability theorem analogous to Arnol'd's ‘first theorem’; and (ii) a small-amplitude local conservation law for the invariant, obeying the group-velocity property in the WKB limit. The results are analogous to their quasi-geostrophic forms, and reduce to those forms in the limit of small Rossby number. The results are derived for both the f-plane Boussinesq form of semi-geostrophic dynamics, and its extension to β-plane compressible flow by Magnusdottir & Schubert. Novel features particular to semi-geostrophic dynamics include apparently unnoticed lateral boundary stability criteria. Unlike the boundary stability criteria found in the first part of this study, however, these boundary criteria do not necessarily preclude the construction of provably stable basic states. The interior semi-geostrophic dynamics has an underlying Hamiltonian structure, which guarantees that symmetries in the system correspond naturally to the system's invariants. This is an important motivation for the theoretical approach used in this study. The connection between symmetries and conservation laws is made explicit using Noether's theorem applied to the Eulerian form of the Hamiltonian description of the interior dynamics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

There exists a well-developed body of theory based on quasi-geostrophic (QG) dynamics that is central to our present understanding of large-scale atmospheric and oceanic dynamics. An important question is the extent to which this body of theory may generalize to more accurate dynamical models. As a first step in this process, we here generalize a set of theoretical results, concerning the evolution of disturbances to prescribed basic states, to semi-geostrophic (SG) dynamics. SG dynamics, like QG dynamics, is a Hamiltonian balanced model whose evolution is described by the material conservation of potential vorticity, together with an invertibility principle relating the potential vorticity to the advecting fields. SG dynamics has features that make it a good prototype for balanced models that are more accurate than QG dynamics. In the first part of this two-part study, we derive a pseudomomentum invariant for the SG equations, and use it to obtain: (i) linear and nonlinear generalized Charney–Stern theorems for disturbances to parallel flows; (ii) a finite-amplitude local conservation law for the invariant, obeying the group-velocity property in the WKB limit; and (iii) a wave-mean-flow interaction theorem consisting of generalized Eliassen–Palm flux diagnostics, an elliptic equation for the stream-function tendency, and a non-acceleration theorem. All these results are analogous to their QG forms. The pseudomomentum invariant – a conserved second-order disturbance quantity that is associated with zonal symmetry – is constructed using a variational principle in a similar manner to the QG calculations. Such an approach is possible when the equations of motion under the geostrophic momentum approximation are transformed to isentropic and geostrophic coordinates, in which the ageostrophic advection terms are no longer explicit. Symmetry-related wave-activity invariants such as the pseudomomentum then arise naturally from the Hamiltonian structure of the SG equations. We avoid use of the so-called ‘massless layer’ approach to the modelling of isentropic gradients at the lower boundary, preferring instead to incorporate explicitly those boundary contributions into the wave-activity and stability results. This makes the analogy with QG dynamics most transparent. This paper treats the f-plane Boussinesq form of SG dynamics, and its recent extension to β-plane, compressible flow by Magnusdottir & Schubert. In the limit of small Rossby number, the results reduce to their respective QG forms. Novel features particular to SG dynamics include apparently unnoticed lateral boundary stability criteria in (i), and the necessity of including additional zonal-mean eddy correlation terms besides the zonal-mean potential vorticity fluxes in the wave-mean-flow balance in (iii). In the companion paper, wave-activity conservation laws and stability theorems based on the SG form of the pseudoenergy are presented.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data with summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. Although these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and we then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics that can result in substantially more accurate ABC analyses than the ad hoc choices of summary statistics that have been proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper, dual-hop amplify-and-forward (AF) cooperative systems in the presence of high-power amplifier (HPA) nonlinearity at semi-blind relays, are investigated. Based on the modified AF cooperative system model taking into account the HPA nonlinearity, the expression for the output signal-to-noise ratio (SNR) at the destination node is derived, where the interference due to both the AF relaying mechanism and the HPA nonlinearity is characterized. The performance of the AF cooperative system under study is evaluated in terms of average symbol error probability (SEP), which is derived using the moment-generating function (MGF) approach, considering transmissions over Nakagami-m fading channels. Numerical results are provided and show the effects of some system parameters, such as the HPA parameters, numbers of relays, quadrature amplitude modulation (QAM) order, Nakagami parameters, on performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Semi-open street roofs protect pedestrians from intense sunshine and rains. Their effects on natural ventilation of urban canopy layers (UCL) are less understood. This paper investigates two idealized urban models consisting of 4(2×2) or 16(4×4) buildings under a neutral atmospheric condition with parallel (0°) or non-parallel (15°,30°,45°) approaching wind. The aspect ratio (building height (H) / street width (W)) is 1 and building width is B=3H. Computational fluid dynamic (CFD) simulations were first validated by experimental data, confirming that standard k-ε model predicted airflow velocity better than RNG k-ε model, realizable k–ε model and Reynolds stress model. Three ventilation indices were numerically analyzed for ventilation assessment, including flow rates across street roofs and openings to show the mechanisms of air exchange, age of air to display how long external air reaches a place after entering UCL, and purging flow rate to quantify the net UCL ventilation capacity induced by mean flows and turbulence. Five semi-open roof types are studied: Walls being hung above street roofs (coverage ratio λa=100%) at z=1.5H, 1.2H, 1.1H ('Hung1.5H', 'Hung1.2H', 'Hung1.1H' types); Walls partly covering street roofs (λa=80%) at z=H ('Partly-covered' type); Walls fully covering street roofs (λa=100%) at z=H ('Fully-covered' type).They basically obtain worse UCL ventilation than open street roof type due to the decreased roof ventilation. 'Hung1.1H', 'Hung1.2H', 'Hung1.5H' types are better designs than 'Fully-covered' and 'Partly-covered' types. Greater urban size contains larger UCL volume and requires longer time to ventilate. The methodologies and ventilation indices are confirmed effective to quantify UCL ventilation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This book sets out the findings of research conducted into the links between ecosystem services and poverty alleviation in Southern Africa. It follows from extensive primary research conducted in the region, as well as intensive engagement with researchers, policy-makers and relevant institutions in several countries in southern Africa, as part of the Ecosystem Services and Poverty Alleviation Programme led by DFI, NERC and ESRC.