10 resultados para k nearest neighbour

em CentAUR: Central Archive University of Reading - UK


Relevância:

90.00% 90.00%

Publicador:

Resumo:

The length and time scales accessible to optical tweezers make them an ideal tool for the examination of colloidal systems. Embedded high-refractive-index tracer particles in an index-matched hard sphere suspension provide 'handles' within the system to investigate the mechanical behaviour. Passive observations of the motion of a single probe particle give information about the linear response behaviour of the system, which can be linked to the macroscopic frequency-dependent viscous and elastic moduli of the suspension. Separate 'dragging' experiments allow observation of a sample's nonlinear response to an applied stress on a particle-by particle basis. Optical force measurements have given new data about the dynamics of phase transitions and particle interactions; an example in this study is the transition from liquid-like to solid-like behaviour, and the emergence of a yield stress and other effects attributable to nearest-neighbour caging effects. The forces needed to break such cages and the frequency of these cage breaking events are investigated in detail for systems close to the glass transition.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Data such as digitized aerial photographs, electrical conductivity and yield are intensive and relatively inexpensive to obtain compared with collecting soil data by sampling. If such ancillary data are co-regionalized with the soil data they should be suitable for co-kriging. The latter requires that information for both variables is co-located at several locations; this is rarely so for soil and ancillary data. To solve this problem, we have derived values for the ancillary variable at the soil sampling locations by averaging the values within a radius of 15 m, taking the nearest-neighbour value, kriging over 5 m blocks, and punctual kriging. The cross-variograms from these data with clay content and also the pseudo cross-variogram were used to co-krige to validation points and the root mean squared errors (RMSEs) were calculated. In general, the data averaged within 15m and the punctually kriged values resulted in more accurate predictions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Locality to other nodes on a peer-to-peer overlay network can be established by means of a set of landmarks shared among the participating nodes. Each node independently collects a set of latency measures to landmark nodes, which are used as a multi-dimensional feature vector. Each peer node uses the feature vector to generate a unique scalar index which is correlated to its topological locality. A popular dimensionality reduction technique is the space filling Hilbert’s curve, as it possesses good locality preserving properties. However, there exists little comparison between Hilbert’s curve and other techniques for dimensionality reduction. This work carries out a quantitative analysis of their properties. Linear and non-linear techniques for scaling the landmark vectors to a single dimension are investigated. Hilbert’s curve, Sammon’s mapping and Principal Component Analysis have been used to generate a 1d space with locality preserving properties. This work provides empirical evidence to support the use of Hilbert’s curve in the context of locality preservation when generating peer identifiers by means of landmark vector analysis. A comparative analysis is carried out with an artificial 2d network model and with a realistic network topology model with a typical power-law distribution of node connectivity in the Internet. Nearest neighbour analysis confirms Hilbert’s curve to be very effective in both artificial and realistic network topologies. Nevertheless, the results in the realistic network model show that there is scope for improvements and better techniques to preserve locality information are required.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We agree with Duckrow and Albano [Phys. Rev. E 67, 063901 (2003)] and Quian Quiroga et al. [Phys. Rev. E 67, 063902 (2003)] that mutual information (MI) is a useful measure of dependence for electroencephalogram (EEG) data, but we show that the improvement seen in the performance of MI on extracting dependence trends from EEG is more dependent on the type of MI estimator rather than any embedding technique used. In an independent study we conducted in search for an optimal MI estimator, and in particular for EEG applications, we examined the performance of a number of MI estimators on the data set used by Quian Quiroga et al. in their original study, where the performance of different dependence measures on real data was investigated [Phys. Rev. E 65, 041903 (2002)]. We show that for EEG applications the best performance among the investigated estimators is achieved by k-nearest neighbors, which supports the conjecture by Quian Quiroga et al. in Phys. Rev. E 67, 063902 (2003) that the nearest neighbor estimator is the most precise method for estimating MI.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We agree with Duckrow and Albano [Phys. Rev. E 67, 063901 (2003)] and Quian Quiroga [Phys. Rev. E 67, 063902 (2003)] that mutual information (MI) is a useful measure of dependence for electroencephalogram (EEG) data, but we show that the improvement seen in the performance of MI on extracting dependence trends from EEG is more dependent on the type of MI estimator rather than any embedding technique used. In an independent study we conducted in search for an optimal MI estimator, and in particular for EEG applications, we examined the performance of a number of MI estimators on the data set used by Quian Quiroga in their original study, where the performance of different dependence measures on real data was investigated [Phys. Rev. E 65, 041903 (2002)]. We show that for EEG applications the best performance among the investigated estimators is achieved by k-nearest neighbors, which supports the conjecture by Quian Quiroga in Phys. Rev. E 67, 063902 (2003) that the nearest neighbor estimator is the most precise method for estimating MI.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We compare a number of models of post War US output growth in terms of the degree and pattern of non-linearity they impart to the conditional mean, where we condition on either the previous period's growth rate, or the previous two periods' growth rates. The conditional means are estimated non-parametrically using a nearest-neighbour technique on data simulated from the models. In this way, we condense the complex, dynamic, responses that may be present in to graphical displays of the implied conditional mean.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Information on the breeding biology of the White-headed Vulture Trigonoceps occipitalis is limited and published data are few. Within the Kruger National Park in north-east South Africa there is a regionally important population of about 60 White-headed Vulture pairs, of which 22 pairs were monitored for five years between 2008 and 2012 to determine key aspects of their breeding biology. Across 73 pair/years the mean productivity of 55 breeding attempts was 0.69 chicks per pair. Median egg-laying date across all of the Kruger National Park was 27 June, but northern nests were approximately 30 d later than southern nests. Mean (SD) nearest-neighbour distance was 9 976  7 965 m and inter-nest distances ranged from 1 400 m to more than 20 km, but this did not differ significantly between habitat types. Breeding productivity did not differ significantly between habitat types. The results presented here are the first for this species in Kruger National Park and provide details against which future comparisons can be made.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Advances in hardware technologies allow to capture and process data in real-time and the resulting high throughput data streams require novel data mining approaches. The research area of Data Stream Mining (DSM) is developing data mining algorithms that allow us to analyse these continuous streams of data in real-time. The creation and real-time adaption of classification models from data streams is one of the most challenging DSM tasks. Current classifiers for streaming data address this problem by using incremental learning algorithms. However, even so these algorithms are fast, they are challenged by high velocity data streams, where data instances are incoming at a fast rate. This is problematic if the applications desire that there is no or only a very little delay between changes in the patterns of the stream and absorption of these patterns by the classifier. Problems of scalability to Big Data of traditional data mining algorithms for static (non streaming) datasets have been addressed through the development of parallel classifiers. However, there is very little work on the parallelisation of data stream classification techniques. In this paper we investigate K-Nearest Neighbours (KNN) as the basis for a real-time adaptive and parallel methodology for scalable data stream classification tasks.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.