15 resultados para K-means

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

60.00% 60.00%

Publicador:

Resumo:

The identification and classification of network traffic and protocols is a vital step in many quality of service and security systems. Traffic classification strategies must evolve, alongside the protocols utilising the Internet, to overcome the use of ephemeral or masquerading port numbers and transport layer encryption. This research expands the concept of using machine learning on the initial statistics of flow of packets to determine its underlying protocol. Recognising the need for efficient training/retraining of a classifier and the requirement for fast classification, the authors investigate a new application of k-means clustering referred to as 'two-way' classification. The 'two-way' classification uniquely analyses a bidirectional flow as two unidirectional flows and is shown, through experiments on real network traffic, to improve classification accuracy by as much as 18% when measured against similar proposals. It achieves this accuracy while generating fewer clusters, that is, fewer comparisons are needed to classify a flow. A 'two-way' classification offers a new way to improve accuracy and efficiency of machine learning statistical classifiers while still maintaining the fast training times associated with the k-means.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The environmental quality of land can be assessed by calculating relevant threshold values, which differentiate between concentrations of elements resulting from geogenic and diffuse anthropogenic sources and concentrations generated by point sources of elements. A simple process allowing the calculation of these typical threshold values (TTVs) was applied across a region of highly complex geology (Northern Ireland) to six elements of interest; arsenic, chromium, copper, lead, nickel and vanadium. Three methods for identifying domains (areas where a readily identifiable factor can be shown to control the concentration of an element) were used: k-means cluster analysis, boxplots and empirical cumulative distribution functions (ECDF). The ECDF method was most efficient at determining areas of both elevated and reduced concentrations and was used to identify domains in this investigation. Two statistical methods for calculating normal background concentrations (NBCs) and upper limits of geochemical baseline variation (ULBLs), currently used in conjunction with legislative regimes in the UK and Finland respectively, were applied within each domain. The NBC methodology was constructed to run within a specific legislative framework, and its use on this soil geochemical data set was influenced by the presence of skewed distributions and outliers. In contrast, the ULBL methodology was found to calculate more appropriate TTVs that were generally more conservative than the NBCs. TTVs indicate what a "typical" concentration of an element would be within a defined geographical area and should be considered alongside the risk that each of the elements pose in these areas to determine potential risk to receptors.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We present a novel method for the light-curve characterization of Pan-STARRS1 Medium Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SVs) and burst-like (BL) transients, using multi-band image-differencing time-series data. We select detections in difference images associated with galaxy hosts using a star/galaxy catalog extracted from the deep PS1 MDS stacked images, and adopt a maximum a posteriori formulation to model their difference-flux time-series in four Pan-STARRS1 photometric bands gP1, rP1, iP1, and zP1. We use three deterministic light-curve models to fit BL transients; a Gaussian, a Gamma distribution, and an analytic supernova (SN) model, and one stochastic light-curve model, the Ornstein-Uhlenbeck process, in order to fit variability that is characteristic of active galactic nuclei (AGNs). We assess the quality of fit of the models band-wise and source-wise, using their estimated leave-out-one cross-validation likelihoods and corrected Akaike information criteria. We then apply a K-means clustering algorithm on these statistics, to determine the source classification in each band. The final source classification is derived as a combination of the individual filter classifications, resulting in two measures of classification quality, from the averages across the photometric filters of (1) the classifications determined from the closest K-means cluster centers, and (2) the square distances from the clustering centers in the K-means clustering spaces. For a verification set of AGNs and SNe, we show that SV and BL occupy distinct regions in the plane constituted by these measures. We use our clustering method to characterize 4361 extragalactic image difference detected sources, in the first 2.5 yr of the PS1 MDS, into 1529 BL, and 2262 SV, with a purity of 95.00% for AGNs, and 90.97% for SN based on our verification sets. We combine our light-curve classifications with their nuclear or off-nuclear host galaxy offsets, to define a robust photometric sample of 1233 AGNs and 812 SNe. With these two samples, we characterize their variability and host galaxy properties, and identify simple photometric priors that would enable their real-time identification in future wide-field synoptic surveys.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Energy efficiency is an essential requirement for all contemporary computing systems. We thus need tools to measure the energy consumption of computing systems and to understand how workloads affect it. Significant recent research effort has targeted direct power measurements on production computing systems using on-board sensors or external instruments. These direct methods have in turn guided studies of software techniques to reduce energy consumption via workload allocation and scaling. Unfortunately, direct energy measurements are hampered by the low power sampling frequency of power sensors. The coarse granularity of power sensing limits our understanding of how power is allocated in systems and our ability to optimize energy efficiency via workload allocation.
We present ALEA, a tool to measure power and energy consumption at the granularity of basic blocks, using a probabilistic approach. ALEA provides fine-grained energy profiling via sta- tistical sampling, which overcomes the limitations of power sens- ing instruments. Compared to state-of-the-art energy measurement tools, ALEA provides finer granularity without sacrificing accuracy. ALEA achieves low overhead energy measurements with mean error rates between 1.4% and 3.5% in 14 sequential and paral- lel benchmarks tested on both Intel and ARM platforms. The sampling method caps execution time overhead at approximately 1%. ALEA is thus suitable for online energy monitoring and optimization. Finally, ALEA is a user-space tool with a portable, machine-independent sampling method. We demonstrate two use cases of ALEA, where we reduce the energy consumption of a k-means computational kernel by 37% and an ocean modelling code by 33%, compared to high-performance execution baselines, by varying the power optimization strategy between basic blocks.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This papers examines the use of trajectory distance measures and clustering techniques to define normal
and abnormal trajectories in the context of pedestrian tracking in public spaces. In order to detect abnormal
trajectories, what is meant by a normal trajectory in a given scene is firstly defined. Then every trajectory
that deviates from this normality is classified as abnormal. By combining Dynamic Time Warping and a
modified K-Means algorithms for arbitrary-length data series, we have developed an algorithm for trajectory
clustering and abnormality detection. The final system performs with an overall accuracy of 83% and 75%
when tested in two different standard datasets.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The results of PVT measurements of the liquid phase within the temperature range of (298 to 393) K and up to 35 MPa are presented for some aliphatic esters. Measurements were made by means of a vibrating-tube densimeter, model DMA 512P from Anton Parr. The calibration of the densimeter was performed with water and n-heptane as reference fluids. The experimental PVT data have been correlated by a Tait equation. This equation gives excellent results when used to predict the density of the esters using the method proposed by Thomson et al. (AIChE J. 1982, 28, 671-676). Isothermal compressibilities, isobaric expansivities, thermal pressure coefficients, and changes in the isobaric heat capacity have been calculated from the volumetric data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We report the novel observation that engagement of ß2 integrins on human neutrophils is accompanied by increased levels of the small GTPases Rap1 and Rap2 in a membrane-enriched fraction and a concomitant decrease of these proteins in a granule-enriched fraction. In parallel, we observed a similar time-dependent decrease of gelatinase B (a marker of specific and gelatinase B-containing granules) but not myeloperoxidase (a marker of azurophil granules) in the granule fraction, and release of lactoferrin (a marker of specific granules) in the extracellular medium. Furthermore, inhibition of Src tyrosine kinases, or phosphoinositide 3-kinase with PP1 or LY294002, respectively, blocked ß2 integrin-induced degranulation and the redistribution of Rap1 and Rap2 to a membrane-enriched fraction. Consequently, the ß2 integrin-dependent exocytosis of specific and gelatinase B-containing granules occurs via a Src tyrosine kinase/phosphoinositide 3-kinase signaling pathway and is responsible for the translocation of Rap1 and Rap2 to the plasma membrane in human neutrophils.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

TlCu2-xFexSe2 is a p-type metal for x < 0.5 which crystallizes in a body-centred tetragonal structure. The metal atoms are situated in ab-planes, similar to 7 angstrom apart, while the metal - metal distance within the plane is similar to 2.75 angstrom. Due to the large difference in cation distances, the solid solutions show magnetic properties of mainly two-dimensional character. The SQUID measurements performed for x = 0.27 give the c-axis as the easy axis of magnetization, but also show clear hysteresis effects at 10 K, indicating a partly ferromagnetic coupling. The magnetic ordering temperature T-c is 55( 5) K as found from both SQUID and Mossbauer spectra. At T << Tc the magnetic hyperfine fields are distributed with a maximum at about 30 T, which are compared to the measured magnetic moment per iron atom, which is 0.97 mu(B)/Fe as found from SQUID measurements. The experimental results are compared to results using other methods on isostructural Tl selenides.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper investigates the uplink achievable rates of massive multiple-input multiple-output (MIMO) antenna systems in Ricean fading channels, using maximal-ratio combining (MRC) and zero-forcing (ZF) receivers, assuming perfect and imperfect channel state information (CSI). In contrast to previous relevant works, the fast fading MIMO channel matrix is assumed to have an arbitrary-rank deterministic component as well as a Rayleigh-distributed random component. We derive tractable expressions for the achievable uplink rate in the large-antenna limit, along with approximating results that hold for any finite number of antennas. Based on these analytical results, we obtain the scaling law that the users' transmit power should satisfy, while maintaining a desirable quality of service. In particular, it is found that regardless of the Ricean K-factor, in the case of perfect CSI, the approximations converge to the same constant value as the exact results, as the number of base station antennas, M, grows large, while the transmit power of each user can be scaled down proportionally to 1/M. If CSI is estimated with uncertainty, the same result holds true but only when the Ricean K-factor is non-zero. Otherwise, if the channel experiences Rayleigh fading, we can only cut the transmit power of each user proportionally to 1/√M. In addition, we show that with an increasing Ricean K-factor, the uplink rates will converge to fixed values for both MRC and ZF receivers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The water and sewerage sectors' combined emissions account for just over 1% of total UK emissions, while household water heating accounts for a further 5%. Energy use, particularly electricity, is the largest source of emissions in the sector. Water efficiency measures should therefore result in reduced emissions from a lower demand for water and wastewater treatment and pumping, as well as from decreased domestic water heating. Northern Ireland Water (NI Water) is actively pursuing measures to reduce its carbon footprint. This paper investigated the carbon impacts of implementing a household water efficiency programme in Northern Ireland. Assuming water savings of 59.6 L/prop/day and 15% uptake among households, carbon savings of 0.6% of NI Water's current net operational emissions are achievable from reduced treatment and pumping. Adding the carbon savings from reduced household water heating gives savings equivalent to 6.2% of current net operational emissions. Cost savings to NI Water are estimated as 300,000 per year. The cost of the water efficiency devices is approximately 1.6 million, but may be higher depending on the number of devices distributed relative to the number installed. This paper has shown clear carbon benefits to water efficiency, but further research is needed to examine social and cost impacts. © IWA Publishing 2013.