681 resultados para cluster computing
Resumo:
Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. In this paper, we investigate the problem of evaluating the top k distinguished “features” for a “cluster” based on weighted proximity relationships between the cluster and features. We measure proximity in an average fashion to address possible nonuniform data distribution in a cluster. Combining a standard multi-step paradigm with new lower and upper proximity bounds, we presented an efficient algorithm to solve the problem. The algorithm is implemented in several different modes. Our experiment results not only give a comparison among them but also illustrate the efficiency of the algorithm.
Resumo:
We develop a simplified implementation of the Hoshen-Kopelman cluster counting algorithm adapted for honeycomb networks. In our implementation of the algorithm we assume that all nodes in the network are occupied and links between nodes can be intact or broken. The algorithm counts how many clusters there are in the network and determines which nodes belong to each cluster. The network information is stored into two sets of data. The first one is related to the connectivity of the nodes and the second one to the state of links. The algorithm finds all clusters in only one scan across the network and thereafter cluster relabeling operates on a vector whose size is much smaller than the size of the network. Counting the number of clusters of each size, the algorithm determines the cluster size probability distribution from which the mean cluster size parameter can be estimated. Although our implementation of the Hoshen-Kopelman algorithm works only for networks with a honeycomb (hexagonal) structure, it can be easily changed to be applied for networks with arbitrary connectivity between the nodes (triangular, square, etc.). The proposed adaptation of the Hoshen-Kopelman cluster counting algorithm is applied to studying the thermal degradation of a graphene-like honeycomb membrane by means of Molecular Dynamics simulation with a Langevin thermostat. ACM Computing Classification System (1998): F.2.2, I.5.3.
Resumo:
In this paper we evaluate and compare two representativeand popular distributed processing engines for large scalebig data analytics, Spark and graph based engine GraphLab. Wedesign a benchmark suite including representative algorithmsand datasets to compare the performances of the computingengines, from performance aspects of running time, memory andCPU usage, network and I/O overhead. The benchmark suite istested on both local computer cluster and virtual machines oncloud. By varying the number of computers and memory weexamine the scalability of the computing engines with increasingcomputing resources (such as CPU and memory). We also runcross-evaluation of generic and graph based analytic algorithmsover graph processing and generic platforms to identify thepotential performance degradation if only one processing engineis available. It is observed that both computing engines showgood scalability with increase of computing resources. WhileGraphLab largely outperforms Spark for graph algorithms, ithas close running time performance as Spark for non-graphalgorithms. Additionally the running time with Spark for graphalgorithms over cloud virtual machines is observed to increaseby almost 100% compared to over local computer clusters.
Resumo:
Current trends in broadband mobile networks are addressed towards the placement of different capabilities at the edge of the mobile network in a centralised way. On one hand, the split of the eNB between baseband processing units and remote radio headers makes it possible to process some of the protocols in centralised premises, likely with virtualised resources. On the other hand, mobile edge computing makes use of processing and storage capabilities close to the air interface in order to deploy optimised services with minimum delay. The confluence of both trends is a hot topic in the definition of future 5G networks. The full centralisation of both technologies in cloud data centres imposes stringent requirements to the fronthaul connections in terms of throughput and latency. Therefore, all those cells with limited network access would not be able to offer these types of services. This paper proposes a solution for these cases, based on the placement of processing and storage capabilities close to the remote units, which is especially well suited for the deployment of clusters of small cells. The proposed cloud-enabled small cells include a highly efficient microserver with a limited set of virtualised resources offered to the cluster of small cells. As a result, a light data centre is created and commonly used for deploying centralised eNB and mobile edge computing functionalities. The paper covers the proposed architecture, with special focus on the integration of both aspects, and possible scenarios of application.
Resumo:
Modern scientific discoveries are driven by an unsatisfiable demand for computational resources. High-Performance Computing (HPC) systems are an aggregation of computing power to deliver considerably higher performance than one typical desktop computer can provide, to solve large problems in science, engineering, or business. An HPC room in the datacenter is a complex controlled environment that hosts thousands of computing nodes that consume electrical power in the range of megawatts, which gets completely transformed into heat. Although a datacenter contains sophisticated cooling systems, our studies indicate quantitative evidence of thermal bottlenecks in real-life production workload, showing the presence of significant spatial and temporal thermal and power heterogeneity. Therefore minor thermal issues/anomalies can potentially start a chain of events that leads to an unbalance between the amount of heat generated by the computing nodes and the heat removed by the cooling system originating thermal hazards. Although thermal anomalies are rare events, anomaly detection/prediction in time is vital to avoid IT and facility equipment damage and outage of the datacenter, with severe societal and business losses. For this reason, automated approaches to detect thermal anomalies in datacenters have considerable potential. This thesis analyzed and characterized the power and thermal characteristics of a Tier0 datacenter (CINECA) during production and under abnormal thermal conditions. Then, a Deep Learning (DL)-powered thermal hazard prediction framework is proposed. The proposed models are validated against real thermal hazard events reported for the studied HPC cluster while in production. This thesis is the first empirical study of thermal anomaly detection and prediction techniques of a real large-scale HPC system to the best of my knowledge. For this thesis, I used a large-scale dataset, monitoring data of tens of thousands of sensors for around 24 months with a data collection rate of around 20 seconds.
Resumo:
Embedding intelligence in extreme edge devices allows distilling raw data acquired from sensors into actionable information, directly on IoT end-nodes. This computing paradigm, in which end-nodes no longer depend entirely on the Cloud, offers undeniable benefits, driving a large research area (TinyML) to deploy leading Machine Learning (ML) algorithms on micro-controller class of devices. To fit the limited memory storage capability of these tiny platforms, full-precision Deep Neural Networks (DNNs) are compressed by representing their data down to byte and sub-byte formats, in the integer domain. However, the current generation of micro-controller systems can barely cope with the computing requirements of QNNs. This thesis tackles the challenge from many perspectives, presenting solutions both at software and hardware levels, exploiting parallelism, heterogeneity and software programmability to guarantee high flexibility and high energy-performance proportionality. The first contribution, PULP-NN, is an optimized software computing library for QNN inference on parallel ultra-low-power (PULP) clusters of RISC-V processors, showing one order of magnitude improvements in performance and energy efficiency, compared to current State-of-the-Art (SoA) STM32 micro-controller systems (MCUs) based on ARM Cortex-M cores. The second contribution is XpulpNN, a set of RISC-V domain specific instruction set architecture (ISA) extensions to deal with sub-byte integer arithmetic computation. The solution, including the ISA extensions and the micro-architecture to support them, achieves energy efficiency comparable with dedicated DNN accelerators and surpasses the efficiency of SoA ARM Cortex-M based MCUs, such as the low-end STM32M4 and the high-end STM32H7 devices, by up to three orders of magnitude. To overcome the Von Neumann bottleneck while guaranteeing the highest flexibility, the final contribution integrates an Analog In-Memory Computing accelerator into the PULP cluster, creating a fully programmable heterogeneous fabric that demonstrates end-to-end inference capabilities of SoA MobileNetV2 models, showing two orders of magnitude performance improvements over current SoA analog/digital solutions.
Resumo:
Gli sforzi di ricerca relativi all'High Performance Computing, nel corso degli anni, hanno prodotto risultati importanti inerenti all'incremento delle prestazioni sia in termini di numero di operazioni effettuate per periodo temporale, sia introducendo o migliorando algoritmi paralleli presenti in letteratura. Tali traguardi hanno comportato cambiamenti alla struttura interna delle macchine; si è assistito infatti ad un'evoluzione delle architetture dei processori utilizzati e all'impiego di GPU come risorse di calcolo aggiuntive. La conseguenza di un continuo incremento di prestazioni è quella di dover far fronte ad un grosso dispendio energetico, in quanto le macchine impiegate nell'HPC sono ideate per effettuare un'intensa attività di calcolo in un periodo di tempo molto prolungato; l'energia necessaria per alimentare ciascun nodo e dissipare il calore generato comporta costi elevati. Tra le varie soluzioni proposte per limitare il consumo di energia, quella che ha riscosso maggior interesse, sia a livello di studio che di mercato, è stata l'integrazione di CPU di tipologia RISC (Reduced Instruction Set Computer), in quanto capaci di ottenere prestazioni soddisfacenti con un impiego energetico inferiore rispetto alle CPU CISC (Complex Instruction Set Computer). In questa tesi è presentata l'analisi delle prestazioni di Monte Cimone, un cluster composto da 8 nodi di calcolo basati su architettura RISC-V e distribuiti in 4 piattaforme (\emph{blade}) dual-board. Verranno eseguiti dei benchmark che ci permetteranno di valutare: le prestazioni dello scambio di dati a lunga e corta distanza; le prestazioni nella risoluzione di problemi che presentano un principio di località spaziale ridotto; le prestazioni nella risoluzione di problemi su grafi e, nello specifico, ricerca in ampiezza e cammini minimi da sorgente singola.
Resumo:
The new social panorama resulting from aging of the Brazilian population is leading to significant transformations within healthcare. Through the cluster analysis strategy, it was sought to describe the specific care demands of the elderly population, using frailty components. Cross-sectional study based on reviewing medical records, conducted in the geriatric outpatient clinic, Hospital de Clínicas, Universidade Estadual de Campinas (Unicamp). Ninety-eight elderly users of this clinic were evaluated using cluster analysis and instruments for assessing their overall geriatric status and frailty characteristics. The variables that most strongly influenced the formation of clusters were age, functional capacities, cognitive capacity, presence of comorbidities and number of medications used. Three main groups of elderly people could be identified: one with good cognitive and functional performance but with high prevalence of comorbidities (mean age 77.9 years, cognitive impairment in 28.6% and mean of 7.4 comorbidities); a second with more advanced age, greater cognitive impairment and greater dependence (mean age 88.5 years old, cognitive impairment in 84.6% and mean of 7.1 comorbidities); and a third younger group with poor cognitive performance and greater number of comorbidities but functionally independent (mean age 78.5 years old, cognitive impairment in 89.6% and mean of 7.4 comorbidities). These data characterize the profile of this population and can be used as the basis for developing efficient strategies aimed at diminishing functional dependence, poor self-rated health and impaired quality of life.
Resumo:
The [Ru3O(Ac)6(py)2(CH3OH)]+ cluster provides an effective electrocatalytic species for the oxidation of methanol under mild conditions. This complex exhibits characteristic electrochemical waves at -1.02, 0.15 and 1.18 V, associated with the Ru3III,II,II/Ru3III,III,II/Ru 3III,III,III /Ru3IV,III,III successive redox couples, respectively. Above 1.7 V, formation of two RuIV centers enhances the 2-electron oxidation of the methanol ligand yielding formaldehyde, in agreement with the theoretical evolution of the HOMO levels as a function of the oxidation states. This work illustrates an important strategy to improve the efficiency of the oxidation catalysis, by using a multicentered redox catalyst and accessing its multiple higher oxidation states.
Resumo:
Background: Identifying clusters of acute paracoccidioidomycosis cases could potentially help in identifying the environmental factors that influence the incidence of this mycosis. However, unlike other endemic mycoses, there are no published reports of clusters of paracoccidioidomycosis. Methodology/Principal Findings: A retrospective cluster detection test was applied to verify if an excess of acute form (AF) paracoccidioidomycosis cases in time and/or space occurred in Botucatu, an endemic area in Sao Paulo State. The scan-test SaTScan v7.0.3 was set to find clusters for the maximum temporal period of 1 year. The temporal test indicated a significant cluster in 1985 (P<0.005). This cluster comprised 10 cases, although 2.19 were expected for this year in this area. Age and clinical presentation of these cases were typical of AF paracccidioidomycosis. The space-time test confirmed the temporal cluster in 1985 and showed the localities where the risk was higher in that year. The cluster suggests that some particularities took place in the antecedent years in those localities. Analysis of climate variables showed that soil water storage was atypically high in 1982/83 (similar to 2.11/2.5 SD above mean), and the absolute air humidity in 1984, the year preceding the cluster, was much higher than normal (similar to 1.6 SD above mean), conditions that may have favored, respectively, antecedent fungal growth in the soil and conidia liberation in 1984, the probable year of exposure. These climatic anomalies in this area was due to the 1982/83 El Nino event, the strongest in the last 50 years. Conclusions/Significance: We describe the first cluster of AF paracoccidioidomycosis, which was potentially linked to a climatic anomaly caused by the 1982/83 El Nino Southern Oscillation. This finding is important because it may help to clarify the conditions that favor Paracoccidioides brasiliensis survival and growth in the environment and that enhance human exposure, thus allowing the development of preventive measures.
Resumo:
Context. Abundance variations in moderately metal-rich globular clusters can give clues about the formation and chemical enrichment of globular clusters. Aims. CN, CH, Na, Mg and Al indices in spectra of 89 stars of the template metal-rich globular cluster M71 are measured and implications on internal mixing are discussed. Methods. Stars from the turn-off up to the Red Giant Branch (0.87 < log g < 4.65) observed with the GMOS multi-object spectrograph at the Gemini-North telescope are analyzed. Radial velocities, colours, effective temperatures, gravities and spectral indices are determined for the sample. Results. Previous findings related to the CN bimodality and CN-CH anticorrelation in stars of M71 are confirmed. We also find a CN-Na correlation, and Al-Na, as well as an Mg(2)-Al anticorrelation. Conclusions. A combination of convective mixing and a primordial pollution by AGB or massive stars in the early stages of globular cluster formation is required to explain the observations.
Resumo:
Context. It is not known how many globular clusters may remain undetected towards the Galactic bulge. Aims. One of the aims of the VISTA Variables in the Via Lactea (VVV) Survey is to accurately measure the physical parameters of the known globular clusters in the inner regions of the Milky Way and search for new ones, hidden in regions of large extinction. Methods. From deep near-infrared images, we derive deep JHK(S)-band photometry of a region surrounding the known globular cluster UKS 1 and reveal a new low-mass globular cluster candidate that we name VVV CL001. Results. We use the horizontal-branch red clump to measure E(B-V) similar to 2.2 mag, (m - M)(0) = 16.01 mag, and D = 15.9 kpc for the globular cluster UKS 1. On the basis of near-infrared colour-magnitude diagrams, we also find that VVV CL001 has E(B-V) similar to 2.0, and that it is at least as metal-poor as UKS 1, although its distance remains uncertain. Conclusions. Our finding confirms the previous projection that the central region of the Milky Way harbours more globular clusters. VVV CL001 and UKS 1 are good candidates for a physical cluster binary, but follow-up observations are needed to decide if they are located at the same distance and have similar radial velocities.
Resumo:
The A1763 superstructure at z = 0.23 contains the first galaxy filament to be directly detected using mid-infrared observations. Our previous work has shown that the frequency of starbursting galaxies, as characterized by 24 mu m emission is much higher within the filament than at either the center of the rich galaxy cluster, or the field surrounding the system. New Very Large Array and XMM-Newton data are presented here. We use the radio and X-ray data to examine the fraction and location of active galaxies, both active galactic nuclei (AGNs) and starbursts (SBs). The radio far-infrared correlation, X-ray point source location, IRAC colors, and quasar positions are all used to gain an understanding of the presence of dominant AGNs. We find very few MIPS-selected galaxies that are clearly dominated by AGN activity. Most radio-selected members within the filament are SBs. Within the supercluster, three of eight spectroscopic members detected both in the radio and in the mid-infrared are radio-bright AGNs. They are found at or near the core of A1763. The five SBs are located further along the filament. We calculate the physical properties of the known wide angle tail (WAT) source which is the brightest cluster galaxy of A1763. A second double lobe source is found along the filament well outside of the virial radius of either cluster. The velocity offset of the WAT from the X-ray centroid and the bend of the WAT in the intracluster medium are both consistent with ram pressure stripping, indicative of streaming motions along the direction of the filament. We consider this as further evidence of the cluster-feeding nature of the galaxy filament.
Resumo:
The mass function of cluster-size halos and their redshift distribution are computed for 12 distinct accelerating cosmological scenarios and confronted to the predictions of the conventional flat Lambda CDM model. The comparison with Lambda CDM is performed by a two-step process. First, we determine the free parameters of all models through a joint analysis involving the latest cosmological data, using supernovae type Ia, the cosmic microwave background shift parameter, and baryon acoustic oscillations. Apart from a braneworld inspired cosmology, it is found that the derived Hubble relation of the remaining models reproduces the Lambda CDM results approximately with the same degree of statistical confidence. Second, in order to attempt to distinguish the different dark energy models from the expectations of Lambda CDM, we analyze the predicted cluster-size halo redshift distribution on the basis of two future cluster surveys: (i) an X-ray survey based on the eROSITA satellite, and (ii) a Sunayev-Zeldovich survey based on the South Pole Telescope. As a result, we find that the predictions of 8 out of 12 dark energy models can be clearly distinguished from the Lambda CDM cosmology, while the predictions of 4 models are statistically equivalent to those of the Lambda CDM model, as far as the expected cluster mass function and redshift distribution are concerned. The present analysis suggests that such a technique appears to be very competitive to independent tests probing the late time evolution of the Universe and the associated dark energy effects.
Resumo:
We discuss the properties of homogeneous and isotropic flat cosmologies in which the present accelerating stage is powered only by the gravitationally induced creation of cold dark matter (CCDM) particles (Omega(m) = 1). For some matter creation rates proposed in the literature, we show that the main cosmological functions such as the scale factor of the universe, the Hubble expansion rate, the growth factor, and the cluster formation rate are analytically defined. The best CCDM scenario has only one free parameter and our joint analysis involving baryonic acoustic oscillations + cosmic microwave background (CMB) + SNe Ia data yields (Omega) over tilde = 0.28 +/- 0.01 (1 sigma), where (Omega) over tilde (m) is the observed matter density parameter. In particular, this implies that the model has no dark energy but the part of the matter that is effectively clustering is in good agreement with the latest determinations from the large- scale structure. The growth of perturbation and the formation of galaxy clusters in such scenarios are also investigated. Despite the fact that both scenarios may share the same Hubble expansion, we find that matter creation cosmologies predict stronger small scale dynamics which implies a faster growth rate of perturbations with respect to the usual Lambda CDM cosmology. Such results point to the possibility of a crucial observational test confronting CCDM with Lambda CDM scenarios through a more detailed analysis involving CMB, weak lensing, as well as the large-scale structure.