Biblioteca Digital

791 resultados para cluster algorithms

THE VORONOI TESSELLATION CLUSTER FINDER IN 2+1 DIMENSIONS

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We present a detailed description of the Voronoi Tessellation (VT) cluster finder algorithm in 2+1 dimensions, which improves on past implementations of this technique. The need for cluster finder algorithms able to produce reliable cluster catalogs up to redshift 1 or beyond and down to 10(13.5) solar masses is paramount especially in light of upcoming surveys aiming at cosmological constraints from galaxy cluster number counts. We build the VT in photometric redshift shells and use the two-point correlation function of the galaxies in the field to both determine the density threshold for detection of cluster candidates and to establish their significance. This allows us to detect clusters in a self-consistent way without any assumptions about their astrophysical properties. We apply the VT to mock catalogs which extend to redshift 1.4 reproducing the ACDM cosmology and the clustering properties observed in the Sloan Digital Sky Survey data. An objective estimate of the cluster selection function in terms of the completeness and purity as a function of mass and redshift is as important as having a reliable cluster finder. We measure these quantities by matching the VT cluster catalog with the mock truth table. We show that the VT can produce a cluster catalog with completeness and purity > 80% for the redshift range up to similar to 1 and mass range down to similar to 10(13.5) solar masses.

Towards improving cluster-based feature selection with a simplified silhouette filter

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper proposes a filter-based algorithm for feature selection. The filter is based on the partitioning of the set of features into clusters. The number of clusters, and consequently the cardinality of the subset of selected features, is automatically estimated from data. The computational complexity of the proposed algorithm is also investigated. A variant of this filter that considers feature-class correlations is also proposed for classification problems. Empirical results involving ten datasets illustrate the performance of the developed algorithm, which in general has obtained competitive results in terms of classification accuracy when compared to state of the art algorithms that find clusters of features. We show that, if computational efficiency is an important issue, then the proposed filter May be preferred over their counterparts, thus becoming eligible to join a pool of feature selection algorithms to be used in practice. As an additional contribution of this work, a theoretical framework is used to formally analyze some properties of feature selection methods that rely on finding clusters of features. (C) 2011 Elsevier Inc. All rights reserved.

Multidimensional cluster stability analysis from a Brazilian Bradyrhizobium sp RFLP/PCR data set

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The taxonomy of the N(2)-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradryrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses Clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster. (C) 2008 Elsevier B.V. All rights reserved.

QK-Means: A clustering technique based on community detection and K-Means for deployment of cluster head nodes

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wireless Sensor Networks (WSN) are a special kind of ad-hoc networks that is usually deployed in a monitoring field in order to detect some physical phenomenon. Due to the low dependability of individual nodes, small radio coverage and large areas to be monitored, the organization of nodes in small clusters is generally used. Moreover, a large number of WSN nodes is usually deployed in the monitoring area to increase WSN dependability. Therefore, the best cluster head positioning is a desirable characteristic in a WSN. In this paper, we propose a hybrid clustering algorithm based on community detection in complex networks and traditional K-means clustering technique: the QK-Means algorithm. Simulation results show that QK-Means detect communities and sub-communities thus lost message rate is decreased and WSN coverage is increased. © 2012 IEEE.

Collaborative fuzzy clustering algorithms: some refinements and design guidelines

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There are some variants of the widely used Fuzzy C-Means (FCM) algorithm that support clustering data distributed across different sites. Those methods have been studied under different names, like collaborative and parallel fuzzy clustering. In this study, we offer some augmentation of the two FCM-based clustering algorithms used to cluster distributed data by arriving at some constructive ways of determining essential parameters of the algorithms (including the number of clusters) and forming a set of systematically structured guidelines such as a selection of the specific algorithm depending on the nature of the data environment and the assumptions being made about the number of clusters. A thorough complexity analysis, including space, time, and communication aspects, is reported. A series of detailed numeric experiments is used to illustrate the main ideas discussed in the study.

Large-scale coupled-cluster calculations

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coupled-cluster theory provides one of the most successful concepts in electronic-structure theory. This work covers the parallelization of coupled-cluster energies, gradients, and second derivatives and its application to selected large-scale chemical problems, beside the more practical aspects such as the publication and support of the quantum-chemistry package ACES II MAB and the design and development of a computational environment optimized for coupled-cluster calculations. The main objective of this thesis was to extend the range of applicability of coupled-cluster models to larger molecular systems and their properties and therefore to bring large-scale coupled-cluster calculations into day-to-day routine of computational chemistry. A straightforward strategy for the parallelization of CCSD and CCSD(T) energies, gradients, and second derivatives has been outlined and implemented for closed-shell and open-shell references. Starting from the highly efficient serial implementation of the ACES II MAB computer code an adaptation for affordable workstation clusters has been obtained by parallelizing the most time-consuming steps of the algorithms. Benchmark calculations for systems with up to 1300 basis functions and the presented applications show that the resulting algorithm for energies, gradients and second derivatives at the CCSD and CCSD(T) level of theory exhibits good scaling with the number of processors and substantially extends the range of applicability. Within the framework of the ’High accuracy Extrapolated Ab initio Thermochemistry’ (HEAT) protocols effects of increased basis-set size and higher excitations in the coupled- cluster expansion were investigated. The HEAT scheme was generalized for molecules containing second-row atoms in the case of vinyl chloride. This allowed the different experimental reported values to be discriminated. In the case of the benzene molecule it was shown that even for molecules of this size chemical accuracy can be achieved. Near-quantitative agreement with experiment (about 2 ppm deviation) for the prediction of ﬂuorine-19 nuclear magnetic shielding constants can be achieved by employing the CCSD(T) model together with large basis sets at accurate equilibrium geometries if vibrational averaging and temperature corrections via second-order vibrational perturbation theory are considered. Applying a very similar level of theory for the calculation of the carbon-13 NMR chemical shifts of benzene resulted in quantitative agreement with experimental gas-phase data. The NMR chemical shift study for the bridgehead 1-adamantyl cation at the CCSD(T) level resolved earlier discrepancies of lower-level theoretical treatment. The equilibrium structure of diacetylene has been determined based on the combination of experimental rotational constants of thirteen isotopic species and zero-point vibrational corrections calculated at various quantum-chemical levels. These empirical equilibrium structures agree to within 0.1 pm irrespective of the theoretical level employed. High-level quantum-chemical calculations on the hyperﬁne structure parameters of the cyanopolyynes were found to be in excellent agreement with experiment. Finally, the theoretically most accurate determination of the molecular equilibrium structure of ferrocene to date is presented.

Cluster analysis and display of genome-wide expression patterns

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.

Stages of motivational readiness for physical activity: A comparison of different algorithms of classification

Relevância:

30.00% 30.00%

Publicador:

Enhanced diabetes care to patients of south Asian ethnic origin (the United Kingdom Asian Diabetes Study):a cluster randomised controlled trial

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background - Delivery of high-quality, evidence-based health care to deprived sectors of the community is a major goal for society. We investigated the effectiveness of a culturally sensitive, enhanced care package in UK general practices for improvement of cardiovascular risk factors in patients of south Asian origin with type 2 diabetes. Methods - In this cluster randomised controlled trial, 21 inner-city practices in the UK were assigned by simple randomisation to intervention (enhanced care including additional time with practice nurse and support from a link worker and diabetes-specialist nurse [nine practices; n=868]) or control (standard care [12 practices; n=618]) groups. All adult patients of south Asian origin with type 2 diabetes were eligible. Prescribing algorithms with clearly defined targets were provided for all practices. Primary outcomes were changes in blood pressure, total cholesterol, and glycaemic control (haemoglobin A1c) after 2 years. Analysis was by intention to treat. This trial is registered, number ISRCTN 38297969. Findings - We recorded significant differences between treatment groups in diastolic blood pressure (1·91 [95% CI -2·88 to -0·94] mm?Hg, p=0·0001) and mean arterial pressure (1·36 [-2·49 to -0·23] mm?Hg, p=0·0180), after adjustment for confounders and clustering. We noted no significant differences between groups for total cholesterol (0·03 [-0·04 to 0·11] mmol/L), systolic blood pressure (-0·33 [-2·41 to 1·75] mm?Hg), or HbA1c (-0·15% [-0·33 to 0·03]). Economic analysis suggests that the nurse-led intervention was not cost effective (incremental cost-effectiveness ratio £28?933 per QALY gained). Across the whole study population over the 2 years of the trial, systolic blood pressure, diastolic blood pressure, and cholesterol decreased significantly by 4·9 (95% CI 4·0–5·9) mm?Hg, 3·8 (3·2–4·4) mm?Hg, and 0·45 (0·40–0·51) mmol/L, respectively, and we recorded a small and non-significant increase for haemoglobin A1c (0·04% [-0·04 to 0·13]), p=0·290). Interpretation - We recorded additional, although small, benefits from our culturally tailored care package that were greater than the secular changes achieved in the UK in recent years. Stricter targets in general practice and further measures to motivate patients are needed to achieve best possible health-care outcomes in south Asian patients with diabetes. Funding - Pfizer, Sanofi-Aventis, Servier Laboratories UK, Merck Sharp & Dohme/Schering-Plough, Takeda UK, Roche, Merck Pharma, Daiichi-Sankyo UK, Boehringer Ingelheim, Eli Lilly, Novo Nordisk, Bristol-Myers Squibb, Solvay Health Care, and Assurance Medical Society UK.

Ionized cluster beam deposition, and thin film analysis

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In 1972 the ionized cluster beam (ICB) deposition technique was introduced as a new method for thin film deposition. At that time the use of clusters was postulated to be able to enhance film nucleation and adatom surface mobility, resulting in high quality films. Although a few researchers reported singly ionized clusters containing 10$\sp2$-10$\sp3$ atoms, others were unable to repeat their work. The consensus now is that film effects in the early investigations were due to self-ion bombardment rather than clusters. Subsequently in recent work (early 1992) synthesis of large clusters of zinc without the use of a carrier gas was demonstrated by Gspann and repeated in our laboratory. Clusters resulted from very significant changes in two source parameters. Crucible pressure was increased from the earlier 2 Torr to several thousand Torr and a converging-diverging nozzle 18 mm long and 0.4 mm in diameter at the throat was used in place of the 1 mm x 1 mm nozzle used in the early work. While this is practical for zinc and other high vapor pressure materials it remains impractical for many materials of industrial interest such as gold, silver, and aluminum. The work presented here describes results using gold and silver at pressures of around 1 and 50 Torr in order to study the effect of the pressure and nozzle shape. Significant numbers of large clusters were not detected. Deposited films were studied by atomic force microscopy (AFM) for roughness analysis, and X-ray diffraction.^ Nanometer size islands of zinc deposited on flat silicon substrates by ICB were also studied by atomic force microscopy and the number of atoms/cm$\sp2$ was calculated and compared to data from Rutherford backscattering spectrometry (RBS). To improve the agreement between data from AFM and RBS, convolution and deconvolution algorithms were implemented to study and simulate the interaction between tip and sample in atomic force microscopy. The deconvolution algorithm takes into account the physical volume occupied by the tip resulting in an image that is a more accurate representation of the surface.^ One method increasingly used to study the deposited films both during the growth process and following, is ellipsometry. Ellipsometry is a surface analytical technique used to determine the optical properties and thickness of thin films. In situ measurements can be made through the windows of a deposition chamber. A method for determining the optical properties of a film, that is sensitive only to the growing film and accommodates underlying interfacial layers, multiple unknown underlayers, and other unknown substrates was developed. This method is carried out by making an initial ellipsometry measurement well past the real interface and by defining a virtual interface in the vicinity of this measurement. ^

Algorithms for super-resolution of images based on Sparse Representation and Manifolds

Relevância:

30.00% 30.00%

Publicador:

Resumo:

lmage super-resolution is defined as a class of techniques that enhance the spatial resolution of images. Super-resolution methods can be subdivided in single and multi image methods. This thesis focuses on developing algorithms based on mathematical theories for single image super resolution problems. lndeed, in arder to estimate an output image, we adopta mixed approach: i.e., we use both a dictionary of patches with sparsity constraints (typical of learning-based methods) and regularization terms (typical of reconstruction-based methods). Although the existing methods already per- form well, they do not take into account the geometry of the data to: regularize the solution, cluster data samples (samples are often clustered using algorithms with the Euclidean distance as a dissimilarity metric), learn dictionaries (they are often learned using PCA or K-SVD). Thus, state-of-the-art methods still suffer from shortcomings. In this work, we proposed three new methods to overcome these deficiencies. First, we developed SE-ASDS (a structure tensor based regularization term) in arder to improve the sharpness of edges. SE-ASDS achieves much better results than many state-of-the- art algorithms. Then, we proposed AGNN and GOC algorithms for determining a local subset of training samples from which a good local model can be computed for recon- structing a given input test sample, where we take into account the underlying geometry of the data. AGNN and GOC methods outperform spectral clustering, soft clustering, and geodesic distance based subset selection in most settings. Next, we proposed aSOB strategy which takes into account the geometry of the data and the dictionary size. The aSOB strategy outperforms both PCA and PGA methods. Finally, we combine all our methods in a unique algorithm, named G2SR. Our proposed G2SR algorithm shows better visual and quantitative results when compared to the results of state-of-the-art methods.

A process evaluation of a cluster randomised trial to reduce potentially inappropriate prescribing in older people in primary care (OPTI-SCRIPT study)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background
The OPTI-SCRIPT cluster randomised controlled trial (RCT) found that a three-phase multifaceted intervention including academic detailing with a pharmacist, GP-led medicines reviews, supported by web-based pharmaceutical treatment algorithms, and tailored patient information leaflets, was effective in reducing potentially inappropriate prescribing (PIP) in Irish primary care. We report a process evaluation exploring the implementation of the intervention, the experiences of those participating in the study and lessons for future implementation.

Methods
The OPTI-SCRIPT trial included 21 GP practices and 196 patients. The process evaluation used mixed methods. Quantitative data were collected from all GP practices and semi-structured interviews were conducted with GPs from intervention and control groups, and a purposive sample of patients from the intervention group. All interviews were transcribed verbatim and analysed using a thematic analysis.

Results
Despite receiving a standardised academic detailing session, intervention delivery varied among GP practices. Just over 70 % of practices completed medicines review as recommended with the patient present. Only single-handed practices conducted reviews without patients present, highlighting the influence of practice characteristics and resources on variation. Medications were more likely to be completely stopped or switched to another more appropriate medication when reviews were conducted with patients present. The patient information leaflets were not used by any of the intervention practices. Both GP (32 %) and patient (40 %) recruitment rates were modest. For those who did participate, overall, the experience was positively viewed, with GPs and patients referring to the value of medication reviews to improve prescribing and reduce unnecessary medications. Lack of time in busy GP practices and remuneration were identified as organisational barriers to future implementation.

Conclusions
The OPTI-SCRIPT intervention was positively viewed by both GPs and patients, both of whom valued the study’s objectives. Patient information leaflets were not a successful component of the intervention. Academic detailing and medication reviews are important components in changing PIP, and having patients present during the review process seems to be a more effective approach for decreasing PIP.

Models and algorithms for real-world optimization problems

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis deals with efficient solution of optimization problems of practical interest. The first part of the thesis deals with bin packing problems. The bin packing problem (BPP) is one of the oldest and most fundamental combinatorial optimiza- tion problems. The bin packing problem and its generalizations arise often in real-world ap- plications, from manufacturing industry, logistics and transportation of goods, and scheduling. After an introductory chapter, I will present two applications of two of the most natural extensions of the bin packing: Chapter 2 will be dedicated to an application of bin packing in two dimension to a problem of scheduling a set of computational tasks on a computer cluster, while Chapter 3 deals with the generalization of BPP in three dimensions that arise frequently in logistic and transportation, often com- plemented with additional constraints on the placement of items and characteristics of the solution, like, for example, guarantees on the stability of the items, to avoid potential damage to the transported goods, on the distribution of the total weight of the bins, and on compatibility with loading and unloading operations. The second part of the thesis, and in particular Chapter 4 considers the Trans- mission Expansion Problem (TEP), where an electrical transmission grid must be expanded so as to satisfy future energy demand at the minimum cost, while main- taining some guarantees of robustness to potential line failures. These problems are gaining importance in a world where a shift towards renewable energy can impose a significant geographical reallocation of generation capacities, resulting in the ne- cessity of expanding current power transmission grids.

A Distributed Support based on Kubeflow to Ease MLOps of Federated Learning Algorithms

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The main objective of my thesis work is to exploit the Google native and open-source platform Kubeflow, specifically using Kubeflow pipelines, to execute a Federated Learning scalable ML process in a 5G-like and simplified test architecture hosting a Kubernetes cluster and apply the largely adopted FedAVG algorithm and FedProx its optimization empowered by the ML platform ‘s abilities to ease the development and production cycle of this specific FL process. FL algorithms are more are and more promising and adopted both in Cloud application development and 5G communication enhancement through data coming from the monitoring of the underlying telco infrastructure and execution of training and data aggregation at edge nodes to optimize the global model of the algorithm ( that could be used for example for resource provisioning to reach an agreed QoS for the underlying network slice) and after a study and a research over the available papers and scientific articles related to FL with the help of the CTTC that suggests me to study and use Kubeflow to bear the algorithm we found out that this approach for the whole FL cycle deployment was not documented and may be interesting to investigate more in depth. This study may lead to prove the efficiency of the Kubeflow platform itself for this need of development of new FL algorithms that will support new Applications and especially test the FedAVG algorithm performances in a simulated client to cloud communication using a MNIST dataset for FL as benchmark.

Cluster Analysis To Identify Elderly People's Profiles: A Healthcare Strategy Based On Frailty Characteristics.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The new social panorama resulting from aging of the Brazilian population is leading to significant transformations within healthcare. Through the cluster analysis strategy, it was sought to describe the specific care demands of the elderly population, using frailty components. Cross-sectional study based on reviewing medical records, conducted in the geriatric outpatient clinic, Hospital de Clínicas, Universidade Estadual de Campinas (Unicamp). Ninety-eight elderly users of this clinic were evaluated using cluster analysis and instruments for assessing their overall geriatric status and frailty characteristics. The variables that most strongly influenced the formation of clusters were age, functional capacities, cognitive capacity, presence of comorbidities and number of medications used. Three main groups of elderly people could be identified: one with good cognitive and functional performance but with high prevalence of comorbidities (mean age 77.9 years, cognitive impairment in 28.6% and mean of 7.4 comorbidities); a second with more advanced age, greater cognitive impairment and greater dependence (mean age 88.5 years old, cognitive impairment in 84.6% and mean of 7.1 comorbidities); and a third younger group with poor cognitive performance and greater number of comorbidities but functionally independent (mean age 78.5 years old, cognitive impairment in 89.6% and mean of 7.4 comorbidities). These data characterize the profile of this population and can be used as the basis for developing efficient strategies aimed at diminishing functional dependence, poor self-rated health and impaired quality of life.

«
1
2
3
4
5
6
7
8
...
52
53
»