23 resultados para Data Driven Clustering
em Reposit
Resumo:
We present a measurement of the semileptonic mixing asymmetry for B0 mesons, asld, using two independent decay channels: B0→μ +D -X, with D -→K +π -π -; and B0→μ +D *-X, with D * -→D ̄0π -, D ̄0→ K +π - (and charge conjugate processes). We use a data sample corresponding to 10.4fb -1 of pp̄ collisions at √s=1.96TeV, collected with the D0 experiment at the Fermilab Tevatron collider. We extract the charge asymmetries in these two channels as a function of the visible proper decay length of the B0 meson, correct for detector-related asymmetries using data-driven methods, and account for dilution from charge-symmetric processes using Monte Carlo simulation. The final measurement combines four signal visible proper decay length regions for each channel, yielding asld=[0.68±0.45(stat)±0.14(syst)]%. This is the single most precise measurement of this parameter, with uncertainties smaller than the current world average of B factory measurements. © 2012 American Physical Society.
Resumo:
A measurement of differential cross sections for the production of a pair of isolated photons in proton-proton collisions at root s = 7 TeV is presented. The data sample corresponds to an integrated luminosity of 5.0 fb(-1) collected with the CMS detector. A data-driven isolation template method is used to extract the prompt diphoton yield. The measured cross section for two isolated photons, with transverse energy above 40 and 25 GeV respectively, in the pseudorapidity range vertical bar eta vertical bar < 2.5, vertical bar eta vertical bar (sic) [1.44, 1.57] and with an angular separation Delta R > 0.45, is 17.2 +/-0.2 (stat) +/-1.9 (syst) +/- 0.4 (lumi) pb. Differential cross sections are measured as a function of the diphoton invariant mass, the diphoton transverse momentum, the azimuthal angle difference between the two photons, and the cosine of the polar angle in the Collins-Soper reference frame of the diphoton system. The results are compared to theoretical predictions at leading, next-to-leading, and next-to-next-to-leading order in quantum chromodynamics.
Resumo:
It's believed that the simple Su-Schrieffer-Heeger Hamiltonian can not predict the insulator to metal transition of transpolyacetylene (t-PA). The soliton lattice configuration at a doping level y=6% still has a semiconductor gap. Disordered distributions of solitons close the gap, but the electronic states around the Fermi energy are localized. However, within the same framework, it is possible to show that a cluster of solitons can produce dramatic changes in the electronic structure, allowing an insulator-to-metal transition.
Resumo:
Non-technical losses identification has been paramount in the last decade. Since we have datasets with hundreds of legal and illegal profiles, one may have a method to group data into subprofiles in order to minimize the search for consumers that cause great frauds. In this context, a electric power company may be interested in to go deeper a specific profile of illegal consumer. In this paper, we introduce the Optimum-Path Forest (OPF) clustering technique to this task, and we evaluate the behavior of a dataset provided by a brazilian electric power company with different values of an OPF parameter. © 2011 IEEE.
Resumo:
The taxonomy of the N(2)-fixing bacteria belonging to the genus Bradyrhizobium is still poorly refined, mainly due to conflicting results obtained by the analysis of the phenotypic and genotypic properties. This paper presents an application of a method aiming at the identification of possible new clusters within a Brazilian collection of 119 Bradryrhizobium strains showing phenotypic characteristics of B. japonicum and B. elkanii. The stability was studied as a function of the number of restriction enzymes used in the RFLP-PCR analysis of three ribosomal regions with three restriction enzymes per region. The method proposed here uses Clustering algorithms with distances calculated by average-linkage clustering. Introducing perturbations using sub-sampling techniques makes the stability analysis. The method showed efficacy in the grouping of the species B. japonicum and B. elkanii. Furthermore, two new clusters were clearly defined, indicating possible new species, and sub-clusters within each detected cluster. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
The development of strategies for structural health monitoring (SHM) has become increasingly important because of the necessity of preventing undesirable damage. This paper describes an approach to this problem using vibration data. It involves a three-stage process: reduction of the time-series data using principle component analysis (PCA), the development of a data-based model using an auto-regressive moving average (ARMA) model using data from an undamaged structure, and the classification of whether or not the structure is damaged using a fuzzy clustering approach. The approach is applied to data from a benchmark structure from Los Alamos National Laboratory, USA. Two fuzzy clustering algorithms are compared: fuzzy c-means (FCM) and Gustafson-Kessel (GK) algorithms. It is shown that while both fuzzy clustering algorithms are effective, the GK algorithm marginally outperforms the FCM algorithm. (C) 2008 Elsevier Ltd. All rights reserved.
Resumo:
Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)
Resumo:
The frequency of adenine mononucleotides (A), dinucleotides (AA) and clusters, and the positions of clusters, were studied in 502 molecules of the 5S rRNA.All frequencies were reduced in the evolutive lines of vertebrates, plants and fungi, in parallel with increasing organismic complexity. No change was observed in invertebrates. All frequencies were increased in mitochondria, plastids and mycoplasmas. The presumed relatives to the ancestors of the organelles, Rhodobacteria alfa and Cyanobacteria, showed intermediate values, relative to the eubacterial averages. Firmibacterid showed very high number of cluster sites.Clusters were more frequent in single-stranded regions in all organisms. The routes of organelles and mycoplasmas accummulated clusters at faster rates in double-stranded regions. Rates of change were higher for AA and clusters than for A in plants, vertebrates and organeltes, higher for cluster sites and A in mycoplasmas, and higher for AA and A in fungi. These data indicated that selection pressures acted more strongly on adenine clustering than on adenine frequency.It is proposed that AA and clusters, as sites of lower informational content. have the property of tolerating positional variation in the sites of other molecules (or other regions of the same molecule) that interact with the adenines. This reasoning was consistent with the degrees of genic polymorphism. low in plants and vertebrates and high in invertebrates. In the eubacteria endosymbiontic or parasitic to eukaryotes, the more tolerant RNA would be better adapted to interactions with the homologous nucleus-derived ribosomal proteins: the intermediate values observed in their precursors were interpreted as preadaptive.Among other groups, only the Deinococcus-Thermus eubacteria showed excessive AA and cluster contents, possibly related to their peculiar tolerance to mutagens, and the Ciliates showed excessive AA contents, indicative of retention of primitive characters.
PHYLOGENETIC STUDIES OF SOME SPECIES OF THE GENUS COFFEA .2. NUMERICAL-ANALYSIS OF ISOENZYMATIC DATA
Resumo:
Thirteen species of Coffea were studied for five enzymes systems, including alpha and beta esterase, alkaline phosphatase, acid phosphatase, malate dehydrogenase and acid dehydrogenase. Three coefficients of similarity: Simple Matching, Jaccard and Ochiai and three different clustering methods: Single Linkage, Complete Linkage and Unweighted Pair Group, using Arithmetic Averages (UPGMA) were used to analyse the data.The phylogenetic relationships among the twelve diploid species and between them and the tetraploid species C. arabica showed that similarity among species of the same subsection is not always greater than among species of different subsections. In addition, although there are several similarity groups in common, established by isoenzymatic polymorphism, morphological characteristics, chemical data, crossability and geographic distribution, there is no common trend among the phylogenetic relationships as indicated by all these different evaluating procedures.
Resumo:
In this paper we focus on providing coordinated visual strategies to assist users in performing tasks driven by the presence of temporal and spatial attributes. We introduce temporal visualization techniques targeted at such tasks, and illustrate their use with an application involving a climate classification process. The climate classification requires extensive Processing of a database containing daily rain precipitation values collected along over fifty years at several spatial locations in the São Paulo state, Brazil. We identify user exploration tasks typically conducted as part of the data preparation required in this process, and then describe how such tasks may be assisted by the multiple visual techniques provided. Issues related to the use of the multiple techniques by an end-user are also discussed.
Resumo:
An intelligent system that emulates human decision behaviour based on visual data acquisition is proposed. The approach is useful in applications where images are used to supply information to specialists who will choose suitable actions. An artificial neural classifier aids a fuzzy decision support system to deal with uncertainty and imprecision present in available information. Advantages of both techniques are exploited complementarily. As an example, this method was applied in automatic focus checking and adjustment in video monitor manufacturing. Copyright © 2005 IFAC.
Resumo:
The significant volume of work accidents in the cities causes an expressive loss to society. The development of Spatial Data Mining technologies presents a new perspective for the extraction of knowledge from the correlation between conventional and spatial attributes. One of the most important techniques of the Spatial Data Mining is the Spatial Clustering, which clusters similar spatial objects to find a distribution of patterns, taking into account the geographical position of the objects. Applying this technique to the health area, will provide information that can contribute towards the planning of more adequate strategies for the prevention of work accidents. The original contribution of this work is to present an application of tools developed for Spatial Clustering which supply a set of graphic resources that have helped to discover knowledge and support for management in the work accidents area. © 2011 IEEE.
Resumo:
Structural Health Monitoring (SHM) denotes a system with the ability to detect and interpret adverse changes in a structure. One of the critical challenges for practical implementation of SHM system is the ability to detect damage under changing environmental conditions. This paper aims to characterize the temperature, load and damage effects in the sensor measurements obtained with piezoelectric transducer (PZT) patches. Data sets are collected on thin aluminum specimens under different environmental conditions and artificially induced damage states. The fuzzy clustering algorithm is used to organize the sensor measurements into a set of clusters, which can attribute the variation in sensor data due to temperature, load or any induced damage.
Resumo:
Nowadays, organizations face the problem of keeping their information protected, available and trustworthy. In this context, machine learning techniques have also been extensively applied to this task. Since manual labeling is very expensive, several works attempt to handle intrusion detection with traditional clustering algorithms. In this paper, we introduce a new pattern recognition technique called Optimum-Path Forest (OPF) clustering to this task. Experiments on three public datasets have showed that OPF classifier may be a suitable tool to detect intrusions on computer networks, since it outperformed some state-of-the-art unsupervised techniques. © 2012 IEEE.
Resumo:
Many topics related to association mining have received attention in the research community, especially the ones focused on the discovery of interesting knowledge. A promising approach, related to this topic, is the application of clustering in the pre-processing step to aid the user to find the relevant associative patterns of the domain. In this paper, we propose nine metrics to support the evaluation of this kind of approach. The metrics are important since they provide criteria to: (a) analyze the methodologies, (b) identify their positive and negative aspects, (c) carry out comparisons among them and, therefore, (d) help the users to select the most suitable solution for their problems. Some experiments were done in order to present how the metrics can be used and their usefulness. © 2013 Springer-Verlag GmbH.