639 resultados para Fano Partitions
Resumo:
K-Means is a popular clustering algorithm which adopts an iterative refinement procedure to determine data partitions and to compute their associated centres of mass, called centroids. The straightforward implementation of the algorithm is often referred to as `brute force' since it computes a proximity measure from each data point to each centroid at every iteration of the K-Means process. Efficient implementations of the K-Means algorithm have been predominantly based on multi-dimensional binary search trees (KD-Trees). A combination of an efficient data structure and geometrical constraints allow to reduce the number of distance computations required at each iteration. In this work we present a general space partitioning approach for improving the efficiency and the scalability of the K-Means algorithm. We propose to adopt approximate hierarchical clustering methods to generate binary space partitioning trees in contrast to KD-Trees. In the experimental analysis, we have tested the performance of the proposed Binary Space Partitioning K-Means (BSP-KM) when a divisive clustering algorithm is used. We have carried out extensive experimental tests to compare the proposed approach to the one based on KD-Trees (KD-KM) in a wide range of the parameters space. BSP-KM is more scalable than KDKM, while keeping the deterministic nature of the `brute force' algorithm. In particular, the proposed space partitioning approach has shown to overcome the well-known limitation of KD-Trees in high-dimensional spaces and can also be adopted to improve the efficiency of other algorithms in which KD-Trees have been used.
Resumo:
Recent laboratory observations and advances in theoretical quantum chemistry allow a reappraisal of the fundamental mechanisms that determine the water vapour self-continuum absorption throughout the infrared and millimetre wave spectral regions. By starting from a framework that partitions bimolecular interactions between water molecules into free-pair states, true bound and quasi-bound dimers, we present a critical review of recent observations, continuum models and theoretical predictions. In the near-infrared bands of the water monomer, we propose that spectral features in recent laboratory-derived self-continuum can be well explained as being due to a combination of true bound and quasi-bound dimers, when the spectrum of quasi-bound dimers is approximated as being double the broadened spectrum of the water monomer. Such a representation can explain both the wavenumber variation and the temperature dependence. Recent observations of the self-continuum absorption in the windows between these near-infrared bands indicate that widely used continuum models can underestimate the true strength by around an order of magnitude. An existing far-wing model does not appear able to explain the discrepancy, and although a dimer explanation is possible, currently available observations do not allow a compelling case to be made. In the 8–12 micron window, recent observations indicate that the modern continuum models either do not properly represent the temperature dependence, the wavelength variation, or both. The temperature dependence is suggestive of a transition from the dominance of true bound dimers at lower temperatures to quasibound dimers at higher temperatures. In the mid- and far-infrared spectral region, recent theoretical calculations indicate that true bound dimers may explain at least between 20% and 40% of the observed self-continuum. The possibility that quasi-bound dimers could cause an additional contribution of the same size is discussed. Most recent theoretical considerations agree that water dimers are likely to be the dominant contributor to the self-continuum in the mm-wave spectral range.
Resumo:
This paper re-examines the relative importance of sector and regional effects in determining property returns. Using the largest property database currently available in the world, we decompose the returns on individual properties into a national effect, common to all properties, and a number of sector and regional factors. However, unlike previous studies, we categorise the individual property data into an ever-increasing number of property-types and regions, from a simple 3-by-3 classification, up to a 10 by 63 sector/region classification. In this way we can test the impact that a finer classification has on the sector and regional effects. We confirm the earlier findings of previous studies that sector-specific effects have a greater influence on property returns than regional effects. We also find that the impact of the sector effect is robust across different classifications of sectors and regions. Nonetheless, the more refined sector and regional partitions uncover some interesting sector and regional differences, which were obscured in previous studies. All of which has important implications for property portfolio construction and analysis.
Resumo:
We extend extreme learning machine (ELM) classifiers to complex Reproducing Kernel Hilbert Spaces (RKHS) where the input/output variables as well as the optimization variables are complex-valued. A new family of classifiers, called complex-valued ELM (CELM) suitable for complex-valued multiple-input–multiple-output processing is introduced. In the proposed method, the associated Lagrangian is computed using induced RKHS kernels, adopting a Wirtinger calculus approach formulated as a constrained optimization problem similarly to the conventional ELM classifier formulation. When training the CELM, the Karush–Khun–Tuker (KKT) theorem is used to solve the dual optimization problem that consists of satisfying simultaneously smallest training error as well as smallest norm of output weights criteria. The proposed formulation also addresses aspects of quaternary classification within a Clifford algebra context. For 2D complex-valued inputs, user-defined complex-coupled hyper-planes divide the classifier input space into four partitions. For 3D complex-valued inputs, the formulation generates three pairs of complex-coupled hyper-planes through orthogonal projections. The six hyper-planes then divide the 3D space into eight partitions. It is shown that the CELM problem formulation is equivalent to solving six real-valued ELM tasks, which are induced by projecting the chosen complex kernel across the different user-defined coordinate planes. A classification example of powdered samples on the basis of their terahertz spectral signatures is used to demonstrate the advantages of the CELM classifiers compared to their SVM counterparts. The proposed classifiers retain the advantages of their ELM counterparts, in that they can perform multiclass classification with lower computational complexity than SVM classifiers. Furthermore, because of their ability to perform classification tasks fast, the proposed formulations are of interest to real-time applications.
Resumo:
The Land surface Processes and eXchanges (LPX) model is a fire-enabled dynamic global vegetation model that performs well globally but has problems representing fire regimes and vegetative mix in savannas. Here we focus on improving the fire module. To improve the representation of ignitions, we introduced a reatment of lightning that allows the fraction of ground strikes to vary spatially and seasonally, realistically partitions strike distribution between wet and dry days, and varies the number of dry days with strikes. Fuel availability and moisture content were improved by implementing decomposition rates specific to individual plant functional types and litter classes, and litter drying rates driven by atmospheric water content. To improve water extraction by grasses, we use realistic plant-specific treatments of deep roots. To improve fire responses, we introduced adaptive bark thickness and post-fire resprouting for tropical and temperate broadleaf trees. All improvements are based on extensive analyses of relevant observational data sets. We test model performance for Australia, first evaluating parameterisations separately and then measuring overall behaviour against standard benchmarks. Changes to the lightning parameterisation produce a more realistic simulation of fires in southeastern and central Australia. Implementation of PFT-specific decomposition rates enhances performance in central Australia. Changes in fuel drying improve fire in northern Australia, while changes in rooting depth produce a more realistic simulation of fuel availability and structure in central and northern Australia. The introduction of adaptive bark thickness and resprouting produces more realistic fire regimes in Australian savannas. We also show that the model simulates biomass recovery rates consistent with observations from several different regions of the world characterised by resprouting vegetation. The new model (LPX-Mv1) produces an improved simulation of observed vegetation composition and mean annual burnt area, by 33 and 18% respectively compared to LPX.
Resumo:
The use of economic incentives for biodiversity (mostly Compensation and Reward for Environmental Services including Payment for ES) has been widely supported in the past decades and became the main innovative policy tools for biodiversity conservation worldwide. These policy tools are often based on the insight that rational actors perfectly weigh the costs and benefits of adopting certain behaviors and well-crafted economic incentives and disincentives will lead to socially desirable development scenarios. This rationalist mode of thought has provided interesting insights and results, but it also misestimates the context by which ‘real individuals’ come to decisions, and the multitude of factors influencing development sequences. In this study, our goal is to examine how these policies can take advantage of some unintended behavioral reactions that might in return impact, either positively or negatively, general policy performances. We test the effect of income's origin (‘Low effort’ based money vs. ‘High effort’ based money) on spending decisions (Necessity vs. Superior goods) and subsequent pro social preferences (Future pro-environmental behavior) within Madagascar rural areas, using a natural field experiment. Our results show that money obtained under low effort leads to different consumption patterns than money obtained under high efforts: superior goods are more salient in the case of low effort money. In parallel, money obtained under low effort leads to subsequent higher pro social behavior. Compensation and rewards policies for ecosystem services may mobilize knowledge on behavioral biases to improve their design and foster positive spillovers on their development goals.
Resumo:
Phylogenetic analyses of chloroplast DNA sequences, morphology, and combined data have provided consistent support for many of the major branches within the angiosperm, clade Dipsacales. Here we use sequences from three mitochondrial loci to test the existing broad scale phylogeny and in an attempt to resolve several relationships that have remained uncertain. Parsimony, maximum likelihood, and Bayesian analyses of a combined mitochondrial data set recover trees broadly consistent with previous studies, although resolution and support are lower than in the largest chloroplast analyses. Combining chloroplast and mitochondrial data results in a generally well-resolved and very strongly supported topology but the previously recognized problem areas remain. To investigate why these relationships have been difficult to resolve we conducted a series of experiments using different data partitions and heterogeneous substitution models. Usually more complex modeling schemes are favored regardless of the partitions recognized but model choice had little effect on topology or support values. In contrast there are consistent but weakly supported differences in the topologies recovered from coding and non-coding matrices. These conflicts directly correspond to relationships that were poorly resolved in analyses of the full combined chloroplast-mitochondrial data set. We suggest incongruent signal has contributed to our inability to confidently resolve these problem areas. (c) 2007 Elsevier Inc. All rights reserved.
Resumo:
Broad-scale phylogenetic analyses of the angiosperms and of the Asteridae have failed to confidently resolve relationships among the major lineages of the campanulid Asteridae (i.e., the euasterid II of APG II, 2003). To address this problem we assembled presently available sequences for a core set of 50 taxa, representing the diversity of the four largest lineages (Apiales, Aquifoliales, Asterales, Dipsacales) as well as the smaller ""unplaced"" groups (e.g., Bruniaceae, Paracryphiaceae, Columelliaceae). We constructed four data matrices for phylogenetic analysis: a chloroplast coding matrix (atpB, matK, ndhF, rbcL), a chloroplast non-coding matrix (rps16 intron, trnT-F region, trnV-atpE IGS), a combined chloroplast dataset (all seven chloroplast regions), and a combined genome matrix (seven chloroplast regions plus 18S and 26S rDNA). Bayesian analyses of these datasets using mixed substitution models produced often well-resolved and supported trees. Consistent with more weakly supported results from previous studies, our analyses support the monophyly of the four major clades and the relationships among them. Most importantly, Asterales are inferred to be sister to a clade containing Apiales and Dipsacales. Paracryphiaceae is consistently placed sister to the Dipsacales. However, the exact relationships of Bruniaceae, Columelliaceae, and an Escallonia clade depended upon the dataset. Areas of poor resolution in combined analyses may be partly explained by conflict between the coding and non-coding data partitions. We discuss the implications of these results for our understanding of campanulid phylogeny and evolution, paying special attention to how our findings bear on character evolution and biogeography in Dipsacales.
Resumo:
In this paper, we present an algorithm for cluster analysis that integrates aspects from cluster ensemble and multi-objective clustering. The algorithm is based on a Pareto-based multi-objective genetic algorithm, with a special crossover operator, which uses clustering validation measures as objective functions. The algorithm proposed can deal with data sets presenting different types of clusters, without the need of expertise in cluster analysis. its result is a concise set of partitions representing alternative trade-offs among the objective functions. We compare the results obtained with our algorithm, in the context of gene expression data sets, to those achieved with multi-objective Clustering with automatic K-determination (MOCK). the algorithm most closely related to ours. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters` compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution.
Resumo:
Habitually, capuchin monkeys access encased hard foods by using their canines and premolars and/or by pounding the food on hard surfaces. Instead, the wild bearded capuchins (Cebus libidinosus) of Boa Vista (Brazil) routinely crack palm fruits with tools. We measured size, weight, structure, and peak-force-at-failure of the four palm fruit species most frequently processed with tools by wild capuchin monkeys living in Boa Vista. Moreover, for each nut species we identify whether peak-force-at-failure was consistently associated with greater weight/volume, endocarp, thickness, and structural complexity. The goals of this study were (a) to investigate whether these palm fruits are difficult, or impossible, to access other than with tools and (b) to collect data on the physical properties of palm fruits that are comparable to those available for the nuts cracked open with tools by wild chimpanzees. Results showed that the four nut species differ in terms of peak-force-at-failure and that peak-force-at-failure is positively associated with greater weight (and consequently volume) and apparently with structural complexity (i.e. more kernels and thus more partitions); finally for three out of four nut species shell thickness is also positively associated with greater volume. The finding that the nuts exploited by capuchins with tools have very high resistance values support the idea that tool use is indeed mandatory to crack them open. Finally, the peak-force-at-failure of the piassava nuts is similar to that reported for the very tough panda nuts cracked open by wild chimpanzees; this highlights the ecological importance of tool use for exploiting high resistance foods in this capuchin species.
Resumo:
We consider conditions which allow the embedding of linear hypergraphs of fixed size. In particular, we prove that any k-uniform hypergraph H of positive uniform density contains all linear k-uniform hypergraphs of a given size. More precisely, we show that for all integers l >= k >= 2 and every d > 0 there exists Q > 0 for which the following holds: if His a sufficiently large k-uniform hypergraph with the property that the density of H induced on every vertex subset of size on is at least d, then H contains every linear k-uniform hypergraph F with l vertices. The main ingredient in the proof of this result is a counting lemma for linear hypergraphs, which establishes that the straightforward extension of graph epsilon-regularity to hypergraphs suffices for counting linear hypergraphs. We also consider some related problems. (C) 2009 Elsevier Inc. All rights reserved.
Resumo:
Given a branched covering of degree d between closed surfaces, it determines a collection of partitions of d, the branch data. In this work we show that any branch data are realized by an indecomposable primitive branched covering on a connected closed surface N with chi(N) <= 0. This shows that decomposable and indecomposable realizations may coexist. Moreover, we characterize the branch data of a decomposable primitive branched covering. Bibliography: 20 titles.
Resumo:
The traveling salesman problem is although looking very simple problem but it is an important combinatorial problem. In this thesis I have tried to find the shortest distance tour in which each city is visited exactly one time and return to the starting city. I have tried to solve traveling salesman problem using multilevel graph partitioning approach.Although traveling salesman problem itself very difficult as this problem is belong to the NP-Complete problems but I have tried my best to solve this problem using multilevel graph partitioning it also belong to the NP-Complete problems. I have solved this thesis by using the k-mean partitioning algorithm which divides the problem into multiple partitions and solving each partition separately and its solution is used to improve the overall tour by applying Lin Kernighan algorithm on it. Through all this I got optimal solution which proofs that solving traveling salesman problem through graph partition scheme is good for this NP-Problem and through this we can solved this intractable problem within few minutes.Keywords: Graph Partitioning Scheme, Traveling Salesman Problem.
Resumo:
Detta arbete har genomförts i samarbete med Försvarsmakten och behandlar vilka möjligheter som finns för forensiska undersökningar av e-boksläsaren Amazon Kindle. I arbetets litteraturstudie beskrivs hur tidigare forskning inom ämnet är kraftigt begränsad. Arbetet syftar därför till att besvara hur data kan extraheras från en Kindle, vilka data av forensiskt intresse en Kindle kan innehålla, var denna information lagras och om detta skiljer sig åt mellan olika modeller och firmware-versioner samt om det är nog att undersöka endast den del av minnet som är tillgänglig för användaren eller om ytterligare privilegier för att komma åt hela minnesarean bör införskaffas. För att göra detta fylls tre olika modeller av Kindles med information. Därefter tas avbilder på dem, dels på endast användarpartitionen och dels på dess fullständiga minnesarea efter att en privilegie-eskalering har utförts. Inhämtad data analyseras och resultatet presenteras. Resultatet visar att information av forensiskt intresse så som anteckningar, besökta webbsidor och dokument kan återfinnas, varför det finns ett värde i att utföra forensiska undersökningar på Amazon Kindles. Skillnader råder mellan vilken information som kan återfinnas och var den lagras på de olika enheterna. Enheterna har fyra partitioner varav endast en kan kommas åt utan privilegie-eskalering, varför det finns en fördel med att inhämta avbilder av hela minnesarean. Utöver ovanstående presenteras en metod för att förbipassera en enhets kodlås och därigenom få fullständig åtkomst till den även om den är låst.