821 resultados para Grid-based clustering approach
Resumo:
Most traditional data mining algorithms struggle to cope with the sheer scale of data efficiently. In this paper, we propose a general framework to accelerate existing clustering algorithms to cluster large-scale datasets which contain large numbers of attributes, items, and clusters. Our framework makes use of locality sensitive hashing (LSH) to significantly reduce the cluster search space. We also theoretically prove that our framework has a guaranteed error bound in terms of the clustering quality. This framework can be applied to a set of centroid-based clustering algorithms that assign an object to the most similar cluster, and we adopt the popular K-Modes categorical clustering algorithm to present how the framework can be applied. We validated our framework with five synthetic datasets and a real world Yahoo! Answers dataset. The experimental results demonstrate that our framework is able to speed up the existing clustering algorithm between factors of 2 and 6, while maintaining comparable cluster purity.
Resumo:
La délégation du pouvoir de gestion aux administrateurs et aux gestionnaires, une caractéristique intrinsèque à la gestion efficace de grandes entreprises dans un contexte de capitalisme, confère une grande discrétion à l’équipe de direction. Cette discrétion, si elle n’est pas surveillée, peut mener à des comportements opportunistes envers la corporation, les actionnaires et les autres fournisseurs de capital qui n’ont pas de pouvoir de gestion. Les conflits entre ces deux classes d’agents peuvent émerger à la fois de décisions de gouvernance générale ou de transactions particulières (ie. offre publique d’achat). Dans les cas extrêmes, ces conflits peuvent mener à la faillite de la firme. Dans les cas plus typiques, ils mènent l’extraction de bénéfices privés pour les administrateurs et gestionnaires, l’expropriation des actionnaires, et des réductions de valeur pour la firme. Nous prenons le point de vue d’un petit actionnaire minoritaire pour explorer les méchanismes de gouvernance disponibles au Canada et aux États‐Unis. Après une synthèse dans la Partie 1 des théories sous‐jacentes à l’étude du pouvoir dans la corporation (séparation de la propriété et du contrôle et les conflits d’agence), nous concentrons notre analyse dans la Partie 2 sur les différents types de méchanismes (1) de gouvernance interne, (2) juridiques et (3) marchands, qui confèrent du pouvoir aux deux classes d’agents. Nous examinons comment les intérêts de ces deux classes peuvent être réalignés afin de prévenir et résoudre les conflits au sein de la firme. La Partie 3 explore un équilibre dynamique de pouvoir corporatif qui cherche à minimiser le potentiel d’opportunisme toute en préservant une quantité de discrétion suffisante pour la gestion efficace de la firme. Nous analysons des moyens pour renforcer les protections des actionnaires minoritaires et proposons un survol des pistes de réforme possibles.
Resumo:
The article examines the structure of the collaboration networks of research groups where Slovenian and Spanish PhD students are pursuing their doctorate. The units of analysis are student-supervisor dyads. We use duocentred networks, a novel network structure appropriate for networks which are centred around a dyad. A cluster analysis reveals three typical clusters of research groups. Those which are large and belong to several institutions are labelled under a bridging social capital label. Those which are small, centred in a single institution but have high cohesion are labelled as bonding social capital. Those which are small and with low cohesion are called weak social capital groups. Academic performance of both PhD students and supervisors are highest in bridging groups and lowest in weak groups. Other variables are also found to differ according to the type of research group. At the end, some recommendations regarding academic and research policy are drawn
Resumo:
We developed a stochastic simulation model incorporating most processes likely to be important in the spread of Phytophthora ramorum and similar diseases across the British landscape (covering Rhododendron ponticum in woodland and nurseries, and Vaccinium myrtillus in heathland). The simulation allows for movements of diseased plants within a realistically modelled trade network and long-distance natural dispersal. A series of simulation experiments were run with the model, representing an experiment varying the epidemic pressure and linkage between natural vegetation and horticultural trade, with or without disease spread in commercial trade, and with or without inspections-with-eradication, to give a 2 x 2 x 2 x 2 factorial started at 10 arbitrary locations spread across England. Fifty replicate simulations were made at each set of parameter values. Individual epidemics varied dramatically in size due to stochastic effects throughout the model. Across a range of epidemic pressures, the size of the epidemic was 5-13 times larger when commercial movement of plants was included. A key unknown factor in the system is the area of susceptible habitat outside the nursery system. Inspections, with a probability of detection and efficiency of infected-plant removal of 80% and made at 90-day intervals, reduced the size of epidemics by about 60% across the three sectors with a density of 1% susceptible plants in broadleaf woodland and heathland. Reducing this density to 0.1% largely isolated the trade network, so that inspections reduced the final epidemic size by over 90%, and most epidemics ended without escape into nature. Even in this case, however, major wild epidemics developed in a few percent of cases. Provided the number of new introductions remains low, the current inspection policy will control most epidemics. However, as the rate of introduction increases, it can overwhelm any reasonable inspection regime, largely due to spread prior to detection. (C) 2009 Elsevier B.V. All rights reserved.
Resumo:
It is well known that there is a dynamic relationship between cerebral blood flow (CBF) and cerebral blood volume (CBV). With increasing applications of functional MRI, where the blood oxygen-level-dependent signals are recorded, the understanding and accurate modeling of the hemodynamic relationship between CBF and CBV becomes increasingly important. This study presents an empirical and data-based modeling framework for model identification from CBF and CBV experimental data. It is shown that the relationship between the changes in CBF and CBV can be described using a parsimonious autoregressive with exogenous input model structure. It is observed that neither the ordinary least-squares (LS) method nor the classical total least-squares (TLS) method can produce accurate estimates from the original noisy CBF and CBV data. A regularized total least-squares (RTLS) method is thus introduced and extended to solve such an error-in-the-variables problem. Quantitative results show that the RTLS method works very well on the noisy CBF and CBV data. Finally, a combination of RTLS with a filtering method can lead to a parsimonious but very effective model that can characterize the relationship between the changes in CBF and CBV.
Resumo:
With the fast development of wireless communications, ZigBee and semiconductor devices, home automation networks have recently become very popular. Since typical consumer products deployed in home automation networks are often powered by tiny and limited batteries, one of the most challenging research issues is concerning energy reduction and the balancing of energy consumption across the network in order to prolong the home network lifetime for consumer devices. The introduction of clustering and sink mobility techniques into home automation networks have been shown to be an efficient way to improve the network performance and have received significant research attention. Taking inspiration from nature, this paper proposes an Ant Colony Optimization (ACO) based clustering algorithm specifically with mobile sink support for home automation networks. In this work, the network is divided into several clusters and cluster heads are selected within each cluster. Then, a mobile sink communicates with each cluster head to collect data directly through short range communications. The ACO algorithm has been utilized in this work in order to find the optimal mobility trajectory for the mobile sink. Extensive simulation results from this research show that the proposed algorithm significantly improves home network performance when using mobile sinks in terms of energy consumption and network lifetime as compared to other routing algorithms currently deployed for home automation networks.
Resumo:
Starting from the idea that economic systems fall into complexity theory, where its many agents interact with each other without a central control and that these interactions are able to change the future behavior of the agents and the entire system, similar to a chaotic system we increase the model of Russo et al. (2014) to carry out three experiments focusing on the interaction between Banks and Firms in an artificial economy. The first experiment is relative to Relationship Banking where, according to the literature, the interaction over time between Banks and Firms are able to produce mutual benefits, mainly due to reduction of the information asymmetry between them. The following experiment is related to information heterogeneity in the credit market, where the larger the bank, the higher their visibility in the credit market, increasing the number of consult for new loans. Finally, the third experiment is about the effects on the credit market of the heterogeneity of prices that Firms faces in the goods market.
Resumo:
Lo scopo del clustering è quindi quello di individuare strutture nei dati significative, ed è proprio dalla seguente definizione che è iniziata questa attività di tesi , fornendo un approccio innovativo ed inesplorato al cluster, ovvero non ricercando la relazione ma ragionando su cosa non lo sia. Osservando un insieme di dati ,cosa rappresenta la non relazione? Una domanda difficile da porsi , che ha intrinsecamente la sua risposta, ovvero l’indipendenza di ogni singolo dato da tutti gli altri. La ricerca quindi dell’indipendenza tra i dati ha portato il nostro pensiero all’approccio statistico ai dati , in quanto essa è ben descritta e dimostrata in statistica. Ogni punto in un dataset, per essere considerato “privo di collegamenti/relazioni” , significa che la stessa probabilità di essere presente in ogni elemento spaziale dell’intero dataset. Matematicamente parlando , ogni punto P in uno spazio S ha la stessa probabilità di cadere in una regione R ; il che vuol dire che tale punto può CASUALMENTE essere all’interno di una qualsiasi regione del dataset. Da questa assunzione inizia il lavoro di tesi, diviso in più parti. Il secondo capitolo analizza lo stato dell’arte del clustering, raffrontato alla crescente problematica della mole di dati, che con l’avvento della diffusione della rete ha visto incrementare esponenzialmente la grandezza delle basi di conoscenza sia in termini di attributi (dimensioni) che in termini di quantità di dati (Big Data). Il terzo capitolo richiama i concetti teorico-statistici utilizzati dagli algoritimi statistici implementati. Nel quarto capitolo vi sono i dettagli relativi all’implementazione degli algoritmi , ove sono descritte le varie fasi di investigazione ,le motivazioni sulle scelte architetturali e le considerazioni che hanno portato all’esclusione di una delle 3 versioni implementate. Nel quinto capitolo gli algoritmi 2 e 3 sono confrontati con alcuni algoritmi presenti in letteratura, per dimostrare le potenzialità e le problematiche dell’algoritmo sviluppato , tali test sono a livello qualitativo , in quanto l’obbiettivo del lavoro di tesi è dimostrare come un approccio statistico può rivelarsi un’arma vincente e non quello di fornire un nuovo algoritmo utilizzabile nelle varie problematiche di clustering. Nel sesto capitolo saranno tratte le conclusioni sul lavoro svolto e saranno elencati i possibili interventi futuri dai quali la ricerca appena iniziata del clustering statistico potrebbe crescere.
Resumo:
This publication offers concrete suggestions for implementing an integrative and learning-oriented approach to agricultural extension with the goal of fostering sustainable development. It targets governmental and non-governmental organisations, development agencies, and extension staff working in the field of rural development. The book looks into the conditions and trends that influence extension today, and outlines new challenges and necessary adaptations. It offers a basic reflection on the goals, the criteria for success and the form of a state-of-the-art approach to extension. The core of the book consists of a presentation of Learning for Sustainability (LforS), an example of an integrative, learning-oriented approach that is based on three crucial elements: stakeholder dialogue, knowledge management, and organizational development. Awareness raising and capacity building, social mobilization, and monitoring & evaluation are additional building blocks. The structure and organisation of the LforS approach as well as a selection of appropriate methods and tools are presented. The authors also address key aspects of developing and managing a learning-oriented extension approach. The book illustrates how LforS can be implemented by presenting two case studies, one from Madagascar and one from Mongolia. It addresses conceptual questions and at the same time it is practice-oriented. In contrast to other extension approaches, LforS does not limit its focus to production-related aspects and the development of value chains: it also addresses livelihood issues in a broad sense. With its focus on learning processes LforS seeks to create a better understanding of the links between different spheres and different levels of decision-making; it also seeks to foster integration of the different actors’ perspectives.