808 resultados para Agglomerative Hierarchical Clustering
Resumo:
Az államigazgatásban – itthon és külföldön is – a projektek jelentős százaléka időben csúszik, nem azt eredményezi, amit eredetileg elvártak, a szakmai résztvevők szerint túladminisztrált, a munkatársak tevékenysége nem áttekinthető. Ezeknek a problémáknak a nagy része a projektszervezet és a hierarchikusfunkcionális- hivatali szervezet egymás mellett éléséből és a nehezen szinkronizálható együttműködésből fakad. A cikkben egy, a gyakorlatban bevált módszertant mutat be a szerző, amely adott feltételrendszer mellett nagymértékben kiküszöböli a fent említett hiányosságokat és a szervezet napi működésébe illeszkedő tevékenységek sorozatára vezeti vissza a projekttevékenységeket. A módszer egy gyakorlati problémából – a volt APEH-es és VP-s rendszerek integrálása a NAV-ba – indult ki, azonban a szerző véleménye szerint alkalmazható más, funkcionális alapokon felépülő szervezetnél is. _____ The high percentage of public sector projects slips in time, the result is not that what was expected initially, those are overadministrated by according to the professional participants’ opinion, and the activity of staff does not clear. In this article the author describes a best practice methodology, which led the project activities to series of activities which fit to the organization’s daily operations. The method started from a practical problem, but according to the author’s opinion it can be applied to other structured functional basis organizations.
Resumo:
We build a multiple hierarchical model of a representative democracy in which, for instance, voters elect county representatives, county representatives elect district representatives, district representatives elect state representatives, and state representatives elect a prime minister. We use our model to show that the policy determined by the final representative can become more extreme as the number of hierarchical levels increases because of increased opportunities for gerrymandering. Thus, a sufficiently large number of voters gives a district maker an advantage, enabling her to implement her favorite policy. We also show that the range of implementable policies increases with the depth of the hierarchical system. Consequently, districting by a candidate in a hierarchical legislative system can be viewed as a type of policy implementation device.
Resumo:
Phylogenetic analyses were performed on six genera and 46 species of the Neotropical palm tribe Geonomeae. The analyses were based on two low copy nuclear DNA sequences from the genes encoding phosphoribulokinase and RNA polymerase II. The basal node of the tribe was polytomous. Pholidostachys formed a monophyletic group. The currently accepted genera Calyptronoma and Calyptrogyne formed a well-supported clade with Calyptronoma resolved as paraphyletic to Calyptrogyne. Geonoma formed a strongly supported monophyletic group consisting of two main clades. ^ An evaluation of the genetic distinctness between Geonoma macrostachys varieties at a local and regional scale using inter-simple sequence repeat (ISSR) markers was performed. Clustering, ordination, and AMOVA suggested a lack of genetic distinctness between varieties at the regional level. A hierarchical AMOVA revealed that the genetic diversity mainly lies among the four localities sampled. A significant genetic differentiation between sympatric varieties occurred in one locality only. The current taxonomy of G. macrostachys, which recognizes only one species, was therefore supported. ^ The preferred habitat of sympatric G. macrostachys varieties with respect to edaphic, topographic, and light factors in three Peruvian lowland forests was studied. The two varieties were mostly encountered in different physiographically defined habitats, with variety acaulis occurring more often in floodplain forest and variety macrostachys in the tierra firme. Comparison of means tests revealed that nine to eleven of the 16 environmental variables were significantly different between varieties. Edaphic factors, mainly soil texture and K content, were better contributors than light conditions to distinguish the habitats occupied by the two varieties in all three study sites. It is concluded that habitat differentiation plays a role in the coexistence of these closely related species taxa. ^
Resumo:
This research is to establish new optimization methods for pattern recognition and classification of different white blood cells in actual patient data to enhance the process of diagnosis. Beckman-Coulter Corporation supplied flow cytometry data of numerous patients that are used as training sets to exploit the different physiological characteristics of the different samples provided. The methods of Support Vector Machines (SVM) and Artificial Neural Networks (ANN) were used as promising pattern classification techniques to identify different white blood cell samples and provide information to medical doctors in the form of diagnostic references for the specific disease states, leukemia. The obtained results prove that when a neural network classifier is well configured and trained with cross-validation, it can perform better than support vector classifiers alone for this type of data. Furthermore, a new unsupervised learning algorithm---Density based Adaptive Window Clustering algorithm (DAWC) was designed to process large volumes of data for finding location of high data cluster in real-time. It reduces the computational load to ∼O(N) number of computations, and thus making the algorithm more attractive and faster than current hierarchical algorithms.
Resumo:
With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.
Resumo:
The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre->artist->album->track, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user-access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns.
Resumo:
Due to the rapid advances in computing and sensing technologies, enormous amounts of data are being generated everyday in various applications. The integration of data mining and data visualization has been widely used to analyze these massive and complex data sets to discover hidden patterns. For both data mining and visualization to be effective, it is important to include the visualization techniques in the mining process and to generate the discovered patterns for a more comprehensive visual view. In this dissertation, four related problems: dimensionality reduction for visualizing high dimensional datasets, visualization-based clustering evaluation, interactive document mining, and multiple clusterings exploration are studied to explore the integration of data mining and data visualization. In particular, we 1) propose an efficient feature selection method (reliefF + mRMR) for preprocessing high dimensional datasets; 2) present DClusterE to integrate cluster validation with user interaction and provide rich visualization tools for users to examine document clustering results from multiple perspectives; 3) design two interactive document summarization systems to involve users efforts and generate customized summaries from 2D sentence layouts; and 4) propose a new framework which organizes the different input clusterings into a hierarchical tree structure and allows for interactive exploration of multiple clustering solutions.
Resumo:
Abstract Driven by the political and economic forces of cross-strait, Taiwan has become one of the major source markets for Hong Kong tourism industry since 1987. The major purposes of this study were to investigate the following factors (1) The influential factors of travel motivation, (2) The clusters of travel motivations, (3) The marketing segmentation of clusters of Taiwanese tourists to visit Hong Kong. Through ten travel agents, self-report surveys were distributed to collect data from 366 Taiwanese travelers. Hence, four push factors and six pull factors were identified as travel motivations through the factor analysis. Combined with the cluster analysis; five new groups were founded. Finally, five clusters which process unique profiles (location difference, visiting frequency, travel satisfaction, and destination loyalty) were addressed. The suggestions of developing effective market strategies to attract Taiwanese tourists to Hong Kong were also provided.
Resumo:
Online Social Network (OSN) services provided by Internet companies bring people together to chat, share the information, and enjoy the information. Meanwhile, huge amounts of data are generated by those services (they can be regarded as the social media ) every day, every hour, even every minute, and every second. Currently, researchers are interested in analyzing the OSN data, extracting interesting patterns from it, and applying those patterns to real-world applications. However, due to the large-scale property of the OSN data, it is difficult to effectively analyze it. This dissertation focuses on applying data mining and information retrieval techniques to mine two key components in the social media data — users and user-generated contents. Specifically, it aims at addressing three problems related to the social media users and contents: (1) how does one organize the users and the contents? (2) how does one summarize the textual contents so that users do not have to go over every post to capture the general idea? (3) how does one identify the influential users in the social media to benefit other applications, e.g., Marketing Campaign? The contribution of this dissertation is briefly summarized as follows. (1) It provides a comprehensive and versatile data mining framework to analyze the users and user-generated contents from the social media. (2) It designs a hierarchical co-clustering algorithm to organize the users and contents. (3) It proposes multi-document summarization methods to extract core information from the social network contents. (4) It introduces three important dimensions of social influence, and a dynamic influence model for identifying influential users.
Resumo:
Intense precipitation events (IPE) have been causing great social and economic losses in the affected regions. In the Amazon, these events can have serious impacts, primarily for populations living on the margins of its countless rivers, because when water levels are elevated, floods and/or inundations are generally observed. Thus, the main objective of this research is to study IPE, through Extreme Value Theory (EVT), to estimate return periods of these events and identify regions of the Brazilian Amazon where IPE have the largest values. The study was performed using daily rainfall data of the hydrometeorological network managed by the National Water Agency (Agência Nacional de Água) and the Meteorological Data Bank for Education and Research (Banco de Dados Meteorológicos para Ensino e Pesquisa) of the National Institute of Meteorology (Instituto Nacional de Meteorologia), covering the period 1983-2012. First, homogeneous rainfall regions were determined through cluster analysis, using the hierarchical agglomerative Ward method. Then synthetic series to represent the homogeneous regions were created. Next EVT, was applied in these series, through Generalized Extreme Value (GEV) and the Generalized Pareto Distribution (GPD). The goodness of fit of these distributions were evaluated by the application of the Kolmogorov-Smirnov test, which compares the cumulated empirical distributions with the theoretical ones. Finally, the composition technique was used to characterize the prevailing atmospheric patterns for the occurrence of IPE. The results suggest that the Brazilian Amazon has six pluvial homogeneous regions. It is expected more severe IPE to occur in the south and in the Amazon coast. More intense rainfall events are expected during the rainy or transitions seasons of each sub-region, with total daily precipitation of 146.1, 143.1 and 109.4 mm (GEV) and 201.6, 209.5 and 152.4 mm (GPD), at least once year, in the south, in the coast and in the northwest of the Brazilian Amazon, respectively. For the south Amazonia, the composition analysis revealed that IPE are associated with the configuration and formation of the South Atlantic Convergence Zone. Along the coast, intense precipitation events are associated with mesoscale systems, such Squall Lines. In Northwest Amazonia IPE are apparently associated with the Intertropical Convergence Zone and/or local convection.
Resumo:
This paper proposes a method to evaluate hierarchical image segmentation procedures, in order to enable comparisons between different hierarchical algorithms and of these with other (non-hierarchical) segmentation techniques (as well as with edge detectors) to be made. The proposed method builds up on the edge-based segmentation evaluation approach by considering a set of reference human segmentations as a sample drawn from the population of different levels of detail that may be used in segmenting an image. Our main point is that, since a hierarchical sequence of segmentations approximates such population, those segmentations in the sequence that best capture each human segmentation level of detail should provide the basis for the evaluation of the hierarchical sequence as a whole. A small computational experiment is carried out to show the feasibility of our approach.
Resumo:
Acknowledgements MW and RVD have been supported by the German Federal Ministry for Education and Research via the BMBF Young Investigators Group CoSy-CC2 (grant 18 Marc Wiedermann et al. no. 01LN1306A). JFD thanks the Stordalen Foundation and BMBF (project GLUES) for financial support. JK acknowledges the IRTG 1740 funded by DFG and FAPESP. Coupled climate network analysis has been performed using the Python package pyunicorn (Donges et al, 2015a) that is available at https://github.com/pik-copan/pyunicorn.
Resumo:
Acknowledgements MW and RVD have been supported by the German Federal Ministry for Education and Research via the BMBF Young Investigators Group CoSy-CC2 (grant 18 Marc Wiedermann et al. no. 01LN1306A). JFD thanks the Stordalen Foundation and BMBF (project GLUES) for financial support. JK acknowledges the IRTG 1740 funded by DFG and FAPESP. Coupled climate network analysis has been performed using the Python package pyunicorn (Donges et al, 2015a) that is available at https://github.com/pik-copan/pyunicorn.