891 resultados para Hierarchical clustering model
Resumo:
In this paper we address the issue of locating hierarchical facilities in the presence of congestion. Two hierarchical models are presented, where lower level servers attend requests first, and then, some of the served customers are referred to higher level servers. In the first model, the objective is to find the minimum number of servers and theirlocations that will cover a given region with a distance or time standard. The second model is cast as a Maximal Covering Location formulation. A heuristic procedure is then presented together with computational experience. Finally, some extensions of these models that address other types of spatial configurations are offered.
Resumo:
Analysis of variance is commonly used in morphometry in order to ascertain differences in parameters between several populations. Failure to detect significant differences between populations (type II error) may be due to suboptimal sampling and lead to erroneous conclusions; the concept of statistical power allows one to avoid such failures by means of an adequate sampling. Several examples are given in the morphometry of the nervous system, showing the use of the power of a hierarchical analysis of variance test for the choice of appropriate sample and subsample sizes. In the first case chosen, neuronal densities in the human visual cortex, we find the number of observations to be of little effect. For dendritic spine densities in the visual cortex of mice and humans, the effect is somewhat larger. A substantial effect is shown in our last example, dendritic segmental lengths in monkey lateral geniculate nucleus. It is in the nature of the hierarchical model that sample size is always more important than subsample size. The relative weight to be attributed to subsample size thus depends on the relative magnitude of the between observations variance compared to the between individuals variance.
Resumo:
The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.
Resumo:
A systematic assessment of global neural network connectivity through direct electrophysiological assays has remained technically infeasible, even in simpler systems like dissociated neuronal cultures. We introduce an improved algorithmic approach based on Transfer Entropy to reconstruct structural connectivity from network activity monitored through calcium imaging. We focus in this study on the inference of excitatory synaptic links. Based on information theory, our method requires no prior assumptions on the statistics of neuronal firing and neuronal connections. The performance of our algorithm is benchmarked on surrogate time series of calcium fluorescence generated by the simulated dynamics of a network with known ground-truth topology. We find that the functional network topology revealed by Transfer Entropy depends qualitatively on the time-dependent dynamic state of the network (bursting or non-bursting). Thus by conditioning with respect to the global mean activity, we improve the performance of our method. This allows us to focus the analysis to specific dynamical regimes of the network in which the inferred functional connectivity is shaped by monosynaptic excitatory connections, rather than by collective synchrony. Our method can discriminate between actual causal influences between neurons and spurious non-causal correlations due to light scattering artifacts, which inherently affect the quality of fluorescence imaging. Compared to other reconstruction strategies such as cross-correlation or Granger Causality methods, our method based on improved Transfer Entropy is remarkably more accurate. In particular, it provides a good estimation of the excitatory network clustering coefficient, allowing for discrimination between weakly and strongly clustered topologies. Finally, we demonstrate the applicability of our method to analyses of real recordings of in vitro disinhibited cortical cultures where we suggest that excitatory connections are characterized by an elevated level of clustering compared to a random graph (although not extreme) and can be markedly non-local.
Resumo:
We present a simple model of communication in networks with hierarchical branching. We analyze the behavior of the model from the viewpoint of critical systems under different situations. For certain values of the parameters, a continuous phase transition between a sparse and a congested regime is observed and accurately described by an order parameter and the power spectra. At the critical point the behavior of the model is totally independent of the number of hierarchical levels. Also scaling properties are observed when the size of the system varies. The presence of noise in the communication is shown to break the transition. The analytical results are a useful guide to forecasting the main features of real networks.
Resumo:
We present a generator of random networks where both the degree-dependent clustering coefficient and the degree distribution are tunable. Following the same philosophy as in the configuration model, the degree distribution and the clustering coefficient for each class of nodes of degree k are fixed ad hoc and a priori. The algorithm generates corresponding topologies by applying first a closure of triangles and second the classical closure of remaining free stubs. The procedure unveils an universal relation among clustering and degree-degree correlations for all networks, where the level of assortativity establishes an upper limit to the level of clustering. Maximum assortativity ensures no restriction on the decay of the clustering coefficient whereas disassortativity sets a stronger constraint on its behavior. Correlation measures in real networks are seen to observe this structural bound.
Resumo:
A recent method used to optimize biased neural networks with low levels of activity is applied to a hierarchical model. As a consequence, the performance of the system is strongly enhanced. The steps to achieve optimization are analyzed in detail.
Resumo:
Background: The trithorax group (trxG) and Polycomb group (PcG) proteins are responsible for the maintenance of stable transcriptional patterns of many developmental regulators. They bind to specific regions of DNA and direct the post-translational modifications of histones, playing a role in the dynamics of chromatin structure.Results: We have performed genome-wide expression studies of trx and ash2 mutants in Drosophila melanogaster. Using computational analysis of our microarray data, we have identified 25 clusters of genes potentially regulated by TRX. Most of these clusters consist of genes that encode structural proteins involved in cuticle formation. This organization appears to be a distinctive feature of the regulatory networks of TRX and other chromatin regulators, since we have observed the same arrangement in clusters after experiments performed with ASH2, as well as in experiments performed by others with NURF, dMyc, and ASH1. We have also found many of these clusters to be significantly conserved in D. simulans, D. yakuba, D. pseudoobscura and partially in Anopheles gambiae.Conclusion: The analysis of genes governed by chromatin regulators has led to the identification of clusters of functionally related genes conserved in other insect species, suggesting this chromosomal organization is biologically important. Moreover, our results indicate that TRX and other chromatin regulators may act globally on chromatin domains that contain transcriptionally co-regulated genes.
Resumo:
Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.
Resumo:
To study the stress-induced effects caused by wounding under a new perspective, a metabolomic strategy based on HPLC-MS has been devised for the model plant Arabidopsis thaliana. To detect induced metabolites and precisely localise these compounds among the numerous constitutive metabolites, HPLC-MS analyses were performed in a two-step strategy. In a first step, rapid direct TOF-MS measurements of the crude leaf extract were performed with a ballistic gradient on a short LC-column. The HPLC-MS data were investigated by multivariate analysis as total mass spectra (TMS). Principal components analysis (PCA) and hierarchical cluster analysis (HCA) on principal coordinates were combined for data treatment. PCA and HCA demonstrated a clear clustering of plant specimens selecting the highest discriminating ions given by the complete data analysis, leading to the specific detection of discrete-induced ions (m/z values). Furthermore, pool constitution with plants of homogeneous behaviour was achieved for confirmatory analysis. In this second step, long high-resolution LC profilings on an UPLC-TOF-MS system were used on pooled samples. This allowed to precisely localise the putative biological marker induced by wounding and by specific extraction of accurate m/z values detected in the screening procedure with the TMS spectra.
Resumo:
T cell receptor (TCR-CD3) triggering involves both receptor clustering and conformational changes at the cytoplasmic tails of the CD3 subunits. The mechanism by which TCRalphabeta ligand binding confers conformational changes to CD3 is unknown. By using well-defined ligands, we showed that induction of the conformational change requires both multivalent engagement and the mobility restriction of the TCR-CD3 imposed by the plasma membrane. The conformational change is elicited by cooperative rearrangements of two TCR-CD3 complexes and does not require accompanying changes in the structure of the TCRalphabeta ectodomains. This conformational change at CD3 reverts upon ligand dissociation and is required for T cell activation. Thus, our permissive geometry model provides a molecular mechanism that rationalizes how the information of ligand binding to TCRalphabeta is transmitted to the CD3 subunits and to the intracellular signaling machinery.
Resumo:
OBJECTIVE: Hierarchical modeling has been proposed as a solution to the multiple exposure problem. We estimate associations between metabolic syndrome and different components of antiretroviral therapy using both conventional and hierarchical models. STUDY DESIGN AND SETTING: We use discrete time survival analysis to estimate the association between metabolic syndrome and cumulative exposure to 16 antiretrovirals from four drug classes. We fit a hierarchical model where the drug class provides a prior model of the association between metabolic syndrome and exposure to each antiretroviral. RESULTS: One thousand two hundred and eighteen patients were followed for a median of 27 months, with 242 cases of metabolic syndrome (20%) at a rate of 7.5 cases per 100 patient years. Metabolic syndrome was more likely to develop in patients exposed to stavudine, but was less likely to develop in those exposed to atazanavir. The estimate for exposure to atazanavir increased from hazard ratio of 0.06 per 6 months' use in the conventional model to 0.37 in the hierarchical model (or from 0.57 to 0.81 when using spline-based covariate adjustment). CONCLUSION: These results are consistent with trials that show the disadvantage of stavudine and advantage of atazanavir relative to other drugs in their respective classes. The hierarchical model gave more plausible results than the equivalent conventional model.
Resumo:
We uncover the global organization of clustering in real complex networks. To this end, we ask whether triangles in real networks organize as in maximally random graphs with given degree and clustering distributions, or as in maximally ordered graph models where triangles are forced into modules. The answer comes by way of exploring m-core landscapes, where the m-core is defined, akin to the k-core, as the maximal subgraph with edges participating in at least m triangles. This property defines a set of nested subgraphs that, contrarily to k-cores, is able to distinguish between hierarchical and modular architectures. We find that the clustering organization in real networks is neither completely random nor ordered although, surprisingly, it is more random than modular. This supports the idea that the structure of real networks may in fact be the outcome of self-organized processes based on local optimization rules, in contrast to global optimization principles.
Resumo:
We presented an integrated hierarchical model of psychopathology that more accurately captures empirical patterns of comorbidity between clinical syndromes and personality disorders.In order to verify the structural validity of the model proposed, this study aimed to analyze the convergence between the Restructured Clinical (RC) scales and Personality scales (PSY-5) of the MMPI-2-RF and the Clinical Syndrome and Personality Disorder scales of the MCMI-III.The MMPI-2-RF and MCMI-III were administered to a clinical sample of 377 outpatients (167 men and 210 women).The structural hypothesiswas assessed by using a Confirmatory Factor Analytic design with four common superordinate factors. An independent-cluster-basis solution was proposed based on maximum likelihood estimation and the application of several fit indices.The fit of the proposed model can be considered as good and more so if we take into account its complexity.
Resumo:
This paper is a study of the concept of priority and its use together with the notion of hierarchy in academic writing and theoretical models of translation. Hierarchies and priorities can be implicit or explicit, prescribed, suggested or described. The paper starts, chronologically, wtih Nida and Levý’s hierarchical accounts of translation and follows their legacy in scholars as different as Newmark and Gutt. The concept of priorities is hinted at also in didactic models (Nord) as well as in norm-theoretical and accounts of translation (Toury and Chesterman) within Descriptive Translation Studies. All of these authors are analyzed and commented. The paper calls for a more systematic and straightforward account of translational priorities, and proposes a few conceptual tools that stem from this research model, including the concepts of ambition and richness of a translation. Finally, the paper concludes with an adaptation of Lakoff and Johnson’s view of prototypicality and its potential usefulness in research into and the understanding of translation.