846 resultados para high dimensional growing self organizing map with randomness
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
The Asteraceae, one of the largest families among angiosperms, is chemically characterised by the production of sesquiterpene lactones (SLs). A total of 1,111 SLs, which were extracted from 658 species, 161 genera, 63 subtribes and 15 tribes of Asteraceae, were represented and registered in two dimensions in the SISTEMATX, an in-house software system, and were associated with their botanical sources. The respective 11 block of descriptors: Constitutional, Functional groups, BCUT, Atom-centred, 2D autocorrelations, Topological, Geometrical, RDF, 3D-MoRSE, GETAWAY and WHIM were used as input data to separate the botanical occurrences through self-organising maps. Maps that were generated with each descriptor divided the Asteraceae tribes, with total index values between 66.7% and 83.6%. The analysis of the results shows evident similarities among the Heliantheae, Helenieae and Eupatorieae tribes as well as between the Anthemideae and Inuleae tribes. Those observations are in agreement with systematic classifications that were proposed by Bremer, which use mainly morphological and molecular data, therefore chemical markers partially corroborate with these classifications. The results demonstrate that the atom-centred and RDF descriptors can be used as a tool for taxonomic classification in low hierarchical levels, such as tribes. Descriptors obtained through fragments or by the two-dimensional representation of the SL structures were sufficient to obtain significant results, and better results were not achieved by using descriptors derived from three-dimensional representations of SLs. Such models based on physico-chemical properties can project new design SLs, similar structures from literature or even unreported structures in two-dimensional chemical space. Therefore, the generated SOMs can predict the most probable tribe where a biologically active molecule can be found according Bremer classification.
Resumo:
The Peer-to-Peer network paradigm is drawing the attention of both final users and researchers for its features. P2P networks shift from the classic client-server approach to a high level of decentralization where there is no central control and all the nodes should be able not only to require services, but to provide them to other peers as well. While on one hand such high level of decentralization might lead to interesting properties like scalability and fault tolerance, on the other hand it implies many new problems to deal with. A key feature of many P2P systems is openness, meaning that everybody is potentially able to join a network with no need for subscription or payment systems. The combination of openness and lack of central control makes it feasible for a user to free-ride, that is to increase its own benefit by using services without allocating resources to satisfy other peers’ requests. One of the main goals when designing a P2P system is therefore to achieve cooperation between users. Given the nature of P2P systems based on simple local interactions of many peers having partial knowledge of the whole system, an interesting way to achieve desired properties on a system scale might consist in obtaining them as emergent properties of the many interactions occurring at local node level. Two methods are typically used to face the problem of cooperation in P2P networks: 1) engineering emergent properties when designing the protocol; 2) study the system as a game and apply Game Theory techniques, especially to find Nash Equilibria in the game and to reach them making the system stable against possible deviant behaviors. In this work we present an evolutionary framework to enforce cooperative behaviour in P2P networks that is alternative to both the methods mentioned above. Our approach is based on an evolutionary algorithm inspired by computational sociology and evolutionary game theory, consisting in having each peer periodically trying to copy another peer which is performing better. The proposed algorithms, called SLAC and SLACER, draw inspiration from tag systems originated in computational sociology, the main idea behind the algorithm consists in having low performance nodes copying high performance ones. The algorithm is run locally by every node and leads to an evolution of the network both from the topology and from the nodes’ strategy point of view. Initial tests with a simple Prisoners’ Dilemma application show how SLAC is able to bring the network to a state of high cooperation independently from the initial network conditions. Interesting results are obtained when studying the effect of cheating nodes on SLAC algorithm. In fact in some cases selfish nodes rationally exploiting the system for their own benefit can actually improve system performance from the cooperation formation point of view. The final step is to apply our results to more realistic scenarios. We put our efforts in studying and improving the BitTorrent protocol. BitTorrent was chosen not only for its popularity but because it has many points in common with SLAC and SLACER algorithms, ranging from the game theoretical inspiration (tit-for-tat-like mechanism) to the swarms topology. We discovered fairness, meant as ratio between uploaded and downloaded data, to be a weakness of the original BitTorrent protocol and we drew inspiration from the knowledge of cooperation formation and maintenance mechanism derived from the development and analysis of SLAC and SLACER, to improve fairness and tackle freeriding and cheating in BitTorrent. We produced an extension of BitTorrent called BitFair that has been evaluated through simulation and has shown the abilities of enforcing fairness and tackling free-riding and cheating nodes.
Resumo:
A prevalent claim is that we are in knowledge economy. When we talk about knowledge economy, we generally mean the concept of “Knowledge-based economy” indicating the use of knowledge and technologies to produce economic benefits. Hence knowledge is both tool and raw material (people’s skill) for producing some kind of product or service. In this kind of environment economic organization is undergoing several changes. For example authority relations are less important, legal and ownership-based definitions of the boundaries of the firm are becoming irrelevant and there are only few constraints on the set of coordination mechanisms. Hence what characterises a knowledge economy is the growing importance of human capital in productive processes (Foss, 2005) and the increasing knowledge intensity of jobs (Hodgson, 1999). Economic processes are also highly intertwined with social processes: they are likely to be informal and reciprocal rather than formal and negotiated. Another important point is also the problem of the division of labor: as economic activity becomes mainly intellectual and requires the integration of specific and idiosyncratic skills, the task of dividing the job and assigning it to the most appropriate individuals becomes arduous, a “supervisory problem” (Hogdson, 1999) emerges and traditional hierarchical control may result increasingly ineffective. Not only specificity of know how makes it awkward to monitor the execution of tasks, more importantly, top-down integration of skills may be difficult because ‘the nominal supervisors will not know the best way of doing the job – or even the precise purpose of the specialist job itself – and the worker will know better’ (Hogdson,1999). We, therefore, expect that the organization of the economic activity of specialists should be, at least partially, self-organized. The aim of this thesis is to bridge studies from computer science and in particular from Peer-to-Peer Networks (P2P) to organization theories. We think that the P2P paradigm well fits with organization problems related to all those situation in which a central authority is not possible. We believe that P2P Networks show a number of characteristics similar to firms working in a knowledge-based economy and hence that the methodology used for studying P2P Networks can be applied to organization studies. Three are the main characteristics we think P2P have in common with firms involved in knowledge economy: - Decentralization: in a pure P2P system every peer is an equal participant, there is no central authority governing the actions of the single peers; - Cost of ownership: P2P computing implies shared ownership reducing the cost of owing the systems and the content, and the cost of maintaining them; - Self-Organization: it refers to the process in a system leading to the emergence of global order within the system without the presence of another system dictating this order. These characteristics are present also in the kind of firm that we try to address and that’ why we have shifted the techniques we adopted for studies in computer science (Marcozzi et al., 2005; Hales et al., 2007 [39]) to management science.
Resumo:
Network Theory is a prolific and lively field, especially when it approaches Biology. New concepts from this theory find application in areas where extensive datasets are already available for analysis, without the need to invest money to collect them. The only tools that are necessary to accomplish an analysis are easily accessible: a computing machine and a good algorithm. As these two tools progress, thanks to technology advancement and human efforts, wider and wider datasets can be analysed. The aim of this paper is twofold. Firstly, to provide an overview of one of these concepts, which originates at the meeting point between Network Theory and Statistical Mechanics: the entropy of a network ensemble. This quantity has been described from different angles in the literature. Our approach tries to be a synthesis of the different points of view. The second part of the work is devoted to presenting a parallel algorithm that can evaluate this quantity over an extensive dataset. Eventually, the algorithm will also be used to analyse high-throughput data coming from biology.
Resumo:
In biostatistical applications, interest often focuses on the estimation of the distribution of time T between two consecutive events. If the initial event time is observed and the subsequent event time is only known to be larger or smaller than an observed monitoring time, then the data is described by the well known singly-censored current status model, also known as interval censored data, case I. We extend this current status model by allowing the presence of a time-dependent process, which is partly observed and allowing C to depend on T through the observed part of this time-dependent process. Because of the high dimension of the covariate process, no globally efficient estimators exist with a good practical performance at moderate sample sizes. We follow the approach of Robins and Rotnitzky (1992) by modeling the censoring variable, given the time-variable and the covariate-process, i.e., the missingness process, under the restriction that it satisfied coarsening at random. We propose a generalization of the simple current status estimator of the distribution of T and of smooth functionals of the distribution of T, which is based on an estimate of the missingness. In this estimator the covariates enter only through the estimate of the missingness process. Due to the coarsening at random assumption, the estimator has the interesting property that if we estimate the missingness process more nonparametrically, then we improve its efficiency. We show that by local estimation of an optimal model or optimal function of the covariates for the missingness process, the generalized current status estimator for smooth functionals become locally efficient; meaning it is efficient if the right model or covariate is consistently estimated and it is consistent and asymptotically normal in general. Estimation of the optimal model requires estimation of the conditional distribution of T, given the covariates. Any (prior) knowledge of this conditional distribution can be used at this stage without any risk of losing root-n consistency. We also propose locally efficient one step estimators. Finally, we show some simulation results.
Resumo:
This paper describes the basic tools to work with wireless sensors. TinyOShas a componentbased architecture which enables rapid innovation and implementation while minimizing code size as required by the severe memory constraints inherent in sensor networks. TinyOS's component library includes network protocols, distributed services, sensor drivers, and data acquisition tools ? all of which can be used asia or be further refined for a custom application. TinyOS was originally developed as a research project at the University of California Berkeley, but has since grown to have an international community of developers and users. Some algorithms concerning packet routing are shown. Incar entertainment systems can be based on wireless sensors in order to obtain information from Internet, but routing protocols must be implemented in order to avoid bottleneck problems. Ant Colony algorithms are really useful in such cases, therefore they can be embedded into the sensors to perform such routing task.
Resumo:
Esta tesis estudia la evolución estructural de conjuntos de neuronas como la capacidad de auto-organización desde conjuntos de neuronas separadas hasta que forman una red (clusterizada) compleja. Esta tesis contribuye con el diseño e implementación de un algoritmo no supervisado de segmentación basado en grafos con un coste computacional muy bajo. Este algoritmo proporciona de forma automática la estructura completa de la red a partir de imágenes de cultivos neuronales tomadas con microscopios de fase con una resolución muy alta. La estructura de la red es representada mediante un objeto matemático (matriz) cuyos nodos representan a las neuronas o grupos de neuronas y los enlaces son las conexiones reconstruidas entre ellos. Este algoritmo extrae también otras medidas morfológicas importantes que caracterizan a las neuronas y a las neuritas. A diferencia de otros algoritmos hasta el momento, que necesitan de fluorescencia y técnicas inmunocitoquímicas, el algoritmo propuesto permite el estudio longitudinal de forma no invasiva posibilitando el estudio durante la formación de un cultivo. Además, esta tesis, estudia de forma sistemática un grupo de variables topológicas que garantizan la posibilidad de cuantificar e investigar la progresión de las características principales durante el proceso de auto-organización del cultivo. Nuestros resultados muestran la existencia de un estado concreto correspondiente a redes con configuracin small-world y la emergencia de propiedades a micro- y meso-escala de la estructura de la red. Finalmente, identificamos los procesos físicos principales que guían las transformaciones morfológicas de los cultivos y proponemos un modelo de crecimiento de red que reproduce el comportamiento cuantitativamente de las observaciones experimentales. ABSTRACT The thesis analyzes the morphological evolution of assemblies of living neurons, as they self-organize from collections of separated cells into elaborated, clustered, networks. In particular, it contributes with the design and implementation of a graph-based unsupervised segmentation algorithm, having an associated very low computational cost. The processing automatically retrieves the whole network structure from large scale phase-contrast images taken at high resolution throughout the entire life of a cultured neuronal network. The network structure is represented by a mathematical object (a matrix) in which nodes are identified neurons or neurons clusters, and links are the reconstructed connections between them. The algorithm is also able to extract any other relevant morphological information characterizing neurons and neurites. More importantly, and at variance with other segmentation methods that require fluorescence imaging from immunocyto- chemistry techniques, our measures are non invasive and entitle us to carry out a fully longitudinal analysis during the maturation of a single culture. In turn, a systematic statistical analysis of a group of topological observables grants us the possibility of quantifying and tracking the progression of the main networks characteristics during the self-organization process of the culture. Our results point to the existence of a particular state corresponding to a small-world network configuration, in which several relevant graphs micro- and meso-scale properties emerge. Finally, we identify the main physical processes taking place during the cultures morphological transformations, and embed them into a simplified growth model that quantitatively reproduces the overall set of experimental observations.
Resumo:
Esta tesis estudia la evolución estructural de conjuntos de neuronas como la capacidad de auto-organización desde conjuntos de neuronas separadas hasta que forman una red (clusterizada) compleja. Esta tesis contribuye con el diseño e implementación de un algoritmo no supervisado de segmentación basado en grafos con un coste computacional muy bajo. Este algoritmo proporciona de forma automática la estructura completa de la red a partir de imágenes de cultivos neuronales tomadas con microscopios de fase con una resolución muy alta. La estructura de la red es representada mediante un objeto matemático (matriz) cuyos nodos representan a las neuronas o grupos de neuronas y los enlaces son las conexiones reconstruidas entre ellos. Este algoritmo extrae también otras medidas morfológicas importantes que caracterizan a las neuronas y a las neuritas. A diferencia de otros algoritmos hasta el momento, que necesitan de fluorescencia y técnicas inmunocitoquímicas, el algoritmo propuesto permite el estudio longitudinal de forma no invasiva posibilitando el estudio durante la formación de un cultivo. Además, esta tesis, estudia de forma sistemática un grupo de variables topológicas que garantizan la posibilidad de cuantificar e investigar la progresión de las características principales durante el proceso de auto-organización del cultivo. Nuestros resultados muestran la existencia de un estado concreto correspondiente a redes con configuracin small-world y la emergencia de propiedades a micro- y meso-escala de la estructura de la red. Finalmente, identificamos los procesos físicos principales que guían las transformaciones morfológicas de los cultivos y proponemos un modelo de crecimiento de red que reproduce el comportamiento cuantitativamente de las observaciones experimentales. ABSTRACT The thesis analyzes the morphological evolution of assemblies of living neurons, as they self-organize from collections of separated cells into elaborated, clustered, networks. In particular, it contributes with the design and implementation of a graph-based unsupervised segmentation algorithm, having an associated very low computational cost. The processing automatically retrieves the whole network structure from large scale phase-contrast images taken at high resolution throughout the entire life of a cultured neuronal network. The network structure is represented by a mathematical object (a matrix) in which nodes are identified neurons or neurons clusters, and links are the reconstructed connections between them. The algorithm is also able to extract any other relevant morphological information characterizing neurons and neurites. More importantly, and at variance with other segmentation methods that require fluorescence imaging from immunocyto- chemistry techniques, our measures are non invasive and entitle us to carry out a fully longitudinal analysis during the maturation of a single culture. In turn, a systematic statistical analysis of a group of topological observables grants us the possibility of quantifying and tracking the progression of the main networks characteristics during the self-organization process of the culture. Our results point to the existence of a particular state corresponding to a small-world network configuration, in which several relevant graphs micro- and meso-scale properties emerge. Finally, we identify the main physical processes taking place during the cultures morphological transformations, and embed them into a simplified growth model that quantitatively reproduces the overall set of experimental observations.
Resumo:
The generative topographic mapping (GTM) model was introduced by Bishop et al. (1998, Neural Comput. 10(1), 215-234) as a probabilistic re- formulation of the self-organizing map (SOM). It offers a number of advantages compared with the standard SOM, and has already been used in a variety of applications. In this paper we report on several extensions of the GTM, including an incremental version of the EM algorithm for estimating the model parameters, the use of local subspace models, extensions to mixed discrete and continuous data, semi-linear models which permit the use of high-dimensional manifolds whilst avoiding computational intractability, Bayesian inference applied to hyper-parameters, and an alternative framework for the GTM based on Gaussian processes. All of these developments directly exploit the probabilistic structure of the GTM, thereby allowing the underlying modelling assumptions to be made explicit. They also highlight the advantages of adopting a consistent probabilistic framework for the formulation of pattern recognition algorithms.
Resumo:
This thesis describes the Generative Topographic Mapping (GTM) --- a non-linear latent variable model, intended for modelling continuous, intrinsically low-dimensional probability distributions, embedded in high-dimensional spaces. It can be seen as a non-linear form of principal component analysis or factor analysis. It also provides a principled alternative to the self-organizing map --- a widely established neural network model for unsupervised learning --- resolving many of its associated theoretical problems. An important, potential application of the GTM is visualization of high-dimensional data. Since the GTM is non-linear, the relationship between data and its visual representation may be far from trivial, but a better understanding of this relationship can be gained by computing the so-called magnification factor. In essence, the magnification factor relates the distances between data points, as they appear when visualized, to the actual distances between those data points. There are two principal limitations of the basic GTM model. The computational effort required will grow exponentially with the intrinsic dimensionality of the density model. However, if the intended application is visualization, this will typically not be a problem. The other limitation is the inherent structure of the GTM, which makes it most suitable for modelling moderately curved probability distributions of approximately rectangular shape. When the target distribution is very different to that, theaim of maintaining an `interpretable' structure, suitable for visualizing data, may come in conflict with the aim of providing a good density model. The fact that the GTM is a probabilistic model means that results from probability theory and statistics can be used to address problems such as model complexity. Furthermore, this framework provides solid ground for extending the GTM to wider contexts than that of this thesis.
Resumo:
Recently there has been an outburst of interest in extending topographic maps of vectorial data to more general data structures, such as sequences or trees. However, there is no general consensus as to how best to process sequences using topographicmaps, and this topic remains an active focus of neurocomputational research. The representational capabilities and internal representations of the models are not well understood. Here, we rigorously analyze a generalization of the self-organizingmap (SOM) for processing sequential data, recursive SOM (RecSOM) (Voegtlin, 2002), as a nonautonomous dynamical system consisting of a set of fixed input maps. We argue that contractive fixed-input maps are likely to produce Markovian organizations of receptive fields on the RecSOM map. We derive bounds on parameter β (weighting the importance of importing past information when processing sequences) under which contractiveness of the fixed-input maps is guaranteed. Some generalizations of SOM contain a dynamic module responsible for processing temporal contexts as an integral part of the model. We show that Markovian topographic maps of sequential data can be produced using a simple fixed (nonadaptable) dynamic module externally feeding a standard topographic model designed to process static vectorial data of fixed dimensionality (e.g., SOM). However, by allowing trainable feedback connections, one can obtain Markovian maps with superior memory depth and topography preservation. We elaborate on the importance of non-Markovian organizations in topographic maps of sequential data. © 2006 Massachusetts Institute of Technology.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08