875 resultados para document clustering


Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper deals with Takagi-Sugeno (TS) fuzzy model identification of nonlinear systems using fuzzy clustering. In particular, an extended fuzzy Gustafson-Kessel (EGK) clustering algorithm, using robust competitive agglomeration (RCA), is developed for automatically constructing a TS fuzzy model from system input-output data. The EGK algorithm can automatically determine the 'optimal' number of clusters from the training data set. It is shown that the EGK approach is relatively insensitive to initialization and is less susceptible to local minima, a benefit derived from its agglomerate property. This issue is often overlooked in the current literature on nonlinear identification using conventional fuzzy clustering. Furthermore, the robust statistical concepts underlying the EGK algorithm help to alleviate the difficulty of cluster identification in the construction of a TS fuzzy model from noisy training data. A new hybrid identification strategy is then formulated, which combines the EGK algorithm with a locally weighted, least-squares method for the estimation of local sub-model parameters. The efficacy of this new approach is demonstrated through function approximation examples and also by application to the identification of an automatic voltage regulation (AVR) loop for a simulated 3 kVA laboratory micro-machine system.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Support vector machine (SVM) is a powerful technique for data classification. Despite of its good theoretic foundations and high classification accuracy, normal SVM is not suitable for classification of large data sets, because the training complexity of SVM is highly dependent on the size of data set. This paper presents a novel SVM classification approach for large data sets by using minimum enclosing ball clustering. After the training data are partitioned by the proposed clustering method, the centers of the clusters are used for the first time SVM classification. Then we use the clusters whose centers are support vectors or those clusters which have different classes to perform the second time SVM classification. In this stage most data are removed. Several experimental results show that the approach proposed in this paper has good classification accuracy compared with classic SVM while the training is significantly faster than several other SVM classifiers.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In this paper we study the classification of spatiotemporal pattern of one-dimensional cellular automata (CA) whereas the classification comprises CA rules including their initial conditions. We propose an exploratory analysis method based on the normalized compression distance (NCD) of spatiotemporal patterns which is used as dissimilarity measure for a hierarchical clustering. Our approach is different with respect to the following points. First, the classification of spatiotemporal pattern is comparative because the NCD evaluates explicitly the difference of compressibility among two objects, e.g., strings corresponding to spatiotemporal patterns. This is in contrast to all other measures applied so far in a similar context because they are essentially univariate. Second, Kolmogorov complexity, which underlies the NCD, was used in the classification of CA with respect to their spatiotemporal pattern. Third, our method is semiautomatic allowing us to investigate hundreds or thousands of CA rules or initial conditions simultaneously to gain insights into their organizational structure. Our numerical results are not only plausible confirming previous classification attempts but also shed light on the intricate influence of random initial conditions on the classification results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

RNA polymerase I (Pol I) produces large ribosomal RNAs (rRNAs). In this study, we show that the Rpa49 and Rpa34 Pol I subunits, which do not have counterparts in Pol II and Pol III complexes, are functionally conserved using heterospecific complementation of the human and Schizosaccharomyces pombe orthologues in Saccharomyces cerevisiae. Deletion of RPA49 leads to the disappearance of nucleolar structure, but nucleolar assembly can be restored by decreasing ribosomal gene copy number from 190 to 25. Statistical analysis of Miller spreads in the absence of Rpa49 demonstrates a fourfold decrease in Pol I loading rate per gene and decreased contact between adjacent Pol I complexes. Therefore, the Rpa34 and Rpa49 Pol I–specific subunits are essential for nucleolar assembly and for the high polymerase loading rate associated with frequent contact between adjacent enzymes. Together our data suggest that localized rRNA production results in spatially constrained rRNA production, which is instrumental for nucleolar assembly.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Laying hens generally choose to aggregate, but the extent to which the environments in which we house them impact on social group dynamics is not known. In this paper the effect of pen environment on spatial clustering is considered. Twelve groups of four laying hens were studied under three environmental conditions: wire floor (W), shavings (Sh) and perches, peat, nestbox and shavings (PPN). Groups experienced each environment twice, for five weeks each time, in a systematic order that varied from group to group. Video recordings were made one day per week for 30 weeks. To determine level of clustering, we recorded positional data from a randomly selected 20-min excerpt per video (a total of 20 min x 360 videos analysed). On screen, pens were divided into six equal areas. In addition, PPN pens were divided into an additional four (sub) areas, to account for the use of perches (one area per half perch). Every 5 s, we recorded the location of each bird and calculated location use over time, feeding synchrony and cluster scores for each environment. Feeding synchrony and cluster scores were compared against unweighted and weighted (according to observed proportional location use) Poisson distributions to distinguish between resource and social attraction.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Underpinning current models of the mechanisms of the action of radiation is a central role for DNA damage and in particular double-strand breaks (DSBs). For radiations of different LET, there is a need to know the exact yields and distributions of DSBs in human cells. Most measurements of DSB yields within cells now rely on pulsed-field gel electrophoresis as the technique of choice. Previous measurements of DSB yields have suggested that the yields are remarkably similar for different types of radiation with RBE values less than or equal to1.0. More recent studies in mammalian cells, however, have suggested that both the yield and the spatial distribution of DSBs are influenced by radiation quality. RBE values for DSBs induced by high-LET radiations are greater than 1.0, and the distributions are nonrandom. Underlying this is the interaction of particle tracks with the higher-order chromosomal structures within cell nuclei. Further studies are needed to relate nonrandom distributions of DSBs to their rejoining kinetics. At the molecular level, we need to determine the involvement of clustering of damaged bases with strand breakage, and the relationship between higher-order clustering over sizes of kilobase pairs and above to localized clustering at the DNA level. Overall, these studies will allow us to elucidate whether the nonrandom distributions of breaks produced by high-LET particle tracks have any consequences for their repair and biological effectiveness. (C) 2001 by Radiation Research Society.