42 resultados para Data clustering. Fuzzy C-Means. Cluster centers initialization. Validation indices
em University of Queensland eSpace - Australia
Resumo:
In this paper a methodology for integrated multivariate monitoring and control of biological wastewater treatment plants during extreme events is presented. To monitor the process, on-line dynamic principal component analysis (PCA) is performed on the process data to extract the principal components that represent the underlying mechanisms of the process. Fuzzy c-means (FCM) clustering is used to classify the operational state. Performing clustering on scores from PCA solves computational problems as well as increases robustness due to noise attenuation. The class-membership information from FCM is used to derive adequate control set points for the local control loops. The methodology is illustrated by a simulation study of a biological wastewater treatment plant, on which disturbances of various types are imposed. The results show that the methodology can be used to determine and co-ordinate control actions in order to shift the control objective and improve the effluent quality.
Resumo:
The integration of geo-information from multiple sources and of diverse nature in developing mineral favourability indexes (MFIs) is a well-known problem in mineral exploration and mineral resource assessment. Fuzzy set theory provides a convenient framework to combine and analyse qualitative and quantitative data independently of their source or characteristics. A novel, data-driven formulation for calculating MFIs based on fuzzy analysis is developed in this paper. Different geo-variables are considered fuzzy sets and their appropriate membership functions are defined and modelled. A new weighted average-type aggregation operator is then introduced to generate a new fuzzy set representing mineral favourability. The membership grades of the new fuzzy set are considered as the MFI. The weights for the aggregation operation combine the individual membership functions of the geo-variables, and are derived using information from training areas and L, regression. The technique is demonstrated in a case study of skarn tin deposits and is used to integrate geological, geochemical and magnetic data. The study area covers a total of 22.5 km(2) and is divided into 349 cells, which include nine control cells. Nine geo-variables are considered in this study. Depending on the nature of the various geo-variables, four different types of membership functions are used to model the fuzzy membership of the geo-variables involved. (C) 2002 Elsevier Science Ltd. All rights reserved.
Resumo:
The texture segmentation techniques are diversified by the existence of several approaches. In this paper, we propose fuzzy features for the segmentation of texture image. For this purpose, a membership function is constructed to represent the effect of the neighboring pixels on the current pixel in a window. Using these membership function values, we find a feature by weighted average method for the current pixel. This is repeated for all pixels in the window treating each time one pixel as the current pixel. Using these fuzzy based features, we derive three descriptors such as maximum, entropy, and energy for each window. To segment the texture image, the modified mountain clustering that is unsupervised and fuzzy c-means clustering have been used. The performance of the proposed features is compared with that of fractal features.
Resumo:
In this paper an approach to extreme event control in wastewater treatment plant operation by use of automatic supervisory control is discussed. The framework presented is based on the fact that different operational conditions manifest themselves as clusters in a multivariate measurement space. These clusters are identified and linked to specific and corresponding events by use of principal component analysis and fuzzy c-means clustering. A reduced system model is assigned to each type of extreme event and used to calculate appropriate local controller set points. In earlier work we have shown that this approach is applicable to wastewater treatment control using look-up tables to determine current set points. In this work we focus on the automatic determination of appropriate set points by use of steady state and dynamic predictions. The performance of a relatively simple steady-state supervisory controller is compared with that of a model predictive supervisory controller. Also, a look-up table approach is included in the comparison, as it provides a simple and robust alternative to the steady-state and model predictive controllers, The methodology is illustrated in a simulation study.
Resumo:
It is generally accepted that two major gene pools exist in cultivated common bean (Phaseolus vulgaris L.), a Middle American and an Andean one. Some evidence, based on unique phaseolin morphotypes and AFLP analysis, suggests that at least one more gene pool exists in cultivated common bean. To investigate this hypothesis, 1072 accessions from a common bean core collection from the primary centres of origin, held at CIAT, were investigated. Various agronomic and morphological attributes (14 categorical and 11 quantitative) were measured. Multivariate analyses, consisting of homogeneity analysis and clustering for categorical data, clustering and ordination techniques for quantitative data and nonlinear principal component analysis for mixed data, were undertaken. The results of most analyses supported the existence of the two major gene pools. However, the analysis of categorical data of protein types showed an additional minor gene pool. The minor gene pool is designated North Andean and includes phaseolin types CH, S and T; lectin types 312, Pr, B and K; and mostly A5, A6 and A4 types alpha-amylase inhibitor. Analysis of the combined categorical data of protein types and some plant categorical data also suggested that some other germplasm with C type phaseolin are distinguished from the major gene pools.
Resumo:
beta-turns are important topological motifs for biological recognition of proteins and peptides. Organic molecules that sample the side chain positions of beta-turns have shown broad binding capacity to multiple different receptors, for example benzodiazepines. beta-turns have traditionally been classified into various types based on the backbone dihedral angles (phi 2, psi 2, phi 3 and psi 3). Indeed, 57-68% of beta-turns are currently classified into 8 different backbone families (Type I, Type II, Type I', Type II', Type VIII, Type VIa1, Type VIa2 and Type VIb and Type IV which represents unclassified beta-turns). Although this classification of beta-turns has been useful, the resulting beta-turn types are not ideal for the design of beta-turn mimetics as they do not reflect topological features of the recognition elements, the side chains. To overcome this, we have extracted beta-turns from a data set of non-homologous and high-resolution protein crystal structures. The side chain positions, as defined by C-alpha-C-beta vectors, of these turns have been clustered using the kth nearest neighbor clustering and filtered nearest centroid sorting algorithms. Nine clusters were obtained that cluster 90% of the data, and the average intra-cluster RMSD of the four C-alpha-C-beta vectors is 0.36. The nine clusters therefore represent the topology of the side chain scaffold architecture of the vast majority of beta-turns. The mean structures of the nine clusters are useful for the development of beta-turn mimetics and as biological descriptors for focusing combinatorial chemistry towards biologically relevant topological space.
Resumo:
A progressive spatial query retrieves spatial data based on previous queries (e.g., to fetch data in a more restricted area with higher resolution). A direct query, on the other side, is defined as an isolated window query. A multi-resolution spatial database system should support both progressive queries and traditional direct queries. It is conceptually challenging to support both types of query at the same time, as direct queries favour location-based data clustering, whereas progressive queries require fragmented data clustered by resolutions. Two new scaleless data structures are proposed in this paper. Experimental results using both synthetic and real world datasets demonstrate that the query processing time based on the new multiresolution approaches is comparable and often better than multi-representation data structures for both types of queries.
Resumo:
Recent efforts in the characterization of air-water flows properties have included some clustering process analysis. A cluster of bubbles is defined as a group of two or more bubbles, with a distinct separation from other bubbles before and after the cluster. The present paper compares the results of clustering processes two hydraulic structures. That is, a large-size dropshaft and a hydraulic jump in a rectangular horizontal channel. The comparison highlighted some significant differences in clustering production and structures. Both dropshaft and hydraulic jump flows are complex turbulent shear flows, and some clustering index may provide some measure of the bubble-turbulence interactions and associated energy dissipation.
Resumo:
The c-myb gene is the cellular homologue of the v-myb oncogenes carried by the avian leukaemia viruses AMV and E26. It encodes a transcription factor (c-Myb), as does each of the viral oncogenes, which recognises the core DNA sequence C/T-A-A-C-G/T-G via a repeated helix-turn-helix-like motif. c-myb is expressed in immature haemopoietic cells, as well as immature cells of the gastro-intestinal epithelium and is down-regulated with differentiation. Enforced expression of activated or even normal forms of Myb can transform haemopoietic cells, most often of the myeloid lineage, in vitro and in vivo. Although many genes have been identified which are likely to be regulated by c-Myb, the critical target genes involved in Myb's transforming activity are not known. Together with data showing increased c-myb expression in colonic tumours, these observations raise the possibility that c-myb may play a role in human malignant disease. (C) 1998 Elsevier Science Ltd. All rights reserved.
Resumo:
An unusual saltwater population of the "freshwater" crocodilian, Crocodylus johnstoni, was studied in the estuary of the Limmen Bight River in Australia's Northern Territory and compared with populations in permanently freshwater habitats. Crocodiles in the river were found across a large salinity gradient, from fresh water to a salinity of 24 mg.ml-1, more than twice the body fluid concentration. Plasma osmolarity, concentrations of plasma Na+, Cl-, and K+, and exchangeable Na+ pools were all remarkably constant across the salinity spectrum and were not substantially higher or more variable than those in crocodiles from permanently freshwater habitats. Body fluid volumes did not vary; condition factor and hydration status of crocodiles were not correlated with salinity and were not different from those of crocodiles from permanently fresh water. C. johnstoni clearly has considerable powers of osmoregulation in waters of low to medium salinity. Whether this osmoregulatory competence, extends to continuously hyperosmotic environments is not known, but distributional data suggest that C. johnstoni in hyperosmotic conditions may require periodic access to hypoosmotic water. The study demonstrates a physiological capacity for colonisation of at least some estuarine waters by this normally stenohaline freshwater crocodilian.
Resumo:
Cylindrospermopsis raciborskii is a toxic-bloom-forming cyanobacterium that is commonly found in tropical to subtropical climatic regions worldwide, but it is also recognized as a common component of cyanobacterial communities in temperate climates. Genetic profiles of C. raciborskii were examined in 19 cultured isolates originating from geographically diverse regions of Australia and represented by two distinct morphotypes. A 609-bp region of rpoC1, a DNA-dependent RNA polymerase gene, was amplified by PCR from these isolates with cyanobacterium-specific primers. Sequence analysis revealed that all isolates belonged to the same species, including morphotypes with straight or coiled trichomes. Additional rpoC1 gene sequences obtained for a range of cyanobacteria highlighted clustering of C. raciborskii with other heterocyst-producing cyanobacteria (orders Nostocales and Stigonematales). In contrast, randomly amplified polymorphic DNA and short tandemly repeated repetitive sequence profiles revealed a greater level of genetic heterogeneity among C. raciborskii isolates than did rpoC1 gene analysis, and unique band profiles were also found among each of the cyanobacterial genera examined. A PCR test targeting a region of the rpoC1 gene unique to C. raciborskii was developed for the specific identification of C. raciborskii from both purified genomic DNA and environmental samples. The PCR was evaluated with a number of cyanobacterial isolates, but a PCR-positive result was only achieved with C, raciborskii. This method provides an accurate alternative to traditional morphological identification of C. raciborskii.
Resumo:
Computational models complement laboratory experimentation for efficient identification of MHC-binding peptides and T-cell epitopes. Methods for prediction of MHC-binding peptides include binding motifs, quantitative matrices, artificial neural networks, hidden Markov models, and molecular modelling. Models derived by these methods have been successfully used for prediction of T-cell epitopes in cancer, autoimmunity, infectious disease, and allergy. For maximum benefit, the use of computer models must be treated as experiments analogous to standard laboratory procedures and performed according to strict standards. This requires careful selection of data for model building, and adequate testing and validation. A range of web-based databases and MHC-binding prediction programs are available. Although some available prediction programs for particular MHC alleles have reasonable accuracy, there is no guarantee that all models produce good quality predictions. In this article, we present and discuss a framework for modelling, testing, and applications of computational methods used in predictions of T-cell epitopes. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Bull sharks (Carcharhinus leucas) were captured across a salinity gradient from freshwater (FW) to seawater (SW). Across all salinities, C leucas were hyperosmotic to the environment. Plasma osmolarity in FW-captured animals (642 +/- 7 mosM) was significantly reduced compared to SW-captured animals (1067 +/- 21 mosM). In FW animals, sodium, chloride and urea were 208 +/- 3, 203 +/- 3 and 192 +/- 2 mmol l(-1), respectively. Plasma sodium, chloride and urea in SW-captured C leucas were 289 +/- 3, 296 +/- 6 and 370 +/- 10 mmol l(-1), respectively. The increase in plasma osmolarity between FW and SW was not linear. Between FW (3 mosM) and 24%o SW (676 mosM), plasma osmolarity increased by 22% or 0.92% per 1parts per thousand rise in salinity. Between 24%o and 33parts per thousand, plasma osmolarity increased by 33% or 4.7% per 1 parts per thousand rise in salinity, largely due to a sharp increase in plasma urea between 28parts per thousand and 33parts per thousand. C. leucas moving between FW and SW appear to be faced with three major osmoregulatory challenges, these occur between 0-10parts per thousand, 11-20parts per thousand and 21-33parts per thousand. A comparison between C leucas captured in FW and estuarine environments (20-28%o) in the Brisbane River revealed no difference in the mass of rectal glands between these animals. However, a comparison of rectal gland mass between FW animals captured in the Brisbane River and Rio San Juan/Lake Nicaragua showed that animals in the latter system had a significantly smaller rectal gland mass at a given length than animals in the Brisbane River. The physiological challenges and mechanisms required for C leucas moving between FW and SW, as well as the ecological implications of these data are discussed. (C) 2004 Elsevier Inc. All rights reserved.
Resumo:
Background: Pain is defined as both a sensory and an emotional experience. Acute postoperative tooth extraction pain is assessed and treated as a physiological (sensory) pain while chronic pain is a biopsychosocial problem. The purpose of this study was to assess whether psychological and social changes Occur in the acute pain state. Methods: A biopsychosocial pain questionnaire was completed by 438 subjects (165 males, 273 females) with acute postoperative pain at 24 hours following the surgical extraction of teeth and compared with 273 subjects (78 males, 195 females) with chronic orofacial pain. Statistical methods used a k-means cluster analysis. Results: Three clusters were identified in the acute pain group: 'unaffected', 'disabled' and 'depressed, anxious and disabled'. Psychosocial effects showed 24.8 per cent feeling 'distress/suffering' and 15.1 per cent 'sad and depressed'. Females reported higher pain intensity and more distress, depression and inadequate medication for pain relief (p
Resumo:
In this paper, we present ICICLE (Image ChainNet and Incremental Clustering Engine), a prototype system that we have developed to efficiently and effectively retrieve WWW images based on image semantics. ICICLE has two distinguishing features. First, it employs a novel image representation model called Weight ChainNet to capture the semantics of the image content. A new formula, called list space model, for computing semantic similarities is also introduced. Second, to speed up retrieval, ICICLE employs an incremental clustering mechanism, ICC (Incremental Clustering on ChainNet), to cluster images with similar semantics into the same partition. Each cluster has a summary representative and all clusters' representatives are further summarized into a balanced and full binary tree structure. We conducted an extensive performance study to evaluate ICICLE. Compared with some recently proposed methods, our results show that ICICLE provides better recall and precision. Our clustering technique ICC facilitates speedy retrieval of images without sacrificing recall and precision significantly.