23 resultados para Clustering analysis
Resumo:
This paper clarifies the role of alternative optimal solutions in the clustering of multidimensional observations using data envelopment analysis (DEA). The paper shows that alternative optimal solutions corresponding to several units produce different groups with different sizes and different decision making units (DMUs) at each class. This implies that a specific DMU may be grouped into different clusters when the corresponding DEA model has multiple optimal solutions. © 2011 Elsevier B.V. All rights reserved.
Resumo:
Ant colony optimisation algorithms model the way ants use pheromones for marking paths to important locations in their environment. Pheromone traces are picked up, followed, and reinforced by other ants but also evaporate over time. Optimal paths attract more pheromone and less useful paths fade away. The main innovation of the proposed Multiple Pheromone Ant Clustering Algorithm (MPACA) is to mark objects using many pheromones, one for each value of each attribute describing the objects in multidimensional space. Every object has one or more ants assigned to each attribute value and the ants then try to find other objects with matching values, depositing pheromone traces that link them. Encounters between ants are used to determine when ants should combine their features to look for conjunctions and whether they should belong to the same colony. This paper explains the algorithm and explores its potential effectiveness for cluster analysis. © 2014 Springer International Publishing Switzerland.
Resumo:
Biological experiments often produce enormous amount of data, which are usually analyzed by data clustering. Cluster analysis refers to statistical methods that are used to assign data with similar properties into several smaller, more meaningful groups. Two commonly used clustering techniques are introduced in the following section: principal component analysis (PCA) and hierarchical clustering. PCA calculates the variance between variables and groups them into a few uncorrelated groups or principal components (PCs) that are orthogonal to each other. Hierarchical clustering is carried out by separating data into many clusters and merging similar clusters together. Here, we use an example of human leukocyte antigen (HLA) supertype classification to demonstrate the usage of the two methods. Two programs, Generating Optimal Linear Partial Least Square Estimations (GOLPE) and Sybyl, are used for PCA and hierarchical clustering, respectively. However, the reader should bear in mind that the methods have been incorporated into other software as well, such as SIMCA, statistiXL, and R.
Resumo:
Emerging vehicular comfort applications pose a host of completely new set of requirements such as maintaining end-to-end connectivity, packet routing, and reliable communication for internet access while on the move. One of the biggest challenges is to provide good quality of service (QoS) such as low packet delay while coping with the fast topological changes. In this paper, we propose a clustering algorithm based on minimal path loss ratio (MPLR) which should help in spectrum efficiency and reduce data congestion in the network. The vehicular nodes which experience minimal path loss are selected as the cluster heads. The performance of the MPLR clustering algorithm is calculated by rate of change of cluster heads, average number of clusters and average cluster size. Vehicular traffic models derived from the Traffic Wales data are fed as input to the motorway simulator. A mathematical analysis for the rate of change of cluster head is derived which validates the MPLR algorithm and is compared with the simulated results. The mathematical and simulated results are in good agreement indicating the stability of the algorithm and the accuracy of the simulator. The MPLR system is also compared with V2R system with MPLR system performing better. © 2013 IEEE.
Resumo:
Descriptions of vegetation communities are often based on vague semantic terms describing species presence and dominance. For this reason, some researchers advocate the use of fuzzy sets in the statistical classification of plant species data into communities. In this study, spatially referenced vegetation abundance values collected from Greek phrygana were analysed by ordination (DECORANA), and classified on the resulting axes using fuzzy c-means to yield a point data-set representing local memberships in characteristic plant communities. The fuzzy clusters matched vegetation communities noted in the field, which tended to grade into one another, rather than occupying discrete patches. The fuzzy set representation of the community exploited the strengths of detrended correspondence analysis while retaining richer information than a TWINSPAN classification of the same data. Thus, in the absence of phytosociological benchmarks, meaningful and manageable habitat information could be derived from complex, multivariate species data. We also analysed the influence of the reliability of different surveyors' field observations by multiple sampling at a selected sample location. We show that the impact of surveyor error was more severe in the Boolean than the fuzzy classification. © 2007 Springer.
Resumo:
PURPOSE: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset, using a large, international database. METHODS: The database includes 4037 patients with a diagnosis of bipolar I disorder, previously collected at 36 collection sites in 23 countries. Generalized estimating equations (GEE) were used to adjust the data for country median age, and in some models, birth cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. RESULTS: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After adjusting for the birth cohort or when considering only those born after 1959, two subgroups were found. With results of either two or three subgroups, the youngest subgroup was more likely to have a family history of mood disorders and a first episode with depressed polarity. However, without adjusting for birth cohort (three subgroups), family history and polarity of the first episode could not be distinguished between the middle and oldest subgroups. CONCLUSION: These results using international data confirm prior findings using single country data, that there are subgroups of bipolar I disorder based on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more useful for research.
Resumo:
Objective In this study, we have used a chemometrics-based method to correlate key liposomal adjuvant attributes with in-vivo immune responses based on multivariate analysis. Methods The liposomal adjuvant composed of the cationic lipid dimethyldioctadecylammonium bromide (DDA) and trehalose 6,6-dibehenate (TDB) was modified with 1,2-distearoyl-sn-glycero-3-phosphocholine at a range of mol% ratios, and the main liposomal characteristics (liposome size and zeta potential) was measured along with their immunological performance as an adjuvant for the novel, postexposure fusion tuberculosis vaccine, Ag85B-ESAT-6-Rv2660c (H56 vaccine). Partial least square regression analysis was applied to correlate and cluster liposomal adjuvants particle characteristics with in-vivo derived immunological performances (IgG, IgG1, IgG2b, spleen proliferation, IL-2, IL-5, IL-6, IL-10, IFN-γ). Key findings While a range of factors varied in the formulations, decreasing the 1,2-distearoyl-sn-glycero-3-phosphocholine content (and subsequent zeta potential) together built the strongest variables in the model. Enhanced DDA and TDB content (and subsequent zeta potential) stimulated a response skewed towards a cell mediated immunity, with the model identifying correlations with IFN-γ, IL-2 and IL-6. Conclusion This study demonstrates the application of chemometrics-based correlations and clustering, which can inform liposomal adjuvant design.
Resumo:
To determine the factors influencing the distribution of β-amyloid (Aβ) deposits in Alzheimer's disease (AD), the spatial patterns of the diffuse, primitive, and classic Aβ deposits were studied from the superior temporal gyrus (STG) to sector CA4 of the hippocampus in six sporadic cases of the disease. In cortical gyri and in the CA sectors of the hippocampus, the Aβ deposits were distributed either in clusters 200-6400 μm in diameter that were regularly distributed parallel to the tissue boundary or in larger clusters greater than 6400 μm in diameter. In some regions, smaller clusters of Aβ deposits were aggregated into larger 'superclusters'. In many cortical gyri, the density of Aβ deposits was positively correlated with distance below the gyral crest. In the majority of regions, clusters of the diffuse, primitive, and classic deposits were not spatially correlated with each other. In two cases, double immunolabelled to reveal the Aβ deposits and blood vessels, the classic Aβ deposits were clustered around the larger diameter vessels. These results suggest a complex pattern of Aβ deposition in the temporal lobe in sporadic AD. A regular distribution of Aβ deposit clusters may reflect the degeneration of specific cortico-cortical and cortico-hippocampal pathways and the influence of the cerebral blood vessels. Large-scale clustering may reflect the aggregation of deposits in the depths of the sulci and the coalescence of smaller clusters.