31 resultados para cluster analysis
em Aston University Research Archive
Resumo:
A culster analysis was performed on 78 cases of Alzheimer's disease (AD) to identify possible pathological subtypes of the disease. Data on 47 neuropathological variables, inculding features of the gross brain and the density and distribution of senile plaques (SP) and neurofibrillary tangles (NFT) were used to describe each case. Cluster analysis is a multivariate statistical method which combines together in groups, AD cases with the most similar neuropathological characteristics. The majority of cases (83%) were clustered into five such groups. The analysis suggested that an initial division of the 78 cases could be made into two major groups: (1) a large group (68%) in which the distribution of SP and NFT was restricted to a relatively small number of brain regions, and (2) a smaller group (15%) in which the lesions were more widely disseminated throughout the neocortex. Each of these groups could be subdivided on the degree of capillary amyloid angiopathy (CAA) present. In addition, those cases with a restricted development of SP/NFT and CAA could be divided further into an early and a late onset form. Familial AD cases did not cluster as a separate group but were either distributed between four of the five groups or were cases with unique combinations of pathological features not closely related to any of the groups. It was concluded that multivariate statistical methods may be of value in the classification of AD into subtypes. © 1994 Springer-Verlag.
Resumo:
Two contrasting multivariate statistical methods, viz., principal components analysis (PCA) and cluster analysis were applied to the study of neuropathological variations between cases of Alzheimer's disease (AD). To compare the two methods, 78 cases of AD were analyzed, each characterised by measurements of 47 neuropathological variables. Both methods of analysis revealed significant variations between AD cases. These variations were related primarily to differences in the distribution and abundance of senile plaques (SP) and neurofibrillary tangles (NFT) in the brain. Cluster analysis classified the majority of AD cases into five groups which could represent subtypes of AD. However, PCA suggested that variation between cases was more continuous with no distinct subtypes. Hence, PCA may be a more appropriate method than cluster analysis in the study of neuropathological variations between AD cases.
Resumo:
This thesis seeks to describe the development of an inexpensive and efficient clustering technique for multivariate data analysis. The technique starts from a multivariate data matrix and ends with graphical representation of the data and pattern recognition discriminant function. The technique also results in distances frequency distribution that might be useful in detecting clustering in the data or for the estimation of parameters useful in the discrimination between the different populations in the data. The technique can also be used in feature selection. The technique is essentially for the discovery of data structure by revealing the component parts of the data. lhe thesis offers three distinct contributions for cluster analysis and pattern recognition techniques. The first contribution is the introduction of transformation function in the technique of nonlinear mapping. The second contribution is the us~ of distances frequency distribution instead of distances time-sequence in nonlinear mapping, The third contribution is the formulation of a new generalised and normalised error function together with its optimal step size formula for gradient method minimisation. The thesis consists of five chapters. The first chapter is the introduction. The second chapter describes multidimensional scaling as an origin of nonlinear mapping technique. The third chapter describes the first developing step in the technique of nonlinear mapping that is the introduction of "transformation function". The fourth chapter describes the second developing step of the nonlinear mapping technique. This is the use of distances frequency distribution instead of distances time-sequence. The chapter also includes the new generalised and normalised error function formulation. Finally, the fifth chapter, the conclusion, evaluates all developments and proposes a new program. for cluster analysis and pattern recognition by integrating all the new features.
Resumo:
PURPOSE: Two common approaches to identify subgroups of patients with bipolar disorder are clustering methodology (mixture analysis) based on the age of onset, and a birth cohort analysis. This study investigates if a birth cohort effect will influence the results of clustering on the age of onset, using a large, international database. METHODS: The database includes 4037 patients with a diagnosis of bipolar I disorder, previously collected at 36 collection sites in 23 countries. Generalized estimating equations (GEE) were used to adjust the data for country median age, and in some models, birth cohort. Model-based clustering (mixture analysis) was then performed on the age of onset data using the residuals. Clinical variables in subgroups were compared. RESULTS: There was a strong birth cohort effect. Without adjusting for the birth cohort, three subgroups were found by clustering. After adjusting for the birth cohort or when considering only those born after 1959, two subgroups were found. With results of either two or three subgroups, the youngest subgroup was more likely to have a family history of mood disorders and a first episode with depressed polarity. However, without adjusting for birth cohort (three subgroups), family history and polarity of the first episode could not be distinguished between the middle and oldest subgroups. CONCLUSION: These results using international data confirm prior findings using single country data, that there are subgroups of bipolar I disorder based on the age of onset, and that there is a birth cohort effect. Including the birth cohort adjustment altered the number and characteristics of subgroups detected when clustering by age of onset. Further investigation is needed to determine if combining both approaches will identify subgroups that are more useful for research.
Resumo:
Principal components analysis (PCA) has been described for over 50 years; however, it is rarely applied to the analysis of epidemiological data. In this study PCA was critically appraised in its ability to reveal relationships between pulsed-field gel electrophoresis (PFGE) profiles of methicillin- resistant Staphylococcus aureus (MRSA) in comparison to the more commonly employed cluster analysis and representation by dendrograms. The PFGE type following SmaI chromosomal digest was determined for 44 multidrug-resistant hospital-acquired methicillin-resistant S. aureus (MR-HA-MRSA) isolates, two multidrug-resistant community-acquired MRSA (MR-CA-MRSA), 50 hospital-acquired MRSA (HA-MRSA) isolates (from the University Hospital Birmingham, NHS Trust, UK) and 34 community-acquired MRSA (CA-MRSA) isolates (from general practitioners in Birmingham, UK). Strain relatedness was determined using Dice band-matching with UPGMA clustering and PCA. The results indicated that PCA revealed relationships between MRSA strains, which were more strongly correlated with known epidemiology, most likely because, unlike cluster analysis, PCA does not have the constraint of generating a hierarchic classification. In addition, PCA provides the opportunity for further analysis to identify key polymorphic bands within complex genotypic profiles, which is not always possible with dendrograms. Here we provide a detailed description of a PCA method for the analysis of PFGE profiles to complement further the epidemiological study of infectious disease. © 2005 Elsevier B.V. All rights reserved.
Resumo:
This study covers two areas of contribution to the knowledge, firstly it tried to investigate rigourously the relationships of a number of factors believed that they may affect the climate perception, classified into three types to arrive to prove a hypothesis of the important role that qualification and personal factors play in shaping the climate perception, this is in contrast with situational factors. Secondly, the study tries to recluster the items of a wide-range applied scale for the measurement of climate named HAY in order to overcome the cross-cultural differences between the Kuwaiti and the American society, and to achieve a modified dimensions of climate for a civil service organisation in Kuwait. Furthermore, the study attempts to carry out a diagnostic test for the climate of the Ministry of Public Health in Kuwait, aiming to diagnose the perceived characteristics of the MoPH organisation, and suggests a number of areas to be given attention if an improvement is to be introduced. The study used extensively the statistical and the computer facilities to make the analysis more representing the field data, on the other hand this study is characterised by the very highly responsive rate of the main survey which would affect the findings reliability. Three main field studies are included, the first one was to conduct the main questionnaire where the second was to measure the "should be" climate by the experts of MoPH using the DELPHI technique, and the third was to conduct an extensive meeting with the very top management team in MoPH. Results of the first stage were subject to CLUSTER analysis for the reconstruction of the HAY tool, whereas comparative analysis was carried on between the results of the second and third stages on one side, the first from the other.
Resumo:
This paper describes how the statistical technique of cluster analysis and the machine learning technique of rule induction can be combined to explore a database. The ways in which such an approach alleviates the problems associated with other techniques for data analysis are discussed. We report the results of experiments carried out on a database from the medical diagnosis domain. Finally we describe the future developments which we plan to carry out to build on our current work.
Resumo:
Purpose - The purpose of the work discussed in this paper is to understand, analyse and benchmark the "Packing and Filling" processes within BASF. A benchmarking project is described in detail which aimed to cover sites in different countries that supplied many different variants of finished goods in order to establish best practice and then to generate some options for their implementation. Design/methodology/approach - The project used an adaptation of accepted benchmarking methodology combined with other techniques (such as rich picture generation, and cluster analysis) to maximise the insight generated. Findings - The findings of the research showed that one of the main factors effecting the process was how third parties were used (e.g. extent and nature of out-sourcing, and its degree of centralisation). Research limitations/ implications - The exercise was challenged by the selection of suitably similar benchmarking candidates because the environment was complex and highly varied; the paper explains practical solutions for dealing with this challenge. Practical limitations - Strategic and tactical options are outlined at the end of the paper and will have applicability to other organisations and industries that are looking to find the answers to frequently asked questions about how to successfully implement an internal process benchmarking project in a large complex organisation that has high variety in end products and delivery methods. Originality/value - The methodology described in this paper is of a proprietary and unique nature. The paper is structured around some key questions commonly asked of benchmarking, and the answers are provided via a real in-depth case study from BASF that spans 4 sites in 3 countries using 15 different filling lines. © Emerald Group Publishing Limited.
Resumo:
This accessible, practice-oriented and compact text provides a hands-on introduction to the principles of market research. Using the market research process as a framework, the authors explain how to collect and describe the necessary data and present the most important and frequently used quantitative analysis techniques, such as ANOVA, regression analysis, factor analysis, and cluster analysis. An explanation is provided of the theoretical choices a market researcher has to make with regard to each technique, as well as how these are translated into actions in IBM SPSS Statistics. This includes a discussion of what the outputs mean and how they should be interpreted from a market research perspective. Each chapter concludes with a case study that illustrates the process based on real-world data. A comprehensive web appendix includes additional analysis techniques, datasets, video files and case studies. Several mobile tags in the text allow readers to quickly browse related web content using a mobile device.
Resumo:
This exploratory paper, developing a conceptual model of owner-manager characteristics and access to finance, aims to investigate whether the concept of strategic groups plays a role in the process of small and medium-sized enterprises (SMEs) accessing finance. Strategic groups are groups of firms making similar patterns of investments in order to achieve their goals. This paper explores how strategic groups, which represent a classification of SMEs based upon their realised strategies, helps to provide an understanding of the success of SMEs in raising finance. The data, from a representative survey of 400 SMEs conducted by the Barclays Bank Telephone Research Unit, were subject to two-stage cluster analysis, thus codified into strategic groups using the natural rhythm of the data, rather than any subjective and value-laden categories being imposed by the authors. The findings show clear differentiation between strategic groups of SMEs, the characteristics of their owner-managers, and the financing strategies adopted. As such, the paper develops a novel typology of strategic groups of SMEs which, therefore, informs their financing strategies, as well as advising other stakeholders.
Resumo:
This paper introduces a method for the analysis of regional linguistic variation. The method identifies individual and common patterns of spatial clustering in a set of linguistic variables measured over a set of locations based on a combination of three statistical techniques: spatial autocorrelation, factor analysis, and cluster analysis. To demonstrate how to apply this method, it is used to analyze regional variation in the values of 40 continuously measured, high-frequency lexical alternation variables in a 26-million-word corpus of letters to the editor representing 206 cities from across the United States.
Resumo:
A set of full-color images of objects is described for use in experiments investigating the effects of in-depth rotation on the identification of three-dimensional objects. The corpus contains up to 11 perspective views of 70 nameable objects. We also provide ratings of the "goodness" of each view, based on Thurstonian scaling of subjects' preferences in a paired-comparison experiment. An exploratory cluster analysis on the scaling solutions indicates that the amount of information available in a given view generally is the major determinant of the goodness of the view. For instance, objects with an elongated front-back axis tend to cluster together, and the front and back views of these objects, which do not reveal the object's major surfaces and features, are evaluated as the worst views.
Resumo:
Burkholderia cepacia is an opportunistic pathogen that colonises of the lungs of cystic fibrosis (CF) patients, with a frequently fatal outcome. Antibiotic resistance is common and highly transmissible epidemic strains have been described in the UK. 37 B. cepacia isolates from clinical and botanical sources were characterised via metabolic capabilities, antibiotic sensitivity, fatty acid methyl ester (FAME) profiles restriction digest analysis of chromosomal DNA by pulsed-gel electrophoresis (PFGE) (with the use of two separate restriction enzymes) and outer membrane protein (OMP) profiles. This revealed isolates of the UK CF epidemic strain to form a distinct group with a specific OMP profile. Cluster analysis of PFGE and FAME profiles revealed the species Burkholderia gladioli and Burkholderia vietnamiensis to be more closely related to each other and to laboratory strains of B. cepacia than to the CF epidemic strain considered a member of the latter species. The epidemic strain of B. cepacia may therefore be worthy of species definition in its own right. All the strains studied showed a high level of resistance to antibiotics, including the carbapenems. Considering this, carbapenemase production by isolates of B. cepacia was investigated. A metallo-β-lactamase from a clinical strain of B. cepacia was isolated and partially purified of using Cibacron blue F3GA-coupled agarose. The resulting preparation showed a single band of β-lactamase activity (pI 8.45) after analytical isoelectric focusing. The enzyme was particularly effective in the hydrolysis of imipenem. Meropenem, biapenem, cephaloridine, ceftazidime, benzylpenicillin, ampicillin and carbenicillin were hydrolysed at a lower rate. An unusual inhibition profile was noted. Inhibition by the metal ion chelators ethylene diamine tetra acetic acid and o-phenanthroline was reversed by addition of zinc, indicating a metallo-enzyme, whilst >90% inhibition was attainable with 0.1mM concentrations of tazobactam and clavulanic acid. A study of 8 other clinical isolates showed an enzyme of pI 8.45 to be present and inducible by imipenem in each case. This enzyme was assigned PCM-I (Pseudomonas cepacia metalloenzyme I).
Resumo:
The manufacture of copper alloy flat rolled metals involves hot and cold rolling operations, together with annealing and other secondary processes, to transform castings (mainly slabs and cakes) into such shapes as strip, plate, sheet, etc. Production is mainly to customer orders in a wide range of specifications for dimensions and properties. However, order quantities are often small and so process planning plays an important role in this industry. Much research work has been done in the past in relation to the technology of flat rolling and the details of the operations, however, there is little or no evidence of any research in the planning of processes for this type of manufacture. Practical observation in a number of rolling mills has established the type of manual process planning traditionally used in this industry. This manual approach, however, has inherent drawbacks, being particularly dependent on the individual planners who gain their knowledge over a long span of practical experience. The introduction of the retrieval CAPP approach to this industry was a first step to reduce these problems. But this could not provide a long-term answer because of the need for an experienced planner to supervise generation of any plan. It also fails to take account of the dynamic nature of the parameters involved in the planning, such as the availability of resources, operation conditions and variations in the costs. The other alternative is the use of a generative approach to planning in the rolling mill context. In this thesis, generative methods are developed for the selection of optimal routes for single orders and then for batches of orders, bearing in mind equipment restrictions, production costs and material yield. The batch order process planning involves the use of a special cluster analysis algorithm for optimal grouping of the orders. This research concentrates on cold-rolling operations. A prototype model of the proposed CAPP system, including both single order and batch order planning options, has been developed and tested on real order data in the industry. The results were satisfactory and compared very favourably with the existing manual and retrieval methods.