835 resultados para Cluster Analysis. Information Theory. Entropy. Cross Information Potential. Complex Data
Resumo:
A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
Resumo:
Optimal currency area theory suggests that business cycle comovement is a sufficient condition for monetary union, particularly if there are low levels of labour mobility between potential members of the monetary union. Previous studies of co-movement of business cycle variables (mainly authored by Artis and Zhang in the late 1990s) found that there was a core of member states in the EU that could be grouped together as having similar business cycle comovements, but these studies always used Germany as the country against which to compare. In this study, the analysis of Artis and Zhang is extended and updated but correlating against both German and euro area macroeconomic aggregates and using more recent techniques in cluster analysis, namely model-based clustering techniques.
Resumo:
"Physics; Reactor technology--TID-4500, 36th ed."
Resumo:
This paper describes the application of a new technique, rough clustering, to the problem of market segmentation. Rough clustering produces different solutions to k-means analysis because of the possibility of multiple cluster membership of objects. Traditional clustering methods generate extensional descriptions of groups, that show which objects are members of each cluster. Clustering techniques based on rough sets theory generate intensional descriptions, which outline the main characteristics of each cluster. In this study, a rough cluster analysis was conducted on a sample of 437 responses from a larger study of the relationship between shopping orientation (the general predisposition of consumers toward the act of shopping) and intention to purchase products via the Internet. The cluster analysis was based on five measures of shopping orientation: enjoyment, personalization, convenience, loyalty, and price. The rough clusters obtained provide interpretations of different shopping orientations present in the data without the restriction of attempting to fit each object into only one segment. Such descriptions can be an aid to marketers attempting to identify potential segments of consumers.
Resumo:
Visual field assessment is a core component of glaucoma diagnosis and monitoring, and the Standard Automated Perimetry (SAP) test is considered up until this moment, the gold standard of visual field assessment. Although SAP is a subjective assessment and has many pitfalls, it is being constantly used in the diagnosis of visual field loss in glaucoma. Multifocal visual evoked potential (mfVEP) is a newly introduced method used for visual field assessment objectively. Several analysis protocols have been tested to identify early visual field losses in glaucoma patients using the mfVEP technique, some were successful in detection of field defects, which were comparable to the standard SAP visual field assessment, and others were not very informative and needed more adjustment and research work. In this study, we implemented a novel analysis approach and evaluated its validity and whether it could be used effectively for early detection of visual field defects in glaucoma. OBJECTIVES: The purpose of this study is to examine the effectiveness of a new analysis method in the Multi-Focal Visual Evoked Potential (mfVEP) when it is used for the objective assessment of the visual field in glaucoma patients, compared to the gold standard technique. METHODS: 3 groups were tested in this study; normal controls (38 eyes), glaucoma patients (36 eyes) and glaucoma suspect patients (38 eyes). All subjects had a two standard Humphrey visual field HFA test 24-2 and a single mfVEP test undertaken in one session. Analysis of the mfVEP results was done using the new analysis protocol; the Hemifield Sector Analysis HSA protocol. Analysis of the HFA was done using the standard grading system. RESULTS: Analysis of mfVEP results showed that there was a statistically significant difference between the 3 groups in the mean signal to noise ratio SNR (ANOVA p<0.001 with a 95% CI). The difference between superior and inferior hemispheres in all subjects were all statistically significant in the glaucoma patient group 11/11 sectors (t-test p<0.001), partially significant 5/11 (t-test p<0.01) and no statistical difference between most sectors in normal group (only 1/11 was significant) (t-test p<0.9). sensitivity and specificity of the HAS protocol in detecting glaucoma was 97% and 86% respectively, while for glaucoma suspect were 89% and 79%. DISCUSSION: The results showed that the new analysis protocol was able to confirm already existing field defects detected by standard HFA, was able to differentiate between the 3 study groups with a clear distinction between normal and patients with suspected glaucoma; however the distinction between normal and glaucoma patients was especially clear and significant. CONCLUSION: The new HSA protocol used in the mfVEP testing can be used to detect glaucomatous visual field defects in both glaucoma and glaucoma suspect patient. Using this protocol can provide information about focal visual field differences across the horizontal midline, which can be utilized to differentiate between glaucoma and normal subjects. Sensitivity and specificity of the mfVEP test showed very promising results and correlated with other anatomical changes in glaucoma field loss.
Resumo:
The paper treats the task for cluster analysis of a given assembly of objects on the basis of the information contained in the description table of these objects. Various methods of cluster analysis are briefly considered. Heuristic method and rules for classification of the given assembly of objects are presented for the cases when their division into classes and the number of classes is not known. The algorithm is checked by a test example and two program products (PP) – learning systems and software for company management. Analysis of the results is presented.
Resumo:
CONCLUSIONS: The new HSA protocol used in the mfVEP testing can be applied to detect glaucomatous visual field defects in both glaucoma and glaucoma suspect patients. Using this protocol can provide information about focal visual field differences across the horizontal midline, which can be utilized to differentiate between glaucoma and normal subjects. Sensitivity and specificity of the mfVEP test showed very promising results and correlated with other anatomical changes in glaucoma field loss. PURPOSE: Multifocal visual evoked potential (mfVEP) is a newly introduced method used for objective visual field assessment. Several analysis protocols have been tested to identify early visual field losses in glaucoma patients using the mfVEP technique, some were successful in detection of field defects, which were comparable to the standard automated perimetry (SAP) visual field assessment, and others were not very informative and needed more adjustment and research work. In this study we implemented a novel analysis approach and evaluated its validity and whether it could be used effectively for early detection of visual field defects in glaucoma. METHODS: Three groups were tested in this study; normal controls (38 eyes), glaucoma patients (36 eyes) and glaucoma suspect patients (38 eyes). All subjects had a two standard Humphrey field analyzer (HFA) test 24-2 and a single mfVEP test undertaken in one session. Analysis of the mfVEP results was done using the new analysis protocol; the hemifield sector analysis (HSA) protocol. Analysis of the HFA was done using the standard grading system. RESULTS: Analysis of mfVEP results showed that there was a statistically significant difference between the three groups in the mean signal to noise ratio (ANOVA test, p < 0.001 with a 95% confidence interval). The difference between superior and inferior hemispheres in all subjects were statistically significant in the glaucoma patient group in all 11 sectors (t-test, p < 0.001), partially significant in 5 / 11 (t-test, p < 0.01), and no statistical difference in most sectors of the normal group (1 / 11 sectors was significant, t-test, p < 0.9). Sensitivity and specificity of the HSA protocol in detecting glaucoma was 97% and 86%, respectively, and for glaucoma suspect patients the values were 89% and 79%, respectively.
Resumo:
A Flood Vulnerability Index (FloodVI) was developed using Principal Component Analysis (PCA) and a new aggregation method based on Cluster Analysis (CA). PCA simplifies a large number of variables into a few uncorrelated factors representing the social, economic, physical and environmental dimensions of vulnerability. CA groups areas that have the same characteristics in terms of vulnerability into vulnerability classes. The grouping of the areas determines their classification contrary to other aggregation methods in which the areas' classification determines their grouping. While other aggregation methods distribute the areas into classes, in an artificial manner, by imposing a certain probability for an area to belong to a certain class, as determined by the assumption that the aggregation measure used is normally distributed, CA does not constrain the distribution of the areas by the classes. FloodVI was designed at the neighbourhood level and was applied to the Portuguese municipality of Vila Nova de Gaia where several flood events have taken place in the recent past. The FloodVI sensitivity was assessed using three different aggregation methods: the sum of component scores, the first component score and the weighted sum of component scores. The results highlight the sensitivity of the FloodVI to different aggregation methods. Both sum of component scores and weighted sum of component scores have shown similar results. The first component score aggregation method classifies almost all areas as having medium vulnerability and finally the results obtained using the CA show a distinct differentiation of the vulnerability where hot spots can be clearly identified. The information provided by records of previous flood events corroborate the results obtained with CA, because the inundated areas with greater damages are those that are identified as high and very high vulnerability areas by CA. This supports the fact that CA provides a reliable FloodVI.
Resumo:
Cluster analysis of ranking data, which occurs in consumer questionnaires, voting forms or other inquiries of preferences, attempts to identify typical groups of rank choices. Empirically measured rankings are often incomplete, i.e. different numbers of filled rank positions cause heterogeneity in the data. We propose a mixture approach for clustering of heterogeneous rank data. Rankings of different lengths can be described and compared by means of a single probabilistic model. A maximum entropy approach avoids hidden assumptions about missing rank positions. Parameter estimators and an efficient EM algorithm for unsupervised inference are derived for the ranking mixture model. Experiments on both synthetic data and real-world data demonstrate significantly improved parameter estimates on heterogeneous data when the incomplete rankings are included in the inference process.
Resumo:
This paper emerged from work supported by EPSRC grant GR/S84354/01 and proposes a method of determining principal curves, using spline functions, in principal component analysis (PCA) for the representation of non-linear behaviour in process monitoring. Although principal curves are well established, they are difficult to implement in practice if a large number of variables are analysed. The significant contribution of this paper is that the proposed method has minimal complexity, assuming simple spline geometry, thus enabling efficient computation. The paper provides a foundation for further work where multiple curves may be required to represent underlying non-linear information in complex data.
Resumo:
This paper proposes the use of an improved covariate unit root test which exploits the cross-sectional dependence information when the panel data null hypothesis of a unit root is rejected. More explicitly, to increase the power of the test, we suggest the utilization of more than one covariate and offer several ways to select the ‘best’ covariates from the set of potential covariates represented by the individuals in the panel. Employing our methods, we investigate the Prebish-Singer hypothesis for nine commodity prices. Our results show that this hypothesis holds for all but the price of petroleum.
Resumo:
A realistic representation of the North Atlantic tropical cyclone tracks is crucial as it allows, for example, explaining potential changes in US landfalling systems. Here we present a tentative study, which examines the ability of recent climate models to represent North Atlantic tropical cyclone tracks. Tracks from two types of climate models are evaluated: explicit tracks are obtained from tropical cyclones simulated in regional or global climate models with moderate to high horizontal resolution (1° to 0.25°), and downscaled tracks are obtained using a downscaling technique with large-scale environmental fields from a subset of these models. For both configurations, tracks are objectively separated into four groups using a cluster technique, leading to a zonal and a meridional separation of the tracks. The meridional separation largely captures the separation between deep tropical and sub-tropical, hybrid or baroclinic cyclones, while the zonal separation segregates Gulf of Mexico and Cape Verde storms. The properties of the tracks’ seasonality, intensity and power dissipation index in each cluster are documented for both configurations. Our results show that except for the seasonality, the downscaled tracks better capture the observed characteristics of the clusters. We also use three different idealized scenarios to examine the possible future changes of tropical cyclone tracks under 1) warming sea surface temperature, 2) increasing carbon dioxide, and 3) a combination of the two. The response to each scenario is highly variable depending on the simulation considered. Finally, we examine the role of each cluster in these future changes and find no preponderant contribution of any single cluster over the others.
Resumo:
Background: The validity of ensemble averaging on event-related potential (ERP) data has been questioned, due to its assumption that the ERP is identical across trials. Thus, there is a need for preliminary testing for cluster structure in the data. New method: We propose a complete pipeline for the cluster analysis of ERP data. To increase the signalto-noise (SNR) ratio of the raw single-trials, we used a denoising method based on Empirical Mode Decomposition (EMD). Next, we used a bootstrap-based method to determine the number of clusters, through a measure called the Stability Index (SI). We then used a clustering algorithm based on a Genetic Algorithm (GA)to define initial cluster centroids for subsequent k-means clustering. Finally, we visualised the clustering results through a scheme based on Principal Component Analysis (PCA). Results: After validating the pipeline on simulated data, we tested it on data from two experiments – a P300 speller paradigm on a single subject and a language processing study on 25 subjects. Results revealed evidence for the existence of 6 clusters in one experimental condition from the language processing study. Further, a two-way chi-square test revealed an influence of subject on cluster membership.
Resumo:
The present paper presents a theoretical analysis of a cross flow heat exchanger with a new flow arrangement comprehending several tube rows. The thermal performance of the proposed flow arrangement is compared with the thermal performance of a typical counter cross flow arrangement that is used in chemical, refrigeration, automotive and air conditioning industries. The thermal performance comparison has been performed in terms of the following parameters: heat exchanger effectiveness and efficiency, dimensionless entropy generation, entransy dissipation number, and dimensionless local temperature differences. It is also shown that the uniformity of the temperature difference field leads to a higher thermal performance of the heat exchanger. In the present case this is accomplished thorough a different organization of the in-tube fluid circuits in the heat exchanger. The relation between the recently introduced "entransy dissipation number" and the conventional thermal effectiveness has been obtained in terms of the "number of transfer units". A case study has been solved to quantitatively to obtain the temperature difference distribution over two rows units involving the proposed arrangement and the counter cross flow one. It has been shown that the proposed arrangement presents better thermal performance regardless the comparison parameter. (C) 2012 Elsevier Masson SAS. All rights reserved.
Resumo:
People often use tools to search for information. In order to improve the quality of an information search, it is important to understand how internal information, which is stored in user’s mind, and external information, represented by the interface of tools interact with each other. How information is distributed between internal and external representations significantly affects information search performance. However, few studies have examined the relationship between types of interface and types of search task in the context of information search. For a distributed information search task, how data are distributed, represented, and formatted significantly affects the user search performance in terms of response time and accuracy. Guided by UFuRT (User, Function, Representation, Task), a human-centered process, I propose a search model, task taxonomy. The model defines its relationship with other existing information models. The taxonomy clarifies the legitimate operations for each type of search task of relation data. Based on the model and taxonomy, I have also developed prototypes of interface for the search tasks of relational data. These prototypes were used for experiments. The experiments described in this study are of a within-subject design with a sample of 24 participants recruited from the graduate schools located in the Texas Medical Center. Participants performed one-dimensional nominal search tasks over nominal, ordinal, and ratio displays, and searched one-dimensional nominal, ordinal, interval, and ratio tasks over table and graph displays. Participants also performed the same task and display combination for twodimensional searches. Distributed cognition theory has been adopted as a theoretical framework for analyzing and predicting the search performance of relational data. It has been shown that the representation dimensions and data scales, as well as the search task types, are main factors in determining search efficiency and effectiveness. In particular, the more external representations used, the better search task performance, and the results suggest the ideal search performance occurs when the question type and corresponding data scale representation match. The implications of the study lie in contributing to the effective design of search interface for relational data, especially laboratory results, which are often used in healthcare activities.