40 resultados para Automated data analysis


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A recent novel approach to the visualisation and analysis of datasets, and one which is particularly applicable to those of a high dimension, is discussed in the context of real applications. A feed-forward neural network is utilised to effect a topographic, structure-preserving, dimension-reducing transformation of the data, with an additional facility to incorporate different degrees of associated subjective information. The properties of this transformation are illustrated on synthetic and real datasets, including the 1992 UK Research Assessment Exercise for funding in higher education. The method is compared and contrasted to established techniques for feature extraction, and related to topographic mappings, the Sammon projection and the statistical field of multidimensional scaling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Consideration of the influence of test technique and data analysis method is important for data comparison and design purposes. The paper highlights the effects of replication interval, crack growth rate averaging and curve-fitting procedures on crack growth rate results for a Ni-base alloy. It is shown that an upper bound crack growth rate line is not appropriate for use in fatigue design, and that the derivative of a quadratic fit to the a vs N data looks promising. However, this type of averaging, or curve fitting, is not useful in developing an understanding of microstructure/crack tip interactions. For this purpose, simple replica-to-replica growth rate calculations are preferable. © 1988.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Developers of interactive software are confronted by an increasing variety of software tools to help engineer the interactive aspects of software applications. Not only do these tools fall into different categories in terms of functionality, but within each category there is a growing number of competing tools with similar, although not identical, features. Choice of user interface development tool (UIDT) is therefore becoming increasingly complex.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DUE TO INCOMPLETE PAPERWORK, ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This article is aimed primarily at eye care practitioners who are undertaking advanced clinical research, and who wish to apply analysis of variance (ANOVA) to their data. ANOVA is a data analysis method of great utility and flexibility. This article describes why and how ANOVA was developed, the basic logic which underlies the method and the assumptions that the method makes for it to be validly applied to data from clinical experiments in optometry. The application of the method to the analysis of a simple data set is then described. In addition, the methods available for making planned comparisons between treatment means and for making post hoc tests are evaluated. The problem of determining the number of replicates or patients required in a given experimental situation is also discussed. Copyright (C) 2000 The College of Optometrists.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The use of quantitative methods has become increasingly important in the study of neuropathology and especially in neurodegenerative disease. Disorders such as Alzheimer's disease (AD) and the frontotemporal dementias (FTD) are characterized by the formation of discrete, microscopic, pathological lesions which play an important role in pathological diagnosis. This chapter reviews the advantages and limitations of the different methods of quantifying pathological lesions in histological sections including estimates of density, frequency, coverage, and the use of semi-quantitative scores. The sampling strategies by which these quantitative measures can be obtained from histological sections, including plot or quadrat sampling, transect sampling, and point-quarter sampling, are described. In addition, data analysis methods commonly used to analysis quantitative data in neuropathology, including analysis of variance (ANOVA), polynomial curve fitting, multiple regression, classification trees, and principal components analysis (PCA), are discussed. These methods are illustrated with reference to quantitative studies of a variety of neurodegenerative disorders.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Diabetes patients might suffer from an unhealthy life, long-term treatment and chronic complicated diseases. The decreasing hospitalization rate is a crucial problem for health care centers. This study combines the bagging method with base classifier decision tree and costs-sensitive analysis for diabetes patients' classification purpose. Real patients' data collected from a regional hospital in Thailand were analyzed. The relevance factors were selected and used to construct base classifier decision tree models to classify diabetes and non-diabetes patients. The bagging method was then applied to improve accuracy. Finally, asymmetric classification cost matrices were used to give more alternative models for diabetes data analysis.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper describes how the statistical technique of cluster analysis and the machine learning technique of rule induction can be combined to explore a database. The ways in which such an approach alleviates the problems associated with other techniques for data analysis are discussed. We report the results of experiments carried out on a database from the medical diagnosis domain. Finally we describe the future developments which we plan to carry out to build on our current work.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Principal component analysis (PCA) is a ubiquitous technique for data analysis and processing, but one which is not based upon a probability model. In this paper we demonstrate how the principal axes of a set of observed data vectors may be determined through maximum-likelihood estimation of parameters in a latent variable model closely related to factor analysis. We consider the properties of the associated likelihood function, giving an EM algorithm for estimating the principal subspace iteratively, and discuss the advantages conveyed by the definition of a probability density function for PCA.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper examines the source country determinants of FDI into Japan. The paper highlights certain methodological and theoretical weaknesses in the previous literature and offers some explanations for hitherto ambiguous results. Specifically, the paper highlights the importance of panel data analysis, and the identification of fixed effects in the analysis rather than simply pooling the data. Indeed, we argue that many of the results reported elsewhere are a feature of this mis-specification. To this end, pooled, fixed effects and random effects estimates are compared. The results suggest that FDI into Japan is inversely related to trade flows, such that trade and FDI are substitutes. Moreover, the results also suggest that FDI increases with home country political and economic stability. The paper also shows that previously reported results, regarding the importance of exchange rates, relative borrowing costs and labour costs in explaining FDI flows, are sensitive to the econometric specification and estimation approach. The paper also discusses the importance of these results within a policy context. In recent years Japan has sought to attract FDI, though many firms still complain of barriers to inward investment penetration in Japan. The results show that cultural and geographic distance are only of marginal importance in explaining FDI, and that the results are consistent with the market-seeking explanation of FDI. As such, the attitude to risk in the source country is strongly related to the size of FDI flows to Japan. © 2007 The Authors Journal compilation © 2007 Blackwell Publishing Ltd.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Analysis of variance (ANOVA) is the most efficient method available for the analysis of experimental data. Analysis of variance is a method of considerable complexity and subtlety, with many different variations, each of which applies in a particular experimental context. Hence, it is possible to apply the wrong type of ANOVA to data and, therefore, to draw an erroneous conclusion from an experiment. This article reviews the types of ANOVA most likely to arise in clinical experiments in optometry including the one-way ANOVA ('fixed' and 'random effect' models), two-way ANOVA in randomised blocks, three-way ANOVA, and factorial experimental designs (including the varieties known as 'split-plot' and 'repeated measures'). For each ANOVA, the appropriate experimental design is described, a statistical model is formulated, and the advantages and limitations of each type of design discussed. In addition, the problems of non-conformity to the statistical model and determination of the number of replications are considered. © 2002 The College of Optometrists.