219 resultados para multivariate classification
Resumo:
This paper describes a novel system for automatic classification of images obtained from Anti-Nuclear Antibody (ANA) pathology tests on Human Epithelial type 2 (HEp-2) cells using the Indirect Immunofluorescence (IIF) protocol. The IIF protocol on HEp-2 cells has been the hallmark method to identify the presence of ANAs, due to its high sensitivity and the large range of antigens that can be detected. However, it suffers from numerous shortcomings, such as being subjective as well as time and labour intensive. Computer Aided Diagnostic (CAD) systems have been developed to address these problems, which automatically classify a HEp-2 cell image into one of its known patterns (eg. speckled, homogeneous). Most of the existing CAD systems use handpicked features to represent a HEp-2 cell image, which may only work in limited scenarios. We propose a novel automatic cell image classification method termed Cell Pyramid Matching (CPM), which is comprised of regional histograms of visual words coupled with the Multiple Kernel Learning framework. We present a study of several variations of generating histograms and show the efficacy of the system on two publicly available datasets: the ICPR HEp-2 cell classification contest dataset and the SNPHEp-2 dataset.
Resumo:
Existing multi-model approaches for image set classification extract local models by clustering each image set individually only once, with fixed clusters used for matching with other image sets. However, this may result in the two closest clusters to represent different characteristics of an object, due to different undesirable environmental conditions (such as variations in illumination and pose). To address this problem, we propose to constrain the clustering of each query image set by forcing the clusters to have resemblance to the clusters in the gallery image sets. We first define a Frobenius norm distance between subspaces over Grassmann manifolds based on reconstruction error. We then extract local linear subspaces from a gallery image set via sparse representation. For each local linear subspace, we adaptively construct the corresponding closest subspace from the samples of a probe image set by joint sparse representation. We show that by minimising the sparse representation reconstruction error, we approach the nearest point on a Grassmann manifold. Experiments on Honda, ETH-80 and Cambridge-Gesture datasets show that the proposed method consistently outperforms several other recent techniques, such as Affine Hull based Image Set Distance (AHISD), Sparse Approximated Nearest Points (SANP) and Manifold Discriminant Analysis (MDA).
Resumo:
The concentrations of Na, K, Ca, Mg, Ba, Sr, Fe, Al, Mn, Zn, Pb, Cu, Ni, Cr, Co, Se, U and Ti were determined in the osteoderms and/or flesh of estuarine crocodiles (Crocodylus porosus) captured in three adjacent catchments within the Alligator Rivers Region (ARR) of northern Australia. Results from multivariate analysis of variance showed that when all metals were considered simultaneously, catchment effects were significant (P≤0.05). Despite considerable within-catchment variability, linear discriminant analysis (LDA) showed that differences in elemental signatures in the osteoderms and/or flesh of C. porosus amongst the catchments were sufficient to classify individuals accurately to their catchment of occurrence. Using cross-validation, the accuracy of classifying a crocodile to its catchment of occurrence was 76% for osteoderms and 60% for flesh. These data suggest that osteoderms provide better predictive accuracy than flesh for discriminating crocodiles amongst catchments. There was no advantage in combining the osteoderm and flesh results to increase the accuracy of classification (i.e. 67%). Based on the discriminant function coefficients for the osteoderm data, Ca, Co, Mg and U were the most important elements for discriminating amongst the three catchments. For flesh data, Ca, K, Mg, Na, Ni and Pb were the most important metals for discriminating amongst the catchments. Reasons for differences in the elemental signatures of crocodiles between catchments are generally not interpretable, due to limited data on surface water and sediment chemistry of the catchments or chemical composition of dietary items of C. porosus. From a wildlife management perspective, the provenance or source catchment(s) of 'problem' crocodiles captured at settlements or recreational areas along the ARR coastline may be established using catchment-specific elemental signatures. If the incidence of problem crocodiles can be reduced in settled or recreational areas by effective management at their source, then public safety concerns about these predators may be moderated, as well as the cost of their capture and removal. Copyright © 2002 Elsevier Science B.V.
Resumo:
Multivariate predictive models are widely used tools for assessment of aquatic ecosystem health and models have been successfully developed for the prediction and assessment of aquatic macroinvertebrates, diatoms, local stream habitat features and fish. We evaluated the ability of a modelling method based on the River InVertebrate Prediction and Classification System (RIVPACS) to accurately predict freshwater fish assemblage composition and assess aquatic ecosystem health in rivers and streams of south-eastern Queensland, Australia. The predictive model was developed, validated and tested in a region of comparatively high environmental variability due to the unpredictable nature of rainfall and river discharge. The model was concluded to provide sufficiently accurate and precise predictions of species composition and was sensitive enough to distinguish test sites impacted by several common types of human disturbance (particularly impacts associated with catchment land use and associated local riparian, in-stream habitat and water quality degradation). The total number of fish species available for prediction was low in comparison to similar applications of multivariate predictive models based on other indicator groups, yet the accuracy and precision of our model was comparable to outcomes from such studies. In addition, our model developed for sites sampled on one occasion and in one season only (winter), was able to accurately predict fish assemblage composition at sites sampled during other seasons and years, provided that they were not subject to unusually extreme environmental conditions (e.g. extended periods of low flow that restricted fish movement or resulted in habitat desiccation and local fish extinctions).
Resumo:
Data in germplasm collections contain a mixture of data types; binary, multistate and quantitative. Given the multivariate nature of these data, the pattern analysis methods of classification and ordination have been identified as suitable techniques for statistically evaluating the available diversity. The proximity (or resemblance) measure, which is in part the basis of the complementary nature of classification and ordination techniques, is often specific to particular data types. The use of a combined resemblance matrix has an advantage over data type specific proximity measures. This measure accommodates the different data types without manipulating them to be of a specific type. Descriptors are partitioned into their data types and an appropriate proximity measure is used on each. The separate proximity matrices, after range standardisation, are added as a weighted average and the combined resemblance matrix is then used for classification and ordination. Germplasm evaluation data for 831 accessions of groundnut (Arachis hypogaea L.) from the Australian Tropical Field Crops Genetic Resource Centre, Biloela, Queensland were examined. Data for four binary, five ordered multistate and seven quantitative descriptors have been documented. The interpretative value of different weightings - equal and unequal weighting of data types to obtain a combined resemblance matrix - was investigated by using principal co-ordinate analysis (ordination) and hierarchical cluster analysis. Equal weighting of data types was found to be more valuable for these data as the results provided a greater insight into the patterns of variability available in the Australian groundnut germplasm collection. The complementary nature of pattern analysis techniques enables plant breeders to identify relevant accessions in relation to the descriptors which distinguish amongst them. This additional information may provide plant breeders with a more defined entry point into the germplasm collection for identifying sources of variability for their plant improvement program, thus improving the utilisation of germplasm resources.
Resumo:
Time series classification has been extensively explored in many fields of study. Most methods are based on the historical or current information extracted from data. However, if interest is in a specific future time period, methods that directly relate to forecasts of time series are much more appropriate. An approach to time series classification is proposed based on a polarization measure of forecast densities of time series. By fitting autoregressive models, forecast replicates of each time series are obtained via the bias-corrected bootstrap, and a stationarity correction is considered when necessary. Kernel estimators are then employed to approximate forecast densities, and discrepancies of forecast densities of pairs of time series are estimated by a polarization measure, which evaluates the extent to which two densities overlap. Following the distributional properties of the polarization measure, a discriminant rule and a clustering method are proposed to conduct the supervised and unsupervised classification, respectively. The proposed methodology is applied to both simulated and real data sets, and the results show desirable properties.
Resumo:
A catchment-scale multivariate statistical analysis of hydrochemistry enabled assessment of interactions between alluvial groundwater and Cressbrook Creek, an intermittent drainage system in southeast Queensland, Australia. Hierarchical cluster analyses and principal component analysis were applied to time-series data to evaluate the hydrochemical evolution of groundwater during periods of extreme drought and severe flooding. A simple three-dimensional geological model was developed to conceptualise the catchment morphology and the stratigraphic framework of the alluvium. The alluvium forms a two-layer system with a basal coarse-grained layer overlain by a clay-rich low-permeability unit. In the upper and middle catchment, alluvial groundwater is chemically similar to streamwater, particularly near the creek (reflected by high HCO3/Cl and K/Na ratios and low salinities), indicating a high degree of connectivity. In the lower catchment, groundwater is more saline with lower HCO3/Cl and K/Na ratios, notably during dry periods. Groundwater salinity substantially decreased following severe flooding in 2011, notably in the lower catchment, confirming that flooding is an important mechanism for both recharge and maintaining groundwater quality. The integrated approach used in this study enabled effective interpretation of hydrological processes and can be applied to a variety of hydrological settings to synthesise and evaluate large hydrochemical datasets.
Resumo:
The proliferation of news reports published in online websites and news information sharing among social media users necessitates effective techniques for analysing the image, text and video data related to news topics. This paper presents the first study to classify affective facial images on emerging news topics. The proposed system dynamically monitors and selects the current hot (of great interest) news topics with strong affective interestingness using textual keywords in news articles and social media discussions. Images from the selected hot topics are extracted and classified into three categorized emotions, positive, neutral and negative, based on facial expressions of subjects in the images. Performance evaluations on two facial image datasets collected from real-world resources demonstrate the applicability and effectiveness of the proposed system in affective classification of facial images in news reports. Facial expression shows high consistency with the affective textual content in news reports for positive emotion, while only low correlation has been observed for neutral and negative. The system can be directly used for applications, such as assisting editors in choosing photos with a proper affective semantic for a certain topic during news report preparation.
Resumo:
Next Generation Sequencing (NGS) has revolutionised molecular biology, resulting in an explosion of data sets and an increasing role in clinical practice. Such applications necessarily require rapid identification of the organism as a prelude to annotation and further analysis. NGS data consist of a substantial number of short sequence reads, given context through downstream assembly and annotation, a process requiring reads consistent with the assumed species or species group. Highly accurate results have been obtained for restricted sets using SVM classifiers, but such methods are difficult to parallelise and success depends on careful attention to feature selection. This work examines the problem at very large scale, using a mix of synthetic and real data with a view to determining the overall structure of the problem and the effectiveness of parallel ensembles of simpler classifiers (principally random forests) in addressing the challenges of large scale genomics.
Resumo:
The growing demand for electricity in New Zealand has led to the construction of new hydro-dams or power stations that have had environmental, social and cultural effects. These effects may drive increases in electricity prices, as such prices reflect the cost of running existing power stations as well as building new ones. This study uses Canterbury and Central Otago as case studies because both regions face similar issues in building new hydro-dams and ever-increasing electricity prices that will eventually prompt households to buy power at higher prices. One way for households to respond to these price changes is to generate their own electricity through microgeneration technologies (MGT). The objective of this study is to investigate public perception and preferences regarding MGT and to analyze the factors that influence people's decision to adopt such new technologies in New Zealand. The study uses a multivariate probit approach to examine households' willingness to adopt any one MGT system or a combination of the MGT systems. Our findings provide valuable information for policy makers and marketers who wish to promote effective microgeneration technologies.
Resumo:
Abstract Within the field of Information Systems, a good proportion of research is concerned with the work organisation and this has, to some extent, restricted the kind of application areas given consideration. Yet, it is clear that information and communication technology deployments beyond the work organisation are acquiring increased importance in our lives. With this in mind, we offer a field study of the appropriation of an online play space known as Habbo Hotel. Habbo Hotel, as a site of media convergence, incorporates social networking and digital gaming functionality. Our research highlights the ethical problems such a dual classification of technology may bring. We focus upon a particular set of activities undertaken within and facilitated by the space – scamming. Scammers dupe members with respect to their ‘Furni’, virtual objects that have online and offline economic value. Through our analysis we show that sometimes, online activities are bracketed off from those defined as offline and that this can be related to how the technology is classified by members – as a social networking site and/or a digital game. In turn, this may affect members’ beliefs about rights and wrongs. We conclude that given increasing media convergence, the way forward is to continue the project of educating people regarding the difficulties of determining rights and wrongs, and how rights and wrongs may be acted out with respect to new technologies of play online and offline.