46 resultados para multivariate discriminant analysis

em QUB Research Portal - Research Directory and Institutional Repository for Queen's University Belfast


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A geostatistical version of the classical Fisher rule (linear discriminant analysis) is presented.This method is applicable when a large dataset of multivariate observations is available within a domain split in several known subdomains, and it assumes that the variograms (or covariance functions) are comparable between subdomains, which only differ in the mean values of the available variables. The method consists on finding the eigen-decomposition of the matrix W-1B, where W is the matrix of sills of all direct- and cross-variograms, and B is the covariance matrix of the vectors of weighted means within each subdomain, obtained by generalized least squares. The method is used to map peat blanket occurrence in Northern Ireland, with data from the Tellus
survey, which requires a minimal change to the general recipe: to use compositionally-compliant variogram tools and models, and work with log-ratio transformed data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces the application of linear multivariate statistical techniques, including partial least squares (PLS), canonical correlation analysis (CCA) and reduced rank regression (RRR), into the area of Systems Biology. This new approach aims to extract the important proteins embedded in complex signal transduction pathway models.The analysis is performed on a model of intracellular signalling along the janus-associated kinases/signal transducers and transcription factors (JAK/STAT) and mitogen activated protein kinases (MAPK) signal transduction pathways in interleukin-6 (IL6) stimulated hepatocytes, which produce signal transducer and activator of transcription factor 3 (STAT3).A region of redundancy within the MAPK pathway that does not affect the STAT3 transcription was identified using CCA. This is the core finding of this analysis and cannot be obtained by inspecting the model by eye. In addition, RRR was found to isolate terms that do not significantly contribute to changes in protein concentrations, while the application of PLS does not provide such a detailed picture by virtue of its construction.This analysis has a similar objective to conventional model reduction techniques with the advantage of maintaining the meaning of the states prior to and after the reduction process. A significant model reduction is performed, with a marginal loss in accuracy, offering a more concise model while maintaining the main influencing factors on the STAT3 transcription.The findings offer a deeper understanding of the reaction terms involved, confirm the relevance of several proteins to the production of Acute Phase Proteins and complement existing findings regarding cross-talk between the two signalling pathways.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A compositional multivariate approach is used to analyse regional scale soil geochemical data obtained as part of the Tellus Project generated by the Geological Survey Northern Ireland (GSNI). The multi-element total concentration data presented comprise XRF analyses of 6862 rural soil samples collected at 20cm depths on a non-aligned grid at one site per 2 km2. Censored data were imputed using published detection limits. Using these imputed values for 46 elements (including LOI), each soil sample site was assigned to the regional geology map provided by GSNI initially using the dominant lithology for the map polygon. Northern Ireland includes a diversity of geology representing a stratigraphic record from the Mesoproterozoic, up to and including the Palaeogene. However, the advance of ice sheets and their meltwaters over the last 100,000 years has left at least 80% of the bedrock covered by superficial deposits, including glacial till and post-glacial alluvium and peat. The question is to what extent the soil geochemistry reflects the underlying geology or superficial deposits. To address this, the geochemical data were transformed using centered log ratios (clr) to observe the requirements of compositional data analysis and avoid closure issues. Following this, compositional multivariate techniques including compositional Principal Component Analysis (PCA) and minimum/maximum autocorrelation factor (MAF) analysis method were used to determine the influence of underlying geology on the soil geochemistry signature. PCA showed that 72% of the variation was determined by the first four principal components (PC’s) implying “significant” structure in the data. Analysis of variance showed that only 10 PC’s were necessary to classify the soil geochemical data. To consider an improvement over PCA that uses the spatial relationships of the data, a classification based on MAF analysis was undertaken using the first 6 dominant factors. Understanding the relationship between soil geochemistry and superficial deposits is important for environmental monitoring of fragile ecosystems such as peat. To explore whether peat cover could be predicted from the classification, the lithology designation was adapted to include the presence of peat, based on GSNI superficial deposit polygons and linear discriminant analysis (LDA) undertaken. Prediction accuracy for LDA classification improved from 60.98% based on PCA using 10 principal components to 64.73% using MAF based on the 6 most dominant factors. The misclassification of peat may reflect degradation of peat covered areas since the creation of superficial deposit classification. Further work will examine the influence of underlying lithologies on elemental concentrations in peat composition and the effect of this in classification analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper introduces a new technique for palmprint recognition based on Fisher Linear Discriminant Analysis (FLDA) and Gabor filter bank. This method involves convolving a palmprint image with a bank of Gabor filters at different scales and rotations for robust palmprint features extraction. Once these features are extracted, FLDA is applied for dimensionality reduction and class separability. Since the palmprint features are derived from the principal lines, wrinkles and texture along the palm area. One should carefully consider this fact when selecting the appropriate palm region for the feature extraction process in order to enhance recognition accuracy. To address this problem, an improved region of interest (ROI) extraction algorithm is introduced. This algorithm allows for an efficient extraction of the whole palm area by ignoring all the undesirable parts, such as the fingers and background. Experiments have shown that the proposed method yields attractive performances as evidenced by an Equal Error Rate (EER) of 0.03%.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A study combining high resolution mass spectrometry (liquid chromatography-quadrupole time-of-flight-mass spectrometry, UPLC-QTof-MS) and chemometrics for the analysis of post-mortem brain tissue from subjects with Alzheimer’s disease (AD) (n = 15) and healthy age-matched controls (n = 15) was undertaken. The huge potential of this metabolomics approach for distinguishing AD cases is underlined by the correct prediction of disease status in 94–97% of cases. Predictive power was confirmed in a blind test set of 60 samples, reaching 100% diagnostic accuracy. The approach also indicated compounds significantly altered in concentration following the onset of human AD. Using orthogonal partial least-squares discriminant analysis (OPLS-DA), a multivariate model was created for both modes of acquisition explaining the maximum amount of variation between sample groups (Positive Mode-R2 = 97%; Q2 = 93%; root mean squared error of validation (RMSEV) = 13%; Negative Mode-R2 = 99%; Q2 = 92%; RMSEV = 15%). In brain extracts, 1264 and 1457 ions of interest were detected for the different modes of acquisition (positive and negative, respectively). Incorporation of gender into the model increased predictive accuracy and decreased RMSEV values. High resolution UPLC-QTof-MS has not previously been employed to biochemically profile post-mortem brain tissue, and the novel methods described and validated herein prove its potential for making new discoveries related to the etiology, pathophysiology, and treatment of degenerative brain disorders.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Brain tissue from so-called Alzheimer's disease (AD) mouse models has previously been examined using H-1 NMR-metabolomics, but comparable information concerning human AD is negligible. Since no animal model recapitulates all the features of human AD we undertook the first H-1 NMR-metabolomics investigation of human AD brain tissue. Human post-mortem tissue from 15 AD subjects and 15 age-matched controls was prepared for analysis through a series of lyophilised, milling, extraction and randomisation steps and samples were analysed using H-1 NMR. Using partial least squares discriminant analysis, a model was built using data obtained from brain extracts. Analysis of brain extracts led to the elucidation of 24 metabolites. Significant elevations in brain alanine (15.4 %) and taurine (18.9 %) were observed in AD patients (p ≤ 0.05). Pathway topology analysis implicated either dysregulation of taurine and hypotaurine metabolism or alanine, aspartate and glutamate metabolism. Furthermore, screening of metabolites for AD biomarkers demonstrated that individual metabolites weakly discriminated cases of AD [receiver operating characteristic (ROC) AUC <0.67; p < 0.05]. However, paired metabolites ratios (e.g. alanine/carnitine) were more powerful discriminating tools (ROC AUC = 0.76; p < 0.01). This study further demonstrates the potential of metabolomics for elucidating the underlying biochemistry and to help identify AD in patients attending the memory clinic

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Abstract Honey is a high value food commodity with recognized nutraceutical properties. A primary driver of the value of honey is its floral origin. The feasibility of applying multivariate data analysis to various chemical parameters for the discrimination of honeys was explored. This approach was applied to four authentic honeys with different floral origins (rata, kamahi, clover and manuka) obtained from producers in New Zealand. Results from elemental profiling, stable isotope analysis, metabolomics (UPLC-QToF MS), and NIR, FT-IR, and Raman spectroscopic fingerprinting were analyzed. Orthogonal partial least square discriminant analysis (OPLS-DA) was used to determine which technique or combination of techniques provided the best classification and prediction abilities. Good prediction values were achieved using metabolite data (for all four honeys, Q2 = 0.52; for manuka and clover, Q2 = 0.76) and the trace element/isotopic data (for manuka and clover, Q2 = 0.65), while the other chemical parameters showed promise when combined (for manuka and clover, Q2 = 0.43).

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The histological grading of cervical intraepithelial neoplasia (CIN) remains subjective, resulting in inter- and intra-observer variation and poor reproducibility in the grading of cervical lesions. This study has attempted to develop an objective grading system using automated machine vision. The architectural features of cervical squamous epithelium are quantitatively analysed using a combination of computerized digital image processing and Delaunay triangulation analysis; 230 images digitally captured from cases previously classified by a gynaecological pathologist included normal cervical squamous epithelium (n = 30), koilocytosis (n = 46), CIN 1 (n = 52), CIN 2 (n = 56), and CIN 3 (n=46). Intra- and inter-observer variation had kappa values of 0.502 and 0.415, respectively. A machine vision system was developed in KS400 macro programming language to segment and mark the centres of all nuclei within the epithelium. By object-oriented analysis of image components, the positional information of nuclei was used to construct a Delaunay triangulation mesh. Each mesh was analysed to compute triangle dimensions including the mean triangle area, the mean triangle edge length, and the number of triangles per unit area, giving an individual quantitative profile of measurements for each case. Discriminant analysis of the geometric data revealed the significant discriminatory variables from which a classification score was derived. The scoring system distinguished between normal and CIN 3 in 98.7% of cases and between koilocytosis and CIN 1 in 76.5% of cases, but only 62.3% of the CIN cases were classified into the correct group, with the CIN 2 group showing the highest rate of misclassification. Graphical plots of triangulation data demonstrated the continuum of morphological change from normal squamous epithelium to the highest grade of CIN, with overlapping of the groups originally defined by the pathologists. This study shows that automated location of nuclei in cervical biopsies using computerized image analysis is possible. Analysis of positional information enables quantitative evaluation of architectural features in CIN using Delaunay triangulation meshes, which is effective in the objective classification of CIN. This demonstrates the future potential of automated machine vision systems in diagnostic histopathology. Copyright (C) 2000 John Wiley and Sons, Ltd.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Genome-scale metabolic models promise important insights into cell function. However, the definition of pathways and functional network modules within these models, and in the biochemical literature in general, is often based on intuitive reasoning. Although mathematical methods have been proposed to identify modules, which are defined as groups of reactions with correlated fluxes, there is a need for experimental verification. We show here that multivariate statistical analysis of the NMR-derived intra- and extracellular metabolite profiles of single-gene deletion mutants in specific metabolic pathways in the yeast Saccharomyces cerevisiae identified outliers whose profiles were markedly different from those of the other mutants in their respective pathways. Application of flux coupling analysis to a metabolic model of this yeast showed that the deleted gene in an outlying mutant encoded an enzyme that was not part of the same functional network module as the other enzymes in the pathway. We suggest that metabolomic methods such as this, which do not require any knowledge of how a gene deletion might perturb the metabolic network, provide an empirical method for validating and ultimately refining the predicted network structure.