867 resultados para Nonparametric discriminant analysis
Resumo:
A novel combined near- and mid-infrared (NIR and MIR) spectroscopic method has been researched and developed for the analysis of complex substances such as the Traditional Chinese Medicine (TCM), Illicium verum Hook. F. (IVHF), and its noxious adulterant, Iuicium lanceolatum A.C. Smith (ILACS). Three types of spectral matrix were submitted for classification with the use of the linear discriminant analysis (LDA) method. The data were pretreated with either the successive projections algorithm (SPA) or the discrete wavelet transform (DWT) method. The SPA method performed somewhat better, principally because it required less spectral features for its pretreatment model. Thus, NIR or MIR matrix as well as the combined NIR/MIR one, were pretreated by the SPA method, and then analysed by LDA. This approach enabled the prediction and classification of the IVHF, ILACS and mixed samples. The MIR spectral data produced somewhat better classification rates than the NIR data. However, the best results were obtained from the combined NIR/MIR data matrix with 95–100% correct classifications for calibration, validation and prediction. Principal component analysis (PCA) of the three types of spectral data supported the results obtained with the LDA classification method.
Resumo:
A novel near-infrared spectroscopy (NIRS) method has been researched and developed for the simultaneous analyses of the chemical components and associated properties of mint (Mentha haplocalyx Briq.) tea samples. The common analytes were: total polysaccharide content, total flavonoid content, total phenolic content, and total antioxidant activity. To resolve the NIRS data matrix for such analyses, least squares support vector machines was found to be the best chemometrics method for prediction, although it was closely followed by the radial basis function/partial least squares model. Interestingly, the commonly used partial least squares was unsatisfactory in this case. Additionally, principal component analysis and hierarchical cluster analysis were able to distinguish the mint samples according to their four geographical provinces of origin, and this was further facilitated with the use of the chemometrics classification methods-K-nearest neighbors, linear discriminant analysis, and partial least squares discriminant analysis. In general, given the potential savings with sampling and analysis time as well as with the costs of special analytical reagents required for the standard individual methods, NIRS offered a very attractive alternative for the simultaneous analysis of mint samples.
Resumo:
Solid materials can exist in different physical structures without a change in chemical composition. This phenomenon, known as polymorphism, has several implications on pharmaceutical development and manufacturing. Various solid forms of a drug can possess different physical and chemical properties, which may affect processing characteristics and stability, as well as the performance of a drug in the human body. Therefore, knowledge and control of the solid forms is fundamental to maintain safety and high quality of pharmaceuticals. During manufacture, harsh conditions can give rise to unexpected solid phase transformations and therefore change the behavior of the drug. Traditionally, pharmaceutical production has relied on time-consuming off-line analysis of production batches and finished products. This has led to poor understanding of processes and drug products. Therefore, new powerful methods that enable real time monitoring of pharmaceuticals during manufacturing processes are greatly needed. The aim of this thesis was to apply spectroscopic techniques to solid phase analysis within different stages of drug development and manufacturing, and thus, provide a molecular level insight into the behavior of active pharmaceutical ingredients (APIs) during processing. Applications to polymorph screening and different unit operations were developed and studied. A new approach to dissolution testing, which involves simultaneous measurement of drug concentration in the dissolution medium and in-situ solid phase analysis of the dissolving sample, was introduced and studied. Solid phase analysis was successfully performed during different stages, enabling a molecular level insight into the occurring phenomena. Near-infrared (NIR) spectroscopy was utilized in screening of polymorphs and processing-induced transformations (PITs). Polymorph screening was also studied with NIR and Raman spectroscopy in tandem. Quantitative solid phase analysis during fluidized bed drying was performed with in-line NIR and Raman spectroscopy and partial least squares (PLS) regression, and different dehydration mechanisms were studied using in-situ spectroscopy and partial least squares discriminant analysis (PLS-DA). In-situ solid phase analysis with Raman spectroscopy during dissolution testing enabled analysis of dissolution as a whole, and provided a scientific explanation for changes in the dissolution rate. It was concluded that the methods applied and studied provide better process understanding and knowledge of the drug products, and therefore, a way to achieve better quality.
Resumo:
The use of near infrared (NIR) hyperspectral imaging and hyperspectral image analysis for distinguishing between hard, intermediate and soft maize kernels from inbred lines was evaluated. NIR hyperspectral images of two sets (12 and 24 kernels) of whole maize kernels were acquired using a Spectral Dimensions MatrixNIR camera with a spectral range of 960-1662 nm and a sisuChema SWIR (short wave infrared) hyperspectral pushbroom imaging system with a spectral range of 1000-2498 nm. Exploratory principal component analysis (PCA) was used on absorbance images to remove background, bad pixels and shading. On the cleaned images. PCA could be used effectively to find histological classes including glassy (hard) and floury (soft) endosperm. PCA illustrated a distinct difference between glassy and floury endosperm along principal component (PC) three on the MatrixNIR and PC two on the sisuChema with two distinguishable clusters. Subsequently partial least squares discriminant analysis (PLS-DA) was applied to build a classification model. The PLS-DA model from the MatrixNIR image (12 kernels) resulted in root mean square error of prediction (RMSEP) value of 0.18. This was repeated on the MatrixNIR image of the 24 kernels which resulted in RMSEP of 0.18. The sisuChema image yielded RMSEP value of 0.29. The reproducible results obtained with the different data sets indicate that the method proposed in this paper has a real potential for future classification uses.
Resumo:
Current earthquake early warning systems usually make magnitude and location predictions and send out a warning to the users based on those predictions. We describe an algorithm that assesses the validity of the predictions in real-time. Our algorithm monitors the envelopes of horizontal and vertical acceleration, velocity, and displacement. We compare the observed envelopes with the ones predicted by Cua & Heaton's envelope ground motion prediction equations (Cua 2005). We define a "test function" as the logarithm of the ratio between observed and predicted envelopes at every second in real-time. Once the envelopes deviate beyond an acceptable threshold, we declare a misfit. Kurtosis and skewness of a time evolving test function are used to rapidly identify a misfit. Real-time kurtosis and skewness calculations are also inputs to both probabilistic (Logistic Regression and Bayesian Logistic Regression) and nonprobabilistic (Least Squares and Linear Discriminant Analysis) models that ultimately decide if there is an unacceptable level of misfit. This algorithm is designed to work at a wide range of amplitude scales. When tested with synthetic and actual seismic signals from past events, it works for both small and large events.
Semantic Discriminant mapping for classification and browsing of remote sensing textures and objects
Resumo:
We present a new approach based on Discriminant Analysis to map a high dimensional image feature space onto a subspace which has the following advantages: 1. each dimension corresponds to a semantic likelihood, 2. an efficient and simple multiclass classifier is proposed and 3. it is low dimensional. This mapping is learnt from a given set of labeled images with a class groundtruth. In the new space a classifier is naturally derived which performs as well as a linear SVM. We will show that projecting images in this new space provides a database browsing tool which is meaningful to the user. Results are presented on a remote sensing database with eight classes, made available online. The output semantic space is a low dimensional feature space which opens perspectives for other recognition tasks. © 2005 IEEE.
Resumo:
BACKGROUND: Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP), the Indian Buffet Process (IBP), and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB) analysis. RESULTS: Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV), Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB) approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD), closely related non-Bayesian approaches. CONCLUSIONS: Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Resumo:
This paper introduces a new technique for palmprint recognition based on Fisher Linear Discriminant Analysis (FLDA) and Gabor filter bank. This method involves convolving a palmprint image with a bank of Gabor filters at different scales and rotations for robust palmprint features extraction. Once these features are extracted, FLDA is applied for dimensionality reduction and class separability. Since the palmprint features are derived from the principal lines, wrinkles and texture along the palm area. One should carefully consider this fact when selecting the appropriate palm region for the feature extraction process in order to enhance recognition accuracy. To address this problem, an improved region of interest (ROI) extraction algorithm is introduced. This algorithm allows for an efficient extraction of the whole palm area by ignoring all the undesirable parts, such as the fingers and background. Experiments have shown that the proposed method yields attractive performances as evidenced by an Equal Error Rate (EER) of 0.03%.
Resumo:
Brain tissue from so-called Alzheimer's disease (AD) mouse models has previously been examined using H-1 NMR-metabolomics, but comparable information concerning human AD is negligible. Since no animal model recapitulates all the features of human AD we undertook the first H-1 NMR-metabolomics investigation of human AD brain tissue. Human post-mortem tissue from 15 AD subjects and 15 age-matched controls was prepared for analysis through a series of lyophilised, milling, extraction and randomisation steps and samples were analysed using H-1 NMR. Using partial least squares discriminant analysis, a model was built using data obtained from brain extracts. Analysis of brain extracts led to the elucidation of 24 metabolites. Significant elevations in brain alanine (15.4 %) and taurine (18.9 %) were observed in AD patients (p ≤ 0.05). Pathway topology analysis implicated either dysregulation of taurine and hypotaurine metabolism or alanine, aspartate and glutamate metabolism. Furthermore, screening of metabolites for AD biomarkers demonstrated that individual metabolites weakly discriminated cases of AD [receiver operating characteristic (ROC) AUC <0.67; p < 0.05]. However, paired metabolites ratios (e.g. alanine/carnitine) were more powerful discriminating tools (ROC AUC = 0.76; p < 0.01). This study further demonstrates the potential of metabolomics for elucidating the underlying biochemistry and to help identify AD in patients attending the memory clinic
Resumo:
Statistics are regularly used to make some form of comparison between trace evidence or deploy the exclusionary principle (Morgan and Bull, 2007) in forensic investigations. Trace evidence are routinely the results of particle size, chemical or modal analyses and as such constitute compositional data. The issue is that compositional data including percentages, parts per million etc. only carry relative information. This may be problematic where a comparison of percentages and other constraint/closed data is deemed a statistically valid and appropriate way to present trace evidence in a court of law. Notwithstanding an awareness of the existence of the constant sum problem since the seminal works of Pearson (1896) and Chayes (1960) and the introduction of the application of log-ratio techniques (Aitchison, 1986; Pawlowsky-Glahn and Egozcue, 2001; Pawlowsky-Glahn and Buccianti, 2011; Tolosana-Delgado and van den Boogaart, 2013) the problem that a constant sum destroys the potential independence of variances and covariances required for correlation regression analysis and empirical multivariate methods (principal component analysis, cluster analysis, discriminant analysis, canonical correlation) is all too often not acknowledged in the statistical treatment of trace evidence. Yet the need for a robust treatment of forensic trace evidence analyses is obvious. This research examines the issues and potential pitfalls for forensic investigators if the constant sum constraint is ignored in the analysis and presentation of forensic trace evidence. Forensic case studies involving particle size and mineral analyses as trace evidence are used to demonstrate the use of a compositional data approach using a centred log-ratio (clr) transformation and multivariate statistical analyses.
Resumo:
Dissertação de mest., Qualidade em Análises, Faculdade de Ciências e Tecnologia, Univ. do Algarve, 2013
Resumo:
Short summary: This study was undertaken to assess the diversity of plant resources utilized by the local population in south-western Madagascar, the social, ecological and biophysical conditions that drive their uses and availability, and possible alternative strategies for their sustainable use in the region. The study region, ‘Mahafaly region’, located in south-western Madagascar, is one of the country’s most economically, educationally and climatically disadvantaged regions. With an arid steppe climate, the agricultural production is limited by low water availability and a low level of soil nutrients and soil organic carbon. The region comprises the recently extended Tsimanampetsotsa National Park, with numerous sacred and communities forests, which are threatened by slash and burn agriculture and overexploitation of forests resources. The present study analyzed the availability of wild yams and medicinal plants, and their importance for the livelihood of the local population in this region. An ethnobotanical survey was conducted recording the diversity, local knowledge and use of wild yams and medicinal plants utilized by the local communities in five villages in the Mahafaly region. 250 households were randomly selected followed by semi-structured interviews on the socio-economic characteristics of the households. Data allowed us to characterize sociocultural and socioeconomic factors that determine the local use of wild yams and medicinal plants, and to identify their role in the livelihoods of local people. Species-environment relationships and the current spatial distribution of the wild yams were investigated and predicted using ordination methods and a niche based habitat modelling approach. Species response curves along edaphic gradients allowed us to understand the species requirements on habitat conditions. We thus investigated various alternative methods to enhance the wild yam regeneration for their local conservation and their sustainable use in the Mahafaly region. Altogether, six species of wild yams and a total of 214 medicinal plants species from 68 families and 163 genera were identified in the study region. Results of the cluster and discriminant analysis indicated a clear pattern on resource, resulted in two groups of household and characterized by differences in livestock numbers, off-farm activities, agricultural land and harvests. A generalized linear model highlighted that economic factors significantly affect the collection intensity of wild yams, while the use of medicinal plants depends to a higher degree on socio-cultural factors. The gradient analysis on the distribution of the wild yam species revealed a clear pattern for species habitats. Species models based on NPMR (Nonparametric Multiplicative Regression analysis) indicated the importance of vegetation structure, human interventions, and soil characteristics to determine wild yam species distribution. The prediction of the current availability of wild yam resources showed that abundant wild yam resources are scarce and face high harvest intensity. Experiments on yams cultivation revealed that germination of seeds was enhanced by using pre-germination treatments before planting, vegetative regeneration performed better with the upper part of the tubers (corms) rather than the sets of tubers. In-situ regeneration was possible for the upper parts of the wild tubers but the success depended significantly on the type of soil. The use of manure (10-20 t ha¹) increased the yield of the D. alata and D. alatipes by 40%. We thus suggest the promotion of other cultivated varieties of D. alata found regions neighbouring as the Mahafaly Plateau.
Resumo:
The species related to Vriesea paraibica (Bromeliaceae, Tillandsioideae) have controversial taxonomic limits. For several decades, this group has been identified in herbarium collections as V. x morreniana, an artificial hybrid that does not grow in natural habitats. The aim of this study was to assess the morphological variation in the V. paraibica complex through morphometric analyses of natural populations. Two sets of analyses were performed: the first involved six natural populations (G1) and the second was carried out on taxa that emerged from the first analysis, but using material from herbarium collections (G2). Univariate ANOVA was used, as well as discriminant analysis of 16 morphometric variables in G1 and 18 in G2. The results of the analyses of the two groups were similar and led to the selection of diagnostic traits of four species. Lengths of the lower and median floral bracts were significant for the separation of red and yellow floral bracts. Vriesea paraibica and V. interrogatoria have red bracts; these two species are differentiated by the widths of the lower and median portions of the inflorescence and by scape length. These structures are larger in the former and smaller in the latter. Of the species with yellow floral bracts, V. eltoniana is distinguished by longer leaf blades and scapes and V. flava is characterized by its shorter sepal lengths. (C) 2009 The Linnean Society of London, Botanical Journal of the Linnean Society, 2009, 159, 163-181.
Resumo:
This work presents a novel approach in order to increase the recognition power of Multiscale Fractal Dimension (MFD) techniques, when applied to image classification. The proposal uses Functional Data Analysis (FDA) with the aim of enhancing the MFD technique precision achieving a more representative descriptors vector, capable of recognizing and characterizing more precisely objects in an image. FDA is applied to signatures extracted by using the Bouligand-Minkowsky MFD technique in the generation of a descriptors vector from them. For the evaluation of the obtained improvement, an experiment using two datasets of objects was carried out. A dataset was used of characters shapes (26 characters of the Latin alphabet) carrying different levels of controlled noise and a dataset of fish images contours. A comparison with the use of the well-known methods of Fourier and wavelets descriptors was performed with the aim of verifying the performance of FDA method. The descriptor vectors were submitted to Linear Discriminant Analysis (LDA) classification method and we compared the correctness rate in the classification process among the descriptors methods. The results demonstrate that FDA overcomes the literature methods (Fourier and wavelets) in the processing of information extracted from the MFD signature. In this way, the proposed method can be considered as an interesting choice for pattern recognition and image classification using fractal analysis.
Resumo:
A new method for characterization and analysis of asphaltic mixtures aggregate particles is reported. By relying on multiscale representation of the particles, curvature estimation, and discriminant analysis for optimal separation of the categories of mixtures, a particularly effective and comprehensive methodology is obtained. The potential of the methodology is illustrated with respect to three important types of particles used in asphaltic mixtures, namely basalt, gabbro, and gravel. The obtained results show that gravel particles are markedly distinct from the other two types of particles, with the gabbro category resulting with intermediate geometrical properties. The importance of each considered measurement in the discrimination between the three categories of particles was also quantified in terms of the adopted discriminant analysis.