867 resultados para Nonparametric discriminant analysis
Resumo:
Automatic taxonomic categorisation of 23 species of dinoflagellates was demonstrated using field-collected specimens. These dinoflagellates have been responsible for the majority of toxic and noxious phytoplankton blooms which have occurred in the coastal waters of the European Union in recent years and make severe impact on the aquaculture industry. The performance by human 'expert' ecologists/taxonomists in identifying these species was compared to that achieved by 2 artificial neural network classifiers (multilayer perceptron and radial basis function networks) and 2 other statistical techniques, k-Nearest Neighbour and Quadratic Discriminant Analysis. The neural network classifiers outperform the classical statistical techniques. Over extended trials, the human experts averaged 85% while the radial basis network achieved a best performance of 83%, the multilayer perceptron 66%, k-Nearest Neighbour 60%, and the Quadratic Discriminant Analysis 56%.
Resumo:
The detection of dense harmful algal blooms (HABs) by satellite remote sensing is usually based on analysis of chlorophyll-a as a proxy. However, this approach does not provide information about the potential harm of bloom, nor can it identify the dominant species. The developed HAB risk classification method employs a fully automatic data-driven approach to identify key characteristics of water leaving radiances and derived quantities, and to classify pixels into “harmful”, “non-harmful” and “no bloom” categories using Linear Discriminant Analysis (LDA). Discrimination accuracy is increased through the use of spectral ratios of water leaving radiances, absorption and backscattering. To reduce the false alarm rate the data that cannot be reliably classified are automatically labelled as “unknown”. This method can be trained on different HAB species or extended to new sensors and then applied to generate independent HAB risk maps; these can be fused with other sensors to fill gaps or improve spatial or temporal resolution. The HAB discrimination technique has obtained accurate results on MODIS and MERIS data, correctly identifying 89% of Phaeocystis globosa HABs in the southern North Sea and 88% of Karenia mikimotoi blooms in the Western English Channel. A linear transformation of the ocean colour discriminants is used to estimate harmful cell counts, demonstrating greater accuracy than if based on chlorophyll-a; this will facilitate its integration into a HAB early warning system operating in the southern North Sea.
Resumo:
The histological grading of cervical intraepithelial neoplasia (CIN) remains subjective, resulting in inter- and intra-observer variation and poor reproducibility in the grading of cervical lesions. This study has attempted to develop an objective grading system using automated machine vision. The architectural features of cervical squamous epithelium are quantitatively analysed using a combination of computerized digital image processing and Delaunay triangulation analysis; 230 images digitally captured from cases previously classified by a gynaecological pathologist included normal cervical squamous epithelium (n = 30), koilocytosis (n = 46), CIN 1 (n = 52), CIN 2 (n = 56), and CIN 3 (n=46). Intra- and inter-observer variation had kappa values of 0.502 and 0.415, respectively. A machine vision system was developed in KS400 macro programming language to segment and mark the centres of all nuclei within the epithelium. By object-oriented analysis of image components, the positional information of nuclei was used to construct a Delaunay triangulation mesh. Each mesh was analysed to compute triangle dimensions including the mean triangle area, the mean triangle edge length, and the number of triangles per unit area, giving an individual quantitative profile of measurements for each case. Discriminant analysis of the geometric data revealed the significant discriminatory variables from which a classification score was derived. The scoring system distinguished between normal and CIN 3 in 98.7% of cases and between koilocytosis and CIN 1 in 76.5% of cases, but only 62.3% of the CIN cases were classified into the correct group, with the CIN 2 group showing the highest rate of misclassification. Graphical plots of triangulation data demonstrated the continuum of morphological change from normal squamous epithelium to the highest grade of CIN, with overlapping of the groups originally defined by the pathologists. This study shows that automated location of nuclei in cervical biopsies using computerized image analysis is possible. Analysis of positional information enables quantitative evaluation of architectural features in CIN using Delaunay triangulation meshes, which is effective in the objective classification of CIN. This demonstrates the future potential of automated machine vision systems in diagnostic histopathology. Copyright (C) 2000 John Wiley and Sons, Ltd.
Resumo:
Logistic regression and Gaussian mixture model (GMM) classifiers have been trained to estimate the probability of acute myocardial infarction (AMI) in patients based upon the concentrations of a panel of cardiac markers. The panel consists of two new markers, fatty acid binding protein (FABP) and glycogen phosphorylase BB (GPBB), in addition to the traditional cardiac troponin I (cTnI), creatine kinase MB (CKMB) and myoglobin. The effect of using principal component analysis (PCA) and Fisher discriminant analysis (FDA) to preprocess the marker concentrations was also investigated. The need for classifiers to give an accurate estimate of the probability of AMI is argued and three categories of performance measure are described, namely discriminatory ability, sharpness, and reliability. Numerical performance measures for each category are given and applied. The optimum classifier, based solely upon the samples take on admission, was the logistic regression classifier using FDA preprocessing. This gave an accuracy of 0.85 (95% confidence interval: 0.78-0.91) and a normalised Brier score of 0.89. When samples at both admission and a further time, 1-6 h later, were included, the performance increased significantly, showing that logistic regression classifiers can indeed use the information from the five cardiac markers to accurately and reliably estimate the probability AMI. © Springer-Verlag London Limited 2008.
Resumo:
The concentration of organic acids in anaerobic digesters is one of the most critical parameters for monitoring and advanced control of anaerobic digestion processes. Thus, a reliable online-measurement system is absolutely necessary. A novel approach to obtaining these measurements indirectly and online using UV/vis spectroscopic probes, in conjunction with powerful pattern recognition methods, is presented in this paper. An UV/vis spectroscopic probe from S::CAN is used in combination with a custom-built dilution system to monitor the absorption of fully fermented sludge at a spectrum from 200 to 750 nm. Advanced pattern recognition methods are then used to map the non-linear relationship between measured absorption spectra to laboratory measurements of organic acid concentrations. Linear discriminant analysis, generalized discriminant analysis (GerDA), support vector machines (SVM), relevance vector machines, random forest and neural networks are investigated for this purpose and their performance compared. To validate the approach, online measurements have been taken at a full-scale 1.3-MW industrial biogas plant. Results show that whereas some of the methods considered do not yield satisfactory results, accurate prediction of organic acid concentration ranges can be obtained with both GerDA and SVM-based classifiers, with classification rates in excess of 87% achieved on test data.
Resumo:
A study combining high resolution mass spectrometry (liquid chromatography-quadrupole time-of-flight-mass spectrometry, UPLC-QTof-MS) and chemometrics for the analysis of post-mortem brain tissue from subjects with Alzheimer’s disease (AD) (n = 15) and healthy age-matched controls (n = 15) was undertaken. The huge potential of this metabolomics approach for distinguishing AD cases is underlined by the correct prediction of disease status in 94–97% of cases. Predictive power was confirmed in a blind test set of 60 samples, reaching 100% diagnostic accuracy. The approach also indicated compounds significantly altered in concentration following the onset of human AD. Using orthogonal partial least-squares discriminant analysis (OPLS-DA), a multivariate model was created for both modes of acquisition explaining the maximum amount of variation between sample groups (Positive Mode-R2 = 97%; Q2 = 93%; root mean squared error of validation (RMSEV) = 13%; Negative Mode-R2 = 99%; Q2 = 92%; RMSEV = 15%). In brain extracts, 1264 and 1457 ions of interest were detected for the different modes of acquisition (positive and negative, respectively). Incorporation of gender into the model increased predictive accuracy and decreased RMSEV values. High resolution UPLC-QTof-MS has not previously been employed to biochemically profile post-mortem brain tissue, and the novel methods described and validated herein prove its potential for making new discoveries related to the etiology, pathophysiology, and treatment of degenerative brain disorders.
Resumo:
In our genome scan for schizophrenia genes in 265 Irish pedigrees, marker D5S818 in 5q22 produced the second best result of the first 223 markers tested (P = 0.002). We then tested an additional 13 markers and the evidence suggests the presence of a vulnerability locus for schizophrenia in region 5q22-31. This region appears to be distinct from those chromosome 5 regions studied in two prior reports, but the same as that producing positive results in the report by Wildenauer and colleagues found elsewhere in this issue. The largest pairwise heterogeneity LOD (H-LOD) score was found with marker D5S393 (max 3.04, P = 0.0005), assuming a narrow phenotypic category, and a genetic model with intermediate heterozygotic liability. In marked contrast to the H-LOD scores from our sample with markers from the regions of interest on chromosomes 6p and 8p, expanding the disease definition to include schizophrenia spectrum or nonspectrum disorders produced substantially smaller scores, with a number of markers failing to yield positive values at any recombination fraction. Using multipoint H-LODS, the strongest evidence for linkage occurs under the narrow phenotypic definition and recessive genetic model, with a peak at marker D5S804 (max 3.35, P = 0.0002). Multipoint nonparametric linkage analysis produced a peak in the same location (max z = 2.84, P = 0.002) with the narrow phenotypic definition. This putative vulnerability locus appears to be segregating in 10-25% of the families studied, but this estimate is tentative. Comparison of individual family multipoint H-LOD scores at the regions of interest on chromosomes 6p, 8p and 5q showed that only a minority of families yield high lod scores in two or three regions.
Resumo:
In our genomic scan of 265 Irish families with schizophrenia, we have thus far generated modest evidence for the presence of vulnerability genes in three chromosomal regions, i.e., 5q21-q31, 6p24-p22, and 8p22-p21. Outside of those regions, of all markers tested to date, D10S674 produced one of the highest pairwise heterogeneity lod (H-LOD) scores, 3.2 (P = 0.0004), when initially tested on a subset of 88 families. We then tested a total of 12 markers across a region of 32 centimorgans in region 10p15-p11 of all 265 families. The strongest evidence for linkage occurred assuming an intermediate phenotypic definition, and a recessive genetic model. The largest pairwise H-LOD score was found with marker D10S2443 (maximum 1.95, P = 0.005). Using multipoint H-LODs, we found a broad peak (maximum 1.91, P = 0.006) extending over the 11 centimorgans from marker D10S674 to marker D10S1426. Multipoint nonparametric linkage analysis produced a much broader peak, but with the maximum in the same location near D10S2443 (maximum z = 1.88, P = 0.03). Based on estimates from the multipoint analysis, this putative vulnerability locus appears to be segregating in 5-15% of the families studied, but this estimate should be viewed with caution. When evaluated in the context of our genome scan results, the evidence suggests the possibility of a fourth vulnerability locus for schizophrenia in these Irish families, in region 10p15-p11.
Resumo:
Purpose: The purpose of this paper is to present an artificial neural network (ANN) model that predicts earthmoving trucks condition level using simple predictors; the model’s performance is compared to the respective predictive accuracy of the statistical method of discriminant analysis (DA).
Design/methodology/approach: An ANN-based predictive model is developed. The condition level predictors selected are the capacity, age, kilometers travelled and maintenance level. The relevant data set was provided by two Greek construction companies and includes the characteristics of 126 earthmoving trucks.
Findings: Data processing identifies a particularly strong connection of kilometers travelled and maintenance level with the earthmoving trucks condition level. Moreover, the validation process reveals that the predictive efficiency of the proposed ANN model is very high. Similar findings emerge from the application of DA to the same data set using the same predictors.
Originality/value: Earthmoving trucks’ sound condition level prediction reduces downtime and its adverse impact on earthmoving duration and cost, while also enhancing the maintenance and replacement policies effectiveness. This research proves that a sound condition level prediction for earthmoving trucks is achievable through the utilization of easy to collect data and provides a comparative evaluation of the results of two widely applied predictive methods.
Resumo:
Only long-term home oxygen therapy has been shown in randomised controlled trials to increase survival in chronic obstructive pulmonary disease (COPD). There have been no trials assessing the effect of inhaled corticosteroids and long-acting bronchodilators, alone or in combination, on mortality in patients with COPD, despite their known benefit in reducing symptoms and exacerbations. The "TOwards a Revolution in COPD Health" (TORCH) survival study is aiming to determine the impact of salmeterol/fluticasone propionate (SFC) combination and the individual components on the survival of COPD patients. TORCH is a multicentre, randomised, double-blind, parallel-group, placebo-controlled study. Approximately 6,200 patients with moderate-to-severe COPD were randomly assigned to b.i.d. treatment with either SFC (50/500 microg), fluticasone propionate (500 microg), salmeterol (50 microg) or placebo for 3 yrs. The primary end-point is all-cause mortality; secondary end-points are COPD morbidity relating to rate of exacerbations and health status, using the St George's Respiratory Questionnaire. Other end-points include other mortality and exacerbation end-points, requirement for long-term oxygen therapy, and clinic lung function. Safety end-points include adverse events, with additional information on bone fractures. The first patient was recruited in September 2000 and results should be available in 2006. This paper describes the "TOwards a Revolution in COPD Health" study and explains the rationale behind it.
Resumo:
In this study, 137 corn distillers dried grains with solubles (DDGS) samples from a range of different geographical origins (Jilin Province of China, Heilongjiang Province of China, USA and Europe) were collected and analysed. Different near infrared spectrometers combined with different chemometric packages were used in two independent laboratories to investigate the feasibility of classifying geographical origin of DDGS. Base on the same dataset, one laboratory developed a partial least square discriminant analysis model and another laboratory developed an orthogonal partial least square discriminant analysis model. Results showed that both models could perfectly classify DDGS samples from different geographical origins. These promising results encourage the development of larger scale efforts to produce datasets which can be used to differentiate the geographical origin of DDGS and such efforts are required to provide higher level food security measures on a global scale.
Resumo:
1. The population density and age structure of two species of heather psyllid Strophingia ericae and Strophingia cinereae, feeding on Calluna vulgaris and Erica cinerea, respectively, were sampled using standardized methods at locations throughout Britain. Locations were chosen to represent the full latitudinal and altitudinal range of the host plants.
2. The paper explains how spatial variation in thermal environment, insect life-history characteristics and physiology, and plant distribution, interact to provide the mechanisms that determine the range and abundance of Strophingia spp.
3. Strophingia ericae and S. cinereae, despite the similarity in the spatial distribution patterns of their host plants within Britain, display strongly contrasting geographical ranges and corresponding life-history strategies. Strophingia ericae is found on its host plant throughout Britain but S. cinereae is restricted to low elevation sites south of the Mersey-Humber line and occupies only part of the latitudinal and altitudinal range of its host plant. There is no evidence to suggest that S. ericae has reached its potential altitudinal or latitudinal limit in the UK, even though its host plant appears to reach its altitudinal limit.
4. There was little difference in the ability of the two Strophingia spp. to survive shortterm exposure to temperatures as low as - 15 degrees C and low winter temperatures probably do not limit distribution in S. cinereae.
5. Population density of S. ericae was not related to altitude but showed a weak correlation with latitude. The spread of larval instars present at a site, measured as an index of instar homogeneity, was significantly correlated with a range of temperature related variables, of which May mean temperature and length of growing season above 3 degrees C (calculated using the Lennon and Turner climatic model) were the most significant. Factor analysis did not improve the level of correlation significantly above those obtained for single climatic variables. The data confirmed that S. ericae has a I year life cycle at the lowest elevations and a 2 year life cycle at the higher elevations. However, there was no evidence, as previously suggested, for an abrupt change from a one to a 2 year life cycle in S. ericae with increasing altitudes or latitudes.
6. By contrast with S. ericae, S. cinereae had an obligatory 1 year life cycle, its population decreased with altitude and the index of instar homogeneity showed little correlation with single temperature variables. Moreover, it occupied only part of the range of its host plant and its spatial distribution in the UK could be predicted with 96% accuracy using selected variables in discriminant analysis.
7. The life histories of the congeneric heather psyllids reflect adaptations that allow them to exploit host plants with different distributions in climatic and thereby geographical space. Strophingia ericae has the flexible life history that enables it to exploit C. vulgaris throughout its European boreal temperate range. Strophingia cinereae has a less flexible life history and is adapted for living on an oceanic temperate host. While the geographic ranges of the two Strophingia spp. overlap within the UK, the psyllids appear to respond differently to variation in their thermal environment.
Resumo:
The aim of the study was to investigate the potential of a metabolomics platform to distinguish between pigs treated with ronidazole, dimetridazole and metronidazole and non-medicated animals (controls), at two withdrawal periods (day 0 and 5). Livers from each animal were biochemically profiled using UHPLC–QTof-MS in ESI+ mode of acquisition. Several Orthogonal Partial Least Squares-Discriminant Analysis models were generated from the acquired mass spectrometry data. The models classified the two groups control and treated animals. A total of 42 ions of interest explained the variation in ESI+. It was possible to find the identity of 3 of the ions and to positively classify 4 of the ionic features, which can be used as potential biomarkers of illicit 5-nitroimidazole abuse. Further evidence of the toxic mechanisms of 5-nitroimidazole drugs has been revealed, which may be of substantial importance as metronidazole is widely used in human medicine.
Resumo:
With the rapid development of internet-of-things (IoT), face scrambling has been proposed for privacy protection during IoT-targeted image/video distribution. Consequently in these IoT applications, biometric verification needs to be carried out in the scrambled domain, presenting significant challenges in face recognition. Since face models become chaotic signals after scrambling/encryption, a typical solution is to utilize traditional data-driven face recognition algorithms. While chaotic pattern recognition is still a challenging task, in this paper we propose a new ensemble approach – Many-Kernel Random Discriminant Analysis (MK-RDA) to discover discriminative patterns from chaotic signals. We also incorporate a salience-aware strategy into the proposed ensemble method to handle chaotic facial patterns in the scrambled domain, where random selections of features are made on semantic components via salience modelling. In our experiments, the proposed MK-RDA was tested rigorously on three human face datasets: the ORL face dataset, the PIE face dataset and the PUBFIG wild face dataset. The experimental results successfully demonstrate that the proposed scheme can effectively handle chaotic signals and significantly improve the recognition accuracy, making our method a promising candidate for secure biometric verification in emerging IoT applications.
Resumo:
O trabalho apresentado nesta tese teve como principais objectivos contribuir para o conhecimento da composição do líquido amniótico humano (LA), colhido no 2º trimestre de gravidez, assim como investigar possíveis alterações na sua composição devido à ocorrência de patologias pré-natais, recorrendo à metabonómica e procurando, assim, definir novos biomarcadores de doenças da grávida e do feto. Após uma introdução descrevendo o estado da arte relacionado com este trabalho (Capítulo 1) e os princípios das metodologias analíticas usadas (Capítulo 2), seguida de uma descrição dos aspectos experimentais associados a esta tese (Capítulo 3), apresentam-se os resultados da caracterização da composição química do LA (gravidez saudável) por espectroscopia de ressonância magnética nuclear (RMN), assim como da monitorização da sua estabilidade durante o armazenamento e após ciclos de congelamento-descongelamento (Capítulo 4). Amostras de LA armazenadas a -20°C registaram alterações significativas, tornando-se estas menos pronunciadas (mas ainda mensuráveis) a -70°C, temperatura recomendada para o armazenamento de LA. Foram também observadas alterações de composição após 1-2 ciclos de congelamento-descongelamento (a ter em conta aquando da reutilização de amostras), assim como à temperatura ambiente (indicando um período máximo de 4h para a manipulação e análise de LA). A aquisição de espectros de RMN de 1H de alta resolução e RMN acoplado (LC-NMR/MS) permitiu a detecção de 75 compostos no LA do 2º trimestre, 6 dos quais detectados pela primeira vez no LA. Experiências de difusão (DOSY) permitiram ainda a caracterização das velocidades de difusão e massas moleculares médias das proteínas mais abundantes. O Capítulo 5 descreve o estudo dos efeitos de malformações fetais (FM) e de cromossomopatias (CD) na composição do LA do 2º trimestre de gravidez. A extensão deste trabalho ao estudo dos efeitos de patologias no LA que ocorrem no 3º trimestre de gravidez é descrita no Capítulo 6, nomeadamente no que se refere ao parto pré-termo (PTD), pré-eclampsia (PE), restrição do crescimento intra-uterino (IUGR), ruptura prematura de membranas (PROM) e diabetes mellitus gestacional (GDM). Como complemento a estes estudos, realizou-se uma análise preliminar da urina materna do 2º trimestre para o estudo de FM e GDM, descrita no Capítulo 7. Para interpretação dos dados analíticos, obtidos por espectroscopia RMN de 1H, cromatografia líquida de ultra eficiência acoplada a espectrometria de massa (UPLC-MS) e espectroscopia do infravermelho médio (MIR), recorreu-se à análise discriminante pelos métodos dos mínimos quadrados parciais e o método dos mínimos quadrados parciais ortogonal (PLS-DA e OPLS-DA) e à correlação espectral. Após análise por validação cruzada de Monte-Carlo (MCCV), os modelos PLS-DA de LA permitiram distinguir as FM dos controlos (sensibilidades 69-85%, especificidades 80-95%, taxas de classificação 80-90%), revelando variações metabólicas ao nível do metabolismo energético, dos metabolismos dos aminoácidos e glícidos assim como possíveis alterações ao nível do funcionamento renal. Observou-se também um grande impacto das FM no perfil metabólico da urina materna (medido por UPLC-MS), tendo no entanto sido registados modelos PLS-DA com menor sensibilidade (40-60%), provavelmente devido ao baixo número de amostras e maior variabilidade da composição da urina (relativamente ao LA). Foram sugeridos possíveis marcadores relacionados com a ocorrência de FM, incluindo lactato, glucose, leucina, valina, glutamina, glutamato, glicoproteínas e conjugados de ácido glucurónico e/ou sulfato e compostos endógenos e/ou exógenos (<1 M) (os últimos visíveis apenas na urina). No LA foram também observadas variações metabólicas devido à ocorrência de vários tipos de cromossomopatias (CD), mas de menor magnitude. Os perfis metabólicos de LA associado a pré- PTD produziram modelos que, apesar do baixo poder de previsão, sugeriram alterações precoces no funcionamento da unidade fetoplacentária, hiperglicémia e stress oxidativo. Os modelos obtidos para os grupos pré- IUGR pré- PE, pré- PROM e pré-diagnóstico GDM (LA e urina materna) registaram baixo poder de previsão, indicando o pouco impacto destas condições na composição do LA e/ou urina do 2º trimestre. Os resultados obtidos demonstram as potencialidades da análise dos perfis metabólicos do LA (e, embora com base em menos estudos, da urina materna) do 2º trimestre para o desenvolvimento de novos e complementares métodos de diagnóstico, nomeadamente para FM e PTD.