Biblioteca Digital

46 resultados para Scheduling algorithms and analysis

Graphitized carbon black in quartz tubes for the sampling of indoor air nicotine and analysis by microwave thermal desorption-capillary gas chromatography

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nicotine in a smoky indoor air environment can be determined using graphitized carbon black as a solid sorbent in quartz tubes. The temperature stability, high purity, and heat absorption characteristics of the sorbent, as well as the permeability of the quartz tubes to microwaves, enable the thermal desorption by means of microwaves after active sampling. Permeation and dynamic dilution procedures for the generation of nicotine in the vapor phase at low and high concentrations are used to evaluate the performances of the sampler. Tube preparation is described and the microwave desorption temperature is measured. Breakthrough volume is determined to allow sampling at 0.1-1 L/min for definite periods of time. The procedure is tested for the determination of gas and paticulate phase nicotine in sidestream smoke produced in an experimental chamber.

Aetiologies of pulseless electrical activity in out-of-hospital cardiac arrests:A retrospective study and analysis of specific causes

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: Pulseless electrical activity (PEA) cardiac arrest is defined as a cardiac arrest (CA) presenting with a residual organized electrical activity on the electrocardiogram. In the last decades, the incidence of PEA has regularly increased, compared to other types of CA like ventricular fibrillation or pulseless ventricular tachycardia. PEA is frequently induced by reversible conditions. The "4 (or 5) H" & "4 (or 5) T" are proposed as a mnemonic to asses for Hypoxia, Hypovolemia, Hypo- /Hyperkalaemia, Hypothermia, Thrombosis (cardiac or pulmonary), cardiac Tamponade, Toxins, and Tension pneumothorax. Other pathologies (intracranial haemorrhage, severe sepsis, myocardial contraction dysfunction) have been identified as potential causes for PEA, but their respective probability and frequencies are unclear and they are not yet included into the resuscitation guidelines. The aim of this study was to analyse the aetiologies of PEA out-of-hospital CA, in order to evaluate the relative frequencies of each cause and therefore to improve the management of patients suffering a PEA cardiac arrest. Method: This retrospective study was based on data routinely and prospectively collected for each PEMS intervention. All adult patients treated from January 1st 2002 to December 2012 31st by the PEMS for out-of-hospital cardiac arrest, with PEA as the first recorded rhythm, and admitted to the emergency department (ED) of the Lausanne University Hospital were included. The aetiologies of PEA cardiac arrest were classified into subgroups, based on the classical H&T's classification, supplemented by four other subgroups analysis: trauma, intra-cranial haemorrhage (ICH), non-ischemic cardiomyopathy (NIC) and undetermined cause. Results: 1866 OHCA were treated by the PEMS. PEA was the first recorded rhythm in 240 adult patients (13.8 %). After exclusion of 96 patients, 144 patients with a PEA cardiac arrest admitted to the ED were included in the analysis. The mean age was 63.8 ± 20.0 years, 58.3% were men and the survival rate at 48 hours was 29%. 32 different causes of OHCA PEA were established for 119 patients. For 25 patients (17.4 %), we were unable to attribute a specific cause for the PEA cardiac arrest. Hypoxia (23.6 %), acute coronary syndrome (12.5%) and trauma (12.5 %) were the three most frequent causes. Pulmonary embolism, Hypovolemia, Intoxication and Hyperkaliemia occurs in less than 10% of the cases (7.6 %, 5.6 %, 3.5%, respectively 2.1 %). Non ischemic cardiomyopathy and intra-cranial haemorrhage occur in 8.3 % and 6.9 %, respectively. Conclusions: According to our results, intra-cranial haemorrhage and non-ischemic cardiomyopathy represent noticeable causes of PEA in OHCA, with a prevalence equalling or exceeding the frequency of classical 4 H's and 4 T's aetiologies. These two pathologies are potentially accessible to simple diagnostic procedures (native CT-scan or echocardiography) and should be included into the 4 H's and 4 T's mnemonic.

Clinical features and analysis of the duration of colonization during an outbreak of Salmonella braenderup gastroenteritis

Relevância:

100.00% 100.00%

Publicador:

High specificity of line-immunoassay based algorithms for recent HIV-1 infection independent of viral subtype and stage of disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

ABSTRACT: BACKGROUND: Serologic testing algorithms for recent HIV seroconversion (STARHS) provide important information for HIV surveillance. We have shown that a patient's antibody reaction in a confirmatory line immunoassay (INNO-LIATM HIV I/II Score, Innogenetics) provides information on the duration of infection. Here, we sought to further investigate the diagnostic specificity of various Inno-Lia algorithms and to identify factors affecting it. METHODS: Plasma samples of 714 selected patients of the Swiss HIV Cohort Study infected for longer than 12 months and representing all viral clades and stages of chronic HIV-1 infection were tested blindly by Inno-Lia and classified as either incident (up to 12 m) or older infection by 24 different algorithms. Of the total, 524 patients received HAART, 308 had HIV-1 RNA below 50 copies/mL, and 620 were infected by a HIV-1 non-B clade. Using logistic regression analysis we evaluated factors that might affect the specificity of these algorithms. RESULTS: HIV-1 RNA <50 copies/mL was associated with significantly lower reactivity to all five HIV-1 antigens of the Inno-Lia and impaired specificity of most algorithms. Among 412 patients either untreated or with HIV-1 RNA ≥50 copies/mL despite HAART, the median specificity of the algorithms was 96.5% (range 92.0-100%). The only factor that significantly promoted false-incident results in this group was age, with false-incident results increasing by a few percent per additional year. HIV-1 clade, HIV-1 RNA, CD4 percentage, sex, disease stage, and testing modalities exhibited no significance. Results were similar among 190 untreated patients. CONCLUSIONS: The specificity of most Inno-Lia algorithms was high and not affected by HIV-1 variability, advanced disease and other factors promoting false-recent results in other STARHS. Specificity should be good in any group of untreated HIV-1 patients.

Machine learning for geospatial data : algorithms, software tools and case studies

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Résumé Cette thèse est consacrée à l'analyse, la modélisation et la visualisation de données environnementales à référence spatiale à l'aide d'algorithmes d'apprentissage automatique (Machine Learning). L'apprentissage automatique peut être considéré au sens large comme une sous-catégorie de l'intelligence artificielle qui concerne particulièrement le développement de techniques et d'algorithmes permettant à une machine d'apprendre à partir de données. Dans cette thèse, les algorithmes d'apprentissage automatique sont adaptés pour être appliqués à des données environnementales et à la prédiction spatiale. Pourquoi l'apprentissage automatique ? Parce que la majorité des algorithmes d'apprentissage automatiques sont universels, adaptatifs, non-linéaires, robustes et efficaces pour la modélisation. Ils peuvent résoudre des problèmes de classification, de régression et de modélisation de densité de probabilités dans des espaces à haute dimension, composés de variables informatives spatialisées (« géo-features ») en plus des coordonnées géographiques. De plus, ils sont idéaux pour être implémentés en tant qu'outils d'aide à la décision pour des questions environnementales allant de la reconnaissance de pattern à la modélisation et la prédiction en passant par la cartographie automatique. Leur efficacité est comparable au modèles géostatistiques dans l'espace des coordonnées géographiques, mais ils sont indispensables pour des données à hautes dimensions incluant des géo-features. Les algorithmes d'apprentissage automatique les plus importants et les plus populaires sont présentés théoriquement et implémentés sous forme de logiciels pour les sciences environnementales. Les principaux algorithmes décrits sont le Perceptron multicouches (MultiLayer Perceptron, MLP) - l'algorithme le plus connu dans l'intelligence artificielle, le réseau de neurones de régression généralisée (General Regression Neural Networks, GRNN), le réseau de neurones probabiliste (Probabilistic Neural Networks, PNN), les cartes auto-organisées (SelfOrganized Maps, SOM), les modèles à mixture Gaussiennes (Gaussian Mixture Models, GMM), les réseaux à fonctions de base radiales (Radial Basis Functions Networks, RBF) et les réseaux à mixture de densité (Mixture Density Networks, MDN). Cette gamme d'algorithmes permet de couvrir des tâches variées telle que la classification, la régression ou l'estimation de densité de probabilité. L'analyse exploratoire des données (Exploratory Data Analysis, EDA) est le premier pas de toute analyse de données. Dans cette thèse les concepts d'analyse exploratoire de données spatiales (Exploratory Spatial Data Analysis, ESDA) sont traités selon l'approche traditionnelle de la géostatistique avec la variographie expérimentale et selon les principes de l'apprentissage automatique. La variographie expérimentale, qui étudie les relations entre pairs de points, est un outil de base pour l'analyse géostatistique de corrélations spatiales anisotropiques qui permet de détecter la présence de patterns spatiaux descriptible par une statistique. L'approche de l'apprentissage automatique pour l'ESDA est présentée à travers l'application de la méthode des k plus proches voisins qui est très simple et possède d'excellentes qualités d'interprétation et de visualisation. Une part importante de la thèse traite de sujets d'actualité comme la cartographie automatique de données spatiales. Le réseau de neurones de régression généralisée est proposé pour résoudre cette tâche efficacement. Les performances du GRNN sont démontrées par des données de Comparaison d'Interpolation Spatiale (SIC) de 2004 pour lesquelles le GRNN bat significativement toutes les autres méthodes, particulièrement lors de situations d'urgence. La thèse est composée de quatre chapitres : théorie, applications, outils logiciels et des exemples guidés. Une partie importante du travail consiste en une collection de logiciels : Machine Learning Office. Cette collection de logiciels a été développée durant les 15 dernières années et a été utilisée pour l'enseignement de nombreux cours, dont des workshops internationaux en Chine, France, Italie, Irlande et Suisse ainsi que dans des projets de recherche fondamentaux et appliqués. Les cas d'études considérés couvrent un vaste spectre de problèmes géoenvironnementaux réels à basse et haute dimensionnalité, tels que la pollution de l'air, du sol et de l'eau par des produits radioactifs et des métaux lourds, la classification de types de sols et d'unités hydrogéologiques, la cartographie des incertitudes pour l'aide à la décision et l'estimation de risques naturels (glissements de terrain, avalanches). Des outils complémentaires pour l'analyse exploratoire des données et la visualisation ont également été développés en prenant soin de créer une interface conviviale et facile à l'utilisation. Machine Learning for geospatial data: algorithms, software tools and case studies Abstract The thesis is devoted to the analysis, modeling and visualisation of spatial environmental data using machine learning algorithms. In a broad sense machine learning can be considered as a subfield of artificial intelligence. It mainly concerns with the development of techniques and algorithms that allow computers to learn from data. In this thesis machine learning algorithms are adapted to learn from spatial environmental data and to make spatial predictions. Why machine learning? In few words most of machine learning algorithms are universal, adaptive, nonlinear, robust and efficient modeling tools. They can find solutions for the classification, regression, and probability density modeling problems in high-dimensional geo-feature spaces, composed of geographical space and additional relevant spatially referenced features. They are well-suited to be implemented as predictive engines in decision support systems, for the purposes of environmental data mining including pattern recognition, modeling and predictions as well as automatic data mapping. They have competitive efficiency to the geostatistical models in low dimensional geographical spaces but are indispensable in high-dimensional geo-feature spaces. The most important and popular machine learning algorithms and models interesting for geo- and environmental sciences are presented in details: from theoretical description of the concepts to the software implementation. The main algorithms and models considered are the following: multi-layer perceptron (a workhorse of machine learning), general regression neural networks, probabilistic neural networks, self-organising (Kohonen) maps, Gaussian mixture models, radial basis functions networks, mixture density networks. This set of models covers machine learning tasks such as classification, regression, and density estimation. Exploratory data analysis (EDA) is initial and very important part of data analysis. In this thesis the concepts of exploratory spatial data analysis (ESDA) is considered using both traditional geostatistical approach such as_experimental variography and machine learning. Experimental variography is a basic tool for geostatistical analysis of anisotropic spatial correlations which helps to understand the presence of spatial patterns, at least described by two-point statistics. A machine learning approach for ESDA is presented by applying the k-nearest neighbors (k-NN) method which is simple and has very good interpretation and visualization properties. Important part of the thesis deals with a hot topic of nowadays, namely, an automatic mapping of geospatial data. General regression neural networks (GRNN) is proposed as efficient model to solve this task. Performance of the GRNN model is demonstrated on Spatial Interpolation Comparison (SIC) 2004 data where GRNN model significantly outperformed all other approaches, especially in case of emergency conditions. The thesis consists of four chapters and has the following structure: theory, applications, software tools, and how-to-do-it examples. An important part of the work is a collection of software tools - Machine Learning Office. Machine Learning Office tools were developed during last 15 years and was used both for many teaching courses, including international workshops in China, France, Italy, Ireland, Switzerland and for realizing fundamental and applied research projects. Case studies considered cover wide spectrum of the real-life low and high-dimensional geo- and environmental problems, such as air, soil and water pollution by radionuclides and heavy metals, soil types and hydro-geological units classification, decision-oriented mapping with uncertainties, natural hazards (landslides, avalanches) assessments and susceptibility mapping. Complementary tools useful for the exploratory data analysis and visualisation were developed as well. The software is user friendly and easy to use.

Transcriptomic and functional analysis of an autolysis-deficient, teicoplanin-resistant derivative of methicillin-resistant Staphylococcus aureus.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The molecular basis of glycopeptide-intermediate S. aureus (GISA) isolates is not well defined though frequently involves phenotypes such as thickened cell walls and decreased autolysis. We have exploited an isogenic pair of teicoplanin-susceptible (strain MRGR3) and teicoplanin-resistant (strain 14-4) methicillin-resistant S. aureus strains for detailed transcriptomic profiling and analysis of altered autolytic properties. Strain 14-4 displayed markedly deficient Triton X-100-triggered autolysis compared to its teicoplanin-susceptible parent, although microarray analysis paradoxically did not reveal significant reductions in expression levels of major autolytic genes atl, lytM, and lytN, except for sle1, which showed a slight decrease. The most important paradox was a more-than-twofold increase in expression of the cidABC operon in 14-4 compared to MRGR3, which was correlated with decreased expression of autolysis negative regulators lytSR and lrgAB. In contrast, the autolysis-deficient phenotype of 14-4 was correlated with both increased expression of negative autolysis regulators (arlRS, mgrA, and sarA) and decreased expression of positive regulators (agr RNAII and RNAIII). Quantitative bacteriolytic assays and zymographic analysis of concentrated culture supernatants showed a striking reduction in Atl-derived, extracellular bacteriolytic hydrolase activities in 14-4 compared to MRGR3. This observed difference was independent of the source of cell wall substrate (MRGR3 or 14-4) used for analysis. Collectively, our results suggest that altered autolytic properties in 14-4 are apparently not driven by significant changes in the transcription of key autolytic effectors. Instead, our analysis points to alternate regulatory mechanisms that impact autolysis effectors which may include changes in posttranscriptional processing or export.

Karasek's questionnaire and field activity analysis

Relevância:

100.00% 100.00%

Publicador:

Intraobserver and interobserver variability of ascending aorta diameter as assessed with ECG-gated MDCT: automatic versus manual measurements

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Purpose: Recently morphometric measurements of the ascending aorta have been done with ECG-gated MDCT to help the development of future endovascular therapies (TCT) [1]. However, the variability of these measurements remains unknown. It will be interesting to know the impact of CAD (computer aided diagnosis) with automated segmentation of the vessel and automatic measurements of diameter on the management of ascending aorta aneurysms. Methods and Materials: Thirty patients referred for ECG-gated CT thoracic angiography (64-row CT scanner) were evaluated. Measurements of the maximum and minimum ascending aorta diameters were obtained automatically with a commercially available CAD and semi-manually by two observers separately. The CAD algorithms segment the iv-enhanced lumen of the ascending aorta into perpendicular planes along the centreline. The CAD then determines the largest and the smallest diameters. Both observers repeated the automatic measurements and the semimanual measurements during a different session at least one month after the first measurements. The Bland and Altman method was used to study the inter/intraobserver variability. A Wilcoxon signed-rank test was also used to analyse differences between observers. Results: Interobserver variability for semi-manual measurements between the first and second observers was between 1.2 to 1.0 mm for maximal and minimal diameter, respectively. Intraobserver variability of each observer ranged from 0.8 to 1.2 mm, the lowest variability being produced by the more experienced observer. CAD variability could be as low as 0.3 mm, showing that it can perform better than human observers. However, when used in nonoptimal conditions (streak artefacts from contrast in the superior vena cava or weak lumen enhancement), CAD has a variability that can be as high as 0.9 mm, reaching variability of semi-manual measurements. Furthermore, there were significant differences between both observers for maximal and minimal diameter measurements (p<0.001). There was also a significant difference between the first observer and CAD for maximal diameter measurements with the former underestimating the diameter compared to the latter (p<0.001). As for minimal diameters, they were higher when measured by the second observer than when measured by CAD (p<0.001). Neither the difference of mean minimal diameter between the first observer and CAD nor the difference of mean maximal diameter between the second observer and CAD was significant (p=0.20 and 0.06, respectively). Conclusion: CAD algorithms can lessen the variability of diameter measurements in the follow-up of ascending aorta aneurysms. Nevertheless, in non-optimal conditions, it may be necessary to correct manually the measurements. Improvements of the algorithms will help to avoid such a situation.

BMI group-related differences in physical fitness and physical activity in preschool-age children: a cross-sectional analysis.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the Ballabeina study, we investigated age- and BMI-group-related differences in aerobic fitness (20 m shuttle run), agility (obstacle course), dynamic (balance beam) and static balance (balance platform), and physical activity (PA, accelerometers) in 613 children (M age = 5.1 years, SD = 0.6). Normal weight (NW) children performed better than overweight (OW) children in aerobic fitness, agility, and dynamic balance (all p <.001), while OWchildren had a better static balance (p < .001). BMI-group-related differences in aerobic fitness and agility were larger in older children (p for interaction with age = .01) in favor of the NW children. PA did not differ between NW and OW (p > or = .1), but did differ between NW and obese children (p < .05). BMI-group-related differences in physical fitness can already be present in preschool-age children.

Cross-species and cross-platform gene expression studies with the Bioconductor-compliant R package 'annotationTools'.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Background: The variety of DNA microarray formats and datasets presently available offers an unprecedented opportunity to perform insightful comparisons of heterogeneous data. Cross-species studies, in particular, have the power of identifying conserved, functionally important molecular processes. Validation of discoveries can now often be performed in readily available public data which frequently requires cross-platform studies.Cross-platform and cross-species analyses require matching probes on different microarray formats. This can be achieved using the information in microarray annotations and additional molecular biology databases, such as orthology databases. Although annotations and other biological information are stored using modern database models ( e. g. relational), they are very often distributed and shared as tables in text files, i.e. flat file databases. This common flat database format thus provides a simple and robust solution to flexibly integrate various sources of information and a basis for the combined analysis of heterogeneous gene expression profiles.Results: We provide annotationTools, a Bioconductor-compliant R package to annotate microarray experiments and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file databases. First, annotationTools contains a specialized set of functions for mining this widely used database format in a systematic manner. It thus offers a straightforward solution for annotating microarray experiments. Second, building on these basic functions and relying on the combination of information from several databases, it provides tools to easily perform cross-species analyses of gene expression data.Here, we present two example applications of annotationTools that are of direct relevance for the analysis of heterogeneous gene expression profiles, namely a cross-platform mapping of probes and a cross-species mapping of orthologous probes using different orthology databases. We also show how to perform an explorative comparison of disease-related transcriptional changes in human patients and in a genetic mouse model.Conclusion: The R package annotationTools provides a simple solution to handle microarray annotation and orthology tables, as well as other flat molecular biology databases. Thereby, it allows easy integration and analysis of heterogeneous microarray experiments across different technological platforms or species.

Body fluid and tissue analysis using filter paper sampling support prior to LC-MS/MS: application to fatal overdose with colchicine

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Because of the various matrices available for forensic investigations, the development of versatile analytical approaches allowing the simultaneous determination of drugs is challenging. The aim of this work was to assess a liquid chromatography-tandem mass spectrometry (LC-MS/MS) platform allowing the rapid quantification of colchicine in body fluids and tissues collected in the context of a fatal overdose. For this purpose, filter paper was used as a sampling support and was associated with an automated 96-well plate extraction performed by the LC autosampler itself. The developed method features a 7-min total run time including automated filter paper extraction (2 min) and chromatographic separation (5 min). The sample preparation was reduced to a minimum regardless of the matrix analyzed. This platform was fully validated for dried blood spots (DBS) in the toxic concentration range of colchicine. The DBS calibration curve was applied successfully to quantification in all other matrices (body fluids and tissues) except for bile, where an excessive matrix effect was found. The distribution of colchicine for a fatal overdose case was reported as follows: peripheral blood, 29 ng/ml; urine, 94 ng/ml; vitreous humour and cerebrospinal fluid, < 5 ng/ml; pericardial fluid, 14 ng/ml; brain, < 5 pg/mg; heart, 121 pg/mg; kidney, 245 pg/mg; and liver, 143 pg/mg. Although filter paper is usually employed for DBS, we report here the extension of this alternative sampling support to the analysis of other body fluids and tissues. The developed platform represents a rapid and versatile approach for drug determination in multiple forensic media.

Association between variants of the leptin receptor gene (LEPR) and overweight: a systematic review and an analysis of the CoLaus study.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

BACKGROUND: Three non-synonymous single nucleotide polymorphisms (Q223R, K109R and K656N) of the leptin receptor gene (LEPR) have been tested for association with obesity-related outcomes in multiple studies, showing inconclusive results. We performed a systematic review and meta-analysis on the association of the three LEPR variants with BMI. In addition, we analysed 15 SNPs within the LEPR gene in the CoLaus study, assessing the interaction of the variants with sex. METHODOLOGY/PRINCIPAL FINDINGS: We searched electronic databases, including population-based studies that investigated the association between LEPR variants Q223R, K109R and K656N and obesity- related phenotypes in healthy, unrelated subjects. We furthermore performed meta-analyses of the genotype and allele frequencies in case-control studies. Results were stratified by SNP and by potential effect modifiers. CoLaus data were analysed by logistic and linear regressions and tested for interaction with sex. The meta-analysis of published data did not show an overall association between any of the tested LEPR variants and overweight. However, the choice of a BMI cut-off value to distinguish cases from controls was crucial to explain heterogeneity in Q223R. Differences in allele frequencies across ethnic groups are compatible with natural selection of derived alleles in Q223R and K109R and of the ancient allele in K656N in Asians. In CoLaus, the rs10128072, rs3790438 and rs3790437 variants showed interaction with sex for their association with overweight, waist circumference and fat mass in linear regressions. CONCLUSIONS: Our systematic review and analysis of primary data from the CoLaus study did not show an overall association between LEPR SNPs and overweight. Most studies were underpowered to detect small effect sizes. A potential effect modification by sex, population stratification, as well as the role of natural selection should be addressed in future genetic association studies.

BIO-INSPIRED COMPUTATIONAL TECHNIQUES APPLIED TO THE CLUSTERING AND VISUALIZATION OF SPATIO-TEMPORAL GEOSPATIAL DATA

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The coverage and volume of geo-referenced datasets are extensive and incessantly¦growing. The systematic capture of geo-referenced information generates large volumes¦of spatio-temporal data to be analyzed. Clustering and visualization play a key¦role in the exploratory data analysis and the extraction of knowledge embedded in¦these data. However, new challenges in visualization and clustering are posed when¦dealing with the special characteristics of this data. For instance, its complex structures,¦large quantity of samples, variables involved in a temporal context, high dimensionality¦and large variability in cluster shapes.¦The central aim of my thesis is to propose new algorithms and methodologies for¦clustering and visualization, in order to assist the knowledge extraction from spatiotemporal¦geo-referenced data, thus improving making decision processes.¦I present two original algorithms, one for clustering: the Fuzzy Growing Hierarchical¦Self-Organizing Networks (FGHSON), and the second for exploratory visual data analysis:¦the Tree-structured Self-organizing Maps Component Planes. In addition, I present¦methodologies that combined with FGHSON and the Tree-structured SOM Component¦Planes allow the integration of space and time seamlessly and simultaneously in¦order to extract knowledge embedded in a temporal context.¦The originality of the FGHSON lies in its capability to reflect the underlying structure¦of a dataset in a hierarchical fuzzy way. A hierarchical fuzzy representation of¦clusters is crucial when data include complex structures with large variability of cluster¦shapes, variances, densities and number of clusters. The most important characteristics¦of the FGHSON include: (1) It does not require an a-priori setup of the number¦of clusters. (2) The algorithm executes several self-organizing processes in parallel.¦Hence, when dealing with large datasets the processes can be distributed reducing the¦computational cost. (3) Only three parameters are necessary to set up the algorithm.¦In the case of the Tree-structured SOM Component Planes, the novelty of this algorithm¦lies in its ability to create a structure that allows the visual exploratory data analysis¦of large high-dimensional datasets. This algorithm creates a hierarchical structure¦of Self-Organizing Map Component Planes, arranging similar variables' projections in¦the same branches of the tree. Hence, similarities on variables' behavior can be easily¦detected (e.g. local correlations, maximal and minimal values and outliers).¦Both FGHSON and the Tree-structured SOM Component Planes were applied in¦several agroecological problems proving to be very efficient in the exploratory analysis¦and clustering of spatio-temporal datasets.¦In this thesis I also tested three soft competitive learning algorithms. Two of them¦well-known non supervised soft competitive algorithms, namely the Self-Organizing¦Maps (SOMs) and the Growing Hierarchical Self-Organizing Maps (GHSOMs); and the¦third was our original contribution, the FGHSON. Although the algorithms presented¦here have been used in several areas, to my knowledge there is not any work applying¦and comparing the performance of those techniques when dealing with spatiotemporal¦geospatial data, as it is presented in this thesis.¦I propose original methodologies to explore spatio-temporal geo-referenced datasets¦through time. Our approach uses time windows to capture temporal similarities and¦variations by using the FGHSON clustering algorithm. The developed methodologies¦are used in two case studies. In the first, the objective was to find similar agroecozones¦through time and in the second one it was to find similar environmental patterns¦shifted in time.¦Several results presented in this thesis have led to new contributions to agroecological¦knowledge, for instance, in sugar cane, and blackberry production.¦Finally, in the framework of this thesis we developed several software tools: (1)¦a Matlab toolbox that implements the FGHSON algorithm, and (2) a program called¦BIS (Bio-inspired Identification of Similar agroecozones) an interactive graphical user¦interface tool which integrates the FGHSON algorithm with Google Earth in order to¦show zones with similar agroecological characteristics.

Data-driven clustering : new methods and applications

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Abstract : This work is concerned with the development and application of novel unsupervised learning methods, having in mind two target applications: the analysis of forensic case data and the classification of remote sensing images. First, a method based on a symbolic optimization of the inter-sample distance measure is proposed to improve the flexibility of spectral clustering algorithms, and applied to the problem of forensic case data. This distance is optimized using a loss function related to the preservation of neighborhood structure between the input space and the space of principal components, and solutions are found using genetic programming. Results are compared to a variety of state-of--the-art clustering algorithms. Subsequently, a new large-scale clustering method based on a joint optimization of feature extraction and classification is proposed and applied to various databases, including two hyperspectral remote sensing images. The algorithm makes uses of a functional model (e.g., a neural network) for clustering which is trained by stochastic gradient descent. Results indicate that such a technique can easily scale to huge databases, can avoid the so-called out-of-sample problem, and can compete with or even outperform existing clustering algorithms on both artificial data and real remote sensing images. This is verified on small databases as well as very large problems. Résumé : Ce travail de recherche porte sur le développement et l'application de méthodes d'apprentissage dites non supervisées. Les applications visées par ces méthodes sont l'analyse de données forensiques et la classification d'images hyperspectrales en télédétection. Dans un premier temps, une méthodologie de classification non supervisée fondée sur l'optimisation symbolique d'une mesure de distance inter-échantillons est proposée. Cette mesure est obtenue en optimisant une fonction de coût reliée à la préservation de la structure de voisinage d'un point entre l'espace des variables initiales et l'espace des composantes principales. Cette méthode est appliquée à l'analyse de données forensiques et comparée à un éventail de méthodes déjà existantes. En second lieu, une méthode fondée sur une optimisation conjointe des tâches de sélection de variables et de classification est implémentée dans un réseau de neurones et appliquée à diverses bases de données, dont deux images hyperspectrales. Le réseau de neurones est entraîné à l'aide d'un algorithme de gradient stochastique, ce qui rend cette technique applicable à des images de très haute résolution. Les résultats de l'application de cette dernière montrent que l'utilisation d'une telle technique permet de classifier de très grandes bases de données sans difficulté et donne des résultats avantageusement comparables aux méthodes existantes.

Influence of magnetic field strength and image registration strategy on voxel-based morphometry in a study of Alzheimer's disease.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Multi-centre data repositories like the Alzheimer's Disease Neuroimaging Initiative (ADNI) offer a unique research platform, but pose questions concerning comparability of results when using a range of imaging protocols and data processing algorithms. The variability is mainly due to the non-quantitative character of the widely used structural T1-weighted magnetic resonance (MR) images. Although the stability of the main effect of Alzheimer's disease (AD) on brain structure across platforms and field strength has been addressed in previous studies using multi-site MR images, there are only sparse empirically-based recommendations for processing and analysis of pooled multi-centre structural MR data acquired at different magnetic field strengths (MFS). Aiming to minimise potential systematic bias when using ADNI data we investigate the specific contributions of spatial registration strategies and the impact of MFS on voxel-based morphometry in AD. We perform a whole-brain analysis within the framework of Statistical Parametric Mapping, testing for main effects of various diffeomorphic spatial registration strategies, of MFS and their interaction with disease status. Beyond the confirmation of medial temporal lobe volume loss in AD, we detect a significant impact of spatial registration strategy on estimation of AD related atrophy. Additionally, we report a significant effect of MFS on the assessment of brain anatomy (i) in the cerebellum, (ii) the precentral gyrus and (iii) the thalamus bilaterally, showing no interaction with the disease status. We provide empirical evidence in support of pooling data in multi-centre VBM studies irrespective of disease status or MFS.

«
1
2
3
4
»