29 resultados para Vector spaces -- Problems, exercises, etc.

em Université de Lausanne, Switzerland


Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper presents general problems and approaches for the spatial data analysis using machine learning algorithms. Machine learning is a very powerful approach to adaptive data analysis, modelling and visualisation. The key feature of the machine learning algorithms is that they learn from empirical data and can be used in cases when the modelled environmental phenomena are hidden, nonlinear, noisy and highly variable in space and in time. Most of the machines learning algorithms are universal and adaptive modelling tools developed to solve basic problems of learning from data: classification/pattern recognition, regression/mapping and probability density modelling. In the present report some of the widely used machine learning algorithms, namely artificial neural networks (ANN) of different architectures and Support Vector Machines (SVM), are adapted to the problems of the analysis and modelling of geo-spatial data. Machine learning algorithms have an important advantage over traditional models of spatial statistics when problems are considered in a high dimensional geo-feature spaces, when the dimension of space exceeds 5. Such features are usually generated, for example, from digital elevation models, remote sensing images, etc. An important extension of models concerns considering of real space constrains like geomorphology, networks, and other natural structures. Recent developments in semi-supervised learning can improve modelling of environmental phenomena taking into account on geo-manifolds. An important part of the study deals with the analysis of relevant variables and models' inputs. This problem is approached by using different feature selection/feature extraction nonlinear tools. To demonstrate the application of machine learning algorithms several interesting case studies are considered: digital soil mapping using SVM, automatic mapping of soil and water system pollution using ANN; natural hazards risk analysis (avalanches, landslides), assessments of renewable resources (wind fields) with SVM and ANN models, etc. The dimensionality of spaces considered varies from 2 to more than 30. Figures 1, 2, 3 demonstrate some results of the studies and their outputs. Finally, the results of environmental mapping are discussed and compared with traditional models of geostatistics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Indirect calorimetry based on respiratory exchange measurement has been successfully used from the beginning of the century to obtain an estimate of heat production (energy expenditure) in human subjects and animals. The errors inherent to this classical technique can stem from various sources: 1) model of calculation and assumptions, 2) calorimetric factors used, 3) technical factors and 4) human factors. The physiological and biochemical factors influencing the interpretation of calorimetric data include a change in the size of the bicarbonate and urea pools and the accumulation or loss (via breath, urine or sweat) of intermediary metabolites (gluconeogenesis, ketogenesis). More recently, respiratory gas exchange data have been used to estimate substrate utilization rates in various physiological and metabolic situations (fasting, post-prandial state, etc.). It should be recalled that indirect calorimetry provides an index of overall substrate disappearance rates. This is incorrectly assumed to be equivalent to substrate "oxidation" rates. Unfortunately, there is no adequate golden standard to validate whole body substrate "oxidation" rates, and this contrasts to the "validation" of heat production by indirect calorimetry, through use of direct calorimetry under strict thermal equilibrium conditions. Tracer techniques using stable (or radioactive) isotopes, represent an independent way of assessing substrate utilization rates. When carbohydrate metabolism is measured with both techniques, indirect calorimetry generally provides consistent glucose "oxidation" rates as compared to isotopic tracers, but only when certain metabolic processes (such as gluconeogenesis and lipogenesis) are minimal or / and when the respiratory quotients are not at the extreme of the physiological range. However, it is believed that the tracer techniques underestimate true glucose "oxidation" rates due to the failure to account for glycogenolysis in the tissue storing glucose, since this escapes the systemic circulation. A major advantage of isotopic techniques is that they are able to estimate (given certain assumptions) various metabolic processes (such as gluconeogenesis) in a noninvasive way. Furthermore when, in addition to the 3 macronutrients, a fourth substrate is administered (such as ethanol), isotopic quantification of substrate "oxidation" allows one to eliminate the inherent assumptions made by indirect calorimetry. In conclusion, isotopic tracers techniques and indirect calorimetry should be considered as complementary techniques, in particular since the tracer techniques require the measurement of carbon dioxide production obtained by indirect calorimetry. However, it should be kept in mind that the assessment of substrate oxidation by indirect calorimetry may involve large errors in particular over a short period of time. By indirect calorimetry, energy expenditure (heat production) is calculated with substantially less error than substrate oxidation rates.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Due to their performance enhancing properties, use of anabolic steroids (e.g. testosterone, nandrolone, etc.) is banned in elite sports. Therefore, doping control laboratories accredited by the World Anti-Doping Agency (WADA) screen among others for these prohibited substances in urine. It is particularly challenging to detect misuse with naturally occurring anabolic steroids such as testosterone (T), which is a popular ergogenic agent in sports and society. To screen for misuse with these compounds, drug testing laboratories monitor the urinary concentrations of endogenous steroid metabolites and their ratios, which constitute the steroid profile and compare them with reference ranges to detect unnaturally high values. However, the interpretation of the steroid profile is difficult due to large inter-individual variances, various confounding factors and different endogenous steroids marketed that influence the steroid profile in various ways. A support vector machine (SVM) algorithm was developed to statistically evaluate urinary steroid profiles composed of an extended range of steroid profile metabolites. This model makes the interpretation of the analytical data in the quest for deviating steroid profiles feasible and shows its versatility towards different kinds of misused endogenous steroids. The SVM model outperforms the current biomarkers with respect to detection sensitivity and accuracy, particularly when it is coupled to individual data as stored in the Athlete Biological Passport.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The paper proposes an approach aimed at detecting optimal model parameter combinations to achieve the most representative description of uncertainty in the model performance. A classification problem is posed to find the regions of good fitting models according to the values of a cost function. Support Vector Machine (SVM) classification in the parameter space is applied to decide if a forward model simulation is to be computed for a particular generated model. SVM is particularly designed to tackle classification problems in high-dimensional space in a non-parametric and non-linear way. SVM decision boundaries determine the regions that are subject to the largest uncertainty in the cost function classification, and, therefore, provide guidelines for further iterative exploration of the model space. The proposed approach is illustrated by a synthetic example of fluid flow through porous media, which features highly variable response due to the parameter values' combination.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Recently, kernel-based Machine Learning methods have gained great popularity in many data analysis and data mining fields: pattern recognition, biocomputing, speech and vision, engineering, remote sensing etc. The paper describes the use of kernel methods to approach the processing of large datasets from environmental monitoring networks. Several typical problems of the environmental sciences and their solutions provided by kernel-based methods are considered: classification of categorical data (soil type classification), mapping of environmental and pollution continuous information (pollution of soil by radionuclides), mapping with auxiliary information (climatic data from Aral Sea region). The promising developments, such as automatic emergency hot spot detection and monitoring network optimization are discussed as well.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A haplotype is an m-long binary vector. The XOR-genotype of two haplotypes is the m-vector of their coordinate-wise XOR. We study the following problem: Given a set of XOR-genotypes, reconstruct their haplotypes so that the set of resulting haplotypes can be mapped onto a perfect phylogeny (PP) tree. The question is motivated by studying population evolution in human genetics, and is a variant of the perfect phylogeny haplotyping problem that has received intensive attention recently. Unlike the latter problem, in which the input is "full" genotypes, here we assume less informative input, and so may be more economical to obtain experimentally. Building on ideas of Gusfield, we show how to solve the problem in polynomial time, by a reduction to the graph realization problem. The actual haplotypes are not uniquely determined by that tree they map onto, and the tree itself may or may not be unique. We show that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness. To actually determine the haplotypes given the tree, additional information is necessary. We show that two or three full genotypes suffice to reconstruct all the haplotypes, and present a linear algorithm for identifying those genotypes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A haplotype is an m-long binary vector. The XOR-genotype of two haplotypes is the m-vector of their coordinate-wise XOR. We study the following problem: Given a set of XOR-genotypes, reconstruct their haplotypes so that the set of resulting haplotypes can be mapped onto a perfect phylogeny (PP) tree. The question is motivated by studying population evolution in human genetics and is a variant of the PP haplotyping problem that has received intensive attention recently. Unlike the latter problem, in which the input is '' full '' genotypes, here, we assume less informative input and so may be more economical to obtain experimentally. Building on ideas of Gusfield, we show how to solve the problem in polynomial time by a reduction to the graph realization problem. The actual haplotypes are not uniquely determined by the tree they map onto and the tree itself may or may not be unique. We show that tree uniqueness implies uniquely determined haplotypes, up to inherent degrees of freedom, and give a sufficient condition for the uniqueness. To actually determine the haplotypes given the tree, additional information is necessary. We show that two or three full genotypes suffice to reconstruct all the haplotypes and present a linear algorithm for identifying those genotypes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

RESUME L'utilisation de la thérapie génique dans l'approche d'un traitement des maladies oculaires dégénératives, plus particulièrement de la rétinite pigmentaire, semble être très prometteuse (Acland et al. 2001). Parmi les vecteurs développés, les vecteurs lentiviraux (dérivé du virus humain HIV-1), permettent la transduction des photorécepteurs après injection sous-rétinienne chez la souris durant les premiers jours de vie. Cependant l'efficacité du transfert de gène est nettement plus limitée dans ce type cellulaire après injection chez l'adulte (Kostic et al. 2003). L'objet de notre étude est de déterminer si la présence d'une barrière physique produite au cours du développement, située entre les photorécepteurs et l'épithélium pigmentaire ainsi qu'entre les photorécepteurs eux-mêmes, est responsable de: la diminution de l'entrée en masse du virus dans les photorécepteurs, minimisant ainsi son efficacité chez la souris adulte. De précédentes recherches, chez le lapin, ont décrit la capacité d'enzymes spécifiques comme la Chondroïtinase ABC et la Neuraminidase X de modifier la structure de la matrice entourant les photorécepteurs (Inter Photoreceptor Matrix, IPM) par digestion de certains de ses constituants suite à leur injection dans l'espace sous-rétinien (Yao et al. 1990). Considérant l'IPM comme une barrière physique, capable de réduire l'efficacité de transduction des photorécepteurs chez la souris adulte, nous avons associé différentes enzymes simultanément à l'injection sous-rétinienne de vecteurs lentiviraux afin d'améliorer la transduction virale en fragilisant I'IPM, la rendant ainsi plus perméable à la diffusion du virus. L'injection sous-rétinienne de Neuraminidase X et de Chondroïtinase ABC chez la souris induit des modifications structurales de l'IPM qui se manifestent respectivement par la révélation ou la disparition de sites de liaison de la peanut agglutinin sur les photorécepteurs. L'injection simultanée de Neuraminidase X avec le vecteur viral contenant le transgène thérapeutique augmente significativement le nombre de photorécepteurs transduits (environ cinq fois). Nous avons en fait démontré que le traitement enzymatique augmente principalement la diffusion du lentivirus dans l'espace situé entre l'épithélium pigmentaire et les photorécepteurs. Le traitement à la Chondroïtinase ABC n'entraîne quant à elle qu'une légère amélioration non significative de la transduction. Cette étude montre qu'une meilleure connaissance de l'IPM ainsi que des substances capables de la modifier (enzymes, drogues etc.) pourrait aider à élaborer de nouvelles stratégies afin d'améliorer la distribution de vecteurs viraux dans la rétine adulte.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis develops a comprehensive and a flexible statistical framework for the analysis and detection of space, time and space-time clusters of environmental point data. The developed clustering methods were applied in both simulated datasets and real-world environmental phenomena; however, only the cases of forest fires in Canton of Ticino (Switzerland) and in Portugal are expounded in this document. Normally, environmental phenomena can be modelled as stochastic point processes where each event, e.g. the forest fire ignition point, is characterised by its spatial location and occurrence in time. Additionally, information such as burned area, ignition causes, landuse, topographic, climatic and meteorological features, etc., can also be used to characterise the studied phenomenon. Thereby, the space-time pattern characterisa- tion represents a powerful tool to understand the distribution and behaviour of the events and their correlation with underlying processes, for instance, socio-economic, environmental and meteorological factors. Consequently, we propose a methodology based on the adaptation and application of statistical and fractal point process measures for both global (e.g. the Morisita Index, the Box-counting fractal method, the multifractal formalism and the Ripley's K-function) and local (e.g. Scan Statistics) analysis. Many measures describing the space-time distribution of environmental phenomena have been proposed in a wide variety of disciplines; nevertheless, most of these measures are of global character and do not consider complex spatial constraints, high variability and multivariate nature of the events. Therefore, we proposed an statistical framework that takes into account the complexities of the geographical space, where phenomena take place, by introducing the Validity Domain concept and carrying out clustering analyses in data with different constrained geographical spaces, hence, assessing the relative degree of clustering of the real distribution. Moreover, exclusively to the forest fire case, this research proposes two new methodologies to defining and mapping both the Wildland-Urban Interface (WUI) described as the interaction zone between burnable vegetation and anthropogenic infrastructures, and the prediction of fire ignition susceptibility. In this regard, the main objective of this Thesis was to carry out a basic statistical/- geospatial research with a strong application part to analyse and to describe complex phenomena as well as to overcome unsolved methodological problems in the characterisation of space-time patterns, in particular, the forest fire occurrences. Thus, this Thesis provides a response to the increasing demand for both environmental monitoring and management tools for the assessment of natural and anthropogenic hazards and risks, sustainable development, retrospective success analysis, etc. The major contributions of this work were presented at national and international conferences and published in 5 scientific journals. National and international collaborations were also established and successfully accomplished. -- Cette thèse développe une méthodologie statistique complète et flexible pour l'analyse et la détection des structures spatiales, temporelles et spatio-temporelles de données environnementales représentées comme de semis de points. Les méthodes ici développées ont été appliquées aux jeux de données simulées autant qu'A des phénomènes environnementaux réels; nonobstant, seulement le cas des feux forestiers dans le Canton du Tessin (la Suisse) et celui de Portugal sont expliqués dans ce document. Normalement, les phénomènes environnementaux peuvent être modélisés comme des processus ponctuels stochastiques ou chaque événement, par ex. les point d'ignition des feux forestiers, est déterminé par son emplacement spatial et son occurrence dans le temps. De plus, des informations tels que la surface bru^lée, les causes d'ignition, l'utilisation du sol, les caractéristiques topographiques, climatiques et météorologiques, etc., peuvent aussi être utilisées pour caractériser le phénomène étudié. Par conséquent, la définition de la structure spatio-temporelle représente un outil puissant pour compren- dre la distribution du phénomène et sa corrélation avec des processus sous-jacents tels que les facteurs socio-économiques, environnementaux et météorologiques. De ce fait, nous proposons une méthodologie basée sur l'adaptation et l'application de mesures statistiques et fractales des processus ponctuels d'analyse global (par ex. l'indice de Morisita, la dimension fractale par comptage de boîtes, le formalisme multifractal et la fonction K de Ripley) et local (par ex. la statistique de scan). Des nombreuses mesures décrivant les structures spatio-temporelles de phénomènes environnementaux peuvent être trouvées dans la littérature. Néanmoins, la plupart de ces mesures sont de caractère global et ne considèrent pas de contraintes spatiales com- plexes, ainsi que la haute variabilité et la nature multivariée des événements. A cet effet, la méthodologie ici proposée prend en compte les complexités de l'espace géographique ou le phénomène a lieu, à travers de l'introduction du concept de Domaine de Validité et l'application des mesures d'analyse spatiale dans des données en présentant différentes contraintes géographiques. Cela permet l'évaluation du degré relatif d'agrégation spatiale/temporelle des structures du phénomène observé. En plus, exclusif au cas de feux forestiers, cette recherche propose aussi deux nouvelles méthodologies pour la définition et la cartographie des zones périurbaines, décrites comme des espaces anthropogéniques à proximité de la végétation sauvage ou de la forêt, et de la prédiction de la susceptibilité à l'ignition de feu. A cet égard, l'objectif principal de cette Thèse a été d'effectuer une recherche statistique/géospatiale avec une forte application dans des cas réels, pour analyser et décrire des phénomènes environnementaux complexes aussi bien que surmonter des problèmes méthodologiques non résolus relatifs à la caractérisation des structures spatio-temporelles, particulièrement, celles des occurrences de feux forestières. Ainsi, cette Thèse fournit une réponse à la demande croissante de la gestion et du monitoring environnemental pour le déploiement d'outils d'évaluation des risques et des dangers naturels et anthro- pogéniques. Les majeures contributions de ce travail ont été présentées aux conférences nationales et internationales, et ont été aussi publiées dans 5 revues internationales avec comité de lecture. Des collaborations nationales et internationales ont été aussi établies et accomplies avec succès.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Shrews of the genus Crocidura from Sicily revealed a new karyotype from Europe: 2n = 36, NF = 56, NFa = 52. With reference to the revision of Vesmanis (1976), this shrew is provisionally attributed to C. caudata Miller, 1901 and it is proposed to call it the "Sicilian shrew". Its chromosome complement is similar to that of shrews from Canary Islands and a species from Burundi (Central Africa), suggesting that it might have split off from a line of Paleotropical origin. Following these findings, the modern concept of Mediterranean island colonization by shrews must be revised. The distinctive characteristics of Mediterranean shrews should also be revised.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Schizophrenia has long been considered with pessimism, but the recent interest in the early phase of psychotic disorders has modified this often unjustified perception. Literature has demonstrated the benefit of the development of programs specialised in the treatment of early psychosis, which tend to be developed in many countries. It is however important to match them to local needs as well as to the structure of local health services. This paper reviews elements that justify such a development in Lausanne, Switzerland, and describe its various elements.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Staphylococcus aureus harbors redundant adhesins mediating tissue colonization and infection. To evaluate their intrinsic role outside of the staphylococcal background, a system was designed to express them in Lactococcus lactis subsp. cremoris 1363. This bacterium is devoid of virulence factors and has a known genetic background. A new Escherichia coli-L. lactis shuttle and expression vector was constructed for this purpose. First, the high-copy-number lactococcal plasmid pIL253 was equipped with the oriColE1 origin, generating pOri253 that could replicate in E. coli. Second, the lactococcal promoters P23 or P59 were inserted at one end of the pOri253 multicloning site. Gene expression was assessed by a luciferase reporter system. The plasmid carrying P23 (named pOri23) expressed luciferase constitutively at a level 10,000 times greater than did the P59-containing plasmid. Transcription was absent in E. coli. The staphylococcal clumping factor A (clfA) gene was cloned into pOri23 and used as a model system. Lactococci carrying pOri23-clfA produced an unaltered and functional 130-kDa ClfA protein attached to their cell walls. This was indicated both by the presence of the protein in Western blots of solubilized cell walls and by the ability of ClfA-positive lactococci to clump in the presence of plasma. ClfA-positive lactococci had clumping titers (titer of 4,112) similar to those of S. aureus Newman in soluble fibrinogen and bound equally well to solid-phase fibrinogen. These experiments provide a new way to study individual staphylococcal pathogenic factors and might complement both classical knockout mutagenesis and modern in vivo expression technology and signature tag mutagenesis.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The algorithmic approach to data modelling has developed rapidly these last years, in particular methods based on data mining and machine learning have been used in a growing number of applications. These methods follow a data-driven methodology, aiming at providing the best possible generalization and predictive abilities instead of concentrating on the properties of the data model. One of the most successful groups of such methods is known as Support Vector algorithms. Following the fruitful developments in applying Support Vector algorithms to spatial data, this paper introduces a new extension of the traditional support vector regression (SVR) algorithm. This extension allows for the simultaneous modelling of environmental data at several spatial scales. The joint influence of environmental processes presenting different patterns at different scales is here learned automatically from data, providing the optimum mixture of short and large-scale models. The method is adaptive to the spatial scale of the data. With this advantage, it can provide efficient means to model local anomalies that may typically arise in situations at an early phase of an environmental emergency. However, the proposed approach still requires some prior knowledge on the possible existence of such short-scale patterns. This is a possible limitation of the method for its implementation in early warning systems. The purpose of this paper is to present the multi-scale SVR model and to illustrate its use with an application to the mapping of Cs137 activity given the measurements taken in the region of Briansk following the Chernobyl accident.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Brain metastases occur in 20-50% of NSCLC and 50-80% of SCLC. In this review, we will look at evidence-based medicine data and give some perspectives on the management of BM. We will address the problems of multiple BM, single BM and prophylactic cranial irradiation. Recursive Partitioning Analysis (RPA) is a powerful prognostic tool to facilitate treatment decisions. Dealing with multiple BM, the use of corticosteroids was established more than 40 years ago by a unique randomized trial (RCT). Palliative effect is high (_80%) as well as side-effects. Whole brain radiotherapy (WBRT) was evaluated in many RCTs with a high (60-90%) response rate; several RT regimes are equivalent, but very high dose per fraction should be avoided. In multiple BM from SCLC, the effect of WBRT is comparable to that in NSCLC but chemotherapy (CXT) although advocated is probably less effective than RT. Single BM from NSCLC occurs in 30% of all BM cases; several prognostic classifications including RPA are very useful. Several options are available in single BM: WBRT, surgery (SX), radiosurgery (RS) or any combination of these. All were studied in RCTs and will be reviewed: the addition of WBRT to SX or RS gives a better neurological tumour control, has little or no impact on survival, and may be more toxic. However omitting WBRT after SX alone gives a higher risk of cerebro-spinal fluid dissemination. Prophylactic cranial irradiation (PCI) has a major role in SCLC. In limited disease, meta-analyses have shown a positive impact of PCI in the decrease of brain relapse and in survival improvement, especially for patients in complete remission. Surprisingly, this has been recently confirmed also in extensive disease. Experience with PCI for NSCLC is still limited, but RCT suggest a reduction of BM with no impact on survival. Toxicity of PCI is a matter of debate, as neurological or neuro-cognitive impairment is already present prior to PCI in almost half of patients. However RT toxicity is probably related to total dose and dose per fraction. Perspectives : Future research should concentrate on : 1) combined modalities in multiple BM. 2) Exploration of treatments in oligo-metastases. 3) Further exploration of PCI in NSCLC. 4) Exploration of new, toxicity-sparing radiotherapy techniques (IMRT, Tomotherapy etc).