3 resultados para knn-menetelmä

em Université de Lausanne, Switzerland


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Die Digitalisierung und Vernetzung der Gesellschaft hat diese entscheidend verändert. Auch Denk- und Handlungsweisen in der Medizin werden durch die neuen Informationstechnologien in erheblichem Maße beeinflusst. Von welcher Art ist dieser Einfluss? Verändert sich das Arzt-Patienten-Verhältnis? Bekommen wir den "gläsernen Patienten"? Gibt es Auswirkungen auf die Berufsstände im Gesundheitswesen? Verändert diese Technologie die Sozialsysteme möglicherweise sogar grundsätzlich? Werden wir bald von intelligenten technischen Systemen beherrscht? Sollten wir (bedenkenlos) alles machen, was wir (technisch) könn(t)en? Wo sind die Grenzen, wo müssen wir bremsen? Müssen wir über eine neue Ethik nachdenken oder "ist alles schon mal da gewesen"? Diesen und ähnlichen Fragen gehen die Beiträge in dieser Buchpublikation nach. In einem ersten Teil werden zur Einführung und zur Anregung Zukunftsszenarien entwickelt, und zwar für die Gebiete Medizinische Bildgebung, Medizinische Robotik und Telemedizin. Der zweiteTeil reflektiert allgemein über ethische Aspekte der Informationstechnik und der Informationstechnik, zunächst ohne direkten Bezug zur Medizin. Dieser wird im dritten Teil hergestellt. Den Schwerpunkt bildet hier die Telemedizin, weil diese ein geeignetes Modell zur ethischen Diskussion über Technik und Medizin darstellt. Der vierte Teil enthält eine Diskussion zum Thema "Werden die neuen Entwicklungen von Informationstechnik in der Medizin zum Fluch oder zum Segen für die Gesellschaft?". Schließlich sind in einem fünften Teil die Ergebnisse und Erkenntnisse in Thesenform zusammenzufassen. Diese "Dresdner Thesen zu ethischen Aspekten der Telemedizin" sind nach der Veranstaltung in einem mehrmonatigen Prozess entstanden, an dem die Herausgeber und die Autoren beteiligt waren.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

BACKGROUND: With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences ("batch effects") as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. FOCUS: The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. DATA: We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., 'control') or group 2 (e.g., 'treated'). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. METHODS: We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The present research deals with an important public health threat, which is the pollution created by radon gas accumulation inside dwellings. The spatial modeling of indoor radon in Switzerland is particularly complex and challenging because of many influencing factors that should be taken into account. Indoor radon data analysis must be addressed from both a statistical and a spatial point of view. As a multivariate process, it was important at first to define the influence of each factor. In particular, it was important to define the influence of geology as being closely associated to indoor radon. This association was indeed observed for the Swiss data but not probed to be the sole determinant for the spatial modeling. The statistical analysis of data, both at univariate and multivariate level, was followed by an exploratory spatial analysis. Many tools proposed in the literature were tested and adapted, including fractality, declustering and moving windows methods. The use of Quan-tité Morisita Index (QMI) as a procedure to evaluate data clustering in function of the radon level was proposed. The existing methods of declustering were revised and applied in an attempt to approach the global histogram parameters. The exploratory phase comes along with the definition of multiple scales of interest for indoor radon mapping in Switzerland. The analysis was done with a top-to-down resolution approach, from regional to local lev¬els in order to find the appropriate scales for modeling. In this sense, data partition was optimized in order to cope with stationary conditions of geostatistical models. Common methods of spatial modeling such as Κ Nearest Neighbors (KNN), variography and General Regression Neural Networks (GRNN) were proposed as exploratory tools. In the following section, different spatial interpolation methods were applied for a par-ticular dataset. A bottom to top method complexity approach was adopted and the results were analyzed together in order to find common definitions of continuity and neighborhood parameters. Additionally, a data filter based on cross-validation was tested with the purpose of reducing noise at local scale (the CVMF). At the end of the chapter, a series of test for data consistency and methods robustness were performed. This lead to conclude about the importance of data splitting and the limitation of generalization methods for reproducing statistical distributions. The last section was dedicated to modeling methods with probabilistic interpretations. Data transformation and simulations thus allowed the use of multigaussian models and helped take the indoor radon pollution data uncertainty into consideration. The catego-rization transform was presented as a solution for extreme values modeling through clas-sification. Simulation scenarios were proposed, including an alternative proposal for the reproduction of the global histogram based on the sampling domain. The sequential Gaussian simulation (SGS) was presented as the method giving the most complete information, while classification performed in a more robust way. An error measure was defined in relation to the decision function for data classification hardening. Within the classification methods, probabilistic neural networks (PNN) show to be better adapted for modeling of high threshold categorization and for automation. Support vector machines (SVM) on the contrary performed well under balanced category conditions. In general, it was concluded that a particular prediction or estimation method is not better under all conditions of scale and neighborhood definitions. Simulations should be the basis, while other methods can provide complementary information to accomplish an efficient indoor radon decision making.