969 resultados para multivariate methods


Relevância:

30.00% 30.00%

Publicador:

Resumo:

When it comes to information sets in real life, often pieces of the whole set may not be available. This problem can find its origin in various reasons, describing therefore different patterns. In the literature, this problem is known as Missing Data. This issue can be fixed in various ways, from not taking into consideration incomplete observations, to guessing what those values originally were, or just ignoring the fact that some values are missing. The methods used to estimate missing data are called Imputation Methods. The work presented in this thesis has two main goals. The first one is to determine whether any kind of interactions exists between Missing Data, Imputation Methods and Supervised Classification algorithms, when they are applied together. For this first problem we consider a scenario in which the databases used are discrete, understanding discrete as that it is assumed that there is no relation between observations. These datasets underwent processes involving different combina- tions of the three components mentioned. The outcome showed that the missing data pattern strongly influences the outcome produced by a classifier. Also, in some of the cases, the complex imputation techniques investigated in the thesis were able to obtain better results than simple ones. The second goal of this work is to propose a new imputation strategy, but this time we constrain the specifications of the previous problem to a special kind of datasets, the multivariate Time Series. We designed new imputation techniques for this particular domain, and combined them with some of the contrasted strategies tested in the pre- vious chapter of this thesis. The time series also were subjected to processes involving missing data and imputation to finally propose an overall better imputation method. In the final chapter of this work, a real-world example is presented, describing a wa- ter quality prediction problem. The databases that characterized this problem had their own original latent values, which provides a real-world benchmark to test the algorithms developed in this thesis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Elemental analysis can become an important piece of evidence to assist the solution of a case. The work presented in this dissertation aims to evaluate the evidential value of the elemental composition of three particular matrices: ink, paper and glass. In the first part of this study, the analytical performance of LIBS and LA-ICP-MS methods was evaluated for paper, writing inks and printing inks. A total of 350 ink specimens were examined including black and blue gel inks, ballpoint inks, inkjets and toners originating from several manufacturing sources and/or batches. The paper collection set consisted of over 200 paper specimens originating from 20 different paper sources produced by 10 different plants. Micro-homogeneity studies show smaller variation of elemental compositions within a single source (i.e., sheet, pen or cartridge) than the observed variation between different sources (i.e., brands, types, batches). Significant and detectable differences in the elemental profile of the inks and paper were observed between samples originating from different sources (discrimination of 87 – 100% of samples, depending on the sample set under investigation and the method applied). These results support the use of elemental analysis, using LA-ICP-MS and LIBS, for the examination of documents and provide additional discrimination to the currently used techniques in document examination. In the second part of this study, a direct comparison between four analytical methods (µ-XRF, solution-ICP-MS, LA-ICP-MS and LIBS) was conducted for glass analyses using interlaboratory studies. The data provided by 21 participants were used to assess the performance of the analytical methods in associating glass samples from the same source and differentiating different sources, as well as the use of different match criteria (confidence interval (±6s, ±5s, ±4s, ±3s, ±2s), modified confidence interval, t-test (sequential univariate, p=0.05 and p=0.01), t-test with Bonferroni correction (for multivariate comparisons), range overlap, and Hotelling’s T2 tests. Error rates (Type 1 and Type 2) are reported for the use of each of these match criteria and depend on the heterogeneity of the glass sources, the repeatability between analytical measurements, and the number of elements that were measured. The study provided recommendations for analytical performance-based parameters for µ-XRF and LA-ICP-MS as well as the best performing match criteria for both analytical techniques, which can be applied now by forensic glass examiners.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp, 2016. © 2016 Wiley Periodicals, Inc.