941 resultados para Multivariate Statistics
Resumo:
Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.
Resumo:
This thesis builds a framework for evaluating downside risk from multivariate data via a special class of risk measures (RM). The peculiarity of the analysis lies in getting rid of strong data distributional assumptions and in orientation towards the most critical data in risk management: those with asymmetries and heavy tails. At the same time, under typical assumptions, such as the ellipticity of the data probability distribution, the conformity with classical methods is shown. The constructed class of RM is a multivariate generalization of the coherent distortion RM, which possess valuable properties for a risk manager. The design of the framework is twofold. The first part contains new computational geometry methods for the high-dimensional data. The developed algorithms demonstrate computability of geometrical concepts used for constructing the RM. These concepts bring visuality and simplify interpretation of the RM. The second part develops models for applying the framework to actual problems. The spectrum of applications varies from robust portfolio selection up to broader spheres, such as stochastic conic optimization with risk constraints or supervised machine learning.
Resumo:
There are only a few insights concerning the influence that agronomic and management variability may have on superficial scald (SS) in pears. Abate Fétel pears were picked during three seasons (2018, 2019 and 2020) from thirty commercial orchards in the Emilia Romagna region, Italy. Using a multivariate statistical approach, high heterogeneity between farms for SS development after cold storage with regular atmosphere was demonstrated. Indeed, some factors seem to affect SS in all growing seasons: high yields, soil texture, improper irrigation and Nitrogen management, use of plant growth regulators, late harvest, precipitations, Calcium and cow manure, presence of nets, orchard age, training system and rootstock. Afterwards, we explored the spatio/temporal variability of fruit attributes in two pear orchards. Environmental and physiological spatial variables were recorded by a portable RTK GPS. High spatial variability of the SS index was observed. Through a geostatistical approach, some characteristics, including soil electrical conductivity and fruit size, have been shown to be negatively correlated with SS. Moreover, regression tree analyses were applied suggesting the presence of threshold values of antioxidant capacity, total phenolic content, and acidity against SS. High pulp firmness and IAD values before storage, denoting a more immature fruit, appeared to be correlated with low SS. Finally, a convolution neural networks (CNN) was tested to detect SS and the starch pattern index (SPI) in pears for portable device applications. Preliminary statistics showed that the model for SS had low accuracy but good precision, and the CNN for SPI denoted good performances compared to the Ctifl and Laimburg scales. The major conclusion is that Abate Fétel pears can potentially be stored in different cold rooms, according to their origin and quality features, ensuring the best fruit quality for the final consumers. These results might lead to a substantial improvement in the Italian pear industry.
Resumo:
In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student's t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student's t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.
Resumo:
Resistant hypertension (RHTN) includes patients with controlled blood pressure (BP) (CRHTN) and uncontrolled BP (UCRHTN). In fact, RHTN patients are more likely to have target organ damage (TOD), and resistin, leptin and adiponectin may affect BP control in these subjects. We assessed the relationship between adipokines levels and arterial stiffness, left ventricular hypertrophy (LVH) and microalbuminuria (MA). This cross-sectional study included CRHTN (n=51) and UCRHTN (n=38) patients for evaluating body mass index, ambulatory blood pressure monitoring, plasma adiponectin, leptin and resistin concentrations, pulse wave velocity (PWV), MA and echocardiography. Leptin and resistin levels were higher in UCRHTN, whereas adiponectin levels were lower in this same subgroup. Similarly, arterial stiffness, LVH and MA were higher in UCRHTN subgroup. Adiponectin levels negatively correlated with PWV (r=-0.42, P<0.01), and MA (r=-0.48, P<0.01) only in UCRHTN. Leptin was positively correlated with PWV (r=0.37, P=0.02) in UCRHTN subgroup, whereas resistin was not correlated with TOD in both subgroups. Adiponectin is associated with arterial stiffness and renal injury in UCRHTN patients, whereas leptin is associated with arterial stiffness in the same subgroup. Taken together, our results showed that those adipokines may contribute to vascular and renal damage in UCRHTN patients.
Resumo:
Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample representativeness, and minimizing the effect of the presence of contaminants.
Resumo:
Ten common doubts of chemistry students and professionals about their statistical applications are discussed. The use of the N-1 denominator instead of N is described for the standard deviation. The statistical meaning of the denominators of the root mean square error of calibration (RMSEC) and root mean square error of validation (RMSEV) are given for researchers using multivariate calibration methods. The reason why scientists and engineers use the average instead of the median is explained. Several problematic aspects about regression and correlation are treated. The popular use of triplicate experiments in teaching and research laboratories is seen to have its origin in statistical confidence intervals. Nonparametric statistics and bootstrapping methods round out the discussion.
Resumo:
We estimated the prevalence of chronic diseases and other health problems reported by adolescents in relation to social and demographic variables and nutritional status. This cross-sectional population-based survey analyzed data from the Health Survey in Campinas, São Paulo State, Brazil, 2008. We used descriptive statistics and associations between variables with the chisquare test. Prevalence of chronic diseases among adolescents was 19.17%, with asthma showing the highest prevalence (7.59%), followed by heart disease (1.96%), hypertension (1.07%), and diabetes 0.21%. Prevalence rates were 61.53% for health problems, 40.39% for allergy, and 24.83% for frequent headache or migraine. After multivariate analysis using Poisson regression, the factors associated with chronic disease were age 15 to 19 years (PR = 1.38), not attending school (PR = 1.46), having children (PR = 1.84), and obesity (PR = 1.54). Female gender (PR = 1.12) was statistically associated with health problems. The study illustrates that adolescence is a life stage in which chronic disease and health problems can occur.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
Universidade Estadual de Campinas. Faculdade de Educação Física
Resumo:
The aim of the present study was to evaluate the effect of soil characteristics (pH, macro- and micro-nutrients), environmental factors (temperature, humidity, period of the year and time of day of collection) and meteorological conditions (rain, sun, cloud and cloud/rain) on the flavonoid content of leaves of Passiflora incarnata L., Passifloraceae. The total flavonoid contents of leaf samples harvested from plants cultivated or collected under different conditions were quantified by high-performance liquid chromatography with ultraviolet detection (HPLC-UV/PAD). Chemometric treatment of the data by principal component (PCA) and hierarchic cluster analyses (HCA) showed that the samples did not present a specific classification in relation to the environmental and soil variables studied, and that the environmental variables were not significant in describing the data set. However, the levels of the elements Fe, B and Cu present in the soil showed an inverse correlation with the total flavonoid contents of the leaves of P. incarnata.
Resumo:
Background: Genome wide association studies (GWAS) are becoming the approach of choice to identify genetic determinants of complex phenotypes and common diseases. The astonishing amount of generated data and the use of distinct genotyping platforms with variable genomic coverage are still analytical challenges. Imputation algorithms combine directly genotyped markers information with haplotypic structure for the population of interest for the inference of a badly genotyped or missing marker and are considered a near zero cost approach to allow the comparison and combination of data generated in different studies. Several reports stated that imputed markers have an overall acceptable accuracy but no published report has performed a pair wise comparison of imputed and empiric association statistics of a complete set of GWAS markers. Results: In this report we identified a total of 73 imputed markers that yielded a nominally statistically significant association at P < 10(-5) for type 2 Diabetes Mellitus and compared them with results obtained based on empirical allelic frequencies. Interestingly, despite their overall high correlation, association statistics based on imputed frequencies were discordant in 35 of the 73 (47%) associated markers, considerably inflating the type I error rate of imputed markers. We comprehensively tested several quality thresholds, the haplotypic structure underlying imputed markers and the use of flanking markers as predictors of inaccurate association statistics derived from imputed markers. Conclusions: Our results suggest that association statistics from imputed markers showing specific MAF (Minor Allele Frequencies) range, located in weak linkage disequilibrium blocks or strongly deviating from local patterns of association are prone to have inflated false positive association signals. The present study highlights the potential of imputation procedures and proposes simple procedures for selecting the best imputed markers for follow-up genotyping studies.
Resumo:
The existence of juxtaposed regions of distinct cultures in spite of the fact that people's beliefs have a tendency to become more similar to each other's as the individuals interact repeatedly is a puzzling phenomenon in the social sciences. Here we study an extreme version of the frequency-dependent bias model of social influence in which an individual adopts the opinion shared by the majority of the members of its extended neighborhood, which includes the individual itself. This is a variant of the majority-vote model in which the individual retains its opinion in case there is a tie among the neighbors' opinions. We assume that the individuals are fixed in the sites of a square lattice of linear size L and that they interact with their nearest neighbors only. Within a mean-field framework, we derive the equations of motion for the density of individuals adopting a particular opinion in the single-site and pair approximations. Although the single-site approximation predicts a single opinion domain that takes over the entire lattice, the pair approximation yields a qualitatively correct picture with the coexistence of different opinion domains and a strong dependence on the initial conditions. Extensive Monte Carlo simulations indicate the existence of a rich distribution of opinion domains or clusters, the number of which grows with L(2) whereas the size of the largest cluster grows with ln L(2). The analysis of the sizes of the opinion domains shows that they obey a power-law distribution for not too large sizes but that they are exponentially distributed in the limit of very large clusters. In addition, similarly to other well-known social influence model-Axelrod's model-we found that these opinion domains are unstable to the effect of a thermal-like noise.
Resumo:
Background: The aim of this study was to estimate the prevalence of fibromyalgia, as well as to assess the major symptoms of this syndrome in an adult, low socioeconomic status population assisted by the primary health care system in a city in Brazil. Methods: We cross-sectionally sampled individuals assisted by the public primary health care system (n = 768, 35-60 years old). Participants were interviewed by phone and screened about pain. They were then invited to be clinically assessed (304 accepted). Pain was estimated using a Visual Analogue Scale (VAS). Fibromyalgia was assessed using the Fibromyalgia Impact Questionnaire (FIQ), as well as screening for tender points using dolorimetry. Statistical analyses included Bayesian Statistics and the Kruskal-Wallis Anova test (significance level = 5%). Results: From the phone-interview screening, we divided participants (n = 768) in three groups: No Pain (NP) (n = 185); Regional Pain (RP) (n = 388) and Widespread Pain (WP) (n = 106). Among those participating in the clinical assessments, (304 subjects), the prevalence of fibromyalgia was 4.4% (95% confidence interval [2.6%; 6.3%]). Symptoms of pain (VAS and FIQ), feeling well, job ability, fatigue, morning tiredness, stiffness, anxiety and depression were statically different among the groups. In multivariate analyses we found that individuals with FM and WP had significantly higher impairment than those with RP and NP. FM and WP were similarly disabling. Similarly, RP was no significantly different than NP. Conclusion: Fibromyalgia is prevalent in the low socioeconomic status population assisted by the public primary health care system. Prevalence was similar to other studies (4.4%) in a more diverse socioeconomic population. Individuals with FM and WP have significant impact in their well being.