977 resultados para Higgs boson, statistics, multivariate methods, ATLAS
Resumo:
As empresas que almejam garantir e melhorar sua posição dentro de em um mercado cada vez mais competitivo precisam estar sempre atualizadas e em constante evolução. Na busca contínua por essa evolução, investem em projetos de Pesquisa & Desenvolvimento (P&D) e em seu capital humano para promover a criatividade e a inovação organizacional. As pessoas têm papel fundamental no desenvolvimento da inovação, mas para que isso possa florescer de forma constante é preciso comprometimento e criatividade para a geração de ideias. Criatividade é pensar o novo; inovação é fazer acontecer. Porém, encontrar pessoas com essas qualidades nem sempre é tarefa fácil e muitas vezes é preciso estimular essas habilidades e características para que se tornem efetivamente criativas. Os cursos de graduação podem ser uma importante ferramenta para trabalhar esses aspectos, características e habilidades, usando métodos e práticas de ensino que auxiliem no desenvolvimento da criatividade, pois o ambiente ensino-aprendizagem pesa significativamente na formação das pessoas. O objetivo deste estudo é de identificar quais fatores têm maior influência sobre o desenvolvimento da criatividade em um curso de graduação em administração, analisando a influência das práticas pedagógicas dos docentes e as barreiras internas dos discentes. O referencial teórico se baseia principalmente nos trabalhos de Alencar, Fleith, Torrance e Wechsler. A pesquisa transversal de abordagem quantitativa teve como público-alvo os alunos do curso de Administração de uma universidade confessional da Grande São Paulo, que responderam 465 questionários compostos de três escalas. Para as práticas docentes foi adaptada a escala de Práticas Docentes em relação à Criatividade. Para as barreiras internas foi adaptada a escala de Barreiras da Criatividade Pessoal. Para a análise da percepção do desenvolvimento da criatividade foi construída e validada uma escala baseada no referencial de características de uma pessoa criativa. As análises estatísticas descritivas e fatoriais exploratórias foram realizadas no software Statistical Package for the Social Sciences (SPSS), enquanto as análises fatoriais confirmatórias e a mensuração da influência das práticas pedagógicas e das barreiras internas sobre a percepção do desenvolvimento da criatividade foram realizadas por modelagem de equação estrutural utilizando o algoritmo Partial Least Squares (PLS), no software Smart PLS 2.0. Os resultados apontaram que as práticas pedagógicas e as barreiras internas dos discentes explicam 40% da percepção de desenvolvimento da criatividade, sendo as práticas pedagógicas que exercem maior influencia. A pesquisa também apontou que o tipo de temática e o período em que o aluno está cursando não têm influência sobre nenhum dos três construtos, somente o professor influencia as práticas pedagógicas.
Resumo:
A major problem in modern probabilistic modeling is the huge computational complexity involved in typical calculations with multivariate probability distributions when the number of random variables is large. Because exact computations are infeasible in such cases and Monte Carlo sampling techniques may reach their limits, there is a need for methods that allow for efficient approximate computations. One of the simplest approximations is based on the mean field method, which has a long history in statistical physics. The method is widely used, particularly in the growing field of graphical models. Researchers from disciplines such as statistical physics, computer science, and mathematical statistics are studying ways to improve this and related methods and are exploring novel application areas. Leading approaches include the variational approach, which goes beyond factorizable distributions to achieve systematic improvements; the TAP (Thouless-Anderson-Palmer) approach, which incorporates correlations by including effective reaction terms in the mean field theory; and the more general methods of graphical models. Bringing together ideas and techniques from these diverse disciplines, this book covers the theoretical foundations of advanced mean field methods, explores the relation between the different approaches, examines the quality of the approximation obtained, and demonstrates their application to various areas of probabilistic modeling.
Resumo:
This accessible, practice-oriented and compact text provides a hands-on introduction to the principles of market research. Using the market research process as a framework, the authors explain how to collect and describe the necessary data and present the most important and frequently used quantitative analysis techniques, such as ANOVA, regression analysis, factor analysis, and cluster analysis. An explanation is provided of the theoretical choices a market researcher has to make with regard to each technique, as well as how these are translated into actions in IBM SPSS Statistics. This includes a discussion of what the outputs mean and how they should be interpreted from a market research perspective. Each chapter concludes with a case study that illustrates the process based on real-world data. A comprehensive web appendix includes additional analysis techniques, datasets, video files and case studies. Several mobile tags in the text allow readers to quickly browse related web content using a mobile device.
Resumo:
Dissolved organic matter (DOM) in groundwater and surface water samples from the Florida coastal Everglades were studied using excitation–emission matrix fluorescence modeled through parallel factor analysis (EEM-PARAFAC). DOM in both surface and groundwater from the eastern Everglades S332 basin reflected a terrestrial-derived fingerprint through dominantly higher abundances of humic-like PARAFAC components. In contrast, surface water DOM from northeastern Florida Bay featured a microbial-derived DOM signature based on the higher abundance of microbial humic-like and protein-like components consistent with its marine source. Surprisingly, groundwater DOM from northeastern Florida Bay reflected a terrestrial-derived source except for samples from central Florida Bay well, which mirrored a combination of terrestrial and marine end-member origin. Furthermore, surface water and groundwater displayed effects of different degradation pathways such as photodegradation and biodegradation as exemplified by two PARAFAC components seemingly indicative of such degradation processes. Finally, Principal Component Analysis of the EEM-PARAFAC data was able to distinguish and classify most of the samples according to DOM origins and degradation processes experienced, except for a small overlap of S332 surface water and groundwater, implying rather active surface-to-ground water interaction in some sites particularly during the rainy season. This study highlights that EEM-PARAFAC could be used successfully to trace and differentiate DOM from diverse sources across both horizontal and vertical flow profiles, and as such could be a convenient and useful tool for the better understanding of hydrological interactions and carbon biogeochemical cycling.
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.
Resumo:
Abstract
Continuous variable is one of the major data types collected by the survey organizations. It can be incomplete such that the data collectors need to fill in the missingness. Or, it can contain sensitive information which needs protection from re-identification. One of the approaches to protect continuous microdata is to sum them up according to different cells of features. In this thesis, I represents novel methods of multiple imputation (MI) that can be applied to impute missing values and synthesize confidential values for continuous and magnitude data.
The first method is for limiting the disclosure risk of the continuous microdata whose marginal sums are fixed. The motivation for developing such a method comes from the magnitude tables of non-negative integer values in economic surveys. I present approaches based on a mixture of Poisson distributions to describe the multivariate distribution so that the marginals of the synthetic data are guaranteed to sum to the original totals. At the same time, I present methods for assessing disclosure risks in releasing such synthetic magnitude microdata. The illustration on a survey of manufacturing establishments shows that the disclosure risks are low while the information loss is acceptable.
The second method is for releasing synthetic continuous micro data by a nonstandard MI method. Traditionally, MI fits a model on the confidential values and then generates multiple synthetic datasets from this model. Its disclosure risk tends to be high, especially when the original data contain extreme values. I present a nonstandard MI approach conditioned on the protective intervals. Its basic idea is to estimate the model parameters from these intervals rather than the confidential values. The encouraging results of simple simulation studies suggest the potential of this new approach in limiting the posterior disclosure risk.
The third method is for imputing missing values in continuous and categorical variables. It is extended from a hierarchically coupled mixture model with local dependence. However, the new method separates the variables into non-focused (e.g., almost-fully-observed) and focused (e.g., missing-a-lot) ones. The sub-model structure of focused variables is more complex than that of non-focused ones. At the same time, their cluster indicators are linked together by tensor factorization and the focused continuous variables depend locally on non-focused values. The model properties suggest that moving the strongly associated non-focused variables to the side of focused ones can help to improve estimation accuracy, which is examined by several simulation studies. And this method is applied to data from the American Community Survey.
Resumo:
Thesis (Ph.D.)--University of Washington, 2016-08
Resumo:
This dissertation proposes statistical methods to formulate, estimate and apply complex transportation models. Two main problems are part of the analyses conducted and presented in this dissertation. The first method solves an econometric problem and is concerned with the joint estimation of models that contain both discrete and continuous decision variables. The use of ordered models along with a regression is proposed and their effectiveness is evaluated with respect to unordered models. Procedure to calculate and optimize the log-likelihood functions of both discrete-continuous approaches are derived, and difficulties associated with the estimation of unordered models explained. Numerical approximation methods based on the Genz algortithm are implemented in order to solve the multidimensional integral associated with the unordered modeling structure. The problems deriving from the lack of smoothness of the probit model around the maximum of the log-likelihood function, which makes the optimization and the calculation of standard deviations very difficult, are carefully analyzed. A methodology to perform out-of-sample validation in the context of a joint model is proposed. Comprehensive numerical experiments have been conducted on both simulated and real data. In particular, the discrete-continuous models are estimated and applied to vehicle ownership and use models on data extracted from the 2009 National Household Travel Survey. The second part of this work offers a comprehensive statistical analysis of free-flow speed distribution; the method is applied to data collected on a sample of roads in Italy. A linear mixed model that includes speed quantiles in its predictors is estimated. Results show that there is no road effect in the analysis of free-flow speeds, which is particularly important for model transferability. A very general framework to predict random effects with few observations and incomplete access to model covariates is formulated and applied to predict the distribution of free-flow speed quantiles. The speed distribution of most road sections is successfully predicted; jack-knife estimates are calculated and used to explain why some sections are poorly predicted. Eventually, this work contributes to the literature in transportation modeling by proposing econometric model formulations for discrete-continuous variables, more efficient methods for the calculation of multivariate normal probabilities, and random effects models for free-flow speed estimation that takes into account the survey design. All methods are rigorously validated on both real and simulated data.
Resumo:
Background: The evidence base on end-of-life care in acute stroke is limited, particularly with regard to recognising dying and related decision-making. There is also limited evidence to support the use of end-of-life care pathways (standardised care plans) for patients who are dying after stroke. Aim: This study aimed to explore the clinical decision-making involved in placing patients on an end-of-life care pathway, evaluate predictors of care pathway use, and investigate the role of families in decision-making. The study also aimed to examine experiences of end-of-life care pathway use for stroke patients, their relatives and the multi-disciplinary health care team. Methods: A mixed methods design was adopted. Data were collected in four Scottish acute stroke units. Case-notes were identified prospectively from 100 consecutive stroke deaths and reviewed. Multivariate analysis was performed on case-note data. Semi-structured interviews were conducted with 17 relatives of stroke decedents and 23 healthcare professionals, using a modified grounded theory approach to collect and analyse data. The VOICES survey tool was also administered to the bereaved relatives and data were analysed using descriptive statistics and thematic analysis of free-text responses. Results: Relatives often played an important role in influencing aspects of end-of-life care, including decisions to use an end-of-life care pathway. Some relatives experienced enduring distress with their perceived responsibility for care decisions. Relatives felt unprepared for and were distressed by prolonged dying processes, which were often associated with severe dysphagia. Pro-active information-giving by staff was reported as supportive by relatives. Healthcare professionals generally avoided discussing place of care with families. Decisions to use an end-of-life care pathway were not predicted by patients’ demographic characteristics; decisions were generally made in consultation with families and the extended health care team, and were made within regular working hours. Conclusion: Distressing stroke-related issues were more prominent in participants’ accounts than concerns with the end-of-life care pathway used. Relatives sometimes perceived themselves as responsible for important clinical decisions. Witnessing prolonged dying processes was difficult for healthcare professionals and families, particularly in relation to the management of persistent major swallowing difficulties.
Resumo:
Adaptability and invisibility are hallmarks of modern terrorism, and keeping pace with its dynamic nature presents a serious challenge for societies throughout the world. Innovations in computer science have incorporated applied mathematics to develop a wide array of predictive models to support the variety of approaches to counterterrorism. Predictive models are usually designed to forecast the location of attacks. Although this may protect individual structures or locations, it does not reduce the threat—it merely changes the target. While predictive models dedicated to events or social relationships receive much attention where the mathematical and social science communities intersect, models dedicated to terrorist locations such as safe-houses (rather than their targets or training sites) are rare and possibly nonexistent. At the time of this research, there were no publically available models designed to predict locations where violent extremists are likely to reside. This research uses France as a case study to present a complex systems model that incorporates multiple quantitative, qualitative and geospatial variables that differ in terms of scale, weight, and type. Though many of these variables are recognized by specialists in security studies, there remains controversy with respect to their relative importance, degree of interaction, and interdependence. Additionally, some of the variables proposed in this research are not generally recognized as drivers, yet they warrant examination based on their potential role within a complex system. This research tested multiple regression models and determined that geographically-weighted regression analysis produced the most accurate result to accommodate non-stationary coefficient behavior, demonstrating that geographic variables are critical to understanding and predicting the phenomenon of terrorism. This dissertation presents a flexible prototypical model that can be refined and applied to other regions to inform stakeholders such as policy-makers and law enforcement in their efforts to improve national security and enhance quality-of-life.
Resumo:
In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student's t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student's t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.
Resumo:
Frankfurters are widely consumed all over the world, and the production requires a wide range of meat and non-meat ingredients. Due to these characteristics, frankfurters are products that can be easily adulterated with lower value meats, and the presence of undeclared species. Adulterations are often still difficult to detect, due the fact that the adulterant components are usually very similar to the authentic product. In this work, FT-Raman spectroscopy was employed as a rapid technique for assessing the quality of frankfurters. Based on information provided by the Raman spectra, a multivariate classification model was developed to identify the frankfurter type. The aim was to study three types of frankfurters (chicken, turkey and mixed meat) according to their Raman spectra, based on the fatty vibrational bands. Classification model was built using partial least square discriminant analysis (PLS-DA) and the performance model was evaluated in terms of sensitivity, specificity, accuracy, efficiency and Matthews's correlation coefficient. The PLS-DA models give sensitivity and specificity values on the test set in the ranges of 88%-100%, showing good performance of the classification models. The work shows the Raman spectroscopy with chemometric tools can be used as an analytical tool in quality control of frankfurters.
Resumo:
Ten common doubts of chemistry students and professionals about their statistical applications are discussed. The use of the N-1 denominator instead of N is described for the standard deviation. The statistical meaning of the denominators of the root mean square error of calibration (RMSEC) and root mean square error of validation (RMSEV) are given for researchers using multivariate calibration methods. The reason why scientists and engineers use the average instead of the median is explained. Several problematic aspects about regression and correlation are treated. The popular use of triplicate experiments in teaching and research laboratories is seen to have its origin in statistical confidence intervals. Nonparametric statistics and bootstrapping methods round out the discussion.
Resumo:
A detecção do sexo de mosquitos da família Culicidae é importante em estudos faunísticos e epidemiológicos, pois somente as fêmeas possuem competência vetora para patógenos. O dimorfismo sexual de genitália e de apêndices cefálicos é, em geral, facilmente visível em culicídeos. As asas também podem ser dimórficas e assim poderiam complementar o procedimento de sexagem. No entanto, tal distinção não é facilmente notável à observação direta. Visando descrever formalmente o dimorfismo sexual alar em Aedes scapularis, um culicídeo vetorialmente competente para arbovírus e filárias, asas de machos e fêmeas foram comparadas usando-se métodos de morfometria geométrica e análise estatística multivariada. Nestas análises, populações dos municípios São Paulo e Pariquera-Açu (Estado de São Paulo) foram amostradas. A forma das asas mostrou evidente dimorfismo sexual, o que permitiu um índice de acurácia de 100% em testes-cegos de reclassificação, independentemente da origem geográfica. Já o tamanho alar foi sexualmente dimórfico apenas na população de São Paulo. Aparentemente, a forma alar é evolutivamente mais estável que o tamanho, interpretação que está de acordo com a teoria de Dujardin (2008b), de que a forma alar de insetos seria composta por caracteres genéticos quantitativos e pouco influenciada por fatores não-genéticos, enquanto que o tamanho alar seria predominantemente determinado por plasticidade decorrente de influências ambientais.
Resumo:
Background: The aim of this study was to estimate the prevalence of fibromyalgia, as well as to assess the major symptoms of this syndrome in an adult, low socioeconomic status population assisted by the primary health care system in a city in Brazil. Methods: We cross-sectionally sampled individuals assisted by the public primary health care system (n = 768, 35-60 years old). Participants were interviewed by phone and screened about pain. They were then invited to be clinically assessed (304 accepted). Pain was estimated using a Visual Analogue Scale (VAS). Fibromyalgia was assessed using the Fibromyalgia Impact Questionnaire (FIQ), as well as screening for tender points using dolorimetry. Statistical analyses included Bayesian Statistics and the Kruskal-Wallis Anova test (significance level = 5%). Results: From the phone-interview screening, we divided participants (n = 768) in three groups: No Pain (NP) (n = 185); Regional Pain (RP) (n = 388) and Widespread Pain (WP) (n = 106). Among those participating in the clinical assessments, (304 subjects), the prevalence of fibromyalgia was 4.4% (95% confidence interval [2.6%; 6.3%]). Symptoms of pain (VAS and FIQ), feeling well, job ability, fatigue, morning tiredness, stiffness, anxiety and depression were statically different among the groups. In multivariate analyses we found that individuals with FM and WP had significantly higher impairment than those with RP and NP. FM and WP were similarly disabling. Similarly, RP was no significantly different than NP. Conclusion: Fibromyalgia is prevalent in the low socioeconomic status population assisted by the public primary health care system. Prevalence was similar to other studies (4.4%) in a more diverse socioeconomic population. Individuals with FM and WP have significant impact in their well being.