906 resultados para Exploratory statistical data analysis
Resumo:
Class exercise to analyse qualitative data mediated on use of a set of transcripts, augmented by videos from web site. Discussion is around not only how the data is codes, interview bias, dimensions of analysis. Designed as an introduction.
Resumo:
A presente dissertação têm como problemática a violência conjugal sobre a mulher por parte do seu companheiro masculino, e como objeto empírico de investigação, o diagnóstico de necessidades de competências do Agente da PSP para efetuar um atendimento eficaz à mulher vítima de violência conjugal. Em termos específicos pretende estabelecer-se um perfil de competências profissionais, ao nível dos conhecimentos, habilidades e atitudes, que o profissional da PSP deve possuir para efetuar este atendimento com eficácia, dignidade e respeitando todos os aspetos dos direitos destas vítimas. Cientificamente, a realização de um diagnóstico de necessidades de competências conjetura várias etapas que objetivam definir as competências necessárias, as que se encontram em falta, e expor as que atualmente se possuem. Para atingir tal propósito auscultaram-se as visões dos diferentes atores que intervém neste crime através de três amostras distintas, respetivamente, Especialistas, Vítimas e Polícia. Em concreto catorze reconhecidos especialistas nacionais da área da violência doméstica, cem mulheres vítimas deste crime com denúncia efetuada à PSP, e cem profissionais desta Força de Segurança que acuam neste crime. O procedimento que delimita estas etapas anuncia no seu âmago os objetivos específicos. Neste âmbito, pediu-se aos especialistas o delinear das competências necessárias, as vítimas que descrevessem as lacunas do atendimento, e aos Agentes para elencar as competências atuais. Com a triangulação dos dados obtidos foi possível obter o diagnóstico de competências e responder à pergunta de partida: Quais as competências que existem e quais as que se evidenciam como necessárias no Agente da PSP para o atendimento à mulher vítima de violência conjugal? Neste âmbito, Várias questões se levantaram de forma a responder a esta pergunta de partida: Quais as competências necessárias aos Agentes da PSP para efetuarem um eficaz atendimento perante este crime? Será que estas vítimas se encontram satisfeitas com o atendimento dos elementos policiais que responderam à denúncia do crime? Será que estes Agentes se sentem preparados para intervir com competência a este tipo de crime? Ao nível metodológico, após a pesquisa exploratória, utilizou-se uma metodologia transversal, quali-quantitativa e quantitativa, com recolha de dados assente no método de Delphi, com inquéritos por questionários semiestruturados e estruturados e a análise de conteúdo de dados estatísticos. Perante a análise dos resultados, o diagnóstico de necessidades competências efetuado permitiu concluir que existe um conjunto de competências que têm de ser melhoradas, outro que têm de ser adquiridas pelo Agente da PSP, definido por estes e pelas vítimas.
Resumo:
The influence matrix is used in ordinary least-squares applications for monitoring statistical multiple-regression analyses. Concepts related to the influence matrix provide diagnostics on the influence of individual data on the analysis - the analysis change that would occur by leaving one observation out, and the effective information content (degrees of freedom for signal) in any sub-set of the analysed data. In this paper, the corresponding concepts have been derived in the context of linear statistical data assimilation in numerical weather prediction. An approximate method to compute the diagonal elements of the influence matrix (the self-sensitivities) has been developed for a large-dimension variational data assimilation system (the four-dimensional variational system of the European Centre for Medium-Range Weather Forecasts). Results show that, in the boreal spring 2003 operational system, 15% of the global influence is due to the assimilated observations in any one analysis, and the complementary 85% is the influence of the prior (background) information, a short-range forecast containing information from earlier assimilated observations. About 25% of the observational information is currently provided by surface-based observing systems, and 75% by satellite systems. Low-influence data points usually occur in data-rich areas, while high-influence data points are in data-sparse areas or in dynamically active regions. Background-error correlations also play an important role: high correlation diminishes the observation influence and amplifies the importance of the surrounding real and pseudo observations (prior information in observation space). Incorrect specifications of background and observation-error covariance matrices can be identified, interpreted and better understood by the use of influence-matrix diagnostics for the variety of observation types and observed variables used in the data assimilation system. Copyright © 2004 Royal Meteorological Society
Resumo:
As in any field of scientific inquiry, advancements in the field of second language acquisition (SLA) rely in part on the interpretation and generalizability of study findings using quantitative data analysis and inferential statistics. While statistical techniques such as ANOVA and t-tests are widely used in second language research, this review article provides a review of a class of newer statistical models that have not yet been widely adopted in the field, but have garnered interest in other fields of language research. The class of statistical models called mixed-effects models are introduced, and the potential benefits of these models for the second language researcher are discussed. A simple example of mixed-effects data analysis using the statistical software package R (R Development Core Team, 2011) is provided as an introduction to the use of these statistical techniques, and to exemplify how such analyses can be reported in research articles. It is concluded that mixed-effects models provide the second language researcher with a powerful tool for the analysis of a variety of types of second language acquisition data.
Resumo:
The paper analyses the impact of a priori determinants of biosecurity behaviour of farmers in Great Britain. We use a dataset collected through a stratified telephone survey of 900 cattle and sheep farmers in Great Britain (400 in England and a further 250 in Wales and Scotland respectively) which took place between 25 March 2010 and 18 June 2010. The survey was stratified by farm type, farm size and region. To test the influence of a priori determinants on biosecurity behaviour we used a behavioural economics method, structural equation modelling (SEM) with observed and latent variables. SEM is a statistical technique for testing and estimating causal relationships amongst variables, some of which may be latent using a combination of statistical data and qualitative causal assumptions. Thirteen latent variables were identified and extracted, expressing the behaviour and the underlying determining factors. The variables were: experience, economic factors, organic certification of farm, membership in a cattle/sheep health scheme, perceived usefulness of biosecurity information sources, knowledge about biosecurity measures, perceived importance of specific biosecurity strategies, perceived effect (on farm business in the past five years) of welfare/health regulation, perceived effect of severe outbreaks of animal diseases, attitudes towards livestock biosecurity, attitudes towards animal welfare, influence on decision to apply biosecurity measures and biosecurity behaviour. The SEM model applied on the Great Britain sample has an adequate fit according to the measures of absolute, incremental and parsimonious fit. The results suggest that farmers’ perceived importance of specific biosecurity strategies, organic certification of farm, knowledge about biosecurity measures, attitudes towards animal welfare, perceived usefulness of biosecurity information sources, perceived effect on business during the past five years of severe outbreaks of animal diseases, membership in a cattle/sheep health scheme, attitudes towards livestock biosecurity, influence on decision to apply biosecurity measures, experience and economic factors are significantly influencing behaviour (overall explaining 64% of the variance in behaviour).
Resumo:
An important application of Big Data Analytics is the real-time analysis of streaming data. Streaming data imposes unique challenges to data mining algorithms, such as concept drifts, the need to analyse the data on the fly due to unbounded data streams and scalable algorithms due to potentially high throughput of data. Real-time classification algorithms that are adaptive to concept drifts and fast exist, however, most approaches are not naturally parallel and are thus limited in their scalability. This paper presents work on the Micro-Cluster Nearest Neighbour (MC-NN) classifier. MC-NN is based on an adaptive statistical data summary based on Micro-Clusters. MC-NN is very fast and adaptive to concept drift whilst maintaining the parallel properties of the base KNN classifier. Also MC-NN is competitive compared with existing data stream classifiers in terms of accuracy and speed.
Resumo:
Astronomy has evolved almost exclusively by the use of spectroscopic and imaging techniques, operated separately. With the development of modern technologies, it is possible to obtain data cubes in which one combines both techniques simultaneously, producing images with spectral resolution. To extract information from them can be quite complex, and hence the development of new methods of data analysis is desirable. We present a method of analysis of data cube (data from single field observations, containing two spatial and one spectral dimension) that uses Principal Component Analysis (PCA) to express the data in the form of reduced dimensionality, facilitating efficient information extraction from very large data sets. PCA transforms the system of correlated coordinates into a system of uncorrelated coordinates ordered by principal components of decreasing variance. The new coordinates are referred to as eigenvectors, and the projections of the data on to these coordinates produce images we will call tomograms. The association of the tomograms (images) to eigenvectors (spectra) is important for the interpretation of both. The eigenvectors are mutually orthogonal, and this information is fundamental for their handling and interpretation. When the data cube shows objects that present uncorrelated physical phenomena, the eigenvector`s orthogonality may be instrumental in separating and identifying them. By handling eigenvectors and tomograms, one can enhance features, extract noise, compress data, extract spectra, etc. We applied the method, for illustration purpose only, to the central region of the low ionization nuclear emission region (LINER) galaxy NGC 4736, and demonstrate that it has a type 1 active nucleus, not known before. Furthermore, we show that it is displaced from the centre of its stellar bulge.
Resumo:
When missing data occur in studies designed to compare the accuracy of diagnostic tests, a common, though naive, practice is to base the comparison of sensitivity, specificity, as well as of positive and negative predictive values on some subset of the data that fits into methods implemented in standard statistical packages. Such methods are usually valid only under the strong missing completely at random (MCAR) assumption and may generate biased and less precise estimates. We review some models that use the dependence structure of the completely observed cases to incorporate the information of the partially categorized observations into the analysis and show how they may be fitted via a two-stage hybrid process involving maximum likelihood in the first stage and weighted least squares in the second. We indicate how computational subroutines written in R may be used to fit the proposed models and illustrate the different analysis strategies with observational data collected to compare the accuracy of three distinct non-invasive diagnostic methods for endometriosis. The results indicate that even when the MCAR assumption is plausible, the naive partial analyses should be avoided.
Resumo:
This thesis develops and evaluates statistical methods for different types of genetic analyses, including quantitative trait loci (QTL) analysis, genome-wide association study (GWAS), and genomic evaluation. The main contribution of the thesis is to provide novel insights in modeling genetic variance, especially via random effects models. In variance component QTL analysis, a full likelihood model accounting for uncertainty in the identity-by-descent (IBD) matrix was developed. It was found to be able to correctly adjust the bias in genetic variance component estimation and gain power in QTL mapping in terms of precision. Double hierarchical generalized linear models, and a non-iterative simplified version, were implemented and applied to fit data of an entire genome. These whole genome models were shown to have good performance in both QTL mapping and genomic prediction. A re-analysis of a publicly available GWAS data set identified significant loci in Arabidopsis that control phenotypic variance instead of mean, which validated the idea of variance-controlling genes. The works in the thesis are accompanied by R packages available online, including a general statistical tool for fitting random effects models (hglm), an efficient generalized ridge regression for high-dimensional data (bigRR), a double-layer mixed model for genomic data analysis (iQTL), a stochastic IBD matrix calculator (MCIBD), a computational interface for QTL mapping (qtl.outbred), and a GWAS analysis tool for mapping variance-controlling loci (vGWAS).
Resumo:
Na presente pesquisa de campo, abordou-se, primeir~ mente, o estudo da comunicação entre os homens por códigos verbais, as relações entre a linguagem e o pensamento e a visão do mundo percebida pelo individuo ou pela comunidade através do uso da lin guagem. A seguir examinou-se o sistema escola, comprometido com a transmissão de uma determinada cosmovisão, caracteristica de uma nacionalidade dada, e/ou com a manutenção de valores e modos de vi da que identificam uma certa comunidade. As considerações feitas justificam a importância dada à alfabetização e aos programas de alfabetização em massa nos paises em desenvolvimento, c omo o Brasil. Levando-se em conta estas colocações, planejou-se investigar o vocabulário corrente de trinta e sete alunos do MOBRAL, em Nova Friburgo, relacionando-o com os indices sociais dos informantes (procedência, anos de vida na área geográfica considerada, idade, sexo e profissão), com as variáres temáticas (alimentação, saúde/ doença, profissão/afazeres, expectativas de vida, lembranças de vi da, lazer/diversões), escolhidas após sondagem prévia das condições de vida e dos interesses dos informantes, e com as variáveis lingüisticas, ou seja, as classes de palavras. Neste estudo explQ ratório, propôs-se ainda verificar em que medida o material escri to dos livros de leitura continuada do MOBRAL e dos jornais clas se A (Jornal do Brasil) e C (O Dia e Oltima Hora) se relacionam com o vocabulário utilizado pelos alunos do MOBRAL, em Nova Fribur go. Visando ao levantamento do vocabulário dos entrevistados,foram gravadas cinqüenta falas de acordo com a metodologia utilizada em trabalhos de natureza sociolingüistica . Os dados obtidos neste "corpus" gravado foram anali sados quantitativamente, aplicando-se um programa computacional cQ nhecido como SPSS. O estudo das rel ações entre as variáveis(classi ficação morfológica, tema, idade, sexo, profissão) conduziu à for mação de tabelas de contingência multivariada . A análise dos resultados ofereceu algumas conclusões como o uso constante de substantivos e verbos nas elocuções . Embora se tenha introduzido a técnica de captar as palavras disponíveis durante as entrevistas, não foi alterado o número de substan I tivos nesta pesquisa, porque os informantes não indicaram o nome das coisas isoladamente, fizeram-no por enunciados completos. Ba seando-se neste resultado, propuseram-se algumas sugestões de in teresse pedagógico para utilização do MOBRAL: a primeira -- nao en fatizar os nomes (processo estático da língua) em detrimento dos verkos (processo dinâmico); a segunda -- o uso de frases nas estra tégias de alfabetização. No exame das relações entre as variáveis, a grande variação detectada deveu-se ao tema. Quando se investigou a variedade dos vocábulos usa dos pelos entrevistados em Nova Friburgo, observou-se que das 11.337 ocorrências de substantivos, encontraram-se 2.222 substanti vos diferentes e 1.590 vocábulos; das 17.604 ocorrências de verbos, encontraram-se 2.365 verbos diferentes e 588 vocábulos; das 1.980 ocorrências de adjetivo~, encontraram-se 660 adjetivos diferentes e 488 vocábulos. Concluiu-se que o vocabulário deste grupo pode ser diferente dos de outras áreas, dos de outras cfu~adas sociais inse ridas em outros contextos, mas não é limitado, nem re stri to.Expre~ sa a visão e a expectativa do mundo que os cerca. Outra recomendação ao MOBRAL: a escolha das palavras a ensinar seria colhida nas diversas comunidades, onde funcionam as classes de alfabetização e a motivação para sua seleção deveria estar ligada às necessidades cotidianas dos adultos com a palavra geradora integrada em frases.
Resumo:
A pesquisa objetivou analisar e compreender, considerando os aspectos culturais da organização, como os servidores do Instituto de Criminalística percebem os riscos em seus ambientes interno e externo. A integração das dimensões investigadas (cultura, risco, percepção e gestão) a última sob um breve olhar fenomenológico, possibilitou desvelar até que ponto os valores da organização policial interferem na percepção dos riscos, adicionando uma contribuição às escassas publicações sobre o tema. Em busca da cultura da organização policial, uma incursão histórica esquadrinhou a linhagem da polícia e da Criminalística. No caminho metodológico, deparou-se com raras publicações sobre o tema, demandando um desenho específico para o estudo de caso, considerando o seu contexto atual, em caráter exploratório sobre o foco investigado. Uma abordagem qualitativa, acompanhada de dados estatísticos obtidos na própria instituição, favoreceu a compreensão do problema, envolvendo valores e riscos. Foram desenhadas planilhas ajustadas à realidade da unidade e orientadas aos gestores das seções internas e externas, configurando um mapa de risco do Instituto, resultando em um documento, sem precedentes, tipológico dos riscos na atividade pericial. Levantamento bibliográfico, documental, entrevistas e questionários, garantiram integridade à pesquisa. Em termos práticos, as identificações dos riscos (incluindo o mapa) e os esclarecimentos sobre a cultura, favorecerão, aos administradores, a programação de ações de gestão, visando à redução de instabilidades e tensões que contribuem para a ocorrência de acidentes e perdas no âmbito da criminalística. Concluiu-se, que a percepção e a própria gestão dos riscos são sensíveis às influências de valores culturais, exigindo dos gestores e administrados, em ação participativa, a construção de um ambiente redutor de instabilidades e tensões com a relevância do humano cuja essencialidade representou a senda de partida, de chegada e para retomada, na gestão organizacional.
Resumo:
Atrair e reter talentos por meio de salários inflacionados pode ser oneroso e não necessariamente efetivo. A atividade de Employer Branding (EB), que consiste nos esforços das empresas em promover características e atributos que as tornem diferentes e desejáveis como empregadoras, começa a despertar o interesse tanto das empresas quanto dos pesquisadores de Recursos Humanos e Práticas de Gestão. À luz da recente e escassa literatura internacional, este estudo exploratório buscou identificar quais aspectos do Employer Branding são mais importantes para os indivíduos na intenção de permanecer em uma empresa após o período de estágio. As análises consideraram 443 questionários respondidos por estagiários de uma empresa multinacional de grande porte do setor financeiro, utilizando-se a escala de atratividade do empregador (Berthon et al., 2005), que considera cinco dimensões do Employer Branding: desenvolvimento, social, interesse, aplicação e econômica. Testes estatísticos permitiram afirmar que variáveis demográficas como gênero, tipo de custeio da faculdade (público ou privado) e nível de responsabilidade financeira influenciam na maneira como os indivíduos valorizam cada uma das dimensões. Além disso, ainda que de forma geral todas as dimensões tenham sido consideradas importantes, os resultados da Regressão Logística para a intenção de permanecer permitiram observar que, para a amostra, as questões financeiras destacam-se das demais variáveis. Por fim, a análise dos dados revela aspectos que podem servir de insumo para propostas de readequação de discurso e/ou readequação de práticas por empresas pretendam atrair e reter, com eficiência, estagiários para seu quadro de colaboradores. Além disso, os resultados desta pesquisa contribuem para a teoria ao discutir as categorizações existentes para dimensões do Employer Branding e ao sugerir que há espaço para que novas classificações sejam propostas.
Resumo:
In recent decades the public sector comes under pressure in order to improve its performance. The use of Information Technology (IT) has been a tool increasingly used in reaching that goal. Thus, it has become an important issue in public organizations, particularly in institutions of higher education, determine which factors influence the acceptance and use of technology, impacting on the success of its implementation and the desired organizational results. The Technology Acceptance Model - TAM was used as the basis for this study and is based on the constructs perceived usefulness and perceived ease of use. However, when it comes to integrated management systems due to the complexity of its implementation,organizational factors were added to thus seek further explanation of the acceptance of such systems. Thus, added to the model five TAM constructs related to critical success factors in implementing ERP systems, they are: support of top management, communication, training, cooperation, and technological complexity (BUENO and SALMERON, 2008). Based on the foregoing, launches the following research problem: What factors influence the acceptance and use of SIE / module academic at the Federal University of Para, from the users' perception of teachers and technicians? The purpose of this study was to identify the influence of organizational factors, and behavioral antecedents of behavioral intention to use the SIE / module academic UFPA in the perspective of teachers and technical users. This is applied research, exploratory and descriptive, quantitative with the implementation of a survey, and data collection occurred through a structured questionnaire applied to a sample of 229 teachers and 30 technical and administrative staff. Data analysis was carried out through descriptive statistics and structural equation modeling with the technique of partial least squares (PLS). Effected primarily to assess the measurement model, which were verified reliability, convergent and discriminant validity for all indicators and constructs. Then the structural model was analyzed using the bootstrap resampling technique like. In assessing statistical significance, all hypotheses were supported. The coefficient of determination (R ²) was high or average in five of the six endogenous variables, so the model explains 47.3% of the variation in behavioral intention. It is noteworthy that among the antecedents of behavioral intention (BI) analyzed in this study, perceived usefulness is the variable that has a greater effect on behavioral intention, followed by ease of use (PEU) and attitude (AT). Among the organizational aspects (critical success factors) studied technological complexity (TC) and training (ERT) were those with greatest effect on behavioral intention to use, although these effects were lower than those produced by behavioral factors (originating from TAM). It is pointed out further that the support of senior management (TMS) showed, among all variables, the least effect on the intention to use (BI) and was followed by communications (COM) and cooperation (CO), which exert a low effect on behavioral intention (BI). Therefore, as other studies on the TAM constructs were adequate for the present research. Thus, the study contributed towards proving evidence that the Technology Acceptance Model can be applied to predict the acceptance of integrated management systems, even in public. Keywords: Technology
Resumo:
The study of complex systems has become a prestigious area of science, although relatively young . Its importance was demonstrated by the diversity of applications that several studies have already provided to various fields such as biology , economics and Climatology . In physics , the approach of complex systems is creating paradigms that influence markedly the new methods , bringing to Statistical Physics problems macroscopic level no longer restricted to classical studies such as those of thermodynamics . The present work aims to make a comparison and verification of statistical data on clusters of profiles Sonic ( DT ) , Gamma Ray ( GR ) , induction ( ILD ) , neutron ( NPHI ) and density ( RHOB ) to be physical measured quantities during exploratory drilling of fundamental importance to locate , identify and characterize oil reservoirs . Software were used : Statistica , Matlab R2006a , Origin 6.1 and Fortran for comparison and verification of the data profiles of oil wells ceded the field Namorado School by ANP ( National Petroleum Agency ) . It was possible to demonstrate the importance of the DFA method and that it proved quite satisfactory in that work, coming to the conclusion that the data H ( Hurst exponent ) produce spatial data with greater congestion . Therefore , we find that it is possible to find spatial pattern using the Hurst coefficient . The profiles of 56 wells have confirmed the existence of spatial patterns of Hurst exponents , ie parameter B. The profile does not directly assessed catalogs verification of geological lithology , but reveals a non-random spatial distribution
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)