977 resultados para statistical techniques
Resumo:
An interesting fact about language cognition is that stimulation involving incongruence in the merge operation between verb and complement has often been related to a negative event-related potential (ERP) of augmented amplitude and latency of ca. 400 ms - the N400. Using an automatic ERP latency and amplitude estimator to facilitate the recognition of waves with a low signal-to-noise ratio, the objective of the present study was to study the N400 statistically in 24 volunteers. Stimulation consisted of 80 experimental sentences (40 congruous and 40 incongruous), generated in Brazilian Portuguese, involving two distinct local verb-argument combinations (nominal object and pronominal object series). For each volunteer, the EEG was simultaneously acquired at 20 derivations, topographically localized according to the 10-20 International System. A computerized routine for automatic N400-peak marking (based on the ascendant zero-cross of the first waveform derivative) was applied to the estimated individual ERP waveform for congruous and incongruous sentences in both series for all ERP topographic derivations. Peak-to-peak N400 amplitude was significantly augmented (P < 0.05; one-sided Wilcoxon signed-rank test) due to incongruence in derivations F3, T3, C3, Cz, T5, P3, Pz, and P4 for nominal object series and in P3, Pz and P4 for pronominal object series. The results also indicated high inter-individual variability in ERP waveforms, suggesting that the usual procedure of grand averaging might not be considered a generally adequate approach. Hence, signal processing statistical techniques should be applied in neurolinguistic ERP studies allowing waveform analysis with low signal-to-noise ratio.
Resumo:
The influence of some process variables on the productivity of the fractions (liquid yield times fraction percent) obtained from SCFE of a Brazilian mineral coal using isopropanol and ethanol as primary solvents is analyzed using statistical techniques. A full factorial 23 experimental design was adopted to investigate the effects of process variables (temperature, pressure and cosolvent concentration) on the extraction products. The extracts were analyzed by the Preparative Liquid Chromatography-8 fractions method (PLC-8), a reliable, non destructive solvent fractionation method, especially developed for coal-derived liquids. Empirical statistical modeling was carried out in order to reproduce the experimental data. Correlations obtained were always greater than 0.98. Four specific process criteria were used to allow process optimization. Results obtained show that it is not possible to maximize both extract productivity and purity (through the minimization of heavy fraction content) simultaneously by manipulating the mentioned process variables.
Resumo:
Several eco-toxicological studies have shown that insectivorous mammals, due to their feeding habits, easily accumulate high amounts of pollutants in relation to other mammal species. To assess the bio-accumulation levels of toxic metals and their in°uence on essential metals, we quantified the concentration of 19 elements (Ca, K, Fe, B, P, S, Na, Al, Zn, Ba, Rb, Sr, Cu, Mn, Hg, Cd, Mo, Cr and Pb) in bones of 105 greater white-toothed shrews (Crocidura russula) from a polluted (Ebro Delta) and a control (Medas Islands) area. Since chemical contents of a bio-indicator are mainly compositional data, conventional statistical analyses currently used in eco-toxicology can give misleading results. Therefore, to improve the interpretation of the data obtained, we used statistical techniques for compositional data analysis to define groups of metals and to evaluate the relationships between them, from an inter-population viewpoint. Hypothesis testing on the adequate balance-coordinates allow us to confirm intuition based hypothesis and some previous results. The main statistical goal was to test equal means of balance-coordinates for the two defined populations. After checking normality, one-way ANOVA or Mann-Whitney tests were carried out for the inter-group balances
Resumo:
As in any field of scientific inquiry, advancements in the field of second language acquisition (SLA) rely in part on the interpretation and generalizability of study findings using quantitative data analysis and inferential statistics. While statistical techniques such as ANOVA and t-tests are widely used in second language research, this review article provides a review of a class of newer statistical models that have not yet been widely adopted in the field, but have garnered interest in other fields of language research. The class of statistical models called mixed-effects models are introduced, and the potential benefits of these models for the second language researcher are discussed. A simple example of mixed-effects data analysis using the statistical software package R (R Development Core Team, 2011) is provided as an introduction to the use of these statistical techniques, and to exemplify how such analyses can be reported in research articles. It is concluded that mixed-effects models provide the second language researcher with a powerful tool for the analysis of a variety of types of second language acquisition data.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
This paper presents a comparison of principal component (PC) regression and regularized expectation maximization (RegEM) to reconstruct European summer and winter surface air temperature over the past millennium. Reconstruction is performed within a surrogate climate using the National Center for Atmospheric Research (NCAR) Climate System Model (CSM) 1.4 and the climate model ECHO-G 4, assuming different white and red noise scenarios to define the distortion of pseudoproxy series. We show how sensitivity tests lead to valuable “a priori” information that provides a basis for improving real world proxy reconstructions. Our results emphasize the need to carefully test and evaluate reconstruction techniques with respect to the temporal resolution and the spatial scale they are applied to. Furthermore, we demonstrate that uncertainties inherent to the predictand and predictor data have to be more rigorously taken into account. The comparison of the two statistical techniques, in the specific experimental setting presented here, indicates that more skilful results are achieved with RegEM as low frequency variability is better preserved. We further detect seasonal differences in reconstruction skill for the continental scale, as e.g. the target temperature average is more adequately reconstructed for summer than for winter. For the specific predictor network given in this paper, both techniques underestimate the target temperature variations to an increasing extent as more noise is added to the signal, albeit RegEM less than with PC regression. We conclude that climate field reconstruction techniques can be improved and need to be further optimized in future applications.
Resumo:
OBJECTIVES In dental research multiple site observations within patients or taken at various time intervals are commonplace. These clustered observations are not independent; statistical analysis should be amended accordingly. This study aimed to assess whether adjustment for clustering effects during statistical analysis was undertaken in five specialty dental journals. METHODS Thirty recent consecutive issues of Orthodontics (OJ), Periodontology (PJ), Endodontology (EJ), Maxillofacial (MJ) and Paediatric Dentristry (PDJ) journals were hand searched. Articles requiring adjustment accounting for clustering effects were identified and statistical techniques used were scrutinized. RESULTS Of 559 studies considered to have inherent clustering effects, adjustment for this was made in the statistical analysis in 223 (39.1%). Studies published in the Periodontology specialty accounted for clustering effects in the statistical analysis more often than articles published in other journals (OJ vs. PJ: OR=0.21, 95% CI: 0.12, 0.37, p<0.001; MJ vs. PJ: OR=0.02, 95% CI: 0.00, 0.07, p<0.001; PDJ vs. PJ: OR=0.14, 95% CI: 0.07, 0.28, p<0.001; EJ vs. PJ: OR=0.11, 95% CI: 0.06, 0.22, p<0.001). A positive correlation was found between increasing prevalence of clustering effects in individual specialty journals and correct statistical handling of clustering (r=0.89). CONCLUSIONS The majority of studies in 5 dental specialty journals (60.9%) examined failed to account for clustering effects in statistical analysis where indicated, raising the possibility of inappropriate decreases in p-values and the risk of inappropriate inferences.
Resumo:
An efficient and reliable automated model that can map physical Soil and Water Conservation (SWC) structures on cultivated land was developed using very high spatial resolution imagery obtained from Google Earth and ArcGIS, ERDAS IMAGINE, and SDC Morphology Toolbox for MATLAB and statistical techniques. The model was developed using the following procedures: (1) a high-pass spatial filter algorithm was applied to detect linear features, (2) morphological processing was used to remove unwanted linear features, (3) the raster format was vectorized, (4) the vectorized linear features were split per hectare (ha) and each line was then classified according to its compass direction, and (5) the sum of all vector lengths per class of direction per ha was calculated. Finally, the direction class with the greatest length was selected from each ha to predict the physical SWC structures. The model was calibrated and validated on the Ethiopian Highlands. The model correctly mapped 80% of the existing structures. The developed model was then tested at different sites with different topography. The results show that the developed model is feasible for automated mapping of physical SWC structures. Therefore, the model is useful for predicting and mapping physical SWC structures areas across diverse areas.
Resumo:
This paper introduces a method for the analysis of regional linguistic variation. The method identifies individual and common patterns of spatial clustering in a set of linguistic variables measured over a set of locations based on a combination of three statistical techniques: spatial autocorrelation, factor analysis, and cluster analysis. To demonstrate how to apply this method, it is used to analyze regional variation in the values of 40 continuously measured, high-frequency lexical alternation variables in a 26-million-word corpus of letters to the editor representing 206 cities from across the United States.
Resumo:
The topic of this thesis is the development of knowledge based statistical software. The shortcomings of conventional statistical packages are discussed to illustrate the need to develop software which is able to exhibit a greater degree of statistical expertise, thereby reducing the misuse of statistical methods by those not well versed in the art of statistical analysis. Some of the issues involved in the development of knowledge based software are presented and a review is given of some of the systems that have been developed so far. The majority of these have moved away from conventional architectures by adopting what can be termed an expert systems approach. The thesis then proposes an approach which is based upon the concept of semantic modelling. By representing some of the semantic meaning of data, it is conceived that a system could examine a request to apply a statistical technique and check if the use of the chosen technique was semantically sound, i.e. will the results obtained be meaningful. Current systems, in contrast, can only perform what can be considered as syntactic checks. The prototype system that has been implemented to explore the feasibility of such an approach is presented, the system has been designed as an enhanced variant of a conventional style statistical package. This involved developing a semantic data model to represent some of the statistically relevant knowledge about data and identifying sets of requirements that should be met for the application of the statistical techniques to be valid. Those areas of statistics covered in the prototype are measures of association and tests of location.
Resumo:
With the advent of new technologies it is increasingly easier to find data of different nature from even more accurate sensors that measure the most disparate physical quantities and with different methodologies. The collection of data thus becomes progressively important and takes the form of archiving, cataloging and online and offline consultation of information. Over time, the amount of data collected can become so relevant that it contains information that cannot be easily explored manually or with basic statistical techniques. The use of Big Data therefore becomes the object of more advanced investigation techniques, such as Machine Learning and Deep Learning. In this work some applications in the world of precision zootechnics and heat stress accused by dairy cows are described. Experimental Italian and German stables were involved for the training and testing of the Random Forest algorithm, obtaining a prediction of milk production depending on the microclimatic conditions of the previous days with satisfactory accuracy. Furthermore, in order to identify an objective method for identifying production drops, compared to the Wood model, typically used as an analytical model of the lactation curve, a Robust Statistics technique was used. Its application on some sample lactations and the results obtained allow us to be confident about the use of this method in the future.
Resumo:
Long-term monitoring of acoustical environments is gaining popularity thanks to the relevant amount of scientific and engineering insights that it provides. The increasing interest is due to the constant growth of storage capacity and computational power to process large amounts of data. In this perspective, machine learning (ML) provides a broad family of data-driven statistical techniques to deal with large databases. Nowadays, the conventional praxis of sound level meter measurements limits the global description of a sound scene to an energetic point of view. The equivalent continuous level Leq represents the main metric to define an acoustic environment, indeed. Finer analyses involve the use of statistical levels. However, acoustic percentiles are based on temporal assumptions, which are not always reliable. A statistical approach, based on the study of the occurrences of sound pressure levels, would bring a different perspective to the analysis of long-term monitoring. Depicting a sound scene through the most probable sound pressure level, rather than portions of energy, brought more specific information about the activity carried out during the measurements. The statistical mode of the occurrences can capture typical behaviors of specific kinds of sound sources. The present work aims to propose an ML-based method to identify, separate and measure coexisting sound sources in real-world scenarios. It is based on long-term monitoring and is addressed to acousticians focused on the analysis of environmental noise in manifold contexts. The presented method is based on clustering analysis. Two algorithms, Gaussian Mixture Model and K-means clustering, represent the main core of a process to investigate different active spaces monitored through sound level meters. The procedure has been applied in two different contexts: university lecture halls and offices. The proposed method shows robust and reliable results in describing the acoustic scenario and it could represent an important analytical tool for acousticians.
Resumo:
This paper presents two techniques to evaluate soil mechanical resistance to penetration as an auxiliary method to help in a decision-making in subsoiling operations. The decision is based on the volume of soil mobilized as a function of the considered critical soil resistance to penetration in each case. The first method, probabilistic, uses statistical techniques to define the volume of soil to be mobilized. The other method, deterministic, determines the percentage of soil to be mobilized and its spatial distribution. Both cases plot the percentage curves of experimental data related to the soil mechanical resistance to penetration equal or larger to the established critical level and the volume of soil to be mobilized as a function of critical level. The deterministic method plots showed the spatial distribution of the data with resistance to penetration equal or large than the critical level. The comparison between mobilized soil curves as a function of critical level using both methods showed that they can be considered equivalent. The deterministic method has the advantage of showing the spatial distribution of the critical points.
Resumo:
Neste trabalho são descritas as técnicas de análise estatística utilizadas e a acessibilidade estatística em uma amostra dos artigos originais publicados no período 1996-2006 em duas revistas de pesquisa na área de fruticultura: a Revista Brasileira de Fruticultura (RBF) e a revista francesa Fruits. No total foram classificados 986 artigos em 16 categorias de análise estatística, ordenadas em grau ascendente de complexidade. No período analisado, foi constatado um aumento no uso de análises mais sofisticadas ao longo do tempo em ambos as revistas. Os trabalhos publicados pela RBF aplicaram com maior freqüência técnicas estatísticas mais complexas, com maior utilização de delineamentos em blocos aleatorizados, arranjos fatoriais, parcelas subdivididas e modelos hierárquicos, e do teste de Tukey para comparações múltiplas de médias. Nos trabalhos publicados pela revista Fruits, predominou o uso de outros testes paramétricos e do teste de Duncan. O pacote estatístico SAS foi o mais utilizado nos artigos publicados em ambas as revistas. Os leitores da revista RBF precisaram de um nível de conhecimento estatístico mais elevado para ter acesso à maior parte dos artigos publicados no período, em comparação com os leitores da revista francesa.
Resumo:
Analisar diferenças quanto a características sociodemográficas e relacionadas à saúde entre indivíduos com e sem linha telefônica residencial. Foram analisados os dados do Inquérito de Saúde (ISA-Capital) 2003, um estudo transversal realizado em São Paulo, SP, no mesmo ano. Os moradores que possuíam linha telefônica residencial foram comparados com os que disseram não possuir linha telefônica, segundo as variáveis sociodemográficas, de estilo de vida, estado de saúde e utilização de serviços de saúde. Foram estimados os vícios associados à não-cobertura por parte da população sem telefone, verificando-se sua diminuição após a utilização de ajustes de pós-estratificação. Dos 1.878 entrevistados acima de 18 anos, 80,1% possuía linha telefônica residencial. Na comparação entre os grupos, as principais diferenças sociodemográficas entre indivíduos que não possuíam linha residencial foram: menor idade, maior proporção de indivíduos de raça/cor negra e parda, menor proporção de entrevistados casada, maior proporção de desempregados e com menor escolaridade. Os moradores sem linha telefônica residencial realizavam menos exames de saúde, fumavam e bebiam mais. Ainda, esse grupo consumiu menos medicamentos, auto-avaliou-se em piores condições de saúde e usou mais o Sistema Único de Saúde. Ao se excluir da análise a população sem telefone, as estimativas de consultas odontológicas, alcoolismo, consumo de medicamentos e utilização do SUS para realização de Papanicolaou foram as que tiveram maior vício. Após o ajuste de pós-estratificação, houve diminuição do vício das estimativas para as variáveis associadas à posse de linha telefônica residencial. ) A exclusão dos moradores sem linha telefônica é uma das principais limitações das pesquisas realizadas por esse meio. No entanto, a utilização de técnicas estatísticas de ajustes de pós-estratificação permite a diminuição dos vícios de não cobertura