951 resultados para Statistical analysis.
Resumo:
The cotton strip assay (CSA) is an established technique for measuring soil microbial activity. The technique involves burying cotton strips and measuring their tensile strength after a certain time. This gives a measure of the rotting rate, R, of the cotton strips. R is then a measure of soil microbial activity. This paper examines properties of the technique and indicates how the assay can be optimised. Humidity conditioning of the cotton strips before measuring their tensile strength reduced the within and between day variance and enabled the distribution of the tensile strength measurements to approximate normality. The test data came from a three-way factorial experiment (two soils, two temperatures, three moisture levels). The cotton strips were buried in the soil for intervals of time ranging up to 6 weeks. This enabled the rate of loss of cotton tensile strength with time to be studied under a range of conditions. An inverse cubic model accounted for greater than 90% of the total variation within each treatment combination. This offers support for summarising the decomposition process by a single parameter R. The approximate variance of the decomposition rate was estimated from a function incorporating the variance of tensile strength and the differential of the function for the rate of decomposition, R, with respect to tensile strength. This variance function has a minimum when the measured strength is approximately 2/3 that of the original strength. The estimates of R are almost unbiased and relatively robust against the cotton strips being left in the soil for more or less than the optimal time. We conclude that the rotting rate X should be measured using the inverse cubic equation, and that the cotton strips should be left in the soil until their strength has been reduced to about 2/3.
Resumo:
A catchment-scale multivariate statistical analysis of hydrochemistry enabled assessment of interactions between alluvial groundwater and Cressbrook Creek, an intermittent drainage system in southeast Queensland, Australia. Hierarchical cluster analyses and principal component analysis were applied to time-series data to evaluate the hydrochemical evolution of groundwater during periods of extreme drought and severe flooding. A simple three-dimensional geological model was developed to conceptualise the catchment morphology and the stratigraphic framework of the alluvium. The alluvium forms a two-layer system with a basal coarse-grained layer overlain by a clay-rich low-permeability unit. In the upper and middle catchment, alluvial groundwater is chemically similar to streamwater, particularly near the creek (reflected by high HCO3/Cl and K/Na ratios and low salinities), indicating a high degree of connectivity. In the lower catchment, groundwater is more saline with lower HCO3/Cl and K/Na ratios, notably during dry periods. Groundwater salinity substantially decreased following severe flooding in 2011, notably in the lower catchment, confirming that flooding is an important mechanism for both recharge and maintaining groundwater quality. The integrated approach used in this study enabled effective interpretation of hydrological processes and can be applied to a variety of hydrological settings to synthesise and evaluate large hydrochemical datasets.
Resumo:
Using surface charts at 0330GMT, the movement df the monsoon trough during the months June to September 1990 al two fixed longitudes, namely 79 degrees E and 85 degrees E, is studied. The probability distribution of trough position shows that the median, mean and mode occur at progressively more northern latitudes, especially at 85 degrees E, with a pronounced mode that is close to the northern-most limit reached by the trough. A spectral analysis of the fluctuating latitudinal position of the trough is carried out using FFT and the Maximum Entropy Method (MEM). Both methods show significant peaks around 7.5 and 2.6 days, and a less significant one around 40-50 days. The two peaks at the shorter period are more prominent at the eastern longitude. MEM shows an additional peak around 15 days. A study of the weather systems that occurred during the season shows them to have a duration around 3 days and an interval between systems of around 9 days, suggesting a possible correlation with the dominant short periods observed in the spectrum of trough position.
Resumo:
In genetic epidemiology, population-based disease registries are commonly used to collect genotype or other risk factor information concerning affected subjects and their relatives. This work presents two new approaches for the statistical inference of ascertained data: a conditional and full likelihood approaches for the disease with variable age at onset phenotype using familial data obtained from population-based registry of incident cases. The aim is to obtain statistically reliable estimates of the general population parameters. The statistical analysis of familial data with variable age at onset becomes more complicated when some of the study subjects are non-susceptible, that is to say these subjects never get the disease. A statistical model for a variable age at onset with long-term survivors is proposed for studies of familial aggregation, using latent variable approach, as well as for prospective studies of genetic association studies with candidate genes. In addition, we explore the possibility of a genetic explanation of the observed increase in the incidence of Type 1 diabetes (T1D) in Finland in recent decades and the hypothesis of non-Mendelian transmission of T1D associated genes. Both classical and Bayesian statistical inference were used in the modelling and estimation. Despite the fact that this work contains five studies with different statistical models, they all concern data obtained from nationwide registries of T1D and genetics of T1D. In the analyses of T1D data, non-Mendelian transmission of T1D susceptibility alleles was not observed. In addition, non-Mendelian transmission of T1D susceptibility genes did not make a plausible explanation for the increase in T1D incidence in Finland. Instead, the Human Leucocyte Antigen associations with T1D were confirmed in the population-based analysis, which combines T1D registry information, reference sample of healthy subjects and birth cohort information of the Finnish population. Finally, a substantial familial variation in the susceptibility of T1D nephropathy was observed. The presented studies show the benefits of sophisticated statistical modelling to explore risk factors for complex diseases.
Resumo:
Bacteria play an important role in many ecological systems. The molecular characterization of bacteria using either cultivation-dependent or cultivation-independent methods reveals the large scale of bacterial diversity in natural communities, and the vastness of subpopulations within a species or genus. Understanding how bacterial diversity varies across different environments and also within populations should provide insights into many important questions of bacterial evolution and population dynamics. This thesis presents novel statistical methods for analyzing bacterial diversity using widely employed molecular fingerprinting techniques. The first objective of this thesis was to develop Bayesian clustering models to identify bacterial population structures. Bacterial isolates were identified using multilous sequence typing (MLST), and Bayesian clustering models were used to explore the evolutionary relationships among isolates. Our method involves the inference of genetic population structures via an unsupervised clustering framework where the dependence between loci is represented using graphical models. The population dynamics that generate such a population stratification were investigated using a stochastic model, in which homologous recombination between subpopulations can be quantified within a gene flow network. The second part of the thesis focuses on cluster analysis of community compositional data produced by two different cultivation-independent analyses: terminal restriction fragment length polymorphism (T-RFLP) analysis, and fatty acid methyl ester (FAME) analysis. The cluster analysis aims to group bacterial communities that are similar in composition, which is an important step for understanding the overall influences of environmental and ecological perturbations on bacterial diversity. A common feature of T-RFLP and FAME data is zero-inflation, which indicates that the observation of a zero value is much more frequent than would be expected, for example, from a Poisson distribution in the discrete case, or a Gaussian distribution in the continuous case. We provided two strategies for modeling zero-inflation in the clustering framework, which were validated by both synthetic and empirical complex data sets. We show in the thesis that our model that takes into account dependencies between loci in MLST data can produce better clustering results than those methods which assume independent loci. Furthermore, computer algorithms that are efficient in analyzing large scale data were adopted for meeting the increasing computational need. Our method that detects homologous recombination in subpopulations may provide a theoretical criterion for defining bacterial species. The clustering of bacterial community data include T-RFLP and FAME provides an initial effort for discovering the evolutionary dynamics that structure and maintain bacterial diversity in the natural environment.
Resumo:
Early detection of (pre-)signs of ulceration on a diabetic foot is valuable for clinical practice. Hyperspectral imaging is a promising technique for detection and classification of such (pre-)signs. However, the number of the spectral bands should be limited to avoid overfitting, which is critical for pixel classification with hyperspectral image data. The goal was to design a detector/classifier based on spectral imaging (SI) with a small number of optical bandpass filters. The performance and stability of the design were also investigated. The selection of the bandpass filters boils down to a feature selection problem. A dataset was built, containing reflectance spectra of 227 skin spots from 64 patients, measured with a spectrometer. Each skin spot was annotated manually by clinicians as "healthy" or a specific (pre-)sign of ulceration. Statistical analysis on the data set showed the number of required filters is between 3 and 7, depending on additional constraints on the filter set. The stability analysis revealed that shot noise was the most critical factor affecting the classification performance. It indicated that this impact could be avoided in future SI systems with a camera sensor whose saturation level is higher than 106, or by postimage processing.
Resumo:
The absorption produced by the audience in concert halls is considered a random variable. Beranek's proposal [L. L. Beranek, Music, Acoustics and Architecture (Wiley, New York, 1962), p. 543] that audience absorption is proportional to the area they occupy and not to their number is subjected to a statistical hypothesis test. A two variable linear regression model of the absorption with audience area and residual area as regressor variables is postulated for concert halls without added absorptive materials. Since Beranek's contention amounts to the statement that audience absorption is independent of the seating density, the test of the hypothesis lies in categorizing halls by seating density and examining for significant differences among slopes of regression planes of the different categories. Such a test shows that Beranek's hypothesis can be accepted. It is also shown that the audience area is a better predictor of the absorption than the audience number. The absorption coefficients and their 95% confidence limits are given for the audience and residual areas. A critique of the regression model is presented.
Resumo:
It is well known that wrist pulse signals contain information about the status of health of a person and hence diagnosis based on pulse signals has assumed great importance since long time. In this paper the efficacy of signal processing techniques in extracting useful information from wrist pulse signals has been demonstrated by using signals recorded under two different experimental conditions viz. before lunch condition and after lunch condition. We have used Pearson's product-moment correlation coefficient, which is an effective measure of phase synchronization, in making a statistical analysis of wrist pulse signals. Contour plots and box plots are used to illustrate various differences. Two-sample t-tests show that the correlations show statistically significant differences between the groups. Results show that the correlation coefficient is effective in distinguishing the changes taking place after having lunch. This paper demonstrates the ability of the wrist pulse signals in detecting changes occurring under two different conditions. The study assumes importance in view of limited literature available on the analysis of wrist pulse signals in the case of food intake and also in view of its potential health care applications.
Resumo:
The stress release model, a stochastic version of the elastic rebound theory, is applied to the large events from four synthetic earthquake catalogs generated by models with various levels of disorder in distribution of fault zone strength (Ben-Zion, 1996) They include models with uniform properties (U), a Parkfield-type asperity (A), fractal brittle properties (F), and multi-size-scale heterogeneities (M). The results show that the degree of regularity or predictability in the assumed fault properties, based on both the Akaike information criterion and simulations, follows the order U, F, A, and M, which is in good agreement with that obtained by pattern recognition techniques applied to the full set of synthetic data. Data simulated from the best fitting stress release models reproduce, both visually and in distributional terms, the main features of the original catalogs. The differences in character and the quality of prediction between the four cases are shown to be dependent on two main aspects: the parameter controlling the sensitivity to departures from the mean stress level and the frequency-magnitude distribution, which differs substantially between the four cases. In particular, it is shown that the predictability of the data is strongly affected by the form of frequency-magnitude distribution, being greatly reduced if a pure Gutenburg-Richter form is assumed to hold out to high magnitudes.
Resumo:
The stress release model, a stochastic version of the elastic-rebound theory, is applied to the historical earthquake data from three strong earthquake-prone regions of China, including North China, Southwest China, and the Taiwan seismic regions. The results show that the seismicity along a plate boundary (Taiwan) is more active than in intraplate regions (North and Southwest China). The degree of predictability or regularity of seismic events in these seismic regions, based on both the Akaike information criterion (AIC) and fitted sensitivity parameters, follows the order Taiwan, Southwest China, and North China, which is further identified by numerical simulations. (c) 2004 Elsevier Ltd. All rights reserved.