46 resultados para Statistical analysis methods
em CentAUR: Central Archive University of Reading - UK
Resumo:
For the tracking of extrema associated with weather systems to be applied to a broad range of fields it is necessary to remove a background field that represents the slowly varying, large spatial scales. The sensitivity of the tracking analysis to the form of background field removed is explored for the Northern Hemisphere winter storm tracks for three contrasting fields from an integration of the U. K. Met Office's (UKMO) Hadley Centre Climate Model (HadAM3). Several methods are explored for the removal of a background field from the simple subtraction of the climatology, to the more sophisticated removal of the planetary scales. Two temporal filters are also considered in the form of a 2-6-day Lanczos filter and a 20-day high-pass Fourier filter. The analysis indicates that the simple subtraction of the climatology tends to change the nature of the systems to the extent that there is a redistribution of the systems relative to the climatological background resulting in very similar statistical distributions for both positive and negative anomalies. The optimal planetary wave filter removes total wavenumbers less than or equal to a number in the range 5-7, resulting in distributions more easily related to particular types of weather system. For the temporal filters the 2-6-day bandpass filter is found to have a detrimental impact on the individual weather systems, resulting in the storm tracks having a weak waveguide type of behavior. The 20-day high-pass temporal filter is less aggressive than the 2-6-day filter and produces results falling between those of the climatological and 2-6-day filters.
Resumo:
The Representative Soil Sampling Scheme of England and Wales has recorded information on the soil of agricultural land in England and Wales since 1969. It is a valuable source of information about the soil in the context of monitoring for sustainable agricultural development. Changes in soil nutrient status and pH were examined over the period 1971-2001. Several methods of statistical analysis were applied to data from the surveys during this period. The main focus here is on the data for 1971, 1981, 1991 and 2001. The results of examining change over time in general show that levels of potassium in the soil have increased, those of magnesium have remained fairly constant, those of phosphorus have declined and pH has changed little. Future sampling needs have been assessed in the context of monitoring, to determine the mean at a given level of confidence and tolerable error and to detect change in the mean over time at these same levels over periods of 5 and 10 years. The results of a non-hierarchical multivariate classification suggest that England and Wales could be stratified to optimize future sampling and analysis. To monitor soil quality and health more generally than for agriculture, more of the country should be sampled and a wider range of properties recorded.
Resumo:
Background: We report an analysis of a protein network of functionally linked proteins, identified from a phylogenetic statistical analysis of complete eukaryotic genomes. Phylogenetic methods identify pairs of proteins that co-evolve on a phylogenetic tree, and have been shown to have a high probability of correctly identifying known functional links. Results: The eukaryotic correlated evolution network we derive displays the familiar power law scaling of connectivity. We introduce the use of explicit phylogenetic methods to reconstruct the ancestral presence or absence of proteins at the interior nodes of a phylogeny of eukaryote species. We find that the connectivity distribution of proteins at the point they arise on the tree and join the network follows a power law, as does the connectivity distribution of proteins at the time they are lost from the network. Proteins resident in the network acquire connections over time, but we find no evidence that 'preferential attachment' - the phenomenon of newly acquired connections in the network being more likely to be made to proteins with large numbers of connections - influences the network structure. We derive a 'variable rate of attachment' model in which proteins vary in their propensity to form network interactions independently of how many connections they have or of the total number of connections in the network, and show how this model can produce apparent power-law scaling without preferential attachment. Conclusion: A few simple rules can explain the topological structure and evolutionary changes to protein-interaction networks: most change is concentrated in satellite proteins of low connectivity and small phenotypic effect, and proteins differ in their propensity to form attachments. Given these rules of assembly, power law scaled networks naturally emerge from simple principles of selection, yielding protein interaction networks that retain a high-degree of robustness on short time scales and evolvability on longer evolutionary time scales.
Resumo:
In designing modern office buildings, building spaces are frequently zoned by introducing internal partitioning, which may have a significant influence on the room air environment. This internal partitioning was studied by means of model test, numerical simulation, and statistical analysis as the final stage. In this paper, the results produced from the statistical analysis are summarized and presented.
Resumo:
Baking and 2-g mixograph analyses were performed for 55 cultivars (19 spring and 36 winter wheat) from various quality classes from the 2002 harvest in Poland. An instrumented 2-g direct-drive mixograph was used to study the mixing characteristics of the wheat cultivars. A number of parameters were extracted automatically from each mixograph trace and correlated with baking volume and flour quality parameters (protein content and high molecular weight glutenin subunit [HMW-GS] composition by SDS-PAGE) using multiple linear regression statistical analysis. Principal component analysis of the mixograph data discriminated between four flour quality classes, and predictions of baking volume were obtained using several selected mixograph parameters, chosen using a best subsets regression routine, giving R-2 values of 0.862-0.866. In particular, three new spring wheat strains (CHD 502a-c) recently registered in Poland were highly discriminated and predicted to give high baking volume on the basis of two mixograph parameters: peak bandwidth and 10-min bandwidth.
Resumo:
We are developing computational tools supporting the detailed analysis of the dependence of neural electrophysiological response on dendritic morphology. We approach this problem by combining simulations of faithful models of neurons (experimental real life morphological data with known models of channel kinetics) with algorithmic extraction of morphological and physiological parameters and statistical analysis. In this paper, we present the novel method for an automatic recognition of spike trains in voltage traces, which eliminates the need for human intervention. This enables classification of waveforms with consistent criteria across all the analyzed traces and so it amounts to reduction of the noise in the data. This method allows for an automatic extraction of relevant physiological parameters necessary for further statistical analysis. In order to illustrate the usefulness of this procedure to analyze voltage traces, we characterized the influence of the somatic current injection level on several electrophysiological parameters in a set of modeled neurons. This application suggests that such an algorithmic processing of physiological data extracts parameters in a suitable form for further investigation of structure-activity relationship in single neurons.
Resumo:
A precipitation downscaling method is presented using precipitation from a general circulation model (GCM) as predictor. The method extends a previous method from monthly to daily temporal resolution. The simplest form of the method corrects for biases in wet-day frequency and intensity. A more sophisticated variant also takes account of flow-dependent biases in the GCM. The method is flexible and simple to implement. It is proposed here as a correction of GCM output for applications where sophisticated methods are not available, or as a benchmark for the evaluation of other downscaling methods. Applied to output from reanalyses (ECMWF, NCEP) in the region of the European Alps, the method is capable of reducing large biases in the precipitation frequency distribution, even for high quantiles. The two variants exhibit similar performances, but the ideal choice of method can depend on the GCM/reanalysis and it is recommended to test the methods in each case. Limitations of the method are found in small areas with unresolved topographic detail that influence higher-order statistics (e.g. high quantiles). When used as benchmark for three regional climate models (RCMs), the corrected reanalysis and the RCMs perform similarly in many regions, but the added value of the latter is evident for high quantiles in some small regions.
Resumo:
Data from various stations having different measurement record periods between 1988 and 2007 are analyzed to investigate the surface ozone concentration, long-term trends, and seasonal changes in and around Ireland. Time series statistical analysis is performed on the monthly mean data using seasonal and trend decomposition procedures and the Box-Jenkins approach (autoregressive integrated moving average). In general, ozone concentrations in the Irish region are found to have a negative trend at all sites except at the coastal sites of Mace Head and Valentia. Data from the most polluted Dublin city site have shown a very strong negative trend of −0.33 ppb/yr with a 95% confidence limit of 0.17 ppb/yr (i.e., −0.33 ± 0.17) for the period 2002−2007, and for the site near the city of Cork, the trend is found to be −0.20 ± 0.11 ppb/yr over the same period. The negative trend for other sites is more pronounced when the data span is considered from around the year 2000 to 2007. Rural sites of Wexford and Monaghan have also shown a very strong negative trend of −0.99 ± 0.13 and −0.58 ± 0.12, respectively, for the period 2000−2007. Mace Head, a site that is representative of ozone changes in the air advected from the Atlantic to Europe in the marine planetary boundary layer, has shown a positive trend of about +0.16 ± 0.04 ppb per annum over the entire period 1988−2007, but this positive trend has reduced during recent years (e.g., in the period 2001−2007). Cluster analysis for back trajectories are performed for the stations having a long record of data, Mace Head and Lough Navar. For Mace Head, the northern and western clean air sectors have shown a similar positive trend (+0.17 ± 0.02 ppb/yr for the northern sector and +0.18 ± 0.02 ppb/yr for the western sector) for the whole period, but partial analysis for the clean western sector at Mace Head shows different trends during different time periods with a decrease in the positive trend since 1988 indicating a deceleration in the ozone trend for Atlantic air masses entering Europe.
Resumo:
Peak picking is an early key step in MS data analysis. We compare three commonly used approaches to peak picking and discuss their merits by means of statistical analysis. Methods investigated encompass signal-to-noise ratio, continuous wavelet transform, and a correlation-based approach using a Gaussian template. Functionality of the three methods is illustrated and discussed in a practical context using a mass spectral data set created with MALDI-TOF technology. Sensitivity and specificity are investigated using a manually defined reference set of peaks. As an additional criterion, the robustness of the three methods is assessed by a perturbation analysis and illustrated using ROC curves.
Resumo:
Recent interest in the validation of general circulation models (GCMs) has been devoted to objective methods. A small number of authors have used the direct synoptic identification of phenomena together with a statistical analysis to perform the objective comparison between various datasets. This paper describes a general method for performing the synoptic identification of phenomena that can be used for an objective analysis of atmospheric, or oceanographic, datasets obtained from numerical models and remote sensing. Methods usually associated with image processing have been used to segment the scene and to identify suitable feature points to represent the phenomena of interest. This is performed for each time level. A technique from dynamic scene analysis is then used to link the feature points to form trajectories. The method is fully automatic and should be applicable to a wide range of geophysical fields. An example will be shown of results obtained from this method using data obtained from a run of the Universities Global Atmospheric Modelling Project GCM.
Resumo:
Assaying a large number of genetic markers from patients in clinical trials is now possible in order to tailor drugs with respect to efficacy. The statistical methodology for analysing such massive data sets is challenging. The most popular type of statistical analysis is to use a univariate test for each genetic marker, once all the data from a clinical study have been collected. This paper presents a sequential method for conducting an omnibus test for detecting gene-drug interactions across the genome, thus allowing informed decisions at the earliest opportunity and overcoming the multiple testing problems from conducting many univariate tests. We first propose an omnibus test for a fixed sample size. This test is based on combining F-statistics that test for an interaction between treatment and the individual single nucleotide polymorphism (SNP). As SNPs tend to be correlated, we use permutations to calculate a global p-value. We extend our omnibus test to the sequential case. In order to control the type I error rate, we propose a sequential method that uses permutations to obtain the stopping boundaries. The results of a simulation study show that the sequential permutation method is more powerful than alternative sequential methods that control the type I error rate, such as the inverse-normal method. The proposed method is flexible as we do not need to assume a mode of inheritance and can also adjust for confounding factors. An application to real clinical data illustrates that the method is computationally feasible for a large number of SNPs. Copyright (c) 2007 John Wiley & Sons, Ltd.