964 resultados para statistic
Nonlinear system identification is considered using a generalized kernel regression model. Unlike the standard kernel model, which employs a fixed common variance for all the kernel regressors, each kernel regressor in the generalized kernel model has an individually tuned diagonal covariance matrix that is determined by maximizing the correlation between the training data and the regressor using a repeated guided random search based on boosting optimization. An efficient construction algorithm based on orthogonal forward regression with leave-one-out (LOO) test statistic and local regularization (LR) is then used to select a parsimonious generalized kernel regression model from the resulting full regression matrix. The proposed modeling algorithm is fully automatic and the user is not required to specify any criterion to terminate the construction procedure. Experimental results involving two real data sets demonstrate the effectiveness of the proposed nonlinear system identification approach.
A modified radial basis function (RBF) neural network and its identification algorithm based on observational data with heterogeneous noise are introduced. The transformed system output of Box-Cox is represented by the RBF neural network. To identify the model from observational data, the singular value decomposition of the full regression matrix consisting of basis functions formed by system input data is initially carried out and a new fast identification method is then developed using Gauss-Newton algorithm to derive the required Box-Cox transformation, based on a maximum likelihood estimator (MLE) for a model base spanned by the largest eigenvectors. Finally, the Box-Cox transformation-based RBF neural network, with good generalisation and sparsity, is identified based on the derived optimal Box-Cox transformation and an orthogonal forward regression algorithm using a pseudo-PRESS statistic to select a sparse RBF model with good generalisation. The proposed algorithm and its efficacy are demonstrated with numerical examples.
In this correspondence new robust nonlinear model construction algorithms for a large class of linear-in-the-parameters models are introduced to enhance model robustness via combined parameter regularization and new robust structural selective criteria. In parallel to parameter regularization, we use two classes of robust model selection criteria based on either experimental design criteria that optimizes model adequacy, or the predicted residual sums of squares (PRESS) statistic that optimizes model generalization capability, respectively. Three robust identification algorithms are introduced, i.e., combined A- and D-optimality with regularized orthogonal least squares algorithm, respectively; and combined PRESS statistic with regularized orthogonal least squares algorithm. A common characteristic of these algorithms is that the inherent computation efficiency associated with the orthogonalization scheme in orthogonal least squares or regularized orthogonal least squares has been extended such that the new algorithms are computationally efficient. Numerical examples are included to demonstrate effectiveness of the algorithms.
A large fraction of papers in the climate literature includes erroneous uses of significance tests. A Bayesian analysis is presented to highlight the meaning of significance tests and why typical misuse occurs. The significance statistic is not a quantitative measure of how confident we can be of the ‘reality’ of a given result. It is concluded that a significance test very rarely provides useful quantitative information.
The polar winter stratospheric vortex is a coherent structure that undergoes different types of deformation that can be revealed by the geometric invariant moments. Three moments are used—the aspect ratio, the centroid latitude, and the area of the vortex based on stratospheric data from the 40-yr ECMWF Re-Analysis (ERA-40) project—to study sudden stratospheric warmings. Hierarchical clustering combined with data image visualization techniques is used as well. Using the gap statistic, three optimal clusters are obtained based on the three geometric moments considered here. The 850-K potential vorticity field, as well as the vertical profiles of polar temperature and zonal wind, provides evidence that the clusters represent, respectively, the undisturbed (U), displaced (D), and split (S) states of the polar vortex. This systematic method for identifying and characterizing the state of the polar vortex using objective methods is useful as a tool for analyzing observations and as a test for climate models to simulate the observations. The method correctly identifies all previously identified major warmings and also identifies significant minor warmings where the atmosphere is substantially disturbed but does not quite meet the criteria to qualify as a major stratospheric warming.
Reliability analysis of probabilistic forecasts, in particular through the rank histogram or Talagrand diagram, is revisited. Two shortcomings are pointed out: Firstly, a uniform rank histogram is but a necessary condition for reliability. Secondly, if the forecast is assumed to be reliable, an indication is needed how far a histogram is expected to deviate from uniformity merely due to randomness. Concerning the first shortcoming, it is suggested that forecasts be grouped or stratified along suitable criteria, and that reliability is analyzed individually for each forecast stratum. A reliable forecast should have uniform histograms for all individual forecast strata, not only for all forecasts as a whole. As to the second shortcoming, instead of the observed frequencies, the probability of the observed frequency is plotted, providing and indication of the likelihood of the result under the hypothesis that the forecast is reliable. Furthermore, a Goodness-Of-Fit statistic is discussed which is essentially the reliability term of the Ignorance score. The discussed tools are applied to medium range forecasts for 2 m-temperature anomalies at several locations and lead times. The forecasts are stratified along the expected ranked probability score. Those forecasts which feature a high expected score turn out to be particularly unreliable.
Background: Association mapping, initially developed in human disease genetics, is now being applied to plant species. The model species Arabidopsis provided some of the first examples of association mapping in plants, identifying previously cloned flowering time genes, despite high population sub-structure. More recently, association genetics has been applied to barley, where breeding activity has resulted in a high degree of population sub-structure. A major genotypic division within barley is that between winter- and spring-sown varieties, which differ in their requirement for vernalization to promote subsequent flowering. To date, all attempts to validate association genetics in barley by identifying major flowering time loci that control vernalization requirement (VRN-H1 and VRN-H2) have failed. Here, we validate the use of association genetics in barley by identifying VRN-H1 and VRN-H2, despite their prominent role in determining population sub-structure. Results: By taking barley as a typical inbreeding crop, and seasonal growth habit as a major partitioning phenotype, we develop an association mapping approach which successfully identifies VRN-H1 and VRN-H2, the underlying loci largely responsible for this agronomic division. We find a combination of Structured Association followed by Genomic Control to correct for population structure and inflation of the test statistic, resolved significant associations only with VRN-H1 and the VRN-H2 candidate genes, as well as two genes closely linked to VRN-H1 (HvCSFs1 and HvPHYC). Conclusion: We show that, after employing appropriate statistical methods to correct for population sub-structure, the genome-wide partitioning effect of allelic status at VRN-H1 and VRN-H2 does not result in the high levels of spurious association expected to occur in highly structured samples. Furthermore, we demonstrate that both VRN-H1 and the candidate VRN-H2 genes can be identified using association mapping. Discrimination between intragenic VRN-H1 markers was achieved, indicating that candidate causative polymorphisms may be discerned and prioritised within a larger set of positive associations. This proof of concept study demonstrates the feasibility of association mapping in barley, even within highly structured populations. A major advantage of this method is that it does not require large numbers of genome-wide markers, and is therefore suitable for fine mapping and candidate gene evaluation, especially in species for which large numbers of genetic markers are either unavailable or too costly.
Optimal estimation (OE) improves sea surface temperature (SST) estimated from satellite infrared imagery in the “split-window”, in comparison to SST retrieved using the usual multi-channel (MCSST) or non-linear (NLSST) estimators. This is demonstrated using three months of observations of the Advanced Very High Resolution Radiometer (AVHRR) on the first Meteorological Operational satellite (Metop-A), matched in time and space to drifter SSTs collected on the global telecommunications system. There are 32,175 matches. The prior for the OE is forecast atmospheric fields from the Météo-France global numerical weather prediction system (ARPEGE), the forward model is RTTOV8.7, and a reduced state vector comprising SST and total column water vapour (TCWV) is used. Operational NLSST coefficients give mean and standard deviation (SD) of the difference between satellite and drifter SSTs of 0.00 and 0.72 K. The “best possible” NLSST and MCSST coefficients, empirically regressed on the data themselves, give zero mean difference and SDs of 0.66 K and 0.73 K respectively. Significant contributions to the global SD arise from regional systematic errors (biases) of several tenths of kelvin in the NLSST. With no bias corrections to either prior fields or forward model, the SSTs retrieved by OE minus drifter SSTs have mean and SD of − 0.16 and 0.49 K respectively. The reduction in SD below the “best possible” regression results shows that OE deals with structural limitations of the NLSST and MCSST algorithms. Using simple empirical bias corrections to improve the OE, retrieved minus drifter SSTs are obtained with mean and SD of − 0.06 and 0.44 K respectively. Regional biases are greatly reduced, such that the absolute bias is less than 0.1 K in 61% of 10°-latitude by 30°-longitude cells. OE also allows a statistic of the agreement between modelled and measured brightness temperatures to be calculated. We show that this measure is more efficient than the current system of confidence levels at identifying reliable retrievals, and that the best 75% of satellite SSTs by this measure have negligible bias and retrieval error of order 0.25 K.
This paper considers the effect of using a GARCH filter on the properties of the BDS test statistic as well as a number of other issues relating to the application of the test. It is found that, for certain values of the user-adjustable parameters, the finite sample distribution of the test is far-removed from asymptotic normality. In particular, when data generated from some completely different model class are filtered through a GARCH model, the frequency of rejection of iid falls, often substantially. The implication of this result is that it might be inappropriate to use non-rejection of iid of the standardised residuals of a GARCH model as evidence that the GARCH model ‘fits’ the data.
Variations in lake area and depth reflect climatically induced changes in the water balance of overflowing as well as closed lakes. A new global data base of lake status has been assembled, and is used to compare two simulations for 6 ka (6000 yr ago) made with successive R15 versions of the NCAR Community Climate Model (CCM). Simulated water balance was expressed as anomalies of annual precipitation minus evaporation (P-E); observed water balance as anomalies of lake status. Comparisons were made visually, by comparing regional averages, and by a statistic that compares the signs of simulated P-E anomalies (smoothly interpolated to the lake sites) with the status anomalies. Both CCM0 and CCM1 showed enhanced Northern-Hemisphere monsoons at 6 ka. Both underestimated the effect, but CCM1 fitted the spatial patterns better. In the northern mid- and high-latitudes the two versions differed more, and fitted the data less satisfactorily. CCM1 performed better than CCM0 in North America and central Eurasia, but not in Europe. Both models (especially CCM0) simulated excessive aridity in interior Eurasia. The models were systematically wrong in the southern mid-latitudes. Problems may have been caused by inadequate treatment of changes in sea-surface conditions in both models. Palaeolake status data will continue to provide a benchmark for the evaluation of modelling improvements.
The presence of 10 virulence genes was examined using polymerase chain reaction (PCR) in 365 European O157 and non-O157 Escherichia coli isolates associated with verotoxin production. Strain-specific PCR data were analysed using hierarchical clustering. The resulting dendrogram clearly separated O157 from non-O157 strains. The former clustered typical high-risk seropathotype (SPT) A strains from all regions, including Sweden and Spain, which were homogenous by Cramer's V statistic, and strains with less typical O157 features mostly from Hungary. The non-O157 strains divided into a high-risk SPTB harbouring O26, O111 and O103 strains, a group pathogenic to pigs, and a group with few virulence genes other than for verotoxin. The data demonstrate SPT designation and selected PCR separated verotoxigenic E. coli of high and low risk to humans; although more virulence genes or pulsed-field gel electrophoresis will need to be included to separate high-risk strains further for epidemiological tracing.
Greater self-complexity has been suggested as a protective factor for people under stress (Linville, 1985). Two different measures have been proposed to assess individual self-complexity: Attneave’s H statistic (1959) and a composite index of two components of self-complexity (SC; Rafaeli-Mor et al., 1999). Using mood-incongruent recall, i.e., recalling positive events while in negative mood, the present study compared validity of the two measures through reanalysis of Sakaki’s (2004) data. Results indicated that H statistic did not predict performance of mood-incongruent recall. In contrast, greater SC was associated with better mood-incongruent recall even when the effect of H statistic was controlled.
This study has investigated serial (temporal) clustering of extra-tropical cyclones simulated by 17 climate models that participated in CMIP5. Clustering was estimated by calculating the dispersion (ratio of variance to mean) of 30 December-February counts of Atlantic storm tracks passing nearby each grid point. Results from single historical simulations of 1975-2005 were compared to those from historical ERA40 reanalyses from 1958-2001 ERA40 and single future model projections of 2069-2099 under the RCP4.5 climate change scenario. Models were generally able to capture the broad features in reanalyses reported previously: underdispersion/regularity (i.e. variance less than mean) in the western core of the Atlantic storm track surrounded by overdispersion/clustering (i.e. variance greater than mean) to the north and south and over western Europe. Regression of counts onto North Atlantic Oscillation (NAO) indices revealed that much of the overdispersion in the historical reanalyses and model simulations can be accounted for by NAO variability. Future changes in dispersion were generally found to be small and not consistent across models. The overdispersion statistic, for any 30 year sample, is prone to large amounts of sampling uncertainty that obscures the climate change signal. For example, the projected increase in dispersion for storm counts near London in the CNRMCM5 model is 0.1 compared to a standard deviation of 0.25. Projected changes in the mean and variance of NAO are insufficient to create changes in overdispersion that are discernible above natural sampling variations.
Cobalt is one of the main components of cast metal alloys broadly used in dentistry. It is the constituent of 45 to 70% of numerous prosthetic works. There are evidences that metal elements cause systemic and local toxicity. The purpose of the present study was to evaluate the effects of cobalt on the junctional epithelium and reduced enamel epithelium of the first superior molar in rats, during lactation. To do this, 1-day old rats were used, whose mothers received 300mg of cobalt chloride per liter of distilled water in the drinker, during lactation. After 21 days, the rat pups were killed with an anesthetic overdose. The heads were separated, fixed in ""alfac"", decalcified and embedded in paraffin. Frontal sections stained with hematoxylin and eosin were employed. Karyometric methods allowed to estimate the following parameters: biggest, smallest and mean diameters, D/d ratio, perimeter, area, volume, volume/area ratio, eccentricity, form coefficient and contour index. Stereologic methods allow to evaluate: cytoplasm/nucleus ratio, cell and cytoplasm volume, cell number density, external surface/basal membrane ratio, thickness of the epithelial layers and surface density. All the collected data were subjected to statistic analysis by the non-parametric Wilcoxon-Mann-Whitney test. The nuclei of the studied tissues showed smaller values after karyometry for: diameters; perimeter, area, volume and volume/area ratio. Stereologically, it was observed, in the junctional epithelium and in the reduced enamel epithelium, smaller cells with scarce cytoplasm, reflected in the greater number of cells per mm3 of tissue. In this study, cobalt caused epithelial atrophy, indicating a direct action on the junctional and enamel epithelium.
The information concerning the molecular events taking place in onlay bone grafts are still incipient. The objective of the present study is to correlate the effects of perforation of resident bone bed on (1) the timing of onlay autogenous graft revascularization; (2) the maintenance of volume/density of the graft (assessed through tomography); and (3) the occurrence of bone remodeling proteins (using immunohistochemistry technique) delivered in the graft. Thirty-six New Zealand White rabbits were subjected to iliac crest onlay bone grafting on both sides of the mandible. The bone bed was drill-perforated on one side aiming at accelerating revascularization, whereas on the other side it was kept intact. After grafts fixation and flaps suture all animals were submitted to tomography on both mandible sites. Six animals were sacrificed, respectively, at 3, 5, 7, 10, 20 and 60 days after surgery. A second tomography was taken just before sacrifice. Histological slides were prepared from each grafted site for both immunohistochemistry analysis [osteopontin, osteocalcin, type I collagen and vascular endothelial growth factor (VEGF) anti-bodies] and histometric analysis. The values on bone volume measured on tomography showed no statistic significance (P >= 0.05) between perforated and intact sites. Grafts placed on perforated beds showed higher bone density values compared with non-perforated ones at 3 days (P <= 0.05). This correlation was inverted at 60 days postoperatively. The findings from VEGF labeling revealed a tendency for earlier revascularization in the perforated group. The early revascularization of bone grafts accelerated the bone remodeling process (osteocalcin, type I collagen and osteopontin) that led to an increased bone deposition at 10 days. The extended osteoblast differentiation process at intermediate stages in the perforated group cooperated for a denser bone at 60 days.