924 resultados para Multivariate statistics
Resumo:
Compositional data, also called multiplicative ipsative data, are common in survey research instruments in areas such as time use, budget expenditure and social networks. Compositional data are usually expressed as proportions of a total, whose sum can only be 1. Owing to their constrained nature, statistical analysis in general, and estimation of measurement quality with a confirmatory factor analysis model for multitrait-multimethod (MTMM) designs in particular are challenging tasks. Compositional data are highly non-normal, as they range within the 0-1 interval. One component can only increase if some other(s) decrease, which results in spurious negative correlations among components which cannot be accounted for by the MTMM model parameters. In this article we show how researchers can use the correlated uniqueness model for MTMM designs in order to evaluate measurement quality of compositional indicators. We suggest using the additive log ratio transformation of the data, discuss several approaches to deal with zero components and explain how the interpretation of MTMM designs di ers from the application to standard unconstrained data. We show an illustration of the method on data of social network composition expressed in percentages of partner, family, friends and other members in which we conclude that the faceto-face collection mode is generally superior to the telephone mode, although primacy e ects are higher in the face-to-face mode. Compositions of strong ties (such as partner) are measured with higher quality than those of weaker ties (such as other network members)
Resumo:
Given an observed test statistic and its degrees of freedom, one may compute the observed P value with most statistical packages. It is unknown to what extent test statistics and P values are congruent in published medical papers. Methods: We checked the congruence of statistical results reported in all the papers of volumes 409–412 of Nature (2001) and a random sample of 63 results from volumes 322–323 of BMJ (2001). We also tested whether the frequencies of the last digit of a sample of 610 test statistics deviated from a uniform distribution (i.e., equally probable digits).Results: 11.6% (21 of 181) and 11.1% (7 of 63) of the statistical results published in Nature and BMJ respectively during 2001 were incongruent, probably mostly due to rounding, transcription, or type-setting errors. At least one such error appeared in 38% and 25% of the papers of Nature and BMJ, respectively. In 12% of the cases, the significance level might change one or more orders of magnitude. The frequencies of the last digit of statistics deviated from the uniform distribution and suggested digit preference in rounding and reporting.Conclusions: this incongruence of test statistics and P values is another example that statistical practice is generally poor, even in the most renowned scientific journals, and that quality of papers should be more controlled and valued
Resumo:
ABSRACT This thesis focuses on the monitoring, fault detection and diagnosis of Wastewater Treatment Plants (WWTP), which are important fields of research for a wide range of engineering disciplines. The main objective is to evaluate and apply a novel artificial intelligent methodology based on situation assessment for monitoring and diagnosis of Sequencing Batch Reactor (SBR) operation. To this end, Multivariate Statistical Process Control (MSPC) in combination with Case-Based Reasoning (CBR) methodology was developed, which was evaluated on three different SBR (pilot and lab-scales) plants and validated on BSM1 plant layout.
Resumo:
This paper provides for the first time an objective short-term (8 yr) climatology of African convective weather systems based on satellite imagery. Eight years of infrared International Satellite Cloud Climatology Project-European Space Agency's Meteorological Satellite (ISCCP-Meteosat) satellite imagery has been analyzed using objective feature identification, tracking, and statistical techniques for the July, August, and September periods and the region of Africa and the adjacent Atlantic ocean. This allows various diagnostics to be computed and used to study the distribution of mesoscale and synoptic-scale convective weather systems from mesoscale cloud clusters and squall lines to tropical cyclones. An 8-yr seasonal climatology (1983-90) and the seasonal cycle of this convective activity are presented and discussed. Also discussed is the dependence of organized convection for this region, on the orography, convective, and potential instability and vertical wind shear using European Centre for Medium-Range Weather Forecasts reanalysis data.
Resumo:
Turbulence statistics obtained by direct numerical simulations are analysed to investigate spatial heterogeneity within regular arrays of building-like cubical obstacles. Two different array layouts are studied, staggered and square, both at a packing density of λp=0.25 . The flow statistics analysed are mean streamwise velocity ( u− ), shear stress ( u′w′−−−− ), turbulent kinetic energy (k) and dispersive stress fraction ( u˜w˜ ). The spatial flow patterns and spatial distribution of these statistics in the two arrays are found to be very different. Local regions of high spatial variability are identified. The overall spatial variances of the statistics are shown to be generally very significant in comparison with their spatial averages within the arrays. Above the arrays the spatial variances as well as dispersive stresses decay rapidly to zero. The heterogeneity is explored further by separately considering six different flow regimes identified within the arrays, described here as: channelling region, constricted region, intersection region, building wake region, canyon region and front-recirculation region. It is found that the flow in the first three regions is relatively homogeneous, but that spatial variances in the latter three regions are large, especially in the building wake and canyon regions. The implication is that, in general, the flow immediately behind (and, to a lesser extent, in front of) a building is much more heterogeneous than elsewhere, even in the relatively dense arrays considered here. Most of the dispersive stress is concentrated in these regions. Considering the experimental difficulties of obtaining enough point measurements to form a representative spatial average, the error incurred by degrading the sampling resolution is investigated. It is found that a good estimate for both area and line averages can be obtained using a relatively small number of strategically located sampling points.
Resumo:
A stochastic parameterization scheme for deep convection is described, suitable for use in both climate and NWP models. Theoretical arguments and the results of cloud-resolving models, are discussed in order to motivate the form of the scheme. In the deterministic limit, it tends to a spectrum of entraining/detraining plumes and is similar to other current parameterizations. The stochastic variability describes the local fluctuations about a large-scale equilibrium state. Plumes are drawn at random from a probability distribution function (pdf) that defines the chance of finding a plume of given cloud-base mass flux within each model grid box. The normalization of the pdf is given by the ensemble-mean mass flux, and this is computed with a CAPE closure method. The characteristics of each plume produced are determined using an adaptation of the plume model from the Kain-Fritsch parameterization. Initial tests in the single column version of the Unified Model verify that the scheme is effective in producing the desired distributions of convective variability without adversely affecting the mean state.
Resumo:
Resumo:
Investigation of preferred structures of planetary wave dynamics is addressed using multivariate Gaussian mixture models. The number of components in the mixture is obtained using order statistics of the mixing proportions, hence avoiding previous difficulties related to sample sizes and independence issues. The method is first applied to a few low-order stochastic dynamical systems and data from a general circulation model. The method is next applied to winter daily 500-hPa heights from 1949 to 2003 over the Northern Hemisphere. A spatial clustering algorithm is first applied to the leading two principal components (PCs) and shows significant clustering. The clustering is particularly robust for the first half of the record and less for the second half. The mixture model is then used to identify the clusters. Two highly significant extratropical planetary-scale preferred structures are obtained within the first two to four EOF state space. The first pattern shows a Pacific-North American (PNA) pattern and a negative North Atlantic Oscillation (NAO), and the second pattern is nearly opposite to the first one. It is also observed that some subspaces show multivariate Gaussianity, compatible with linearity, whereas others show multivariate non-Gaussianity. The same analysis is also applied to two subperiods, before and after 1978, and shows a similar regime behavior, with a slight stronger support for the first subperiod. In addition a significant regime shift is also observed between the two periods as well as a change in the shape of the distribution. The patterns associated with the regime shifts reflect essentially a PNA pattern and an NAO pattern consistent with the observed global warming effect on climate and the observed shift in sea surface temperature around the mid-1970s.