71 resultados para Sampling (Statistics)
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
We study the relationship between stable sampling sequences for bandlimited functions in $L^p(\R^n)$ and the Fourier multipliers in $L^p$. In the case that the sequence is a lattice and the spectrum is a fundamental domain for the lattice the connection is complete. In the case of irregular sequences there is still a partial relationship.
Resumo:
Several methods have been suggested to estimate non-linear models with interaction terms in the presence of measurement error. Structural equation models eliminate measurement error bias, but require large samples. Ordinary least squares regression on summated scales, regression on factor scores and partial least squares are appropriate for small samples but do not correct measurement error bias. Two stage least squares regression does correct measurement error bias but the results strongly depend on the instrumental variable choice. This article discusses the old disattenuated regression method as an alternative for correcting measurement error in small samples. The method is extended to the case of interaction terms and is illustrated on a model that examines the interaction effect of innovation and style of use of budgets on business performance. Alternative reliability estimates that can be used to disattenuate the estimates are discussed. A comparison is made with the alternative methods. Methods that do not correct for measurement error bias perform very similarly and considerably worse than disattenuated regression
Resumo:
One of the main questions to solve when analysing geographically added information consists of the design of territorial units adjusted to the objectives of the study. This is related with the reduction of the effects of the Modificable Areal Unit Problem (MAUP). In this paper an optimisation model to solve regionalisation problems is proposed. This model seeks to reduce disadvantages found in previous works about automated regionalisation tools
Resumo:
La regressió basada en distàncies és un mètode de predicció que consisteix en dos passos: a partir de les distàncies entre observacions obtenim les variables latents, les quals passen a ser els regressors en un model lineal de mínims quadrats ordinaris. Les distàncies les calculem a partir dels predictors originals fent us d'una funció de dissimilaritats adequada. Donat que, en general, els regressors estan relacionats de manera no lineal amb la resposta, la seva selecció amb el test F usual no és possible. En aquest treball proposem una solució a aquest problema de selecció de predictors definint tests estadístics generalitzats i adaptant un mètode de bootstrap no paramètric per a l'estimació dels p-valors. Incluim un exemple numèric amb dades de l'assegurança d'automòbils.
Resumo:
One of the main questions to solve when analysing geographically added information consists of the design of territorial units adjusted to the objectives of the study. This is related with the reduction of the effects of the Modificable Areal Unit Problem (MAUP). In this paper an optimisation model to solve regionalisation problems is proposed. This model seeks to reduce disadvantages found in previous works about automated regionalisation tools
Resumo:
La regressió basada en distàncies és un mètode de predicció que consisteix en dos passos: a partir de les distàncies entre observacions obtenim les variables latents, les quals passen a ser els regressors en un model lineal de mínims quadrats ordinaris. Les distàncies les calculem a partir dels predictors originals fent us d'una funció de dissimilaritats adequada. Donat que, en general, els regressors estan relacionats de manera no lineal amb la resposta, la seva selecció amb el test F usual no és possible. En aquest treball proposem una solució a aquest problema de selecció de predictors definint tests estadístics generalitzats i adaptant un mètode de bootstrap no paramètric per a l'estimació dels p-valors. Incluim un exemple numèric amb dades de l'assegurança d'automòbils.
Resumo:
The present study discusses retention criteria for principal components analysis (PCA) applied to Likert scale items typical in psychological questionnaires. The main aim is to recommend applied researchers to restrain from relying only on the eigenvalue-than-one criterion; alternative procedures are suggested for adjusting for sampling error. An additional objective is to add evidence on the consequences of applying this rule when PCA is used with discrete variables. The experimental conditions were studied by means of Monte Carlo sampling including several sample sizes, different number of variables and answer alternatives, and four non-normal distributions. The results suggest that even when all the items and thus the underlying dimensions are independent, eigenvalues greater than one are frequent and they can explain up to 80% of the variance in data, meeting the empirical criterion. The consequences of using Kaiser"s rule are illustrated with a clinical psychology example. The size of the eigenvalues resulted to be a function of the sample size and the number of variables, which is also the case for parallel analysis as previous research shows. To enhance the application of alternative criteria, an R package was developed for deciding the number of principal components to retain by means of confidence intervals constructed about the eigenvalues corresponding to lack of relationship between discrete variables.
Resumo:
Esta investigación se interesó por la dinámica del flujo (flow) en contextos laborales y no laborales con el objetivo de conocer diferencias y semejanzas de dicha experiencia motivacional. Sesenta empleados de ocupaciones variadas contestaron un diario de flujo seis veces al día durante veintiún días consecutivos (6982 registros). Los datos fueron analizados a nivel entre e intra-sujeto y se utilizaron modelos lineales (i.e. regresión lineal) y no lineales (i.e. modelo de catástrofes) para conocer la capacidad predictiva del ajuste reto-habilidades sobre el flujo. Contextos laboral y no laboral han mostrado dos diferencias fundamentales: mayores fluctuaciones en el flujo en el segundo (desviaciones tipo mayores en las variables habilidades, disfrute, interés y absorción) y un significado distinto del reto. Por otro lado, la capacidad de predicción del modelo no lineal ha sido claramente mayor que su homólogo lineal (42% frente al 19%, en el caso del no trabajo; 44% frente a 33% en el trabajo). El flujo, tanto en contextos laborales como no laborales, muestra dinámicas no lineales que combinan cambios graduales y cambios abruptos. La investigación e intervención interesadas en este proceso deberían centrarse en la variable reto que se ha mostrado clave para entender dichas dinámicas complejas en el flujo.
Resumo:
This study examined the independent effect of skewness and kurtosis on the robustness of the linear mixed model (LMM), with the Kenward-Roger (KR) procedure, when group distributions are different, sample sizes are small, and sphericity cannot be assumed. Methods: A Monte Carlo simulation study considering a split-plot design involving three groups and four repeated measures was performed. Results: The results showed that when group distributions are different, the effect of skewness on KR robustness is greater than that of kurtosis for the corresponding values. Furthermore, the pairings of skewness and kurtosis with group size were found to be relevant variables when applying this procedure. Conclusions: With sample sizes of 45 and 60, KR is a suitable option for analyzing data when the distributions are: (a) mesokurtic and not highly or extremely skewed, and (b) symmetric with different degrees of kurtosis. With total sample sizes of 30, it is adequate when group sizes are equal and the distributions are: (a) mesokurtic and slightly or moderately skewed, and sphericity is assumed; and (b) symmetric with a moderate or high/extreme violation of kurtosis. Alternative analyses should be considered when the distributions are highly or extremely skewed and samples sizes are small.
Resumo:
The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposedas the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where kis the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation
Resumo:
Las pruebas no paramétricas engloban una serie de pruebas estadísticas que tienen como denominador común la ausencia de asunciones acerca de la ley de probabilidad que sigue la población de la que ha sido extraída la muestra. Por esta razón es común referirse a ellas como pruebas de distribución libre. En el artículo se describen y trabajan las pruebas no paramétricas, y se resaltan su fundamento y las indicaciones para su empleo cuando se trata de una sola muestra (Chi-cuadrado), de dos muestras con datos independientes (U de Mann-Whitney), de dos muestras con datos relacionados (T de Wilcoxon), de varias muestras con datos independientes (H de Kruskal-Wallis) y de varias muestras con datos relacionados (Friedman).
Resumo:
Las pruebas paramétricas son un tipo de pruebas de significación estadística que cuantifican la asociación o independencia entre una variable cuantitativa y una categórica. Las pruebas paramétricas exigen ciertos requisitos previos para su aplicación: la distribución Normal de la variable cuantitativa en los grupos que se comparan, la homogeneidad de varianzas en las poblaciones de las que proceden los grupos y una n muestral no inferior a 30. Su incumplimiento conlleva la necesidad de recurrir a pruebas estadísticas no paramétricas. Las pruebas paramétricas se clasifican en dos: prueba t (para una muestra o para dos muestras relacionadas o independientes) y prueba ANOVA (para más de dos muestras independientes).
Resumo:
This paper presents an initial challenge to tackle the every so "tricky" points encountered when dealing with energy accounting, and thereafter illustrates how such a system of accounting can be used when assessing for the metabolic changes in societies. The paper is divided in four main sections. The first three, present a general discussion on the main issues encountered when conducting energy analyses. The last section, subsequently, combines this heuristic approach to the actual formalization of it, in quantitative terms, for the analysis of possible energy scenarios. Section one covers the broader issue of how to account for the relevant categories used when accounting for Joules of energy; emphasizing on the clear distinction between Primary Energy Sources (PES) (which are the physical exploited entities that are used to derive useable energy forms (energy carriers)) and Energy Carriers (EC) (the actual useful energy that is transmitted for the appropriate end uses within a society). Section two sheds light on the concept of Energy Return on Investment (EROI). Here, it is emphasized that, there must already be a certain amount of energy carriers available to be able to extract/exploit Primary Energy Sources to thereafter generate a net supply of energy carriers. It is pointed out that this current trend of intense energy supply has only been possible to the great use and dependence on fossil energy. Section three follows up on the discussion of EROI, indicating that a single numeric indicator such as an output/input ratio is not sufficient in assessing for the performance of energetic systems. Rather an integrated approach that incorporates (i) how big the net supply of Joules of EC can be, given an amount of extracted PES (the external constraints); (ii) how much EC needs to be invested to extract an amount of PES; and (iii) the power level that it takes for both processes to succeed, is underlined. Section four, ultimately, puts the theoretical concepts at play, assessing for how the metabolic performances of societies can be accounted for within this analytical framework.
Resumo:
"Vegeu el resum a l'inici del document del fitxer adjunt."
Resumo:
The literature related to skew–normal distributions has grown rapidly in recent yearsbut at the moment few applications concern the description of natural phenomena withthis type of probability models, as well as the interpretation of their parameters. Theskew–normal distributions family represents an extension of the normal family to whicha parameter (λ) has been added to regulate the skewness. The development of this theoreticalfield has followed the general tendency in Statistics towards more flexible methodsto represent features of the data, as adequately as possible, and to reduce unrealisticassumptions as the normality that underlies most methods of univariate and multivariateanalysis. In this paper an investigation on the shape of the frequency distribution of thelogratio ln(Cl−/Na+) whose components are related to waters composition for 26 wells,has been performed. Samples have been collected around the active center of Vulcanoisland (Aeolian archipelago, southern Italy) from 1977 up to now at time intervals ofabout six months. Data of the logratio have been tentatively modeled by evaluating theperformance of the skew–normal model for each well. Values of the λ parameter havebeen compared by considering temperature and spatial position of the sampling points.Preliminary results indicate that changes in λ values can be related to the nature ofenvironmental processes affecting the data