70 resultados para Multivariate statistical methods
Resumo:
The Baix Empordà-Selva-Gavarres aquifer system is related to the fault set that created the tectonic basins of Empordà and Selva areas (NE Spain) during the Neogene. In this work, we describe groundwater hydrogeological, hydrochemical and isotopical (3H, δD, δ18O, and the 87Sr/86Sr ratio) characteristics of this system in order to illustrate the relevance of fault zones in groundwater flow-paths and the recharge. In that way, we identify two flow systems, with distinct hydrochemistry and isotopes. A local flow system originates at the Gavarres Range, and it flows towards the basins of the Baix Empordà and Selva, with an approximate residence time of 20 years. Additionally, a regional flow system has only been identified in the Selva basin. This one is related to the main fault zones, as preferential flow paths. Its recharge is located in mountain ranges with higher altitudes, namely the Transversal and Guilleries Ranges, with residence times larger than 50 years. Isotopical data has also shown mixing processes between both flow systems and rainfall recharge while multivariate statistical analysis of principal components has shown the main processes that control hydrochemistry of each flow systems
Resumo:
A new statistical parallax method using the Maximum Likelihood principle is presented, allowing the simultaneous determination of a luminosity calibration, kinematic characteristics and spatial distribution of a given sample. This method has been developed for the exploitation of the Hipparcos data and presents several improvements with respect to the previous ones: the effects of the selection of the sample, the observational errors, the galactic rotation and the interstellar absorption are taken into account as an intrinsic part of the formulation (as opposed to external corrections). Furthermore, the method is able to identify and characterize physically distinct groups in inhomogeneous samples, thus avoiding biases due to unidentified components. Moreover, the implementation used by the authors is based on the extensive use of numerical methods, so avoiding the need for simplification of the equations and thus the bias they could introduce. Several examples of application using simulated samples are presented, to be followed by applications to real samples in forthcoming articles.
Resumo:
Every year, flash floods cause economic losses and major problems for undertaking daily activity in the Catalonia region (NE Spain). Sometimes catastrophic damage and casualties occur. When a long term analysis of floods is undertaken, a question arises regarding the changing role of the vulnerability and the hazard in risk evolution. This paper sets out to give some information to deal with this question, on the basis of analysis of all the floods that have occurred in Barcelona county (Catalonia) since the 14th century, as well as the flooded area, urban evolution, impacts and the weather conditions for any of most severe events. With this objective, the identification and classification of historical floods, and characterisation of flash-floods among these, have been undertaken. Besides this, the main meteorological factors associated with recent flash floods in this city and neighbouring regions are well-known. On the other hand, the identification of rainfall trends that could explain the historical evolution of flood hazard occurrence in this city has been analysed. Finally, identification of the influence of urban development on the vulnerability to floods has been carried out. Barcelona city has been selected thanks to its long continuous data series (daily rainfall data series, since 1854; one of the longest rainfall rate series of Europe, since 1921) and for the accurate historical archive information that is available (since the Roman Empire for the urban evolution). The evolution of flood occurrence shows the existence of oscillations in the earlier and later modern-age periods that can be attributed to climatic variability, evolution of the perception threshold and changes in vulnerability. A great increase of vulnerability can be assumed for the period 1850¿1900. The analysis of the time evolution for the Barcelona rainfall series (1854¿2000) shows that no trend exists, although, due to changes in urban planning, flash-floods impact has altered over this time. The number of catastrophic flash floods has diminished, although the extraordinary ones have increased.
Resumo:
Objective: Health status measures usually have an asymmetric distribution and present a highpercentage of respondents with the best possible score (ceiling effect), specially when they areassessed in the overall population. Different methods to model this type of variables have beenproposed that take into account the ceiling effect: the tobit models, the Censored Least AbsoluteDeviations (CLAD) models or the two-part models, among others. The objective of this workwas to describe the tobit model, and compare it with the Ordinary Least Squares (OLS) model,that ignores the ceiling effect.Methods: Two different data sets have been used in order to compare both models: a) real datacomming from the European Study of Mental Disorders (ESEMeD), in order to model theEQ5D index, one of the measures of utilities most commonly used for the evaluation of healthstatus; and b) data obtained from simulation. Cross-validation was used to compare thepredicted values of the tobit model and the OLS models. The following estimators werecompared: the percentage of absolute error (R1), the percentage of squared error (R2), the MeanSquared Error (MSE) and the Mean Absolute Prediction Error (MAPE). Different datasets werecreated for different values of the error variance and different percentages of individuals withceiling effect. The estimations of the coefficients, the percentage of explained variance and theplots of residuals versus predicted values obtained under each model were compared.Results: With regard to the results of the ESEMeD study, the predicted values obtained with theOLS model and those obtained with the tobit models were very similar. The regressioncoefficients of the linear model were consistently smaller than those from the tobit model. In thesimulation study, we observed that when the error variance was small (s=1), the tobit modelpresented unbiased estimations of the coefficients and accurate predicted values, specially whenthe percentage of individuals wiht the highest possible score was small. However, when theerrror variance was greater (s=10 or s=20), the percentage of explained variance for the tobitmodel and the predicted values were more similar to those obtained with an OLS model.Conclusions: The proportion of variability accounted for the models and the percentage ofindividuals with the highest possible score have an important effect in the performance of thetobit model in comparison with the linear model.
Resumo:
Generally, medicine books are concentrated almost exclusively in explaining methodology that analyzes fixed measures, measures done in a certain moment, nevertheless the evolution of the measurement and correct interpretation of the missed values are very important and sometimes can give the key information of the results obtained. Thus, the analysis of the temporary series and spectral analysis or analysis of the time series in the dominion of frequencies can be regarded as an appropriate tool for this kind of studies.In this work the frequency of the pulsating secretion of luteinizing hormone LH (thatregulates the fertile life of women) were analyzed in order to determine the existence of the significant frequencies obtained by analysis of Fourier. Detection of the frequencies, with which the pulsating secretion of the LH takes place, is a quite difficult question due topresence of the random errors in measures and samplings, i.e. that pulsating secretions of small amplitude are not detected and disregarded. In physiology it is accepted that cyclical patterns in the secretion of the LH exist and in the results of this research confirm this pattern and determine its frequency presented in the corresponded periodograms to each of studied cycle. The obtained results can be used as key pattern for future sampling frequencies in order to ¿catch¿ the significant picks of the luteinizing hormone and reflect on time forproductivity treatment of women.
Resumo:
En este artículo abordamos el uso y la importancia de las herramientas estadísticas que se utilizan principalmente en los estudios médicos del ámbito de la oncología y la hematología, pero aplicables a muchos otros campos tanto médicos como experimentales o industriales. El objetivo del presente trabajo es presentar de una manera clara y precisa la metodología estadística necesaria para analizar los datos obtenidos en los estudios rigurosa y concisamente en cuanto a las hipótesis de trabajo planteadas por los investigadores. La medida de la respuesta al tratamiento elegidas en al tipo de estudio elegido determinarán los métodos estadísticos que se utilizarán durante el análisis de los datos del estudio y también el tamaño de muestra. Mediante la correcta aplicación del análisis estadístico y de una adecuada planificación se puede determinar si la relación encontrada entre la exposición a un tratamiento y un resultado es casual o por el contrario, está sujeto a una relación no aleatoria que podría establecer una relación de causalidad. Hemos estudiado los principales tipos de diseño de los estudios médicos más utilizados, tales como ensayos clínicos y estudios observacionales (cohortes, casos y controles, estudios de prevalencia y estudios ecológicos). También se presenta una sección sobre el cálculo del tamaño muestral de los estudios y cómo calcularlo, ¿Qué prueba estadística debe utilizarse?, los aspectos sobre fuerza del efecto ¿odds ratio¿ (OR) y riesgo relativo (RR), el análisis de supervivencia. Se presentan ejemplos en la mayoría de secciones del artículo y bibliografía más relevante.
Resumo:
Majolica pottery is one of the most characteristic tableware produced during the Medieval and Renaissance periods. Majolica technology was introduced to the Iberian Peninsula by Islamic artisans during Medieval times, and its production and popularity rapidly spread throughout Spain and eventually to other locations in Europe and the Americas. The prestige and importance of Spanish majolica was very high. Consequently, this ware was imported profusely to the Americas during the Spanish Colonial period. Nowadays, Majolica pottery serves as an important horizon marker at Spanish colonial sites. A preliminary study of Spanish-produced majolica was conducted on a set of 246 samples from the 12 primary majolica production centers on the Iberian Peninsula. The samples were analyzed by neutron activation analysis (NAA), and the resulting data were interpreted using an array of multivariate statistical procedures. Our results show a clear discrimination between different production centers. In some cases, our data allow one to distinguish amongst shards coming from the same production location suggesting different workshops or group of workshops were responsible for production of this pre-industrial pottery.
Resumo:
En la actualidad es difícil hablar de procesos estadísticos de análisis cuantitativo de datos sin hacer referencia a la informática aplicada a la investigación. Estos recursos informáticos se basan a menudo en paquetes de programas informáticos que tienen por objetivo ayudar al/la investigador/a en la fase de análisis de datos. En estos momentos uno de los paquetes más perfeccionados y completos es el SPSS (Statistical Package for the Social Sciences). El SPSS es un paquete de programas para llevar a cabo el análisis estadístico de los datos. Constituye una aplicación estadística muy potente, de la que se han ido desarrollando diversas versiones desde sus inicios, en los años setenta. En esta ficha las salidas de ordenador que se presentan corresponden a la versión 11.0.1. No obstante, aunque la forma ha ido variando desde sus inicios, su funcionamiento sigue siendo muy similar entre las diferentes versiones. Antes de iniciarnos en la utilización de las aplicaciones del SPSS es importante familiarizarse con algunas de las ventanas que más usaremos. Al entrar al SPSS lo primero que nos encontramos es el editor de datos. Esta ventana visualiza, básicamente, los datos que iremos introduciendo. El editor de datos incluye dos opciones: la vista de los datos y la de las variables. Estas opciones pueden seleccionarse a partir de las dos pestañas que se presentan en la parte inferior. La vista de datos contiene el menú general y la matriz de datos. Esta matriz está estructurada ubicando los casos en las filas y las variables en las columnas.
Resumo:
The present study explores the statistical properties of a randomization test based on the random assignment of the intervention point in a two-phase (AB) single-case design. The focus is on randomization distributions constructed with the values of the test statistic for all possible random assignments and used to obtain p-values. The shape of those distributions is investigated for each specific data division defined by the moment in which the intervention is introduced. Another aim of the study consisted in testing the detection of inexistent effects (i.e., production of false alarms) in autocorrelated data series, in which the assumption of exchangeability between observations may be untenable. In this way, it was possible to compare nominal and empirical Type I error rates in order to obtain evidence on the statistical validity of the randomization test for each individual data division. The results suggest that when either of the two phases has considerably less measurement times, Type I errors may be too probable and, hence, the decision making process to be carried out by applied researchers may be jeopardized.
Resumo:
Interdependence is the main feature of dyadic relationships and, in recent years, various statistical procedures have been proposed for quantifying and testing this social attribute in different dyadic designs. The purpose of this paper is to develop several functions for this kind of statistical tests in an R package, known as nonindependence, for use by applied social researchers. A Graphical User Interface (GUI) is also developed to facilitate the use of the functions included in this package. Examples drawn from psychological research and simulated data are used to illustrate how the software works.
Resumo:
The present work focuses the attention on the skew-symmetry index as a measure of social reciprocity. This index is based on the correspondence between the amount of behaviour that each individual addresses to its partners and what it receives from them in return. Although the skew-symmetry index enables researchers to describe social groups, statistical inferential tests are required. The main aim of the present study is to propose an overall statistical technique for testing symmetry in experimental conditions, calculating the skew-symmetry statistic (Φ) at group level. Sampling distributions for the skew- symmetry statistic have been estimated by means of a Monte Carlo simulation in order to allow researchers to make statistical decisions. Furthermore, this study will allow researchers to choose the optimal experimental conditions for carrying out their research, as the power of the statistical test has been estimated. This statistical test could be used in experimental social psychology studies in which researchers may control the group size and the number of interactions within dyads.
Resumo:
The present work deals with quantifying group characteristics. Specifically, dyadic measures of interpersonal perceptions were used to forecast group performance. 46 groups of students, 24 of four and 22 of five people, were studied in a real educational assignment context and marks were gathered as an indicator of group performance. Our results show that dyadic measures of interpersonal perceptions account for final marks. By means of linear regression analysis 85% and 85.6% of group performance was respectively explained for group sizes equal to four and five. Results found in the scientific literature based on the individualistic approach are no larger than 18%. The results of the present study support the utility of dyadic approaches for predicting group performance in social contexts.
Resumo:
Workgroup diversity can be conceptualized as variety, separation, or disparity. Thus, the proper operationalization of diversity depends on how a diversity dimension has been defined. Analytically, the minimal diversity must be obtained when there are no differences on an attribute among the members of a group, however maximal diversity has a different shape for each conceptualization of diversity. Previous work on diversity indexes indicated maximum values for variety (e.g., Blau"s index and Teachman"s index), separation (e.g., standard deviation and mean Euclidean distance), and disparity (e.g., coefficient of variation and the Gini coefficient of concentration), although these maximum values are not valid for all group characteristics (i.e., group size and group size parity) and attribute scales (i.e., number of categories). We demonstrate analytically appropriate upper boundaries for conditional diversity determined by some specific group characteristics, avoiding the bias related to absolute diversity. This will allow applied researchers to make better interpretations regarding the relationship between group diversity and group outcomes.
Resumo:
El presente trabajo recoge de forma breve laproblemática de la estimación de la serial en series temporales de datos obtenidos en registros ERP. Se centra en aquellos componentes de frecuencia mis baja, como es el caso de la CNV: Sepropone la utilización alternativa de las técnicas de suavizado del Análisis Exploratorio de Datos (EDA), para mejorar la estimación obtenida, en comparación con la técnica del promediado simple de diferentes ensayos.
Resumo:
Sickness absence (SA) is an important social, economic and public health issue. Identifying and understanding the determinants, whether biological, regulatory or, health services-related, of variability in SA duration is essential for better management of SA. The conditional frailty model (CFM) is useful when repeated SA events occur within the same individual, as it allows simultaneous analysis of event dependence and heterogeneity due to unknown, unmeasured, or unmeasurable factors. However, its use may encounter computational limitations when applied to very large data sets, as may frequently occur in the analysis of SA duration. To overcome the computational issue, we propose a Poisson-based conditional frailty model (CFPM) for repeated SA events that accounts for both event dependence and heterogeneity. To demonstrate the usefulness of the model proposed in the SA duration context, we used data from all non-work-related SA episodes that occurred in Catalonia (Spain) in 2007, initiated by either a diagnosis of neoplasm or mental and behavioral disorders. As expected, the CFPM results were very similar to those of the CFM for both diagnosis groups. The CPU time for the CFPM was substantially shorter than the CFM. The CFPM is an suitable alternative to the CFM in survival analysis with recurrent events,especially with large databases.