63 resultados para statistical methods
em Consorci de Serveis Universitaris de Catalunya (CSUC), Spain
Resumo:
In the scope of the European project Hydroptimet, INTERREG IIIB-MEDOCC programme, limited area model (LAM) intercomparison of intense events that produced many damages to people and territory is performed. As the comparison is limited to single case studies, the work is not meant to provide a measure of the different models' skill, but to identify the key model factors useful to give a good forecast on such a kind of meteorological phenomena. This work focuses on the Spanish flash-flood event, also known as "Montserrat-2000" event. The study is performed using forecast data from seven operational LAMs, placed at partners' disposal via the Hydroptimet ftp site, and observed data from Catalonia rain gauge network. To improve the event analysis, satellite rainfall estimates have been also considered. For statistical evaluation of quantitative precipitation forecasts (QPFs), several non-parametric skill scores based on contingency tables have been used. Furthermore, for each model run it has been possible to identify Catalonia regions affected by misses and false alarms using contingency table elements. Moreover, the standard "eyeball" analysis of forecast and observed precipitation fields has been supported by the use of a state-of-the-art diagnostic method, the contiguous rain area (CRA) analysis. This method allows to quantify the spatial shift forecast error and to identify the error sources that affected each model forecasts. High-resolution modelling and domain size seem to have a key role for providing a skillful forecast. Further work is needed to support this statement, including verification using a wider observational data set.
Resumo:
Trees are a great bank of data, named sometimes for this reason as the "silentwitnesses" of the past. Due to annual formation of rings, which is normally influenced directly by of climate parameters (generally changes in temperature and moisture or precipitation) and other environmental factors; these changes, occurred in the past, are"written" in the tree "archives" and can be "decoded" in order to interpret what hadhappened before, mainly applied for the past climate reconstruction.Using dendrochronological methods for obtaining samples of Pinus nigra fromthe Catalonian PrePirineous region, the cores of 15 trees with total time spine of about 100 - 250 years were analyzed for the tree ring width (TRW) patterns and had quite high correlation between them (0.71 ¿ 0.84), corresponding to a common behaviour for the environmental changes in their annual growth.After different trials with raw TRW data for standardization in order to take outthe negative exponential growth curve dependency, the best method of doubledetrending (power transformation and smoothing line of 32 years) were selected for obtaining the indexes for further analysis.Analyzing the cross-correlations between obtained tree ring width indexes andclimate data, significant correlations (p<0.05) were observed in some lags, as forexample, annual precipitation in lag -1 (previous year) had negative correlation with TRW growth in the Pallars region. Significant correlation coefficients are between 0.27- 0.51 (with positive or negative signs) for many cases; as for recent (but very short period) climate data of Seu d¿Urgell meteorological station, some significant correlation coefficients were observed, of the order of 0.9.These results confirm the hypothesis of using dendrochronological data as aclimate signal for further analysis, such as reconstruction of climate in the past orprediction in the future for the same locality.
Resumo:
Background: Molecular tools may help to uncover closely related and still diverging species from a wide variety of taxa and provide insight into the mechanisms, pace and geography of marine speciation. There is a certain controversy on the phylogeography and speciation modes of species-groups with an Eastern Atlantic-Western Indian Ocean distribution, with previous studies suggesting that older events (Miocene) and/or more recent (Pleistocene) oceanographic processes could have influenced the phylogeny of marine taxa. The spiny lobster genus Palinurus allows for testing among speciation hypotheses, since it has a particular distribution with two groups of three species each in the Northeastern Atlantic (P. elephas, P. mauritanicus and P. charlestoni) and Southeastern Atlantic and Southwestern Indian Oceans (P. gilchristi, P. delagoae and P. barbarae). In the present study, we obtain a more complete understanding of the phylogenetic relationships among these species through a combined dataset with both nuclear and mitochondrial markers, by testing alternative hypotheses on both the mutation rate and tree topology under the recently developed approximate Bayesian computation (ABC) methods. Results Our analyses support a North-to-South speciation pattern in Palinurus with all the South-African species forming a monophyletic clade nested within the Northern Hemisphere species. Coalescent-based ABC methods allowed us to reject the previously proposed hypothesis of a Middle Miocene speciation event related with the closure of the Tethyan Seaway. Instead, divergence times obtained for Palinurus species using the combined mtDNA-microsatellite dataset and standard mutation rates for mtDNA agree with known glaciation-related processes occurring during the last 2 my. Conclusion The Palinurus speciation pattern is a typical example of a series of rapid speciation events occurring within a group, with very short branches separating different species. Our results support the hypothesis that recent climate change-related oceanographic processes have influenced the phylogeny of marine taxa, with most Palinurus species originating during the last two million years. The present study highlights the value of new coalescent-based statistical methods such as ABC for testing different speciation hypotheses using molecular data.
Resumo:
We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.
Resumo:
Background: The repertoire of statistical methods dealing with the descriptive analysis of the burden of a disease has been expanded and implemented in statistical software packages during the last years. The purpose of this paper is to present a web-based tool, REGSTATTOOLS http://regstattools.net intended to provide analysis for the burden of cancer, or other group of disease registry data. Three software applications are included in REGSTATTOOLS: SART (analysis of disease"s rates and its time trends), RiskDiff (analysis of percent changes in the rates due to demographic factors and risk of developing or dying from a disease) and WAERS (relative survival analysis). Results: We show a real-data application through the assessment of the burden of tobacco-related cancer incidence in two Spanish regions in the period 1995-2004. Making use of SART we show that lung cancer is the most common cancer among those cancers, with rising trends in incidence among women. We compared 2000-2004 data with that of 1995-1999 to assess percent changes in the number of cases as well as relative survival using RiskDiff and WAERS, respectively. We show that the net change increase in lung cancer cases among women was mainly attributable to an increased risk of developing lung cancer, whereas in men it is attributable to the increase in population size. Among men, lung cancer relative survival was higher in 2000-2004 than in 1995-1999, whereas it was similar among women when these time periods were compared. Conclusions: Unlike other similar applications, REGSTATTOOLS does not require local software installation and it is simple to use, fast and easy to interpret. It is a set of web-based statistical tools intended for automated calculation of population indicators that any professional in health or social sciences may require.
Resumo:
A statistical indentation method has been employed to study the hardness value of fire-refined high conductivity copper, using nanoindentation technique. The Joslin and Oliver approach was used with the aim to separate the hardness (H) influence of copper matrix, from that of inclusions and grain boundaries. This approach relies on a large array of imprints (around 400 indentations), performed at 150 nm of indentation depth. A statistical study using a cumulative distribution function fit and Gaussian simulated distributions, exhibits that H for each phase can be extracted when the indentation depth is much lower than the size of the secondary phases. It is found that the thermal treatment produces a hardness increase, due to the partly re-dissolution of the inclusions (mainly Pb and Sn) in the matrix.
Resumo:
In the current study, we evaluated various robust statistical methods for comparing two independent groups. Two scenarios for simulation were generated: one of equality and another of population mean differences. In each of the scenarios, 33 experimental conditions were used as a function of sample size, standard deviation and asymmetry. For each condition, 5000 replications per group were generated. The results obtained by this study show an adequate type error I rate but not a high power for the confidence intervals. In general, for the two scenarios studied (mean population differences and not mean population differences) in the different conditions analysed, the Mann-Whitney U-test demonstrated strong performance, and a little worse the t-test of Yuen-Welch.
Resumo:
Report for the scientific sojourn at the University of Reading, United Kingdom, from January until May 2008. The main objectives have been firstly to infer population structure and parameters in demographic models using a total of 13 microsatellite loci for genotyping approximately 30 individuals per population in 10 Palinurus elephas populations both from Mediterranean and Atlantic waters. Secondly, developing statistical methods to identify discrepant loci, possibly under selection and implement those methods using the R software environment. It is important to consider that the calculation of the probability distribution of the demographic and mutational parameters for a full genetic data set is numerically difficult for complex demographic history (Stephens 2003). The Approximate Bayesian Computation (ABC), based on summary statistics to infer posterior distributions of variable parameters without explicit likelihood calculations, can surmount this difficulty. This would allow to gather information on different demographic prior values (i.e. effective population sizes, migration rate, microsatellite mutation rate, mutational processes) and assay the sensitivity of inferences to demographic priors by assuming different priors.
Resumo:
La aplicación Log2XML tiene como objeto principal la transformación de archivos log en formato texto con separador de campos a un formato XML estandarizado. Para permitir que la aplicación pueda trabajar con logs de diferentes sistemas o aplicaciones, dispone de un sistema de plantillas (indicación de orden de campos y carácter separador) que permite definir la estructura mínima para poder extraer la información de cualquier tipo de log que se base en separadores de campo. Por último, la aplicación permite el procesamiento de la información extraída para la generación de informes y estadísticas.Por otro lado, en el proyecto se profundiza en la tecnología Grails.
Resumo:
This study is a comparison AU Press with three other traditional (non-open access) Canadian university presses. The analysis is based on actual physical book sales on Amazon.com and Amazon.ca. Statistical methods include the sampling of the sales ranking of randomly selected books from each press. Results suggest that there is no significant difference in the ranking of printed books sold by AU Press in comparison with traditional university presses. However, AU Press, can demonstrate a significantly larger readership for its books as evidenced by thousands of downloads of the open electronic versions.
Resumo:
This presentation aims to make understandable the use and application context of two Webometrics techniques, the logs analysis and Google Analytics, which currently coexist in the Virtual Library of the UOC. In this sense, first of all it is provided a comprehensive introduction to webometrics and then it is analysed the case of the UOC's Virtual Library focusing on the assimilation of these techniques and the considerations underlying their use, and covering in a holistic way the process of gathering, processing and data exploitation. Finally there are also provided guidelines for the interpretation of the metric variables obtained.
Resumo:
This study examines how structural determinants influence intermediary factors of child health inequities and how they operate through the communities where children live. In particular, we explore individual, family and community level characteristics associated with a composite indicator that quantitatively measures intermediary determinants of early childhood health in Colombia. We use data from the 2010 Colombian Demographic and Health Survey (DHS). Adopting the conceptual framework of the Commission on Social Determinants of Health (CSDH), three dimensions related to child health are represented in the index: behavioural factors, psychosocial factors and health system. In order to generate the weight of the variables and take into account the discrete nature of the data, principal component analysis (PCA) using polychoric correlations are employed in the index construction. Weighted multilevel models are used to examine community effects. The results show that the effect of household’s SES is attenuated when community characteristics are included, indicating the importance that the level of community development may have in mediating individual and family characteristics. The findings indicate that there is a significant variance in intermediary determinants of child health between-community, especially for those determinants linked to the health system, even after controlling for individual, family and community characteristics. These results likely reflect that whilst the community context can exert a greater influence on intermediary factors linked directly to health, in the case of psychosocial factors and the parent’s behaviours, the family context can be more important. This underlines the importance of distinguishing between community and family intervention programmes.
Resumo:
In an earlier investigation (Burger et al., 2000) five sediment cores near the RodriguesTriple Junction in the Indian Ocean were studied applying classical statistical methods(fuzzy c-means clustering, linear mixing model, principal component analysis) for theextraction of endmembers and evaluating the spatial and temporal variation ofgeochemical signals. Three main factors of sedimentation were expected by the marinegeologists: a volcano-genetic, a hydro-hydrothermal and an ultra-basic factor. Thedisplay of fuzzy membership values and/or factor scores versus depth providedconsistent results for two factors only; the ultra-basic component could not beidentified. The reason for this may be that only traditional statistical methods wereapplied, i.e. the untransformed components were used and the cosine-theta coefficient assimilarity measure.During the last decade considerable progress in compositional data analysis was madeand many case studies were published using new tools for exploratory analysis of thesedata. Therefore it makes sense to check if the application of suitable data transformations,reduction of the D-part simplex to two or three factors and visualinterpretation of the factor scores would lead to a revision of earlier results and toanswers to open questions . In this paper we follow the lines of a paper of R. Tolosana-Delgado et al. (2005) starting with a problem-oriented interpretation of the biplotscattergram, extracting compositional factors, ilr-transformation of the components andvisualization of the factor scores in a spatial context: The compositional factors will beplotted versus depth (time) of the core samples in order to facilitate the identification ofthe expected sources of the sedimentary process.Kew words: compositional data analysis, biplot, deep sea sediments
Resumo:
The statistical analysis of compositional data is commonly used in geological studies.As is well-known, compositions should be treated using logratios of parts, which aredifficult to use correctly in standard statistical packages. In this paper we describe thenew features of our freeware package, named CoDaPack, which implements most of thebasic statistical methods suitable for compositional data. An example using real data ispresented to illustrate the use of the package
Resumo:
The statistical analysis of compositional data should be treated using logratios of parts,which are difficult to use correctly in standard statistical packages. For this reason afreeware package, named CoDaPack was created. This software implements most of thebasic statistical methods suitable for compositional data.In this paper we describe the new version of the package that now is calledCoDaPack3D. It is developed in Visual Basic for applications (associated with Excel©),Visual Basic and Open GL, and it is oriented towards users with a minimum knowledgeof computers with the aim at being simple and easy to use.This new version includes new graphical output in 2D and 3D. These outputs could bezoomed and, in 3D, rotated. Also a customization menu is included and outputs couldbe saved in jpeg format. Also this new version includes an interactive help and alldialog windows have been improved in order to facilitate its use.To use CoDaPack one has to access Excel© and introduce the data in a standardspreadsheet. These should be organized as a matrix where Excel© rows correspond tothe observations and columns to the parts. The user executes macros that returnnumerical or graphical results. There are two kinds of numerical results: new variablesand descriptive statistics, and both appear on the same sheet. Graphical output appearsin independent windows. In the present version there are 8 menus, with a total of 38submenus which, after some dialogue, directly call the corresponding macro. Thedialogues ask the user to input variables and further parameters needed, as well aswhere to put these results. The web site http://ima.udg.es/CoDaPack contains thisfreeware package and only Microsoft Excel© under Microsoft Windows© is required torun the software.Kew words: Compositional data Analysis, Software