957 resultados para Multivariate statistical methods
Resumo:
Cuando se realiza una encuesta social en un amplio territorio queda siempre el deseo de aplicar análisis similares a los realizados en la encuesta a poblaciones o territorios más reducidos, evidentemente utilizando los propios datos de la encuesta. El objetivo de este articulo consiste en mostrar cómo cada estrato de una muestra estratificada puede constituir una base muestral para llevar a cabo dichos análisis con todas las garantías de precisión o, al menos, con garantías calculables y aceptables sin aumentar el número muestral para la encuesta general.
Resumo:
A plant species' genetic population structure is the result of a complex combination of its life history, ecological preferences, position in the ecosystem and historical factors. As a result, many different statistical methods exist that measure different aspects of species' genetic structure. However, little is known about how these methods are interrelated and how they are related to a species' ecology and life history. In this study, we used the IntraBioDiv amplified fragment length polymorphisms data set from 27 high-alpine species to calculate eight genetic summary statistics that we jointly correlate to a set of six ecological and life-history traits. We found that there is a large amount of redundancy among the calculated summary statistics and that there is a significant association with the matrix of species traits. In a multivariate analysis, two main aspects of population structure were visible among the 27 species. The first aspect is related to the species' dispersal capacities and the second is most likely related to the species' postglacial recolonization of the Alps. Furthermore, we found that some summary statistics, most importantly Mantel's r and Jost's D, show different behaviour than expected based on theory. We therefore advise caution in drawing too strong conclusions from these statistics.
Resumo:
The objectives of this work were to evaluate the genotype x environment (GxE) interaction for popcorn and to compare two multivariate analyses methods. Nine popcorn cultivars were sown on four dates one month apart during each of the agricultural years 1998/1999 and 1999/2000. The experiments were carried out using randomized block designs, with four replicates. The cv. Zélia contributed the least to the GxE interaction. The cv. Viçosa performed similarly to cv. Rosa-claro. Optimization of GxE was obtained for cv. CMS 42 for a favorable mega-environment, and for cv. CMS 43 for an unfavorable environment. Multivariate analysis supported the results from the method of Eberhart & Russell. The graphic analysis of the Additive Main effects and Multiplicative Interaction (AMMI) model was simple, allowing conclusions to be made about stability, genotypic performance, genetic divergence between cultivars, and the environments that optimize cultivar performance. The graphic analysis of the Genotype main effects and Genotype x Environment interaction (GGE) method added to AMMI information on environmental stratification, defining mega-environments and the cultivars that optimized performance in those mega-environments. Both methods are adequate to explain the genotype x environment interactions.
Resumo:
We present in this paper the results of the application of several visual methods on a group of locations, dated between VI and I centuries BC, of the ager Tarraconensis (Tarragona, Spain) a Hinterland of the roman colony of Tarraco. The difficulty in interpreting the diverse results in a combined way has been resolved by means of the use of statistical methods, such as Principal Components Analysis (PCA) and K-means clustering analysis. These methods have allowed us to carry out site classifications in function of the landscape's visual structure that contains them and of the visual relationships that could be given among them.
Resumo:
The research of condition monitoring of electric motors has been wide for several decades. The research and development at universities and in industry has provided means for the predictive condition monitoring. Many different devices and systems are developed and are widely used in industry, transportation and in civil engineering. In addition, many methods are developed and reported in scientific arenas in order to improve existing methods for the automatic analysis of faults. The methods, however, are not widely used as a part of condition monitoring systems. The main reasons are, firstly, that many methods are presented in scientific papers but their performance in different conditions is not evaluated, secondly, the methods include parameters that are so case specific that the implementation of a systemusing such methods would be far from straightforward. In this thesis, some of these methods are evaluated theoretically and tested with simulations and with a drive in a laboratory. A new automatic analysis method for the bearing fault detection is introduced. In the first part of this work the generation of the bearing fault originating signal is explained and its influence into the stator current is concerned with qualitative and quantitative estimation. The verification of the feasibility of the stator current measurement as a bearing fault indicatoris experimentally tested with the running 15 kW induction motor. The second part of this work concentrates on the bearing fault analysis using the vibration measurement signal. The performance of the micromachined silicon accelerometer chip in conjunction with the envelope spectrum analysis of the cyclic bearing faultis experimentally tested. Furthermore, different methods for the creation of feature extractors for the bearing fault classification are researched and an automatic fault classifier using multivariate statistical discrimination and fuzzy logic is introduced. It is often important that the on-line condition monitoring system is integrated with the industrial communications infrastructure. Two types of a sensor solutions are tested in the thesis: the first one is a sensor withcalculation capacity for example for the production of the envelope spectra; the other one can collect the measurement data in memory and another device can read the data via field bus. The data communications requirements highly depend onthe type of the sensor solution selected. If the data is already analysed in the sensor the data communications are needed only for the results but in the other case, all measurement data need to be transferred. The complexity of the classification method can be great if the data is analysed at the management level computer, but if the analysis is made in sensor itself, the analyses must be simple due to the restricted calculation and memory capacity.
Resumo:
To enable a mathematically and physically sound execution of the fatigue test and a correct interpretation of its results, statistical evaluation methods are used to assist in the analysis of fatigue testing data. The main objective of this work is to develop step-by-stepinstructions for statistical analysis of the laboratory fatigue data. The scopeof this project is to provide practical cases about answering the several questions raised in the treatment of test data with application of the methods and formulae in the document IIW-XIII-2138-06 (Best Practice Guide on the Statistical Analysis of Fatigue Data). Generally, the questions in the data sheets involve some aspects: estimation of necessary sample size, verification of the statistical equivalence of the collated sets of data, and determination of characteristic curves in different cases. The series of comprehensive examples which are given in this thesis serve as a demonstration of the various statistical methods to develop a sound procedure to create reliable calculation rules for the fatigue analysis.
Resumo:
Paperin pinnan karheus on yksi paperin laatukriteereistä. Sitä mitataan fyysisestipaperin pintaa mittaavien laitteiden ja optisten laitteiden avulla. Mittaukset vaativat laboratorioolosuhteita, mutta nopeammille, suoraan linjalla tapahtuville mittauksilla olisi tarvetta paperiteollisuudessa. Paperin pinnan karheus voidaan ilmaista yhtenä näytteelle kohdistuvana karheusarvona. Tässä työssä näyte on jaettu merkitseviin alueisiin, ja jokaiselle alueelle on laskettu erillinen karheusarvo. Karheuden mittaukseen on käytetty useita menetelmiä. Yleisesti hyväksyttyä tilastollista menetelmää on käytetty tässä työssä etäisyysmuunnoksen lisäksi. Paperin pinnan karheudenmittauksessa on ollut tarvetta jakaa analysoitava näyte karheuden perusteella alueisiin. Aluejaon avulla voidaan rajata näytteestä selvästi karheampana esiintyvät alueet. Etäisyysmuunnos tuottaa alueita, joita on analysoitu. Näistä alueista on muodostettu yhtenäisiä alueita erilaisilla segmentointimenetelmillä. PNN -menetelmään (Pairwise Nearest Neighbor) ja naapurialueiden yhdistämiseen perustuvia algoritmeja on käytetty.Alueiden jakamiseen ja yhdistämiseen perustuvaa lähestymistapaa on myös tarkasteltu. Segmentoitujen kuvien validointi on yleensä tapahtunut ihmisen tarkastelemana. Tämän työn lähestymistapa on verrata yleisesti hyväksyttyä tilastollista menetelmää segmentoinnin tuloksiin. Korkea korrelaatio näiden tulosten välillä osoittaa onnistunutta segmentointia. Eri kokeiden tuloksia on verrattu keskenään hypoteesin testauksella. Työssä on analysoitu kahta näytesarjaa, joidenmittaukset on suoritettu OptiTopolla ja profilometrillä. Etäisyysmuunnoksen aloitusparametrit, joita muutettiin kokeiden aikana, olivat aloituspisteiden määrä ja sijainti. Samat parametrimuutokset tehtiin kaikille algoritmeille, joita käytettiin alueiden yhdistämiseen. Etäisyysmuunnoksen jälkeen korrelaatio oli voimakkaampaa profilometrillä mitatuille näytteille kuin OptiTopolla mitatuille näytteille. Segmentoiduilla OptiTopo -näytteillä korrelaatio parantui voimakkaammin kuin profilometrinäytteillä. PNN -menetelmän tuottamilla tuloksilla korrelaatio oli paras.
Resumo:
Background: The repertoire of statistical methods dealing with the descriptive analysis of the burden of a disease has been expanded and implemented in statistical software packages during the last years. The purpose of this paper is to present a web-based tool, REGSTATTOOLS http://regstattools.net intended to provide analysis for the burden of cancer, or other group of disease registry data. Three software applications are included in REGSTATTOOLS: SART (analysis of disease"s rates and its time trends), RiskDiff (analysis of percent changes in the rates due to demographic factors and risk of developing or dying from a disease) and WAERS (relative survival analysis). Results: We show a real-data application through the assessment of the burden of tobacco-related cancer incidence in two Spanish regions in the period 1995-2004. Making use of SART we show that lung cancer is the most common cancer among those cancers, with rising trends in incidence among women. We compared 2000-2004 data with that of 1995-1999 to assess percent changes in the number of cases as well as relative survival using RiskDiff and WAERS, respectively. We show that the net change increase in lung cancer cases among women was mainly attributable to an increased risk of developing lung cancer, whereas in men it is attributable to the increase in population size. Among men, lung cancer relative survival was higher in 2000-2004 than in 1995-1999, whereas it was similar among women when these time periods were compared. Conclusions: Unlike other similar applications, REGSTATTOOLS does not require local software installation and it is simple to use, fast and easy to interpret. It is a set of web-based statistical tools intended for automated calculation of population indicators that any professional in health or social sciences may require.
Resumo:
El análisis discriminante es un método estadístico a través del cual se busca conocer qué variables, medidas en objetos o individuos, explican mejor la atribución de la diferencia de los grupos a los cuales pertenecen dichos objetos o individuos. Es una técnica que nos permite comprobar hasta qué punto las variables independientes consideradas en la investigación clasifican correctamente a los sujetos u objetos. Se muestran y explican los principales elementos que se relacionan con el procedimiento para llevar a cabo el análisis discriminante y su aplicación utilizando el paquete estadístico SPSS, versión 18, para el desarrollo del modelo estadístico, las condiciones para la aplicación del análisis, la estimación e interpretación de las funciones discriminantes, los métodos de clasificación y la validación de los resultados.
Resumo:
A statistical indentation method has been employed to study the hardness value of fire-refined high conductivity copper, using nanoindentation technique. The Joslin and Oliver approach was used with the aim to separate the hardness (H) influence of copper matrix, from that of inclusions and grain boundaries. This approach relies on a large array of imprints (around 400 indentations), performed at 150 nm of indentation depth. A statistical study using a cumulative distribution function fit and Gaussian simulated distributions, exhibits that H for each phase can be extracted when the indentation depth is much lower than the size of the secondary phases. It is found that the thermal treatment produces a hardness increase, due to the partly re-dissolution of the inclusions (mainly Pb and Sn) in the matrix.
Resumo:
Seaports play an important part in the wellbeing of a nation. Many nations are highly dependent on foreign trade and most trade is done using sea vessels. This study is part of a larger research project, where a simulation model is required in order to create further analyses on Finnish macro logistical networks. The objective of this study is to create a system dynamic simulation model, which gives an accurate forecast for the development of demand of Finnish seaports up to 2030. The emphasis on this study is to show how it is possible to create a detailed harbor demand System Dynamic model with the help of statistical methods. The used forecasting methods were ARIMA (autoregressive integrated moving average) and regression models. The created simulation model gives a forecast with confidence intervals and allows studying different scenarios. The building process was found to be a useful one and the built model can be expanded to be more detailed. Required capacity for other parts of the Finnish logistical system could easily be included in the model.
Resumo:
In order to elucidate the traditional classification of archaeological artefacts, a multielemental analytical method for characterisation of its micro and macro chemical constituents. combined with statistical multivariate analysis for classification, were used. Instrumental thermal neutron activation analysis, for elemental chemical determination, and three statistical methods: discriminant, cluster and modified cluster analysis were applied. The statistical results obtained for the samples from Iquiri, Quinari and Xapuri archaeological phases were in good agreement with the conventional archaeological classification. Iaco and Jacuru archaeological phase were not characterised as homogenous groups. Iquiri phase were the most distinct in relation to the other analysed groups. An homogeneous group for 54% collected samples at the Los Angeles site was also found, this could be characterised as a new archaeological phase.
Resumo:
Compositional data (concentrations) are common in geosciences. Neglecting its character mey lead to erroneous conclusions. Spurious correlation (K. Pearson, 1897) has disastrous consequences. On the basis of the pioneering work by J. Aitchison in the 1980s, a methodology free of these drawbacks is now available. The geometry of the símplex allows the representation of compositions using orthogonal co-ordinares, to which usual statistical methods can be applied, thus facilating computation ans analysis. The use of (log) ratios precludes the interpretation of single concentrations disregarding their relative character. A hydro-chemical data set is used to illustrate the point
Resumo:
In the current study, we evaluated various robust statistical methods for comparing two independent groups. Two scenarios for simulation were generated: one of equality and another of population mean differences. In each of the scenarios, 33 experimental conditions were used as a function of sample size, standard deviation and asymmetry. For each condition, 5000 replications per group were generated. The results obtained by this study show an adequate type error I rate but not a high power for the confidence intervals. In general, for the two scenarios studied (mean population differences and not mean population differences) in the different conditions analysed, the Mann-Whitney U-test demonstrated strong performance, and a little worse the t-test of Yuen-Welch.
Resumo:
Ten common doubts of chemistry students and professionals about their statistical applications are discussed. The use of the N-1 denominator instead of N is described for the standard deviation. The statistical meaning of the denominators of the root mean square error of calibration (RMSEC) and root mean square error of validation (RMSEV) are given for researchers using multivariate calibration methods. The reason why scientists and engineers use the average instead of the median is explained. Several problematic aspects about regression and correlation are treated. The popular use of triplicate experiments in teaching and research laboratories is seen to have its origin in statistical confidence intervals. Nonparametric statistics and bootstrapping methods round out the discussion.