993 resultados para multivariate methods
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods.
Resumo:
Väitöstutkimuksessa on tarkasteltuinfrapunaspektroskopian ja monimuuttujaisten aineistonkäsittelymenetelmien soveltamista kiteytysprosessin monitoroinnissa ja kidemäisen tuotteen analysoinnissa. Parhaillaan kiteytysprosessitutkimuksessa maailmanlaajuisesti tutkitaan intensiivisesti erilaisten mittausmenetelmien soveltamista kiteytysprosessin ilmiöidenjatkuvaan mittaamiseen niin nestefaasista kuin syntyvistä kiteistäkin. Lisäksi tuotteen karakterisointi on välttämätöntä tuotteen laadun varmistamiseksi. Erityisesti lääkeaineiden valmistuksessa kiinnostusta tämäntyyppiseen tutkimukseen edistää Yhdysvaltain elintarvike- ja lääkeaineviraston (FDA) prosessianalyyttisiintekniikoihin (PAT) liittyvä ohjeistus, jossa määritellään laajasti vaatimukset lääkeaineiden valmistuksessa ja tuotteen karakterisoinnissa tarvittaville mittauksille turvallisten valmistusprosessien takaamiseksi. Jäähdytyskiteytyson erityisesti lääketeollisuudessa paljon käytetty erotusmenetelmä kiinteän raakatuotteen puhdistuksessa. Menetelmässä puhdistettava kiinteä raaka-aine liuotetaan sopivaan liuottimeen suhteellisen korkeassa lämpötilassa. Puhdistettavan aineen liukoisuus käytettävään liuottimeen laskee lämpötilan laskiessa, joten systeemiä jäähdytettäessä liuenneen aineen konsentraatio prosessissa ylittää liukoisuuskonsentraation. Tällaiseen ylikylläiseen systeemiin pyrkii muodostumaan uusia kiteitä tai olemassa olevat kiteet kasvavat. Ylikylläisyys on yksi tärkeimmistä kidetuotteen laatuun vaikuttavista tekijöistä. Jäähdytyskiteytyksessä syntyvän tuotteen ominaisuuksiin voidaan vaikuttaa mm. liuottimen valinnalla, jäähdytyprofiililla ja sekoituksella. Lisäksi kiteytysprosessin käynnistymisvaihe eli ensimmäisten kiteiden muodostumishetki vaikuttaa tuotteen ominaisuuksiin. Kidemäisen tuotteen laatu määritellään kiteiden keskimääräisen koon, koko- ja muotojakaumansekä puhtauden perusteella. Lääketeollisuudessa on usein vaatimuksena, että tuote edustaa tiettyä polymorfimuotoa, mikä tarkoittaa molekyylien kykyä järjestäytyä kidehilassa usealla eri tavalla. Edellä mainitut ominaisuudet vaikuttavat tuotteen jatkokäsiteltävyyteen, kuten mm. suodattuvuuteen, jauhautuvuuteen ja tabletoitavuuteen. Lisäksi polymorfiamuodolla on vaikutusta moniin tuotteen käytettävyysominaisuuksiin, kuten esim. lääkeaineen liukenemisnopeuteen elimistössä. Väitöstyössä on tutkittu sulfatiatsolin jäähdytyskiteytystä käyttäen useita eri liuotinseoksia ja jäähdytysprofiileja sekä tarkasteltu näiden tekijöiden vaikutustatuotteen laatuominaisuuksiin. Infrapunaspektroskopia on laajalti kemian alan tutkimuksissa sovellettava menetelmä. Siinä mitataan tutkittavan näytteenmolekyylien värähtelyjen aiheuttamia spektrimuutoksia IR alueella. Tutkimuksessa prosessinaikaiset mittaukset toteutettiin in-situ reaktoriin sijoitettavalla uppoanturilla käyttäen vaimennettuun kokonaisheijastukseen (ATR) perustuvaa Fourier muunnettua infrapuna (FTIR) spektroskopiaa. Jauhemaiset näytteet mitattiin off-line diffuusioheijastukseen (DRIFT) perustuvalla FTIR spektroskopialla. Monimuuttujamenetelmillä (kemometria) voidaan useita satoja, jopa tuhansia muuttujia käsittävä spektridata jalostaa kvalitatiiviseksi (laadulliseksi) tai kvantitatiiviseksi (määrälliseksi) prosessia kuvaavaksi informaatioksi. Väitöstyössä tarkasteltiin laajasti erilaisten monimuuttujamenetelmien soveltamista mahdollisimman monipuolisen prosessia kuvaavan informaation saamiseksi mitatusta spektriaineistosta. Väitöstyön tuloksena on ehdotettu kalibrointirutiini liuenneen aineen konsentraation ja edelleen ylikylläisyystason mittaamiseksi kiteytysprosessin aikana. Kalibrointirutiinin kehittämiseen kuuluivat aineiston hyvyyden tarkastelumenetelmät, aineiston esikäsittelymenetelmät, varsinainen kalibrointimallinnus sekä mallin validointi. Näin saadaan reaaliaikaista informaatiota kiteytysprosessin ajavasta voimasta, mikä edelleen parantaa kyseisen prosessin tuntemusta ja hallittavuutta. Ylikylläisyystason vaikutuksia syntyvän kidetuotteen laatuun seurattiin usein kiteytyskokein. Työssä on esitetty myös monimuuttujaiseen tilastolliseen prosessinseurantaan perustuva menetelmä, jolla voidaan ennustaa spontaania primääristä ytimenmuodostumishetkeä mitatusta spektriaineistosta sekä mahdollisesti päätellä ydintymisessä syntyvä polymorfimuoto. Ehdotettua menetelmää hyödyntäen voidaan paitsi ennakoida kideytimien muodostumista myös havaita mahdolliset häiriötilanteet kiteytysprosessin alkuhetkillä. Syntyvää polymorfimuotoa ennustamalla voidaan havaita ei-toivotun polymorfin ydintyminen,ja mahdollisesti muuttaa kiteytyksen ohjausta halutun polymorfimuodon saavuttamiseksi. Monimuuttujamenetelmiä sovellettiin myös kiteytyspanosten välisen vaihtelun määrittämiseen mitatusta spektriaineistosta. Tämäntyyppisestä analyysistä saatua informaatiota voidaan hyödyntää kiteytysprosessien suunnittelussa ja optimoinnissa. Väitöstyössä testattiin IR spektroskopian ja erilaisten monimuuttujamenetelmien soveltuvuutta kidetuotteen polymorfikoostumuksen nopeaan määritykseen. Jauhemaisten näytteiden luokittelu eri polymorfeja sisältäviin näytteisiin voitiin tehdä käyttäen tarkoitukseen soveltuvia monimuuttujaisia luokittelumenetelmiä. Tämä tarjoaa nopean menetelmän jauhemaisen näytteen polymorfikoostumuksen karkeaan arviointiin, eli siihen mitä yksittäistä polymorfia kyseinen näyte pääasiassa sisältää. Varsinainen kvantitatiivinen analyysi, eli sen selvittäminen paljonko esim. painoprosentteina näyte sisältää eri polymorfeja, vaatii kaikki polymorfit kattavan fysikaalisen kalibrointisarjan, mikä voi olla puhtaiden polymorfien huonon saatavuuden takia hankalaa.
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing one or more parameters in their definition. Methods that can be linked in this way are correspondence analysis, unweighted or weighted logratio analysis (the latter also known as "spectral mapping"), nonsymmetric correspondence analysis, principal component analysis (with and without logarithmic transformation of the data) and multidimensional scaling. In this presentation I will show how several of these methods, which are frequently used in compositional data analysis, may be linked through parametrizations such as power transformations, linear transformations and convex linear combinations. Since the methods of interest here all lead to visual maps of data, a "movie" can be made where where the linking parameter is allowed to vary in small steps: the results are recalculated "frame by frame" and one can see the smooth change from one method to another. Several of these "movies" will be shown, giving a deeper insight into the similarities and differences between these methods
Resumo:
The topology of real-world complex networks, such as in transportation and communication, is always changing with time. Such changes can arise not only as a natural consequence of their growth, but also due to major modi. cations in their intrinsic organization. For instance, the network of transportation routes between cities and towns ( hence locations) of a given country undergo a major change with the progressive implementation of commercial air transportation. While the locations could be originally interconnected through highways ( paths, giving rise to geographical networks), transportation between those sites progressively shifted or was complemented by air transportation, with scale free characteristics. In the present work we introduce the path-star transformation ( in its uniform and preferential versions) as a means to model such network transformations where paths give rise to stars of connectivity. It is also shown, through optimal multivariate statistical methods (i.e. canonical projections and maximum likelihood classification) that while the US highways network adheres closely to a geographical network model, its path-star transformation yields a network whose topological properties closely resembles those of the respective airport transportation network.
Resumo:
Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)
Resumo:
A methodology to define favorable areas in petroleum and mineral exploration is applied, which consists in weighting the exploratory variables, in order to characterize their importance as exploration guides. The exploration data are spatially integrated in the selected area to establish the association between variables and deposits, and the relationships among distribution, topology, and indicator pattern of all variables. Two methods of statistical analysis were compared. The first one is the Weights of Evidence Modeling, a conditional probability approach (Agterberg, 1989a), and the second one is the Principal Components Analysis (Pan, 1993). In the conditional method, the favorability estimation is based on the probability of deposit and variable joint occurrence, with the weights being defined as natural logarithms of likelihood ratios. In the multivariate analysis, the cells which contain deposits are selected as control cells and the weights are determined by eigendecomposition, being represented by the coefficients of the eigenvector related to the system's largest eigenvalue. The two techniques of weighting and complementary procedures were tested on two case studies: 1. Recôncavo Basin, Northeast Brazil (for Petroleum) and 2. Itaiacoca Formation of Ribeira Belt, Southeast Brazil (for Pb-Zn Mississippi Valley Type deposits). The applied methodology proved to be easy to use and of great assistance to predict the favorability in large areas, particularly in the initial phase of exploration programs. © 1998 International Association for Mathematical Geology.
Resumo:
Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^
Resumo:
Measures of agro-ecosystems genetic variability are essential to sustain scientific-based actions and policies tending to protect the ecosystem services they provide. To build the genetic variability datum it is necessary to deal with a large number and different types of variables. Molecular marker data is highly dimensional by nature, and frequently additional types of information are obtained, as morphological and physiological traits. This way, genetic variability studies are usually associated with the measurement of several traits on each entity. Multivariate methods are aimed at finding proximities between entities characterized by multiple traits by summarizing information in few synthetic variables. In this work we discuss and illustrate several multivariate methods used for different purposes to build the datum of genetic variability. We include methods applied in studies for exploring the spatial structure of genetic variability and the association of genetic data to other sources of information. Multivariate techniques allow the pursuit of the genetic variability datum, as a unifying notion that merges concepts of type, abundance and distribution of variability at gene level.
Resumo:
Two contrasting multivariate statistical methods, viz., principal components analysis (PCA) and cluster analysis were applied to the study of neuropathological variations between cases of Alzheimer's disease (AD). To compare the two methods, 78 cases of AD were analyzed, each characterised by measurements of 47 neuropathological variables. Both methods of analysis revealed significant variations between AD cases. These variations were related primarily to differences in the distribution and abundance of senile plaques (SP) and neurofibrillary tangles (NFT) in the brain. Cluster analysis classified the majority of AD cases into five groups which could represent subtypes of AD. However, PCA suggested that variation between cases was more continuous with no distinct subtypes. Hence, PCA may be a more appropriate method than cluster analysis in the study of neuropathological variations between AD cases.
Resumo:
The Large Hadron Collider, located at the CERN laboratories in Geneva, is the largest particle accelerator in the world. One of the main research fields at LHC is the study of the Higgs boson, the latest particle discovered at the ATLAS and CMS experiments. Due to the small production cross section for the Higgs boson, only a substantial statistics can offer the chance to study this particle properties. In order to perform these searches it is desirable to avoid the contamination of the signal signature by the number and variety of the background processes produced in pp collisions at LHC. Much account assumes the study of multivariate methods which, compared to the standard cut-based analysis, can enhance the signal selection of a Higgs boson produced in association with a top quark pair through a dileptonic final state (ttH channel). The statistics collected up to 2012 is not sufficient to supply a significant number of ttH events; however, the methods applied in this thesis will provide a powerful tool for the increasing statistics that will be collected during the next LHC data taking.
Resumo:
Univariate statistical control charts, such as the Shewhart chart, do not satisfy the requirements for process monitoring on a high volume automated fuel cell manufacturing line. This is because of the number of variables that require monitoring. The risk of elevated false alarms, due to the nature of the process being high volume, can present problems if univariate methods are used. Multivariate statistical methods are discussed as an alternative for process monitoring and control. The research presented is conducted on a manufacturing line which evaluates the performance of a fuel cell. It has three stages of production assembly that contribute to the final end product performance. The product performance is assessed by power and energy measurements, taken at various time points throughout the discharge testing of the fuel cell. The literature review performed on these multivariate techniques are evaluated using individual and batch observations. Modern techniques using multivariate control charts on Hotellings T2 are compared to other multivariate methods, such as Principal Components Analysis (PCA). The latter, PCA, was identified as the most suitable method. Control charts such as, scores, T2 and DModX charts, are constructed from the PCA model. Diagnostic procedures, using Contribution plots, for out of control points that are detected using these control charts, are also discussed. These plots enable the investigator to perform root cause analysis. Multivariate batch techniques are compared to individual observations typically seen on continuous processes. Recommendations, for the introduction of multivariate techniques that would be appropriate for most high volume processes, are also covered.
Resumo:
Unraveling the effect of selection vs. drift on the evolution of quantitative traits is commonly achieved by one of two methods. Either one contrasts population differentiation estimates for genetic markers and quantitative traits (the Q(st)-F(st) contrast) or multivariate methods are used to study the covariance between sets of traits. In particular, many studies have focused on the genetic variance-covariance matrix (the G matrix). However, both drift and selection can cause changes in G. To understand their joint effects, we recently combined the two methods into a single test (accompanying article by Martin et al.), which we apply here to a network of 16 natural populations of the freshwater snail Galba truncatula. Using this new neutrality test, extended to hierarchical population structures, we studied the multivariate equivalent of the Q(st)-F(st) contrast for several life-history traits of G. truncatula. We found strong evidence of selection acting on multivariate phenotypes. Selection was homogeneous among populations within each habitat and heterogeneous between habitats. We found that the G matrices were relatively stable within each habitat, with proportionality between the among-populations (D) and the within-populations (G) covariance matrices. The effect of habitat heterogeneity is to break this proportionality because of selection for habitat-dependent optima. Individual-based simulations mimicking our empirical system confirmed that these patterns are expected under the selective regime inferred. We show that homogenizing selection can mimic some effect of drift on the G matrix (G and D almost proportional), but that incorporating information from molecular markers (multivariate Q(st)-F(st)) allows disentangling the two effects.
Resumo:
We conducted this study to determine the relative influence of various mechanical and patient-related factors on the incidence of dislocation after primary total hip asthroplasty (THA). Of 2,023 THAs, 21 patients who had at least 1 dislocation were compared with a control group of 21 patients without dislocation, matched for age, gender, pathology, and year of surgery. Implant positioning, seniority of the surgeon, American Society of Anesthesiologists (ASA) score, and diminished motor coordination were recorded. Data analysis included univariate and multivariate methods. The dislocation risk was 6.9 times higher if total anteversion was not between 40 degrees and 60 degrees and 10 times higher in patients with high ASA scores. Surgeons should pay attention to total anteversion (cup and stem) of THA. The ASA score should be part of the preoperative assessment of the dislocation risk.