970 resultados para Multivariate data
Resumo:
Background Protein-energy-malnutrition (PEM) is common in people with end stage kidney disease (ESKD) undergoing maintenance haemodialysis (MHD) and correlates strongly with mortality. To this day, there is no gold standard for detecting PEM in patients on MHD. Aim of Study The aim of this study was to evaluate if Nutritional Risk Screening 2002 (NRS-2002), handgrip strength measurement, mid-upper arm muscle area (MUAMA), triceps skin fold measurement (TSF), serum albumin, normalised protein catabolic rate (nPCR), Kt/V and eKt/V, dry body weight, body mass index (BMI), age and time since start on MHD are relevant for assessing PEM in patients on MHD. Methods The predictive value of the selected parameters on mortality and mortality or weight loss of more than 5% was assessed. Quantitative data analysis of the 12 parameters in the same patients on MHD in autumn 2009 (n = 64) and spring 2011 (n = 40) with paired statistical analysis and multivariate logistic regression analysis was performed. Results Paired data analysis showed significant reduction of dry body weight, BMI and nPCR. Kt/Vtot did not change, eKt/v and hand grip strength measurements were significantly higher in spring 2011. No changes were detected in TSF, serum albumin, NRS-2002 and MUAMA. Serum albumin was shown to be the only predictor of death and of the combined endpoint “death or weight loss of more than 5%”. Conclusion We now screen patients biannually for serum albumin, nPCR, Kt/V, handgrip measurement of the shunt-free arm, dry body weight, age and time since initiation of MHD.
Resumo:
PURPOSE To identify the influence of fixed prosthesis type on biologic and technical complication rates in the context of screw versus cement retention. Furthermore, a multivariate analysis was conducted to determine which factors, when considered together, influence the complication and failure rates of fixed implant-supported prostheses. MATERIALS AND METHODS Electronic searches of MEDLINE (PubMed), EMBASE, and the Cochrane Library were conducted. Selected inclusion and exclusion criteria were used to limit the search. Data were analyzed statistically with simple and multivariate random-effects Poisson regressions. RESULTS Seventy-three articles qualified for inclusion in the study. Screw-retained prostheses showed a tendency toward and significantly more technical complications than cemented prostheses with single crowns and fixed partial prostheses, respectively. Resin chipping and ceramic veneer chipping had high mean event rates, at 10.04 and 8.95 per 100 years, respectively, for full-arch screwed prostheses. For "all fixed prostheses" (prosthesis type not reported or not known), significantly fewer biologic and technical complications were seen with screw retention. Multivariate analysis revealed a significantly greater incidence of technical complications with cemented prostheses. Full-arch prostheses, cantilevered prostheses, and "all fixed prostheses" had significantly higher complication rates than single crowns. A significantly greater incidence of technical and biologic complications was seen with cemented prostheses. CONCLUSION Screw-retained fixed partial prostheses demonstrated a significantly higher rate of technical complications and screw-retained full-arch prostheses demonstrated a notably high rate of veneer chipping. When "all fixed prostheses" were considered, significantly higher rates of technical and biologic complications were seen for cement-retained prostheses. Multivariate Poisson regression analysis failed to show a significant difference between screw- and cement-retained prostheses with respect to the incidence of failure but demonstrated a higher rate of technical and biologic complications for cement-retained prostheses. The incidence of technical complications was more dependent upon prosthesis and retention type than prosthesis or abutment material.
Resumo:
Syndromic surveillance (SyS) systems currently exploit various sources of health-related data, most of which are collected for purposes other than surveillance (e.g. economic). Several European SyS systems use data collected during meat inspection for syndromic surveillance of animal health, as some diseases may be more easily detected post-mortem than at their point of origin or during the ante-mortem inspection upon arrival at the slaughterhouse. In this paper we use simulation to evaluate the performance of a quasi-Poisson regression (also known as an improved Farrington) algorithm for the detection of disease outbreaks during post-mortem inspection of slaughtered animals. When parameterizing the algorithm based on the retrospective analyses of 6 years of historic data, the probability of detection was satisfactory for large (range 83-445 cases) outbreaks but poor for small (range 20-177 cases) outbreaks. Varying the amount of historical data used to fit the algorithm can help increasing the probability of detection for small outbreaks. However, while the use of a 0·975 quantile generated a low false-positive rate, in most cases, more than 50% of outbreak cases had already occurred at the time of detection. High variance observed in the whole carcass condemnations time-series, and lack of flexibility in terms of the temporal distribution of simulated outbreaks resulting from low reporting frequency (monthly), constitute major challenges for early detection of outbreaks in the livestock population based on meat inspection data. Reporting frequency should be increased in the future to improve timeliness of the SyS system while increased sensitivity may be achieved by integrating meat inspection data into a multivariate system simultaneously evaluating multiple sources of data on livestock health.
Resumo:
The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^
Resumo:
Current statistical methods for estimation of parametric effect sizes from a series of experiments are generally restricted to univariate comparisons of standardized mean differences between two treatments. Multivariate methods are presented for the case in which effect size is a vector of standardized multivariate mean differences and the number of treatment groups is two or more. The proposed methods employ a vector of independent sample means for each response variable that leads to a covariance structure which depends only on correlations among the $p$ responses on each subject. Using weighted least squares theory and the assumption that the observations are from normally distributed populations, multivariate hypotheses analogous to common hypotheses used for testing effect sizes were formulated and tested for treatment effects which are correlated through a common control group, through multiple response variables observed on each subject, or both conditions.^ The asymptotic multivariate distribution for correlated effect sizes is obtained by extending univariate methods for estimating effect sizes which are correlated through common control groups. The joint distribution of vectors of effect sizes (from $p$ responses on each subject) from one treatment and one control group and from several treatment groups sharing a common control group are derived. Methods are given for estimation of linear combinations of effect sizes when certain homogeneity conditions are met, and for estimation of vectors of effect sizes and confidence intervals from $p$ responses on each subject. Computational illustrations are provided using data from studies of effects of electric field exposure on small laboratory animals. ^
Resumo:
The role of clinical chemistry has traditionally been to evaluate acutely ill or hospitalized patients. Traditional statistical methods have serious drawbacks in that they use univariate techniques. To demonstrate alternative methodology, a multivariate analysis of covariance model was developed and applied to the data from the Cooperative Study of Sickle Cell Disease.^ The purpose of developing the model for the laboratory data from the CSSCD was to evaluate the comparability of the results from the different clinics. Several variables were incorporated into the model in order to control for possible differences among the clinics that might confound any real laboratory differences.^ Differences for LDH, alkaline phosphatase and SGOT were identified which will necessitate adjustments by clinic whenever these data are used. In addition, aberrant clinic values for LDH, creatinine and BUN were also identified.^ The use of any statistical technique including multivariate analysis without thoughtful consideration may lead to spurious conclusions that may not be corrected for some time, if ever. However, the advantages of multivariate analysis far outweigh its potential problems. If its use increases as it should, the applicability to the analysis of laboratory data in prospective patient monitoring, quality control programs, and interpretation of data from cooperative studies could well have a major impact on the health and well being of a large number of individuals. ^
Resumo:
Individuals with disabilities face numerous barriers to participation due to biological and physical characteristics of the disability as well as social and environmental factors. Participation can be impacted on all levels from societal, to activities of daily living, exercise, education, and interpersonal relationships. This study evaluated the impact of pain, mood, depression, quality of life and fatigue on participation for individuals with mobility impairments. This cross sectional study derives from self-report data collected from a wheelchair using sample. Bivariate correlational and multivariate analysis were employed to examine the relationship between pain, quality of life, positive and negative mood, fatigue, and depression with participation while controlling for relevant socio-demographic variables (sex, age, time with disability, race, and education). Results from the 122 respondents with mobility impairments demonstrated that after controlling for socio-demographic characteristics in the full model, 20% of the variance in participation scores were accounted for by pain, quality of life, positive and negative mood, and depression. Notably, quality of life emerged as being the single variable that was significantly related to participation in the full model. Contrary to other studies, pain did not appear to significantly impact participation outcomes for wheelchair users in this sample. Participation is an emerging area of interest among rehabilitation and disability researchers, and results of this study provide compelling evidence that several psychosocial factors are related to participation. This area of inquiry warrants further study, as many of the psychosocial variables identified in this study (mood, depression, quality of life) may be amenable to intervention, which may also positively influence participation.^
Resumo:
Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^
Resumo:
The reduction in sea ice along the SE Greenland coast during the last century has severely impacted ice-rafting to this area. In order to reconstruct ice-rafting and oceanographic conditions in the area of Denmark Strait during the last ~150 years, we conducted a multiproxy study on three short (20 cm) sediment cores from outer Kangerdlugssuaq Trough (~300 m water depth). The proxy-based data obtained have been compared with historical and instrumental data to gain a better understanding of the ice sheet-ocean interactions in the area. A robust chronology has been developed based on 210Pb and 137Cs measurements on core PO175GKC#9 (~66.2°N, 32°W) and expanded to the two adjacent cores based on correlations between calcite weight percent records. Our proxy records include sea-ice and phytoplankton biomarkers, and a variety of mineralogical determinations based on the <2 mm sediment fraction, including identification with quantitative x-ray diffraction, ice-rafted debris counts on the 63-150 µm sand fraction, and source identifications based on the composition of Fe oxides in the 45-250 µm fraction. A multivariate statistical analysis indicated significant correlations between our proxy records and historical data, especially with the mean annual temperature data from Stykkishólmur (Iceland) and the storis index (historical observations of sea-ice export via the East Greenland Current). In particular, the biological proxies (calcite weight percent, IP25, and total organic carbon %) showed significant linkage with the storis index. Our records show two distinct intervals in the recent history of the SE Greenland coast. The first of these (ad 1850-1910) shows predominantly perennial sea-ice conditions in the area, while the second (ad 1910-1990) shows more seasonally open water conditions.
Resumo:
Measures of agro-ecosystems genetic variability are essential to sustain scientific-based actions and policies tending to protect the ecosystem services they provide. To build the genetic variability datum it is necessary to deal with a large number and different types of variables. Molecular marker data is highly dimensional by nature, and frequently additional types of information are obtained, as morphological and physiological traits. This way, genetic variability studies are usually associated with the measurement of several traits on each entity. Multivariate methods are aimed at finding proximities between entities characterized by multiple traits by summarizing information in few synthetic variables. In this work we discuss and illustrate several multivariate methods used for different purposes to build the datum of genetic variability. We include methods applied in studies for exploring the spatial structure of genetic variability and the association of genetic data to other sources of information. Multivariate techniques allow the pursuit of the genetic variability datum, as a unifying notion that merges concepts of type, abundance and distribution of variability at gene level.
Resumo:
This work is a multidisciplinary environmental study that provides new insights into the relationships between sediment-organic matter characteristics and polybrominated diphenyl ethers (PBDEs) concentration. The aim of the present multivariate study was to correlate factors influencing PBDEs accumulation in sediment by using principal component analysis (PCA). Organic matter studies by Fourier Transform-Infrared spectroscopy and physicochemical analyses (Total Organic Carbon, pH, electrical conductivity) of sediment samples were considered for PCA. Samples were collected from an artificial irrigation network on the Mendoza River irrigation areas. PCA provided a comprehensive analysis of the studied variables, identifying two components that explained 63% of the data variance. Those factors were mainly associated to organic matter degradation degree, which represent a new insight into the relationships between organic matter in sediments and PBDEs fate. In this sense it was possible to determine that not only the content but also the type of organic matter (chemical structure) could be relevant when evaluating PBDEs accumulation and transport in the environment. Typification of organic matter may be a useful tool to predict more feasible areas where PBDE, may accumulate, as well as sediment transportation capability.
Resumo:
Sites 1085, 1086 and 1087 were drilled off South Africa during Ocean Drilling Program (ODP) Leg 175 to investigate the Benguela Current System. While previous studies have focused on reconstructing the Neogene palaeoceanographic and palaeoclimatic history of these sites, palynology has been largely ignored, except for the Late Pliocene and Quaternary. This study presents palynological data from the upper Middle Miocene to lower Upper Pliocene sediments in Holes 1085A, 1086A and 1087C that provide complementary information about the history of the area. Abundant and diverse marine palynomorphs (mainly dinoflagellate cysts), rare spores and pollen, and dispersed organic matter have been recovered. Multivariate statistical analysis of dispersed organic matter identified three palynofacies assemblages (A, B, C) in the most continuous hole (1085A), and they were defined primarily by amorphous organic matter (AOM), and to a lesser extent black debris, structured phytoclasts, degraded phytoclasts, and marine palynomorphs. Ecostratigraphic interpretation based on dinoflagellate cyst, spore-pollen and palynofacies data allowed us to identify several palaeoceanographic and palaeoclimatic signals. First, the late Middle Miocene was subtropical, and sediments contained the highest percentages of land-derived organic matter, even though they are rich in AOM (palynofacies assemblage A). Second, the Late Miocene was cool-temperate and characterized by periods of intensified upwelling, increase in productivity, abundant and diverse oceanic dinoflagellate cysts, and the highest percentages of AOM (palynofacies assemblage C). Third, the Early to early Late Pliocene was warm-temperate with some dry intervals (increase in grass pollen) and intensified upwelling. Fourth, the Neogene "carbonate crash" identified in other southern oceans was recognized in two palynofacies A samples in Hole 1085A that are nearly barren of dinoflagellate cysts: one Middle Miocene sample (590 mbsf, 13.62 Ma) and one Upper Miocene sample (355 mbsf, 6.5 Ma). Finally, the extremely low percentages of pollen suggest sparse vegetation on the adjacent landmass, and Namib desert conditions were already in existence during the late Middle Miocene.
Resumo:
The Global Ocean Sampling (GOS) expedition is currently the largest and geographically most comprehensive metagenomic dataset, including samples from the Atlantic, Pacific, and Indian Oceans. This study makes use of the wide range of environmental conditions and habitats encompassed within the GOS sites in order to investigate the ecological structuring of bacterial and archaeal taxon ranks. Community structures based on taxonomically classified 16S ribosomal RNA (rRNA) gene fragments at phylum, class, order, family, and genus rank levels were examined using multivariate statistical analysis, and the results were inspected in the context of oceanographic environmental variables and structured habitat classifications. At all taxon rank levels, community structures of neritic, oceanic, estuarine biomes, as well as other exotic biomes (salt marsh, lake, mangrove), were readily distinguishable from each other. A strong structuring of the communities with chlorophyll a concentration and a weaker yet significant structuring with temperature and salinity were observed. Furthermore, there were significant correlations between community structures and habitat classification. These results were used for further investigation of one-to-one relationships between taxa and environment and provided indications for ecological preferences shaped by primary production for both cultured and uncultured bacterial and archaeal clades.
Resumo:
Increasing amounts of data is collected in most areas of research and application. The degree to which this data can be accessed, analyzed, and retrieved, is a decisive in obtaining progress in fields such as scientific research or industrial production. We present a novel methodology supporting content-based retrieval and exploratory search in repositories of multivariate research data. In particular, our methods are able to describe two-dimensional functional dependencies in research data, e.g. the relationship between ination and unemployment in economics. Our basic idea is to use feature vectors based on the goodness-of-fit of a set of regression models to describe the data mathematically. We denote this approach Regressional Features and use it for content-based search and, since our approach motivates an intuitive definition of interestingness, for exploring the most interesting data. We apply our method on considerable real-world research datasets, showing the usefulness of our approach for user-centered access to research data in a Digital Library system.