957 resultados para Multivariate statistical methods
Resumo:
Variable number of tandem repeats (VNTR) are genetic loci at which short sequence motifs are found repeated different numbers of times among chromosomes. To explore the potential utility of VNTR loci in evolutionary studies, I have conducted a series of studies to address the following questions: (1) What are the population genetic properties of these loci? (2) What are the mutational mechanisms of repeat number change at these loci? (3) Can DNA profiles be used to measure the relatedness between a pair of individuals? (4) Can DNA fingerprint be used to measure the relatedness between populations in evolutionary studies? (5) Can microsatellite and short tandem repeat (STR) loci which mutate stepwisely be used in evolutionary analyses?^ A large number of VNTR loci typed in many populations were studied by means of statistical methods developed recently. The results of this work indicate that there is no significant departure from Hardy-Weinberg expectation (HWE) at VNTR loci in most of the human populations examined, and the departure from HWE in some VNTR loci are not solely caused by the presence of population sub-structure.^ A statistical procedure is developed to investigate the mutational mechanisms of VNTR loci by studying the allele frequency distributions of these loci. Comparisons of frequency distribution data on several hundreds VNTR loci with the predictions of two mutation models demonstrated that there are differences among VNTR loci grouped by repeat unit sizes.^ By extending the ITO method, I derived the distribution of the number of shared bands between individuals with any kinship relationship. A maximum likelihood estimation procedure is proposed to estimate the relatedness between individuals from the observed number of shared bands between them.^ It was believed that classical measures of genetic distance are not applicable to analysis of DNA fingerprints which reveal many minisatellite loci simultaneously in the genome, because the information regarding underlying alleles and loci is not available. I proposed a new measure of genetic distance based on band sharing between individuals that is applicable to DNA fingerprint data.^ To address the concern that microsatellite and STR loci may not be useful for evolutionary studies because of the convergent nature of their mutation mechanisms, by a theoretical study as well as by computer simulation, I conclude that the possible bias caused by the convergent mutations can be corrected, and a novel measure of genetic distance that makes the correction is suggested. In summary, I conclude that hypervariable VNTR loci are useful in evolutionary studies of closely related populations or species, especially in the study of human evolution and the history of geographic dispersal of Homo sapiens. (Abstract shortened by UMI.) ^
Resumo:
OBJECTIVES: The aim of the study was to assess whether prospective follow-up data within the Swiss HIV Cohort Study can be used to predict patients who stop smoking; or among smokers who stop, those who start smoking again. METHODS: We built prediction models first using clinical reasoning ('clinical models') and then by selecting from numerous candidate predictors using advanced statistical methods ('statistical models'). Our clinical models were based on literature that suggests that motivation drives smoking cessation, while dependence drives relapse in those attempting to stop. Our statistical models were based on automatic variable selection using additive logistic regression with component-wise gradient boosting. RESULTS: Of 4833 smokers, 26% stopped smoking, at least temporarily; because among those who stopped, 48% started smoking again. The predictive performance of our clinical and statistical models was modest. A basic clinical model for cessation, with patients classified into three motivational groups, was nearly as discriminatory as a constrained statistical model with just the most important predictors (the ratio of nonsmoking visits to total visits, alcohol or drug dependence, psychiatric comorbidities, recent hospitalization and age). A basic clinical model for relapse, based on the maximum number of cigarettes per day prior to stopping, was not as discriminatory as a constrained statistical model with just the ratio of nonsmoking visits to total visits. CONCLUSIONS: Predicting smoking cessation and relapse is difficult, so that simple models are nearly as discriminatory as complex ones. Patients with a history of attempting to stop and those known to have stopped recently are the best candidates for an intervention.
Resumo:
Background: Disturbed interpersonal communication is a core problem in schizophrenia. Patients with schizophrenia often appear disconnected and "out of sync" when interacting with others. This may involve perception, cognition, motor behavior, and nonverbal expressiveness. Although well-known from clinical observation, mainstream research has neglected this area. Corresponding theoretical concepts, statistical methods, and assessment were missing. In recent research, however, it has been shown that objective, video-based measures of nonverbal behavior can be used to reliably quantify nonverbal behavior in schizophrenia. Newly developed algorithms allow for a calculation of movement synchrony. We found that the objective amount of movement of patients with schizophrenia during social interactions was closely related to the symptom profiles of these patients (Kupper et al., 2010). In addition and above the mere amount of movement, the degree of synchrony between patients and healthy interactants may be indicative of various problems in the domain of interpersonal communication and social cognition. Methods: Based on our earlier study, head movement synchrony was assessed objectively (using Motion Energy Analysis, MEA) in 378 brief, videotaped role-play scenes involving 27 stabilized outpatients diagnosed with paranoid-type schizophrenia. Results: Lower head movement synchrony was indicative of symptoms (negative symptoms, but also of conceptual disorganization and lack of insight), verbal memory, patients’ self-evaluation of competence, and social functioning. Many of these relationships remained significant even when corrected for the amount of movement of the patients. Conclusion: The results suggest that nonverbal synchrony may be an objective and sensitive indicator of the severity of symptoms, cognition and social functioning.
Resumo:
The diversity of populations in domestic species offers great opportunities to study genome response to selection. The recently published Sheep HapMap dataset is a great example of characterization of the world wide genetic diversity in sheep. In this study, we re-analyzed the Sheep HapMap dataset to identify selection signatures in worldwide sheep populations. Compared to previous analyses, we made use of statistical methods that (i) take account of the hierarchical structure of sheep populations, (ii) make use of linkage disequilibrium information and (iii) focus specifically on either recent or older selection signatures. We show that this allows pinpointing several new selection signatures in the sheep genome and distinguishing those related to modern breeding objectives and to earlier post-domestication constraints. The newly identified regions, together with the ones previously identified, reveal the extensive genome response to selection on morphology, color and adaptation to new environments.
Resumo:
1H high resolution magic angle spinning (HR-MAS) NMR spectroscopy was applied in combination with multivariate statistical analyses to study the metabolic response of whole cells to the treatment with a hexacationic ruthenium metallaprism [1]6+ as potential anticancer drug. Human ovarian cancer cells (A2780), the corresponding cisplatin resistant cells (A2780cisR), and human embryonic kidney cells (HEK-293) were each incubated for 24 h and 72 h with [1]6+ and compared to untreated cells. Different responses were obtained depending on the cell type and incubation time. Most pronounced changes were found for lipids, choline containing compounds, glutamate and glutathione, nucleotide sugars, lactate, and some amino acids. Possible contributions of these metabolites to physiologic processes are discussed. The time-dependent metabolic response patterns suggest that A2780 cells on one hand and HEK-293 cells and A2780cisR cells on the other hand may follow different cell death pathways and exist in different temporal stages thereof.
Resumo:
The role of endothelial progenitor cells (EPCs) in peripheral artery disease (PAD) remains unclear. We hypothesized that EPC mobilization and function play a central role in the development of endothelial dysfunction and directly influence the degree of atherosclerotic burden in peripheral artery vessels. The number of circulating EPCs, defined as CD34(+)/KDR(+) cells, were assessed by flow cytometry in 91 subjects classified according to a predefined sample size of 31 non-diabetic PAD patients, 30 diabetic PAD patients, and 30 healthy volunteers. Both PAD groups had undergone endovascular treatment in the past. As a functional parameter, EPC colony-forming units were determined ex vivo. Apart from a broad laboratory analysis, a series of clinical measures using the ankle-brachial index (ABI), flow-mediated dilatation (FMD) and carotid intima-media thickness (cIMT) were investigated. A significant reduction of EPC counts and proliferation indices in both PAD groups compared to healthy subjects were observed. Low EPC number and pathological findings in the clinical assessment were strongly correlated to the group allocation. Multivariate statistical analysis revealed these findings to be independent predictors of disease appearance. Linear regression analysis showed the ABI to be a predictor of circulating EPC number (p=0.02). Moreover, the functionality of EPCs was correlated by linear regression (p=0.017) to cIMT. The influence of diabetes mellitus on EPCs in our study has to be considered marginal in already disease-affected patients. This study demonstrated that EPCs could predict the prevalence and severity of symptomatic PAD, with ABI as the determinant of the state of EPC populations in disease-affected groups.
Resumo:
With the ongoing shift in the computer graphics industry toward Monte Carlo rendering, there is a need for effective, practical noise-reduction techniques that are applicable to a wide range of rendering effects and easily integrated into existing production pipelines. This course surveys recent advances in image-space adaptive sampling and reconstruction algorithms for noise reduction, which have proven very effective at reducing the computational cost of Monte Carlo techniques in practice. These approaches leverage advanced image-filtering techniques with statistical methods for error estimation. They are attractive because they can be integrated easily into conventional Monte Carlo rendering frameworks, they are applicable to most rendering effects, and their computational overhead is modest.
Resumo:
Despite many researches on development in education and psychology, not often is the methodology tested with real data. A major barrier to test the growth model is that the design of study includes repeated observations and the nature of the growth is nonlinear. The repeat measurements on a nonlinear model require sophisticated statistical methods. In this study, we present mixed effects model in a negative exponential curve to describe the development of children's reading skills. This model can describe the nature of the growth on children's reading skills and account for intra-individual and inter-individual variation. We also apply simple techniques including cross-validation, regression, and graphical methods to determine the most appropriate curve for data, to find efficient initial values of parameters, and to select potential covariates. We illustrate with an example that motivated this research: a longitudinal study of academic skills from grade 1 to grade 12 in Connecticut public schools. ^
Resumo:
Few studies have investigated causal pathways linking psychosocial factors to each other and to screening mammography. Conflicting hypotheses exist in the theoretic literature regarding the role and importance of subjective norms, a person's perceived social pressure to perform the behavior and his/her motivation to comply. The Theory of Reasoned Action (TRA) hypothesizes that subjective norms directly affect intention; while the Transtheoretical Model (TTM) hypothesizes that attitudes mediate the influence of subjective norms on stage of change. No one has examined which hypothesis best predicts the effect of subjective norms on mammography intention and stage of change. Two statistical methods are available for testing mediation, sequential regression analysis (SRA) and latent variable structural equation modeling (LVSEM); however, software to apply LVSEM to dichotomous variables like intention has only recently become available. No one has compared the methods to determine whether or not they yield similar results for dichotomous variables. ^ Study objectives were to: (1) determine whether the effect of subjective norms on mammography intention and stage of change are mediated by pros and cons; and (2) compare mediation results from the SRA and LVSEM approaches when the outcome is dichotomous. We conducted a secondary analysis of data from a national sample of women veterans enrolled in Project H.O.M.E. (H&barbelow;ealthy O&barbelow;utlook on the M&barbelow;ammography E&barbelow;xperience), a behavioral intervention trial. ^ Results showed that the TTM model described the causal pathways better than the TRA one; however, we found support for only one of the TTM causal mechanisms. Cons was the sole mediator. The mediated effect of subjective norms on intention and stage of change by cons was very small. These findings suggest that interventionists focus their efforts on reducing negative attitudes toward mammography when resources are limited. ^ Both the SRA and LVSEM methods provided evidence for complete mediation, and the direction, magnitude, and standard errors of the parameter estimates were very similar. Because SRA parameter estimates were not biased toward the null, we can probably assume negligible measurement error in the independent and mediator variables. Simulation studies are needed to further our understanding of how these two methods perform under different data conditions. ^
Resumo:
Linkage and association studies are major analytical tools to search for susceptibility genes for complex diseases. With the availability of large collection of single nucleotide polymorphisms (SNPs) and the rapid progresses for high throughput genotyping technologies, together with the ambitious goals of the International HapMap Project, genetic markers covering the whole genome will be available for genome-wide linkage and association studies. In order not to inflate the type I error rate in performing genome-wide linkage and association studies, multiple adjustment for the significant level for each independent linkage and/or association test is required, and this has led to the suggestion of genome-wide significant cut-off as low as 5 × 10 −7. Almost no linkage and/or association study can meet such a stringent threshold by the standard statistical methods. Developing new statistics with high power is urgently needed to tackle this problem. This dissertation proposes and explores a class of novel test statistics that can be used in both population-based and family-based genetic data by employing a completely new strategy, which uses nonlinear transformation of the sample means to construct test statistics for linkage and association studies. Extensive simulation studies are used to illustrate the properties of the nonlinear test statistics. Power calculations are performed using both analytical and empirical methods. Finally, real data sets are analyzed with the nonlinear test statistics. Results show that the nonlinear test statistics have correct type I error rates, and most of the studied nonlinear test statistics have higher power than the standard chi-square test. This dissertation introduces a new idea to design novel test statistics with high power and might open new ways to mapping susceptibility genes for complex diseases. ^
Resumo:
Next-generation DNA sequencing platforms can effectively detect the entire spectrum of genomic variation and is emerging to be a major tool for systematic exploration of the universe of variants and interactions in the entire genome. However, the data produced by next-generation sequencing technologies will suffer from three basic problems: sequence errors, assembly errors, and missing data. Current statistical methods for genetic analysis are well suited for detecting the association of common variants, but are less suitable to rare variants. This raises great challenge for sequence-based genetic studies of complex diseases.^ This research dissertation utilized genome continuum model as a general principle, and stochastic calculus and functional data analysis as tools for developing novel and powerful statistical methods for next generation of association studies of both qualitative and quantitative traits in the context of sequencing data, which finally lead to shifting the paradigm of association analysis from the current locus-by-locus analysis to collectively analyzing genome regions.^ In this project, the functional principal component (FPC) methods coupled with high-dimensional data reduction techniques will be used to develop novel and powerful methods for testing the associations of the entire spectrum of genetic variation within a segment of genome or a gene regardless of whether the variants are common or rare.^ The classical quantitative genetics suffer from high type I error rates and low power for rare variants. To overcome these limitations for resequencing data, this project used functional linear models with scalar response to develop statistics for identifying quantitative trait loci (QTLs) for both common and rare variants. To illustrate their applications, the functional linear models were applied to five quantitative traits in Framingham heart studies. ^ This project proposed a novel concept of gene-gene co-association in which a gene or a genomic region is taken as a unit of association analysis and used stochastic calculus to develop a unified framework for testing the association of multiple genes or genomic regions for both common and rare alleles. The proposed methods were applied to gene-gene co-association analysis of psoriasis in two independent GWAS datasets which led to discovery of networks significantly associated with psoriasis.^
Resumo:
Statistical methods are developed which assess survival data for two attributes; (1) prolongation of life, (2) quality of life. Health state transition probabilities correspond to prolongation of life and are modeled as a discrete-time semi-Markov process. Imbedded within the sojourn time of a particular health state are the quality of life transitions. They reflect events which differentiate perceptions of pain and suffering over a fixed time period. Quality of life transition probabilities are derived from the assumptions of a simple Markov process. These probabilities depend on the health state currently occupied and the next health state to which a transition is made. Utilizing the two forms of attributes the model has the capability to estimate the distribution of expected quality adjusted life years (in addition to the distribution of expected survival times). The expected quality of life can also be estimated within the health state sojourn time making more flexible the assessment of utility preferences. The methods are demonstrated on a subset of follow-up data from the Beta Blocker Heart Attack Trial (BHAT). This model contains the structure necessary to make inferences when assessing a general survival problem with a two dimensional outcome. ^
Resumo:
An investigation was undertaken to determine the chemical characterization of inhalable particulate matter in the Houston area, with special emphasis on source identification and apportionment of outdoor and indoor atmospheric aerosols using multivariate statistical analyses.^ Fine (<2.5 (mu)m) particle aerosol samples were collected by means of dichotomous samplers at two fixed site (Clear Lake and Sunnyside) ambient monitoring stations and one mobile monitoring van in the Houston area during June-October 1981 as part of the Houston Asthma Study. The mobile van allowed particulate sampling to take place both inside and outside of twelve homes.^ The samples collected for 12-h sampling on a 7 AM-7 PM and 7 PM-7 AM (CDT) schedule were analyzed for mass, trace elements, and two anions. Mass was determined gravimetrically. An energy-dispersive X-ray fluorescence (XRF) spectrometer was used for determination of elemental composition. Ion chromatography (IC) was used to determine sulfate and nitrate.^ Average chemical compositions of fine aerosol at each site were presented. Sulfate was found to be the largest single component in the fine fraction mass, comprising approximately 30% of the fine mass outdoors and 12% indoors, respectively.^ Principal components analysis (PCA) was applied to identify sources of aerosols and to assess the role of meteorological factors on the variation in particulate samples. The results suggested that meteorological parameters were not associated with sources of aerosol samples collected at these Houston sites.^ Source factor contributions to fine mass were calculated using a combination of PCA and stepwise multivariate regression analysis. It was found that much of the total fine mass was apparently contributed by sulfate-related aerosols. The average contributions to the fine mass coming from the sulfate-related aerosols were 56% of the Houston outdoor ambient fine particulate matter and 26% of the indoor fine particulate matter.^ Characterization of indoor aerosol in residential environments was compared with the results for outdoor aerosols. It was suggested that much of the indoor aerosol may be due to outdoor sources, but there may be important contributions from common indoor sources in the home environment such as smoking and gas cooking. ^
Resumo:
The infant mortality rate for non-Hispanic Black infants in the U.S. is 13.63 deaths per 1,000 live births while the IMR for non-Hispanic White persons in the U.S. is 5.76 deaths per 1,000 live births. Black women are 2 times as likely as White women to deliver preterm infants and Black women are 2 times as likely as White women to deliver low birth weight infants (weighing less than 2,500 grams at birth). Differential underlying risk factors among mothers of different racial/ethnic groups for delivering pre-term and low birth weight infants have been historically accepted as the cause of racial disparities in IMRs. However, differential underlying risk status may not be the only major causative factor. Differential or unequal access to and provision of care is widely speculated to be a leading contributing factor to the wide racial disparity in infant mortality.2 This paper conducts a systematic review of existing literature investigating racial disparities in obstetrical care provided by healthcare practitioners to evaluate whether inequities in healthcare services provided to pregnant mothers and their neonates exist. The search terms "racial disparities obstetrical care," "racial differences quality of prenatal care," and "infant mortality racial disparities" were entered into the EBSCO Medline, Ovid Medline, PubMed, and Academic Search Complete databases, and articles between years 1990–2011 were selected for abstract review. The only articles included were those that used statistical methods to assess whether racial inequalities were present in the obstetrical services provided to pregnant women. My literature search returned 5 articles. Four of the five studies yielded significant racial differences in obstetrical care. However, the one study that used a large, nationally representative valid sample did not represent significant differences. Thus, this review provides initial evidence for racial disparities in obstetrical care, but concludes that more studies are needed in this area. Not all of the studies reviewed were consistent in the use and measurement of services, and not all studies were significant. The policy and public health implications of possible racial disparities in obstetrical care include the need to develop standard of care protocols for ALL obstetrical patients across the United States to minimize and/or eliminate the inequities and differences in obstetrical services provided.^
Resumo:
In the biomedical studies, the general data structures have been the matched (paired) and unmatched designs. Recently, many researchers are interested in Meta-Analysis to obtain a better understanding from several clinical data of a medical treatment. The hybrid design, which is combined two data structures, may create the fundamental question for statistical methods and the challenges for statistical inferences. The applied methods are depending on the underlying distribution. If the outcomes are normally distributed, we would use the classic paired and two independent sample T-tests on the matched and unmatched cases. If not, we can apply Wilcoxon signed rank and rank sum test on each case. ^ To assess an overall treatment effect on a hybrid design, we can apply the inverse variance weight method used in Meta-Analysis. On the nonparametric case, we can use a test statistic which is combined on two Wilcoxon test statistics. However, these two test statistics are not in same scale. We propose the Hybrid Test Statistic based on the Hodges-Lehmann estimates of the treatment effects, which are medians in the same scale.^ To compare the proposed method, we use the classic meta-analysis T-test statistic on the combined the estimates of the treatment effects from two T-test statistics. Theoretically, the efficiency of two unbiased estimators of a parameter is the ratio of their variances. With the concept of Asymptotic Relative Efficiency (ARE) developed by Pitman, we show ARE of the hybrid test statistic relative to classic meta-analysis T-test statistic using the Hodges-Lemann estimators associated with two test statistics.^ From several simulation studies, we calculate the empirical type I error rate and power of the test statistics. The proposed statistic would provide effective tool to evaluate and understand the treatment effect in various public health studies as well as clinical trials.^