5 resultados para Variable length Markov chains
em DigitalCommons@The Texas Medical Center
Resumo:
The tobacco-specific nitrosamine 4-(methylnitrosamino)-1-(3-pyridyl)-1-butanone (NNK) is an obvious carcinogen for lung cancer. Since CBMN (Cytokinesis-blocked micronucleus) has been found to be extremely sensitive to NNK-induced genetic damage, it is a potential important factor to predict the lung cancer risk. However, the association between lung cancer and NNK-induced genetic damage measured by CBMN assay has not been rigorously examined. ^ This research develops a methodology to model the chromosomal changes under NNK-induced genetic damage in a logistic regression framework in order to predict the occurrence of lung cancer. Since these chromosomal changes were usually not observed very long due to laboratory cost and time, a resampling technique was applied to generate the Markov chain of the normal and the damaged cell for each individual. A joint likelihood between the resampled Markov chains and the logistic regression model including transition probabilities of this chain as covariates was established. The Maximum likelihood estimation was applied to carry on the statistical test for comparison. The ability of this approach to increase discriminating power to predict lung cancer was compared to a baseline "non-genetic" model. ^ Our method offered an option to understand the association between the dynamic cell information and lung cancer. Our study indicated the extent of DNA damage/non-damage using the CBMN assay provides critical information that impacts public health studies of lung cancer risk. This novel statistical method could simultaneously estimate the process of DNA damage/non-damage and its relationship with lung cancer for each individual.^
Resumo:
Calcium levels in spines play a significant role in determining the sign and magnitude of synaptic plasticity. The magnitude of calcium influx into spines is highly dependent on influx through N-methyl D-aspartate (NMDA) receptors, and therefore depends on the number of postsynaptic NMDA receptors in each spine. We have calculated previously how the number of postsynaptic NMDA receptors determines the mean and variance of calcium transients in the postsynaptic density, and how this alters the shape of plasticity curves. However, the number of postsynaptic NMDA receptors in the postsynaptic density is not well known. Anatomical methods for estimating the number of NMDA receptors produce estimates that are very different than those produced by physiological techniques. The physiological techniques are based on the statistics of synaptic transmission and it is difficult to experimentally estimate their precision. In this paper we use stochastic simulations in order to test the validity of a physiological estimation technique based on failure analysis. We find that the method is likely to underestimate the number of postsynaptic NMDA receptors, explain the source of the error, and re-derive a more precise estimation technique. We also show that the original failure analysis as well as our improved formulas are not robust to small estimation errors in key parameters.
Resumo:
Prostate cancer (PC) is a significant economic and health burden in the U.S. and Europe but its causes are largely unknown. The most significant risk factors (after gender) are age and family history of the disease. A gene with high penetrance but low frequency on chromosome 1q, HPC 1, has been suggested to cause a proportion of the familial aggregation of PC but other more common genes, conferring less risk, are also thought to contribute to disease predisposition. We have pursued a strategy to study both types of genetic risk in PC. To identify high penetrance genes, affected men from thirteen families have been genotyped for genetic linkage analysis at six microsatellite markers spanning 45 cM of 1q24-25. Both LOD score and non-parametric statistics provide no significant support for HPC1 in this genomic region, although 3 of the families did combine to produce a LOD score of 0.9. These families will be included in a genome wide search for other PC predisposition genes as part of a multinational collaboration.^ For study of common genetic factors in PC development, leukocyte DNA samples from an unselected series of 55 patients and 67 controls have been examined for genetic differences in two other candidate genes, the androgen receptor gene, hAR, at Xq11-12, and the vitamin D receptor gene, hVDR, at 12q12-14. hAR was typed for two trinucleotide repeat length polymorphisms, (CAG)$\rm\sb{n}$ and (GGC)$\rm\sb{n},$ encoding polyglutamine and polyglycine tracts, respectively, which have been implicated in PC susceptibility. These data, combined with similarly processed patients and controls from the U.K. show no consistent association of allele length with PC risk. A novel finding, however, has been a significant association between the number of GGC repeats and the length of time between diagnosis and relapse in stage T1-T4 Caucasian patients irrespective of therapy and age of the patient. Of 49 patients who relapsed out of 108 entering the study, those with 16 or fewer GGC repeats had an average relapse-free-period of 101 (+/$-$7.7) months while for those with more than 16 repeats the period averaged 48 (+/$-$2.9) months, a difference of 2.1 fold or 4.4 years.^ The second gene, hVDR, was genotyped at two polymorphisms, a synonymous C/T substitution in exon 9 identified by differential TaqI enzymatic digestion and a variable length polyA tract in the 3$\sp\prime$ UTR. Although these polymorphisms are in strong linkage disequilibrium only the polyA region showed a possible association with PC risk. Men homozygous for alleles with fewer than 18 A's had an increased risk (OR = 3.0, p = 0.0578) compared to controls. This result is opposite to the findings of others and may either indicate off-setting random errors which together balance out to no significant overall effect or reflect more complex genetic and/or environmental associations.^ Overall, this research suggests that single gene familial predisposition may be less prominent in PC than in other cancers and that the characteristics of PC pathology may be useful in identifying the effects of common genetic factors. ^
Resumo:
The discrete-time Markov chain is commonly used in describing changes of health states for chronic diseases in a longitudinal study. Statistical inferences on comparing treatment effects or on finding determinants of disease progression usually require estimation of transition probabilities. In many situations when the outcome data have some missing observations or the variable of interest (called a latent variable) can not be measured directly, the estimation of transition probabilities becomes more complicated. In the latter case, a surrogate variable that is easier to access and can gauge the characteristics of the latent one is usually used for data analysis. ^ This dissertation research proposes methods to analyze longitudinal data (1) that have categorical outcome with missing observations or (2) that use complete or incomplete surrogate observations to analyze the categorical latent outcome. For (1), different missing mechanisms were considered for empirical studies using methods that include EM algorithm, Monte Carlo EM and a procedure that is not a data augmentation method. For (2), the hidden Markov model with the forward-backward procedure was applied for parameter estimation. This method was also extended to cover the computation of standard errors. The proposed methods were demonstrated by the Schizophrenia example. The relevance of public health, the strength and limitations, and possible future research were also discussed. ^
Resumo:
In this dissertation, we propose a continuous-time Markov chain model to examine the longitudinal data that have three categories in the outcome variable. The advantage of this model is that it permits a different number of measurements for each subject and the duration between two consecutive time points of measurements can be irregular. Using the maximum likelihood principle, we can estimate the transition probability between two time points. By using the information provided by the independent variables, this model can also estimate the transition probability for each subject. The Monte Carlo simulation method will be used to investigate the goodness of model fitting compared with that obtained from other models. A public health example will be used to demonstrate the application of this method. ^