995 resultados para Liu type estimator
Resumo:
Multiple linear regression model plays a key role in statistical inference and it has extensive applications in business, environmental, physical and social sciences. Multicollinearity has been a considerable problem in multiple regression analysis. When the regressor variables are multicollinear, it becomes difficult to make precise statistical inferences about the regression coefficients. There are some statistical methods that can be used, which are discussed in this thesis are ridge regression, Liu, two parameter biased and LASSO estimators. Firstly, an analytical comparison on the basis of risk was made among ridge, Liu and LASSO estimators under orthonormal regression model. I found that LASSO dominates least squares, ridge and Liu estimators over a significant portion of the parameter space for large dimension. Secondly, a simulation study was conducted to compare performance of ridge, Liu and two parameter biased estimator by their mean squared error criterion. I found that two parameter biased estimator performs better than its corresponding ridge regression estimator. Overall, Liu estimator performs better than both ridge and two parameter biased estimator.
Resumo:
Complex diseases such as cancer result from multiple genetic changes and environmental exposures. Due to the rapid development of genotyping and sequencing technologies, we are now able to more accurately assess causal effects of many genetic and environmental factors. Genome-wide association studies have been able to localize many causal genetic variants predisposing to certain diseases. However, these studies only explain a small portion of variations in the heritability of diseases. More advanced statistical models are urgently needed to identify and characterize some additional genetic and environmental factors and their interactions, which will enable us to better understand the causes of complex diseases. In the past decade, thanks to the increasing computational capabilities and novel statistical developments, Bayesian methods have been widely applied in the genetics/genomics researches and demonstrating superiority over some regular approaches in certain research areas. Gene-environment and gene-gene interaction studies are among the areas where Bayesian methods may fully exert its functionalities and advantages. This dissertation focuses on developing new Bayesian statistical methods for data analysis with complex gene-environment and gene-gene interactions, as well as extending some existing methods for gene-environment interactions to other related areas. It includes three sections: (1) Deriving the Bayesian variable selection framework for the hierarchical gene-environment and gene-gene interactions; (2) Developing the Bayesian Natural and Orthogonal Interaction (NOIA) models for gene-environment interactions; and (3) extending the applications of two Bayesian statistical methods which were developed for gene-environment interaction studies, to other related types of studies such as adaptive borrowing historical data. We propose a Bayesian hierarchical mixture model framework that allows us to investigate the genetic and environmental effects, gene by gene interactions (epistasis) and gene by environment interactions in the same model. It is well known that, in many practical situations, there exists a natural hierarchical structure between the main effects and interactions in the linear model. Here we propose a model that incorporates this hierarchical structure into the Bayesian mixture model, such that the irrelevant interaction effects can be removed more efficiently, resulting in more robust, parsimonious and powerful models. We evaluate both of the 'strong hierarchical' and 'weak hierarchical' models, which specify that both or one of the main effects between interacting factors must be present for the interactions to be included in the model. The extensive simulation results show that the proposed strong and weak hierarchical mixture models control the proportion of false positive discoveries and yield a powerful approach to identify the predisposing main effects and interactions in the studies with complex gene-environment and gene-gene interactions. We also compare these two models with the 'independent' model that does not impose this hierarchical constraint and observe their superior performances in most of the considered situations. The proposed models are implemented in the real data analysis of gene and environment interactions in the cases of lung cancer and cutaneous melanoma case-control studies. The Bayesian statistical models enjoy the properties of being allowed to incorporate useful prior information in the modeling process. Moreover, the Bayesian mixture model outperforms the multivariate logistic model in terms of the performances on the parameter estimation and variable selection in most cases. Our proposed models hold the hierarchical constraints, that further improve the Bayesian mixture model by reducing the proportion of false positive findings among the identified interactions and successfully identifying the reported associations. This is practically appealing for the study of investigating the causal factors from a moderate number of candidate genetic and environmental factors along with a relatively large number of interactions. The natural and orthogonal interaction (NOIA) models of genetic effects have previously been developed to provide an analysis framework, by which the estimates of effects for a quantitative trait are statistically orthogonal regardless of the existence of Hardy-Weinberg Equilibrium (HWE) within loci. Ma et al. (2012) recently developed a NOIA model for the gene-environment interaction studies and have shown the advantages of using the model for detecting the true main effects and interactions, compared with the usual functional model. In this project, we propose a novel Bayesian statistical model that combines the Bayesian hierarchical mixture model with the NOIA statistical model and the usual functional model. The proposed Bayesian NOIA model demonstrates more power at detecting the non-null effects with higher marginal posterior probabilities. Also, we review two Bayesian statistical models (Bayesian empirical shrinkage-type estimator and Bayesian model averaging), which were developed for the gene-environment interaction studies. Inspired by these Bayesian models, we develop two novel statistical methods that are able to handle the related problems such as borrowing data from historical studies. The proposed methods are analogous to the methods for the gene-environment interactions on behalf of the success on balancing the statistical efficiency and bias in a unified model. By extensive simulation studies, we compare the operating characteristics of the proposed models with the existing models including the hierarchical meta-analysis model. The results show that the proposed approaches adaptively borrow the historical data in a data-driven way. These novel models may have a broad range of statistical applications in both of genetic/genomic and clinical studies.
Resumo:
The problem of estimating the numbers of motor units N in a muscle is embedded in a general stochastic model using the notion of thinning from point process theory. In the paper a new moment type estimator for the numbers of motor units in a muscle is denned, which is derived using random sums with independently thinned terms. Asymptotic normality of the estimator is shown and its practical value is demonstrated with bootstrap and approximative confidence intervals for a data set from a 31-year-old healthy right-handed, female volunteer. Moreover simulation results are presented and Monte-Carlo based quantiles, means, and variances are calculated for N in{300,600,1000}.
Resumo:
It is of interest in some applications to determine whether there is a relationship between a hazard rate function (or a cumulative incidence function) and a mark variable which is only observed at uncensored failure times. We develop nonparametric tests for this problem when the mark variable is continuous. Tests are developed for the null hypothesis that the mark-specific hazard rate is independent of the mark versus ordered and two-sided alternatives expressed in terms of mark-specific hazard functions and mark-specific cumulative incidence functions. The test statistics are based on functionals of a bivariate test process equal to a weighted average of differences between a Nelson--Aalen-type estimator of the mark-specific cumulative hazard function and a nonparametric estimator of this function under the null hypothesis. The weight function in the test process can be chosen so that the test statistics are asymptotically distribution-free.Asymptotically correct critical values are obtained through a simple simulation procedure. The testing procedures are shown to perform well in numerical studies, and are illustrated with an AIDS clinical trial example. Specifically, the tests are used to assess if the instantaneous or absolute risk of treatment failure depends on the amount of accumulation of drug resistance mutations in a subject's HIV virus. This assessment helps guide development of anti-HIV therapies that surmount the problem of drug resistance.
Resumo:
La plupart des modèles en statistique classique repose sur une hypothèse sur la distribution des données ou sur une distribution sous-jacente aux données. La validité de cette hypothèse permet de faire de l’inférence, de construire des intervalles de confiance ou encore de tester la fiabilité du modèle. La problématique des tests d’ajustement vise à s’assurer de la conformité ou de la cohérence de l’hypothèse avec les données disponibles. Dans la présente thèse, nous proposons des tests d’ajustement à la loi normale dans le cadre des séries chronologiques univariées et vectorielles. Nous nous sommes limités à une classe de séries chronologiques linéaires, à savoir les modèles autorégressifs à moyenne mobile (ARMA ou VARMA dans le cas vectoriel). Dans un premier temps, au cas univarié, nous proposons une généralisation du travail de Ducharme et Lafaye de Micheaux (2004) dans le cas où la moyenne est inconnue et estimée. Nous avons estimé les paramètres par une méthode rarement utilisée dans la littérature et pourtant asymptotiquement efficace. En effet, nous avons rigoureusement montré que l’estimateur proposé par Brockwell et Davis (1991, section 10.8) converge presque sûrement vers la vraie valeur inconnue du paramètre. De plus, nous fournissons une preuve rigoureuse de l’inversibilité de la matrice des variances et des covariances de la statistique de test à partir de certaines propriétés d’algèbre linéaire. Le résultat s’applique aussi au cas où la moyenne est supposée connue et égale à zéro. Enfin, nous proposons une méthode de sélection de la dimension de la famille d’alternatives de type AIC, et nous étudions les propriétés asymptotiques de cette méthode. L’outil proposé ici est basé sur une famille spécifique de polynômes orthogonaux, à savoir les polynômes de Legendre. Dans un second temps, dans le cas vectoriel, nous proposons un test d’ajustement pour les modèles autorégressifs à moyenne mobile avec une paramétrisation structurée. La paramétrisation structurée permet de réduire le nombre élevé de paramètres dans ces modèles ou encore de tenir compte de certaines contraintes particulières. Ce projet inclut le cas standard d’absence de paramétrisation. Le test que nous proposons s’applique à une famille quelconque de fonctions orthogonales. Nous illustrons cela dans le cas particulier des polynômes de Legendre et d’Hermite. Dans le cas particulier des polynômes d’Hermite, nous montrons que le test obtenu est invariant aux transformations affines et qu’il est en fait une généralisation de nombreux tests existants dans la littérature. Ce projet peut être vu comme une généralisation du premier dans trois directions, notamment le passage de l’univarié au multivarié ; le choix d’une famille quelconque de fonctions orthogonales ; et enfin la possibilité de spécifier des relations ou des contraintes dans la formulation VARMA. Nous avons procédé dans chacun des projets à une étude de simulation afin d’évaluer le niveau et la puissance des tests proposés ainsi que de les comparer aux tests existants. De plus des applications aux données réelles sont fournies. Nous avons appliqué les tests à la prévision de la température moyenne annuelle du globe terrestre (univarié), ainsi qu’aux données relatives au marché du travail canadien (bivarié). Ces travaux ont été exposés à plusieurs congrès (voir par exemple Tagne, Duchesne et Lafaye de Micheaux (2013a, 2013b, 2014) pour plus de détails). Un article basé sur le premier projet est également soumis dans une revue avec comité de lecture (Voir Duchesne, Lafaye de Micheaux et Tagne (2016)).
Resumo:
La plupart des modèles en statistique classique repose sur une hypothèse sur la distribution des données ou sur une distribution sous-jacente aux données. La validité de cette hypothèse permet de faire de l’inférence, de construire des intervalles de confiance ou encore de tester la fiabilité du modèle. La problématique des tests d’ajustement vise à s’assurer de la conformité ou de la cohérence de l’hypothèse avec les données disponibles. Dans la présente thèse, nous proposons des tests d’ajustement à la loi normale dans le cadre des séries chronologiques univariées et vectorielles. Nous nous sommes limités à une classe de séries chronologiques linéaires, à savoir les modèles autorégressifs à moyenne mobile (ARMA ou VARMA dans le cas vectoriel). Dans un premier temps, au cas univarié, nous proposons une généralisation du travail de Ducharme et Lafaye de Micheaux (2004) dans le cas où la moyenne est inconnue et estimée. Nous avons estimé les paramètres par une méthode rarement utilisée dans la littérature et pourtant asymptotiquement efficace. En effet, nous avons rigoureusement montré que l’estimateur proposé par Brockwell et Davis (1991, section 10.8) converge presque sûrement vers la vraie valeur inconnue du paramètre. De plus, nous fournissons une preuve rigoureuse de l’inversibilité de la matrice des variances et des covariances de la statistique de test à partir de certaines propriétés d’algèbre linéaire. Le résultat s’applique aussi au cas où la moyenne est supposée connue et égale à zéro. Enfin, nous proposons une méthode de sélection de la dimension de la famille d’alternatives de type AIC, et nous étudions les propriétés asymptotiques de cette méthode. L’outil proposé ici est basé sur une famille spécifique de polynômes orthogonaux, à savoir les polynômes de Legendre. Dans un second temps, dans le cas vectoriel, nous proposons un test d’ajustement pour les modèles autorégressifs à moyenne mobile avec une paramétrisation structurée. La paramétrisation structurée permet de réduire le nombre élevé de paramètres dans ces modèles ou encore de tenir compte de certaines contraintes particulières. Ce projet inclut le cas standard d’absence de paramétrisation. Le test que nous proposons s’applique à une famille quelconque de fonctions orthogonales. Nous illustrons cela dans le cas particulier des polynômes de Legendre et d’Hermite. Dans le cas particulier des polynômes d’Hermite, nous montrons que le test obtenu est invariant aux transformations affines et qu’il est en fait une généralisation de nombreux tests existants dans la littérature. Ce projet peut être vu comme une généralisation du premier dans trois directions, notamment le passage de l’univarié au multivarié ; le choix d’une famille quelconque de fonctions orthogonales ; et enfin la possibilité de spécifier des relations ou des contraintes dans la formulation VARMA. Nous avons procédé dans chacun des projets à une étude de simulation afin d’évaluer le niveau et la puissance des tests proposés ainsi que de les comparer aux tests existants. De plus des applications aux données réelles sont fournies. Nous avons appliqué les tests à la prévision de la température moyenne annuelle du globe terrestre (univarié), ainsi qu’aux données relatives au marché du travail canadien (bivarié). Ces travaux ont été exposés à plusieurs congrès (voir par exemple Tagne, Duchesne et Lafaye de Micheaux (2013a, 2013b, 2014) pour plus de détails). Un article basé sur le premier projet est également soumis dans une revue avec comité de lecture (Voir Duchesne, Lafaye de Micheaux et Tagne (2016)).
Resumo:
Background: Prostate cancer cells in primary tumors have been typed CD10(-)/CD13(-)/CD24(hi)/CD26(+)/CD38(lo)/CD44(-)/CD104(-). This CD phenotype suggests a lineage relationship between cancer cells and luminal cells. The Gleason grade of tumors is a descriptive of tumor glandular differentiation. Higher Gleason scores are associated with treatment failure. Methods: CD26(+) cancer cells were isolated from Gleason 3+3 (G3) and Gleason 4+4 (G4) tumors by cell sorting, and their gene expression or transcriptome was determined by Affymetrix DNA array analysis. Dataset analysis was used to determine gene expression similarities and differences between G3 and G4 as well as to prostate cancer cell lines and histologically normal prostate luminal cells. Results: The G3 and G4 transcriptomes were compared to those of prostatic cell types of non-cancer, which included luminal, basal, stromal fibromuscular, and endothelial. A principal components analysis of the various transcriptome datasets indicated a closer relationship between luminal and G3 than luminal and G4. Dataset comparison also showed that the cancer transcriptomes differed substantially from those of prostate cancer cell lines. Conclusions: Genes differentially expressed in cancer are potential biomarkers for cancer detection, and those differentially expressed between G3 and G4 are potential biomarkers for disease stratification given that G4 cancer is associated with poor outcomes. Differentially expressed genes likely contribute to the prostate cancer phenotype and constitute the signatures of these particular cancer cell types.
Resumo:
Transport of BPV-1 virus from the cell membrane to the nucleus was studied in vitro in CV-1 cells. At reduced temperature (4 degreesC). BPV-I binding to CV-1 cells was unaffected but there was no transport of virions across the cytosol. Electron microscopy showed BPV-I virions in association with microtubules in the cytoplasm, a finding confirmed by co-immunoprecipitation of L1 protein and tubulin. Internalization of virus was unimpaired in cells treated with the microtubule-depolymerizing drug nocodazole but virions were retained in cytoplasmic vesicles and not transported to the nucleus. We conclude that a microtubule transport mechanism in CV-1 cells moves intact BPV-1 virions from the cell surface to the nuclear membrane. (C) 2001 Academic Press.
Resumo:
Background Previous studies have examined individual dietary and lifestyle factors in relation to type 2 diabetes, but the combined effects of these factors are largely unknown. Methods We followed 84,941 female nurses from 1980 to 1996; these women were free of diagnosed cardiovascular disease, diabetes, and cancer at base line. Information about their diet and lifestyle was updated periodically. A low-risk group was defined according to a combination of five variables: a body-mass index (the weight in kilograms divided by the square of the height in meters) of less than 25; a diet high in cereal fiber and polyunsaturated fat and low in trans fat and glycemic load (which reflects the effect of diet on the blood glucose level); engagement in moderate-to-vigorous physical activity for at least half an hour per day; no current smoking; and the consumption of an average of at least half a drink of an alcoholic beverage per day. Results During 16 years of follow-up, we documented 3300 new cases of type 2 diabetes. Overweight or obesity was the single most important predictor of diabetes. Lack of exercise, a poor diet, current smoking, and abstinence from alcohol use were all associated with a significantly increased risk of diabetes, even after adjustment for the body-mass index. As compared with the rest of the cohort, women in the low-risk group (3.4 percent of the women) had a relative risk of diabetes of 0.09 (95 percent confidence interval, 0.05 to 0.17). A total of 91 percent of the cases of diabetes in this cohort (95 percent confidence interval, 83 to 95 percent) could be attributed to habits and forms of behavior that did not conform to the low-risk pattern. Conclusions Our findings support the hypothesis that the majority of cases of type 2 diabetes could be prevented by the adoption of a healthier lifestyle.
Resumo:
To investigate the efficiency of encapsidation of plasmid by papillomavirus virus-like particles (PV VLPs), and the infectivity of the resultant PV pseudovirions, Cos-1 cells were transfected with an 8-kb plasmid incorporating a green fluorescent protein (GFP) reporter gene (pGSV), and infected with bovine PV (BPV-1) L1/L2 recombinant vaccinia virus to produce BPV1 pseudovirions. Approximately 1 in 1.5x10(4) of dense (1.35 g/ml) PV pseudovirions and 0.3 in 10(4) Of less-dense (1.29 g/ml) pseudovirions packaged an intact pGSV plasmid. The majority (>75%) of packaged plasmids contained deletions, and the deletions affected all tested genes. After exposure of Cos-1 cells to BPV-1 pseudovirions at an MOI of 40,000:1, 6% of cells expressed GFP giving a calculated efficiency of delivery of the pGSV plasmid, by pseudovirions which had packaged an intact plasmid, of approximately 5%. Plasmid delivery was not effected by purified pGSV plasmid, was blocked by antiserum against BPV-1, and was not blocked by DNase treatment of pseudovirions, confirming that delivery was mediated by DNA within the pseudovirion. We conclude that a major limitation to the use of PV pseudovirions as a gene delivery system is that intact plasmid DNA is not efficiently selected for packaging by VLPs in cell-based pseudovirions production systems.
Resumo:
In this paper, an attempt was made to investigate a fundamental problem related to the flexural waves excited by rectangular transducers. Due to the disadvantages of the Green's function approach for solving this problem, a direct and effective method is proposed using a multiple integral transform method and contour integration technique. The explicit frequency domain solutions obtained from this newly developed method are convenient for understanding transducer behavior and theoretical optimization and experimental calibration of rectangular transducers. The time domain solutions can then be easily obtained by using the fast Fourier transform technique. (C) 2001 Elsevier Science B.V. All rights reserved.
Resumo:
The acousto-ultrasonic (AU) input-output characteristics for contact-type transmitting and receiving transducers coupled to composite laminated plates are considered in this paper. Combining a multiple integral transform method, an ordinary discrete layer theory for the laminates and some simplifying assumptions for the electro-mechanical transduction behaviour of the transducers, an analytical solution is developed which can deal with all the wave processes involved in the AU measurement system, i.e, wave generation, wave propagation and wave reception. The spectral response of the normal contact pressure sensed by the receiving transducer due to an arbitrary input pulse excited by the transmitting transducer is obtained. To validate the new analytical-numerical spectral technique in the low-frequency regime, the results are compared with Mindlin plate theory solutions. Based on the analytical results, numerical calculations are carried out to investigate the influence of various external parameters such as frequency content of the input pulse, transmitter/receiver spacing and transducer aperture on the output of the measurement system. The results show that the presented analytical-numerical procedure is an effective tool for understanding the input-output characteristics of the AU technique for laminated plates. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
Polynucleotide immunisation with the E7 gene of human papillomavirus (HPV) type 16 induces only moderate levels of immune response, which may in part be due to limitation in E7 gene expression influenced by biased HPV codon usage. Here we compare for expression and immunogenicity polynucleotide expression plasmids encoding wild-type (pWE7) or synthetic codon optimised (pHE7) HPV16 E7 DNA. Cos-1 cells transfected with pHE7 expressed higher levels of E7 protein than similar cells transfected with pW7. C57BL/6 mice and F1 (C57X FVB) E7 transgenic mice immunised intradermally with E7 plasmids produced high levels of anti-E7 antibody. pHE7 induced a significantly stronger E7-specific cytotoxic T-lymphocyte response than pWE7 and 100% tumour protection in C57BL/6 mice, but neither vaccine induced CTL in partially E7 tolerant K14E7 transgenic mice. The data indicate that immunogenicity of an E7 polynucleotide vaccine can be enhanced by codon modification. However, this may be insufficient for priming E7 responses in animals with split tolerance to E7 as a consequence of expression of E7 in somatic cells. (C) 2002 Elsevier Science (USA).