Biblioteca Digital

6 resultados para SCALE MODELS

em DigitalCommons@The Texas Medical Center

An empirical evaluation of the Random Forests classifier models for variable selection in a large-scale lung cancer case-control study

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Random Forests™ is reported to be one of the most accurate classification algorithms in complex data analysis. It shows excellent performance even when most predictors are noisy and the number of variables is much larger than the number of observations. In this thesis Random Forests was applied to a large-scale lung cancer case-control study. A novel way of automatically selecting prognostic factors was proposed. Also, synthetic positive control was used to validate Random Forests method. Throughout this study we showed that Random Forests can deal with large number of weak input variables without overfitting. It can account for non-additive interactions between these input variables. Random Forests can also be used for variable selection without being adversely affected by collinearities. ^ Random Forests can deal with the large-scale data sets without rigorous data preprocessing. It has robust variable importance ranking measure. Proposed is a novel variable selection method in context of Random Forests that uses the data noise level as the cut-off value to determine the subset of the important predictors. This new approach enhanced the ability of the Random Forests algorithm to automatically identify important predictors for complex data. The cut-off value can also be adjusted based on the results of the synthetic positive control experiments. ^ When the data set had high variables to observations ratio, Random Forests complemented the established logistic regression. This study suggested that Random Forests is recommended for such high dimensionality data. One can use Random Forests to select the important variables and then use logistic regression or Random Forests itself to estimate the effect size of the predictors and to classify new observations. ^ We also found that the mean decrease of accuracy is a more reliable variable ranking measurement than mean decrease of Gini. ^

Veja mais

Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A wealth of genetic associations for cardiovascular and metabolic phenotypes in humans has been accumulating over the last decade, in particular a large number of loci derived from recent genome wide association studies (GWAS). True complex disease-associated loci often exert modest effects, so their delineation currently requires integration of diverse phenotypic data from large studies to ensure robust meta-analyses. We have designed a gene-centric 50 K single nucleotide polymorphism (SNP) array to assess potentially relevant loci across a range of cardiovascular, metabolic and inflammatory syndromes. The array utilizes a "cosmopolitan" tagging approach to capture the genetic diversity across approximately 2,000 loci in populations represented in the HapMap and SeattleSNPs projects. The array content is informed by GWAS of vascular and inflammatory disease, expression quantitative trait loci implicated in atherosclerosis, pathway based approaches and comprehensive literature searching. The custom flexibility of the array platform facilitated interrogation of loci at differing stringencies, according to a gene prioritization strategy that allows saturation of high priority loci with a greater density of markers than the existing GWAS tools, particularly in African HapMap samples. We also demonstrate that the IBC array can be used to complement GWAS, increasing coverage in high priority CVD-related loci across all major HapMap populations. DNA from over 200,000 extensively phenotyped individuals will be genotyped with this array with a significant portion of the generated data being released into the academic domain facilitating in silico replication attempts, analyses of rare variants and cross-cohort meta-analyses in diverse populations. These datasets will also facilitate more robust secondary analyses, such as explorations with alternative genetic models, epistasis and gene-environment interactions.

Veja mais

Large scale variation in Enterococcus faecalis illustrated by the genome analysis of strain OG1RF.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: Enterococcus faecalis has emerged as a major hospital pathogen. To explore its diversity, we sequenced E. faecalis strain OG1RF, which is commonly used for molecular manipulation and virulence studies. RESULTS: The 2,739,625 base pair chromosome of OG1RF was found to contain approximately 232 kilobases unique to this strain compared to V583, the only publicly available sequenced strain. Almost no mobile genetic elements were found in OG1RF. The 64 areas of divergence were classified into three categories. First, OG1RF carries 39 unique regions, including 2 CRISPR loci and a new WxL locus. Second, we found nine replacements where a sequence specific to V583 was substituted by a sequence specific to OG1RF. For example, the iol operon of OG1RF replaces a possible prophage and the vanB transposon in V583. Finally, we found 16 regions that were present in V583 but missing from OG1RF, including the proposed pathogenicity island, several probable prophages, and the cpsCDEFGHIJK capsular polysaccharide operon. OG1RF was more rapidly but less frequently lethal than V583 in the mouse peritonitis model and considerably outcompeted V583 in a murine model of urinary tract infections. CONCLUSION: E. faecalis OG1RF carries a number of unique loci compared to V583, but the almost complete lack of mobile genetic elements demonstrates that this is not a defining feature of the species. Additionally, OG1RF's effects in experimental models suggest that mediators of virulence may be diverse between different E. faecalis strains and that virulence is not dependent on the presence of mobile genetic elements.

Veja mais

Validity assessment of the Breast Cancer Risk Reduction Health Belief scale.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

BACKGROUND: : Women at increased risk of breast cancer (BC) are not widely accepting of chemopreventive interventions, and ethnic minorities are underrepresented in related trials. Furthermore, there is no validated instrument to assess the health-seeking behavior of these women with respect to these interventions. METHODS: : By using constructs from the Health Belief Model, the authors developed and refined, based on pilot data, the Breast Cancer Risk Reduction Health Belief (BCRRHB) scale using a population of 265 women at increased risk of BC who were largely medically underserved, of low socioeconomic status (SES), and ethnic minorities. Construct validity was assessed using principal components analysis with oblique rotation to extract factors, and generate and interpret summary scales. Internal consistency was determined using Cronbach alpha coefficients. RESULTS: : Test-retest reliability for the pilot and final data was calculated to be r = 0.85. Principal components analysis yielded 16 components that explained 64% of the total variance, with communalities ranging from 0.50-0.75. Cronbach alpha coefficients for the extracted factors ranged from 0.45-0.77. CONCLUSIONS: : Evidence suggests that the BCRRHB yields reliable and valid data that allows for the identification of barriers and enhancing factors associated with use of breast cancer chemoprevention in the study population. These findings allow for tailoring treatment plans and intervention strategies to the individual. Future research is needed to validate the scale for use in other female populations. Cancer 2009. (c) 2009 American Cancer Society.

Veja mais

Improving the accuracy of radiation pneumonitis dose response models

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The prognosis for lung cancer patients remains poor. Five year survival rates have been reported to be 15%. Studies have shown that dose escalation to the tumor can lead to better local control and subsequently better overall survival. However, dose to lung tumor is limited by normal tissue toxicity. The most prevalent thoracic toxicity is radiation pneumonitis. In order to determine a safe dose that can be delivered to the healthy lung, researchers have turned to mathematical models predicting the rate of radiation pneumonitis. However, these models rely on simple metrics based on the dose-volume histogram and are not yet accurate enough to be used for dose escalation trials. The purpose of this work was to improve the fit of predictive risk models for radiation pneumonitis and to show the dosimetric benefit of using the models to guide patient treatment planning. The study was divided into 3 specific aims. The first two specifics aims were focused on improving the fit of the predictive model. In Specific Aim 1 we incorporated information about the spatial location of the lung dose distribution into a predictive model. In Specific Aim 2 we incorporated ventilation-based functional information into a predictive pneumonitis model. In the third specific aim a proof of principle virtual simulation was performed where a model-determined limit was used to scale the prescription dose. The data showed that for our patient cohort, the fit of the model to the data was not improved by incorporating spatial information. Although we were not able to achieve a significant improvement in model fit using pre-treatment ventilation, we show some promising results indicating that ventilation imaging can provide useful information about lung function in lung cancer patients. The virtual simulation trial demonstrated that using a personalized lung dose limit derived from a predictive model will result in a different prescription than what was achieved with the clinically used plan; thus demonstrating the utility of a normal tissue toxicity model in personalizing the prescription dose.

Veja mais

The application of latent variable models to the assessment of determinants of HIV risk behavior

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Studies on the relationship between psychosocial determinants and HIV risk behaviors have produced little evidence to support hypotheses based on theoretical relationships. One limitation inherent in many articles in the literature is the method of measurement of the determinants and the analytic approach selected. ^ To reduce the misclassification associated with unit scaling of measures specific to internalized homonegativity, I evaluated the psychometric properties of the Reactions to Homosexuality scale in a confirmatory factor analytic framework. In addition, I assessed the measurement invariance of the scale across racial/ethnic classifications in a sample of men who have sex with men. The resulting measure contained eight items loading on three first-order factors. Invariance assessment identified metric and partial strong invariance between racial/ethnic groups in the sample. ^ Application of the updated measure to a structural model allowed for the exploration of direct and indirect effects of internalized homonegativity on unprotected anal intercourse. Pathways identified in the model show that drug and alcohol use at last sexual encounter, the number of sexual partners in the previous three months and sexual compulsivity all contribute directly to risk behavior. Internalized homonegativity reduced the likelihood of exposure to drugs, alcohol or higher numbers of partners. For men who developed compulsive sexual behavior as a coping strategy for internalized homonegativity, there was an increase in the prevalence odds of risk behavior. ^ In the final stage of the analysis, I conducted a latent profile analysis of the items in the updated Reactions to Homosexuality scale. This analysis identified five distinct profiles, which suggested that the construct was not homogeneous in samples of men who have sex with men. Lack of prior consideration of these distinct manifestations of internalized homonegativity may have contributed to the analytic difficulty in identifying a relationship between the trait and high-risk sexual practices. ^

Veja mais

6 resultados para SCALE MODELS

em DigitalCommons@The Texas Medical Center

Filtro por publicador

An empirical evaluation of the Random Forests classifier models for variable selection in a large-scale lung cancer case-control study

Concept, design and implementation of a cardiovascular gene-centric 50 k SNP array for large-scale genomic association studies.

Large scale variation in Enterococcus faecalis illustrated by the genome analysis of strain OG1RF.

Validity assessment of the Breast Cancer Risk Reduction Health Belief scale.

Improving the accuracy of radiation pneumonitis dose response models

The application of latent variable models to the assessment of determinants of HIV risk behavior