5 resultados para audio data classification
em DigitalCommons@The Texas Medical Center
Resumo:
Material Safety Data Sheets (MSDSs) are an integral component of occupational hazard communication systems. These documents are used to disseminate hazard information to workers on chemical substances. The primary purpose of this study was to investigate the comprehensibility of MSDSs by workers at an international level. ^ A total of 117 employees of a multi-national petrochemical company participated; thirty-nine (39) each in the United States, Canada and the United Kingdom. Overall participation rate of those approached to participate was 82%. These countries were selected as they each utilize one of the three major existing hazard communication systems for fixed workplaces. The systems are comprised of the Occupational Safety and Health Administration's Hazard Communication Standard in the United States, the Workplace Hazardous Materials Information System (WHMIS) in Canada, and the compilation of several European Union directives addressing classification, labeling of substances and preparations, and MSDSs in Europe. ^ A pretest posttest randomized study design was used, with the posttest being comparable to an open book test. The results of this research indicated that only about two-thirds of the information on the MSDSs was comprehended by the workers with a significant difference identified among study participants based on country comparisons. This data was fairly consistent with the results of previous MSDS comprehensibility studies conducted in the United States. There was no significant difference in the comprehension level among study participants when taking into account the international hazard communication standard that the MSDS complied with. Marginally, age, education level and experience level did not have a significant impact on the comprehension level. ^ Participants did find MSDSs to be satisfactory in providing the information needed to protect them regardless of their views on the readability and formatting of MSDSs. The health-related information was the least comprehended as less than half of it was comprehended on the basis of the responses. The findings from this research suggest that there is much work needed yet to make MSDSs more comprehensible on a global basis, particularly regarding health-related information. ^
Resumo:
In the United States, “binge” drinking among college students is an emerging public health concern due to the significant physical and psychological effects on young adults. The focus is on identifying interventions that can help decrease high-risk drinking behavior among this group of drinkers. One such intervention is Motivational interviewing (MI), a client-centered therapy that aims at resolving client ambivalence by developing discrepancy and engaging the client in change talk. Of late, there is a growing interest in determining the active ingredients that influence the alliance between the therapist and the client. This study is a secondary analysis of the data obtained from the Southern Methodist Alcohol Research Trial (SMART) project, a dismantling trial of MI and feedback among heavy drinking college students. The present project examines the relationship between therapist and client language in MI sessions on a sample of “binge” drinking college students. Of the 126 SMART tapes, 30 tapes (‘MI with feedback’ group = 15, ‘MI only’ group = 15) were randomly selected for this study. MISC 2.1, a mutually exclusive and exhaustive coding system, was used to code the audio/videotaped MI sessions. Therapist and client language were analyzed for communication characteristics. Overall, therapists adopted a MI consistent style and clients were found to engage in change talk. Counselor acceptance, empathy, spirit, and complex reflections were all significantly related to client change talk (p-values ranged from 0.001 to 0.047). Additionally, therapist ‘advice without permission’ and MI Inconsistent therapist behaviors were strongly correlated with client sustain talk (p-values ranged from 0.006 to 0.048). Simple linear regression models showed a significant correlation between MI consistent (MICO) therapist language (independent variable) and change talk (dependent variable) and MI inconsistent (MIIN) therapist language (independent variable) and sustain talk (dependent variable). The study has several limitations such as small sample size, self-selection bias, poor inter-rater reliability for the global scales and the lack of a temporal measure of therapist and client language. Future studies might consider a larger sample size to obtain more statistical power. In addition the correlation between therapist language, client language and drinking outcome needs to be explored.^
Resumo:
It is well accepted that tumorigenesis is a multi-step procedure involving aberrant functioning of genes regulating cell proliferation, differentiation, apoptosis, genome stability, angiogenesis and motility. To obtain a full understanding of tumorigenesis, it is necessary to collect information on all aspects of cell activity. Recent advances in high throughput technologies allow biologists to generate massive amounts of data, more than might have been imagined decades ago. These advances have made it possible to launch comprehensive projects such as (TCGA) and (ICGC) which systematically characterize the molecular fingerprints of cancer cells using gene expression, methylation, copy number, microRNA and SNP microarrays as well as next generation sequencing assays interrogating somatic mutation, insertion, deletion, translocation and structural rearrangements. Given the massive amount of data, a major challenge is to integrate information from multiple sources and formulate testable hypotheses. This thesis focuses on developing methodologies for integrative analyses of genomic assays profiled on the same set of samples. We have developed several novel methods for integrative biomarker identification and cancer classification. We introduce a regression-based approach to identify biomarkers predictive to therapy response or survival by integrating multiple assays including gene expression, methylation and copy number data through penalized regression. To identify key cancer-specific genes accounting for multiple mechanisms of regulation, we have developed the integIRTy software that provides robust and reliable inferences about gene alteration by automatically adjusting for sample heterogeneity as well as technical artifacts using Item Response Theory. To cope with the increasing need for accurate cancer diagnosis and individualized therapy, we have developed a robust and powerful algorithm called SIBER to systematically identify bimodally expressed genes using next generation RNAseq data. We have shown that prediction models built from these bimodal genes have the same accuracy as models built from all genes. Further, prediction models with dichotomized gene expression measurements based on their bimodal shapes still perform well. The effectiveness of outcome prediction using discretized signals paves the road for more accurate and interpretable cancer classification by integrating signals from multiple sources.
Resumo:
Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^
Resumo:
Cervical cancer is the leading cause of death and disease from malignant neoplasms among women in developing countries. Even though the Pap smear has significantly decreased the number of deaths from cervical cancer in the past years, it has its limitations. Researchers have developed an automated screening machine which can potentially detect abnormal cases that are overlooked by conventional screening. The goal of quantitative cytology is to classify the patient's tissue sample based on quantitative measurements of the individual cells. It is also much cheaper and potentially can take less time. One of the major challenges of collecting cells with a cytobrush is the possibility of not sampling any existing dysplastic cells on the cervix. Being able to correctly classify patients who have disease without the presence of dysplastic cells could improve the accuracy of quantitative cytology algorithms. Subtle morphologic changes in normal-appearing tissues adjacent to or distant from malignant tumors have been shown to exist, but a comparison of various statistical methods, including many recent advances in the statistical learning field, has not previously been done. The objective of this thesis is to use different classification methods applied to quantitative cytology data for the detection of malignancy associated changes (MACs). In this thesis, Elastic Net is the best algorithm. When we applied the Elastic Net algorithm to the test set, we combined the training set and validation set as "training" set and used 5-fold cross validation to choose the parameter for Elastic Net. It has a sensitivity of 47% at 80% specificity, an AUC 0.52, and a partial AUC 0.10 (95% CI 0.09-0.11).^