11 resultados para Nonparametric discriminant analysis

em DigitalCommons@The Texas Medical Center


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This study compared three body measurements, height, hip width (bitrochanteric) and foot length, in 120 Hispanic women who had their first birth by cesarean section (N = 60) or by spontaneous vaginal delivery (N = 60). The objective of the study was to see if there were differences in these measurements that could be useful in predicting cephalopelvic disproportion. Data were collected from two public hospitals in Houston Texas over a 10 month period from December 1994 to October 1995. The statistical technique used to evaluate the measures was discriminant analysis.^ Women who delivered by cesarean section were older, shorter, had shorter feet and delivered heavier infants. There were no differences in the bitrochanteric widths of the women or in the mean gestational age or Apgar scores of the infants.^ Significantly more of the mothers and infants were ill following cesarean section delivery. Maternal illness was usually infection; infant illness was primarily infection or respiratory difficulties.^ Discriminant analysis is a technique which allows for classification and prediction to which group a particular entity will belong given a certain set of variables. Using discriminant analysis, with a probability of cesarean section 50 percent, the best combination to classify who would have a cesarean section was height and hip width, correctly classifying 74.2 percent of those who needed surgery. When the probability of cesarean section was 10 percent and probability of vaginal delivery was 90 percent, the best predictor of who would need operative delivery was height, hip width and age, correctly classifying 56.2 percent. In the population from which the study participants were selected the incidence of cephalopelvic disproportion was low, approximately 1 percent.^ With the technologic assistance available in most of the developed world, it is likely that the further pursuit of different measures and their use would not be of much benefit in attempting to predict and diagnose disproportion. However, in areas of the world where much of obstetrics is "hands on", the availability of technology extremely limited, and the incidence of disproportion larger, the use of anthropometric measures might be useful and of some potential benefit. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recently it has been proposed that the evaluation of effects of pollutants on aquatic organisms can provide an early warning system of potential environmental and human health risks (NRC 1991). Unfortunately there are few methods available to aquatic biologists to conduct assessments of the effects of pollutants on aquatic animal community health. The primary goal of this research was to develop and evaluate the feasibility of such a method. Specifically, the primary objective of this study was to develop a prototype rapid bioassessment technique similar to the Index of Biotic Integrity (IBI) for the upper Texas and Northwestern Gulf of Mexico coastal tributaries. The IBI consists of a series of "metrics" which describes specific attributes of the aquatic community. Each of these metrics are given a score which is then subtotaled to derive a total assessment of the "health" of the aquatic community. This IBI procedure may provide an additional assessment tool for professionals in water quality management.^ The experimental design consisted primarily of compiling previously collected data from monitoring conducted by the Texas Natural Resource Conservation Commission (TNRCC) at five bayous classified according to potential for anthropogenic impact and salinity regime. Standardized hydrological, chemical, and biological monitoring had been conducted in each of these watersheds. The identification and evaluation of candidate metrics for inclusion in the estuarine IBI was conducted through the use of correlation analysis, cluster analysis, stepwise and normal discriminant analysis, and evaluation of cumulative distribution frequencies. Scores of each included metric were determined based on exceedances of specific percentiles. Individual scores were summed and a total IBI score and rank for the community computed.^ Results of these analyses yielded the proposed metrics and rankings listed in this report. Based on the results of this study, incorporation of an estuarine IBI method as a water quality assessment tool is warranted. Adopted metrics were correlated to seasonal trends and less so to salinity gradients observed during the study (0-25 ppt). Further refinement of this method is needed using a larger more inclusive data set which includes additional habitat types, salinity ranges, and temporal variation. ^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The purpose of this study was to examine the relationship between enterotoxigenic ETEC and travelers' diarrhea over a period of five years in Guadalajara, Mexico. Specifically, this study identified and characterized ETEC from travelers with diarrhea. The objectives were to study the colonization factor antigens, toxins and antibiotic sensitivity patterns in ETEC from 1992 to 1997 and to study the molecular epidemiology of ETEC by plasmid content and DNA restriction fragment patterns. ^ In this survey of travelers' diarrhea in Guadalajara, Mexico, 928 travelers with diarrhea were screened for enteric pathogens between 1992 and 1997. ETEC were isolated in 195 (19.9%) of the patients, representing the most frequent enteric pathogen identified. ^ A total of 31 antimicrobial susceptibility patterns were identified among ETEC isolates over the five-year period. ^ The 195 ETEC isolates contained two to six plasmids each, which ranged in size from 2.0 to 23 kbp. ^ Three different reproducible rRNA gene restriction patterns (ribotypes R-1 to R-3) were obtained among the 195 isolates with the enzyme, HindIII. ^ Colonization factor antigens (CFAs) were identified in 99 (51%) of the 195 ETEC strains studied. ^ Cluster analysis of the observations seen in the four assays all confirmed the five distinct groups of study-year strains of ETEC. Each group had a >95% similarity level of strains within the group and <60% similarity level between the groups. In addition, discriminant analysis of assay variables used in predicting the ETEC strains, reveal a >80% relationship between both the plasmid and rRNA content of ETEC strains and study-year. ^ These findings, based on laboratory observations of the differences in biochemical, antimicrobial susceptibility, plasmid and ribotype content, suggest complex epidemiology for ETEC strains in a population with travelers' diarrhea. The findings of this study may have implications for our understanding of the epidemiology, transmission, treatment, control and prevention of the disease. It has been suggested that an ETEC vaccine for humans should contain the most prevalent CFAs. Therefore, it is important to know the prevalence of these factors in ETEC in various geographical areas. ^ CFAs described in this dissertation may be used in different epidemiological studies in which the prevalence of CFAs and other properties on ETEC will be evaluated. Furthermore, in spite of an intense search in near 200 ETEC isolates for strains that may have clonal relationship, we failed to identify such strains. However, further studies are in progress to construct suitable live vaccine strains and to introduce several of CFAs in the same host organism by recombinant DNA techniques (Dr. Ann-Mari Svennerholm's lab). (Abstract shortened by UMI.)^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In population studies, most current methods focus on identifying one outcome-related SNP at a time by testing for differences of genotype frequencies between disease and healthy groups or among different population groups. However, testing a great number of SNPs simultaneously has a problem of multiple testing and will give false-positive results. Although, this problem can be effectively dealt with through several approaches such as Bonferroni correction, permutation testing and false discovery rates, patterns of the joint effects by several genes, each with weak effect, might not be able to be determined. With the availability of high-throughput genotyping technology, searching for multiple scattered SNPs over the whole genome and modeling their joint effect on the target variable has become possible. Exhaustive search of all SNP subsets is computationally infeasible for millions of SNPs in a genome-wide study. Several effective feature selection methods combined with classification functions have been proposed to search for an optimal SNP subset among big data sets where the number of feature SNPs far exceeds the number of observations. ^ In this study, we take two steps to achieve the goal. First we selected 1000 SNPs through an effective filter method and then we performed a feature selection wrapped around a classifier to identify an optimal SNP subset for predicting disease. And also we developed a novel classification method-sequential information bottleneck method wrapped inside different search algorithms to identify an optimal subset of SNPs for classifying the outcome variable. This new method was compared with the classical linear discriminant analysis in terms of classification performance. Finally, we performed chi-square test to look at the relationship between each SNP and disease from another point of view. ^ In general, our results show that filtering features using harmononic mean of sensitivity and specificity(HMSS) through linear discriminant analysis (LDA) is better than using LDA training accuracy or mutual information in our study. Our results also demonstrate that exhaustive search of a small subset with one SNP, two SNPs or 3 SNP subset based on best 100 composite 2-SNPs can find an optimal subset and further inclusion of more SNPs through heuristic algorithm doesn't always increase the performance of SNP subsets. Although sequential forward floating selection can be applied to prevent from the nesting effect of forward selection, it does not always out-perform the latter due to overfitting from observing more complex subset states. ^ Our results also indicate that HMSS as a criterion to evaluate the classification ability of a function can be used in imbalanced data without modifying the original dataset as against classification accuracy. Our four studies suggest that Sequential Information Bottleneck(sIB), a new unsupervised technique, can be adopted to predict the outcome and its ability to detect the target status is superior to the traditional LDA in the study. ^ From our results we can see that the best test probability-HMSS for predicting CVD, stroke,CAD and psoriasis through sIB is 0.59406, 0.641815, 0.645315 and 0.678658, respectively. In terms of group prediction accuracy, the highest test accuracy of sIB for diagnosing a normal status among controls can reach 0.708999, 0.863216, 0.639918 and 0.850275 respectively in the four studies if the test accuracy among cases is required to be not less than 0.4. On the other hand, the highest test accuracy of sIB for diagnosing a disease among cases can reach 0.748644, 0.789916, 0.705701 and 0.749436 respectively in the four studies if the test accuracy among controls is required to be at least 0.4. ^ A further genome-wide association study through Chi square test shows that there are no significant SNPs detected at the cut-off level 9.09451E-08 in the Framingham heart study of CVD. Study results in WTCCC can only detect two significant SNPs that are associated with CAD. In the genome-wide study of psoriasis most of top 20 SNP markers with impressive classification accuracy are also significantly associated with the disease through chi-square test at the cut-off value 1.11E-07. ^ Although our classification methods can achieve high accuracy in the study, complete descriptions of those classification results(95% confidence interval or statistical test of differences) require more cost-effective methods or efficient computing system, both of which can't be accomplished currently in our genome-wide study. We should also note that the purpose of this study is to identify subsets of SNPs with high prediction ability and those SNPs with good discriminant power are not necessary to be causal markers for the disease.^

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Using a retrospective cross-sectional approach, this study quantitatively analyzed foodborne illness data, restaurant inspection data, and census-derived socioeconomic and demographic data within Harris County, Texas between 2005 and 2010. The main research question investigated involved determining the extent to which contextual and regulatory conditions distinguish outbreak and non-outbreak establishments within Harris County. Two groups of Harris County establishments were analyzed: outbreak and non-outbreak restaurants. STATA 11 was employed to determine the average profiles of each category across both the regulatory and socioeconomic (contextual) variables. Cross tabulations of all of the non-quantitative variables were also performed, and finally, a discriminant analysis was conducted to assess how well the variables were able to allocate the restaurants into their respective categories. Contextual and regulatory conditions were found to be minimally associated with the occurrence of foodborne outbreaks within Harris County. Across both the categories (outbreak and non-outbreak establishments), variables included were extremely similar in means, and when possible to observe, distributions. The variables analyzed in this study, both regulatory and contextual, were not found to significantly allocate the establishments into their correct outbreak or non-outbreak categories. The implications of these findings are that regulatory processes and guidelines in place in Harris County do not effectively to distinguish outbreak from non-outbreak restaurants. Additionally, no socioeconomic or racial/ethnic patterns are apparent in the incidence of foodborne disease in the county. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The considerable search for synergistic agents in cancer research is motivated by the therapeutic benefits achieved by combining anti-cancer agents. Synergistic agents make it possible to reduce dosage while maintaining or enhancing a desired effect. Other favorable outcomes of synergistic agents include reduction in toxicity and minimizing or delaying drug resistance. Dose-response assessment and drug-drug interaction analysis play an important part in the drug discovery process, however analysis are often poorly done. This dissertation is an effort to notably improve dose-response assessment and drug-drug interaction analysis. The most commonly used method in published analysis is the Median-Effect Principle/Combination Index method (Chou and Talalay, 1984). The Median-Effect Principle/Combination Index method leads to inefficiency by ignoring important sources of variation inherent in dose-response data and discarding data points that do not fit the Median-Effect Principle. Previous work has shown that the conventional method yields a high rate of false positives (Boik, Boik, Newman, 2008; Hennessey, Rosner, Bast, Chen, 2010) and, in some cases, low power to detect synergy. There is a great need for improving the current methodology. We developed a Bayesian framework for dose-response modeling and drug-drug interaction analysis. First, we developed a hierarchical meta-regression dose-response model that accounts for various sources of variation and uncertainty and allows one to incorporate knowledge from prior studies into the current analysis, thus offering a more efficient and reliable inference. Second, in the case that parametric dose-response models do not fit the data, we developed a practical and flexible nonparametric regression method for meta-analysis of independently repeated dose-response experiments. Third, and lastly, we developed a method, based on Loewe additivity that allows one to quantitatively assess interaction between two agents combined at a fixed dose ratio. The proposed method makes a comprehensive and honest account of uncertainty within drug interaction assessment. Extensive simulation studies show that the novel methodology improves the screening process of effective/synergistic agents and reduces the incidence of type I error. We consider an ovarian cancer cell line study that investigates the combined effect of DNA methylation inhibitors and histone deacetylation inhibitors in human ovarian cancer cell lines. The hypothesis is that the combination of DNA methylation inhibitors and histone deacetylation inhibitors will enhance antiproliferative activity in human ovarian cancer cell lines compared to treatment with each inhibitor alone. By applying the proposed Bayesian methodology, in vitro synergy was declared for DNA methylation inhibitor, 5-AZA-2'-deoxycytidine combined with one histone deacetylation inhibitor, suberoylanilide hydroxamic acid or trichostatin A in the cell lines HEY and SKOV3. This suggests potential new epigenetic therapies in cell growth inhibition of ovarian cancer cells.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A non-parametric method was developed and tested to compare the partial areas under two correlated Receiver Operating Characteristic curves. Based on the theory of generalized U-statistics the mathematical formulas have been derived for computing ROC area, and the variance and covariance between the portions of two ROC curves. A practical SAS application also has been developed to facilitate the calculations. The accuracy of the non-parametric method was evaluated by comparing it to other methods. By applying our method to the data from a published ROC analysis of CT image, our results are very close to theirs. A hypothetical example was used to demonstrate the effects of two crossed ROC curves. The two ROC areas are the same. However each portion of the area between two ROC curves were found to be significantly different by the partial ROC curve analysis. For computation of ROC curves with large scales, such as a logistic regression model, we applied our method to the breast cancer study with Medicare claims data. It yielded the same ROC area computation as the SAS Logistic procedure. Our method also provides an alternative to the global summary of ROC area comparison by directly comparing the true-positive rates for two regression models and by determining the range of false-positive values where the models differ. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Improvements in the analysis of microarray images are critical for accurately quantifying gene expression levels. The acquisition of accurate spot intensities directly influences the results and interpretation of statistical analyses. This dissertation discusses the implementation of a novel approach to the analysis of cDNA microarray images. We use a stellar photometric model, the Moffat function, to quantify microarray spots from nylon microarray images. The inherent flexibility of the Moffat shape model makes it ideal for quantifying microarray spots. We apply our novel approach to a Wilms' tumor microarray study and compare our results with a fixed-circle segmentation approach for spot quantification. Our results suggest that different spot feature extraction methods can have an impact on the ability of statistical methods to identify differentially expressed genes. We also used the Moffat function to simulate a series of microarray images under various experimental conditions. These simulations were used to validate the performance of various statistical methods for identifying differentially expressed genes. Our simulation results indicate that tests taking into account the dependency between mean spot intensity and variance estimation, such as the smoothened t-test, can better identify differentially expressed genes, especially when the number of replicates and mean fold change are low. The analysis of the simulations also showed that overall, a rank sum test (Mann-Whitney) performed well at identifying differentially expressed genes. Previous work has suggested the strengths of nonparametric approaches for identifying differentially expressed genes. We also show that multivariate approaches, such as hierarchical and k-means cluster analysis along with principal components analysis, are only effective at classifying samples when replicate numbers and mean fold change are high. Finally, we show how our stellar shape model approach can be extended to the analysis of 2D-gel images by adapting the Moffat function to take into account the elliptical nature of spots in such images. Our results indicate that stellar shape models offer a previously unexplored approach for the quantification of 2D-gel spots. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. Screening for colorectal cancer (CRC) is considered cost effective but screening compliance in the US remains low. There have been very few studies on economic analyses of screening promotion strategies for colorectal cancer. The main aim of the current study is to conduct a cost effectiveness analysis (CEA) and examine the uncertainty involved in the results of the CEA of a tailored intervention to promote screening for CRC among patients of a multispeciality clinic in Houston, TX. ^ Methods. The two intervention arms received a PC based tailored program and web based educational information to promote CRC screening. The incremental cost of implementing a tailored PC based program was compared to the website based education and the status quo of no intervention for each unit of effect after 12 months of delivering the intervention. Uncertainty analysis in the point estimates of cost and effect was conducted using nonparametric bootstrapping. ^ Results. The cost of implementing a web based educational intervention was $36.00 per person and the cost of the tailored PC based interactive intervention was $43.00 per person. The additional cost per person screened for the web-based strategy was $2374 and the effect of the tailored intervention was negative. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Introduction. Despite the ban of lead-containing gasoline and paint, childhood lead poisoning remains a public health issue. Furthermore, a Medicaid-eligible child is 8 times more likely to have an elevated blood lead level (EBLL) than a non-Medicaid child, which is the primary reason for the early detection lead screening mandate for ages 12 and 24 months among the Medicaid population. Based on field observations, there was evidence that suggested a screening compliance issue. Objective. The purpose of this study was to analyze blood lead screening compliance in previously lead poisoned Medicaid children and test for an association between timely lead screening and timely childhood immunizations. The mean months between follow-up tests were also examined for a significant difference between the non-compliant and compliant lead screened children. Methods. Access to the surveillance data of all childhood lead poisoned cases in Bexar County was granted by the San Antonio Metropolitan Health District. A database was constructed and analyzed using descriptive statistics, logistic regression methods and non-parametric tests. Lead screening at 12 months of age was analyzed separately from lead screening at 24 months. The small portion of the population who were also related were included in one analysis and removed from a second analysis to check for significance. Gender, ethnicity, age of home, and having a sibling with an EBLL were ruled out as confounders for the association tests but ethnicity and age of home were adjusted in the nonparametric tests. Results. There was a strong significant association between lead screening compliance at 12 months and childhood immunization compliance, with or without including related children (p<0.00). However, there was no significant association between the two variables at the age of 24 months. Furthermore, there was no significant difference between the median of the mean months of follow-up blood tests among the non-compliant and compliant lead screened population for at the 12 month screening group but there was a significant difference at the 24 month screening group (p<0.01). Discussion. Descriptive statistics showed that 61% and 56% of the previously lead poisoned Medicaid population did not receive their 12 and 24 month mandated lead screening on time, respectively. This suggests that their elevated blood lead level may have been diagnosed earlier in their childhood. Furthermore, a child who is compliant with their lead screening at 12 months of age is 2.36 times more likely to also receive their childhood immunizations on time compared to a child who was not compliant with their 12 month screening. Even though there was no statistical significant association found for the 24 month group, the public health significance of a screening compliance issue is no less important. The Texas Medicaid program needs to enforce lead screening compliance because it is evident that there has been no monitoring system in place. Further recommendations include a need for an increased focus on parental education and the importance of taking their children for wellness exams on time.^