67 resultados para Variable sample size

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The purpose of this study is to investigate the effects of predictor variable correlations and patterns of missingness with dichotomous and/or continuous data in small samples when missing data is multiply imputed. Missing data of predictor variables is multiply imputed under three different multivariate models: the multivariate normal model for continuous data, the multinomial model for dichotomous data and the general location model for mixed dichotomous and continuous data. Subsequent to the multiple imputation process, Type I error rates of the regression coefficients obtained with logistic regression analysis are estimated under various conditions of correlation structure, sample size, type of data and patterns of missing data. The distributional properties of average mean, variance and correlations among the predictor variables are assessed after the multiple imputation process. ^ For continuous predictor data under the multivariate normal model, Type I error rates are generally within the nominal values with samples of size n = 100. Smaller samples of size n = 50 resulted in more conservative estimates (i.e., lower than the nominal value). Correlation and variance estimates of the original data are retained after multiple imputation with less than 50% missing continuous predictor data. For dichotomous predictor data under the multinomial model, Type I error rates are generally conservative, which in part is due to the sparseness of the data. The correlation structure for the predictor variables is not well retained on multiply-imputed data from small samples with more than 50% missing data with this model. For mixed continuous and dichotomous predictor data, the results are similar to those found under the multivariate normal model for continuous data and under the multinomial model for dichotomous data. With all data types, a fully-observed variable included with variables subject to missingness in the multiple imputation process and subsequent statistical analysis provided liberal (larger than nominal values) Type I error rates under a specific pattern of missing data. It is suggested that future studies focus on the effects of multiple imputation in multivariate settings with more realistic data characteristics and a variety of multivariate analyses, assessing both Type I error and power. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis project is motivated by the potential problem of using observational data to draw inferences about a causal relationship in observational epidemiology research when controlled randomization is not applicable. Instrumental variable (IV) method is one of the statistical tools to overcome this problem. Mendelian randomization study uses genetic variants as IVs in genetic association study. In this thesis, the IV method, as well as standard logistic and linear regression models, is used to investigate the causal association between risk of pancreatic cancer and the circulating levels of soluble receptor for advanced glycation end-products (sRAGE). Higher levels of serum sRAGE were found to be associated with a lower risk of pancreatic cancer in a previous observational study (255 cases and 485 controls). However, such a novel association may be biased by unknown confounding factors. In a case-control study, we aimed to use the IV approach to confirm or refute this observation in a subset of study subjects for whom the genotyping data were available (178 cases and 177 controls). Two-stage IV method using generalized method of moments-structural mean models (GMM-SMM) was conducted and the relative risk (RR) was calculated. In the first stage analysis, we found that the single nucleotide polymorphism (SNP) rs2070600 of the receptor for advanced glycation end-products (AGER) gene meets all three general assumptions for a genetic IV in examining the causal association between sRAGE and risk of pancreatic cancer. The variant allele of SNP rs2070600 of the AGER gene was associated with lower levels of sRAGE, and it was neither associated with risk of pancreatic cancer, nor with the confounding factors. It was a potential strong IV (F statistic = 29.2). However, in the second stage analysis, the GMM-SMM model failed to converge due to non- concaveness probably because of the small sample size. Therefore, the IV analysis could not support the causality of the association between serum sRAGE levels and risk of pancreatic cancer. Nevertheless, these analyses suggest that rs2070600 was a potentially good genetic IV for testing the causality between the risk of pancreatic cancer and sRAGE levels. A larger sample size is required to conduct a credible IV analysis.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The distribution of the number of heterozygous loci in two randomly chosen gametes or in a random diploid zygote provides information regarding the nonrandom association of alleles among different genetic loci. Two alternative statistics may be employed for detection of nonrandom association of genes of different loci when observations are made on these distributions: observed variance of the number of heterozygous loci (s2k) and a goodness-of-fit criterion (X2) to contrast the observed distribution with that expected under the hypothesis of random association of genes. It is shown, by simulation, that s2k is statistically more efficient than X2 to detect a given extent of nonrandom association. Asymptotic normality of s2k is justified, and X2 is shown to follow a chi-square (chi 2) distribution with partial loss of degrees of freedom arising because of estimation of parameters from the marginal gene frequency data. Whenever direct evaluations of linkage disequilibrium values are possible, tests based on maximum likelihood estimators of linkage disequilibria require a smaller sample size (number of zygotes or gametes) to detect a given level of nonrandom association in comparison with that required if such tests are conducted on the basis of s2k. Summarization of multilocus genotype (or haplotype) data, into the different number of heterozygous loci classes, thus, amounts to appreciable loss of information.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Limited research has been conducted evaluating programs that are designed to improve the outcomes of homeless adults with mental disorders and comorbid alcohol, drug and mental disorders. This study conducted such an evaluation in a community-based day treatment setting with clients of the Harris County Mental Health and Mental Retardation Authority's Bristow Clinic. The study population included all clients who received treatment at the clinic for a minimum of six months between January 1, 1995 and August 31, 1996. An electronic database was used to identify clients and to track their program involvement. A profile was developed of the study participants and their level of program involvement included an examination of the amount of time spent in clinical, social and other interventions, the type of interventions encountered and the number of interventions encountered. Results were analyzed to determine whether social, demographic and mental history affected levels of program involvement and the effects of the levels of program involvement on housing status and psychiatric functioning status.^ A total of 101 clients met the inclusion criteria. Of the 101 clients, 96 had a mental disorder, and five had comorbidity. Due to the limited numbers of participants with comorbidity, only those with mental disorders were included in the analysis. The study found the Bristow Clinic population to be primarily single, Black, male, between the ages of 31 and 40 years, and with a gross family income of less than $4,000. There were more persons residing on the streets at entry and at six months following treatment than in any other residential setting. The most prevalent psychiatric diagnoses were depressive disorders and schizophrenia. The Global Assessment of Functioning (GAF) scale which was used to determine the degree of psychiatric functioning revealed a modal GAF score of 31--40 at entry and following six months in treatment. The study found that the majority of clients spent less than 17 hours in treatment, had less than 51 encounters and had clinical, social, and other encounters. In regard to social and demographic factors and levels of program involvement, there were statistically significant associations between gender and ethnicity and the types of interventions encountered as well as the number of interventions encountered. There was also a statistically significant difference between the amount of time spent in clinical interventions and gender. Relative to outcomes measured, the study found female gender to be the only background variable that was significantly associated with improved housing status and the female gender and previous MHMRA involvement to be statistically associated with improvement in GAF score. The total time in other (not clinical or social) interventions and the total number of encounters with other interventions were also significantly associated with improvement in housing outcome. The analysis of previous services and levels of program involvement revealed significant associations between time spent in social and clinical interventions and previous hospitalizations and previous MHMRA involvement.^ Major limitations of this study include the small sample size which may have resulted in very little power to detect differences and the lack of generalizability of findings due to site locations used in the study. Despite these limitations, the study makes an important contribution to the literature by documenting the levels of program involvement and the social and demographic factors necessary to produce outcomes of improved housing status and psychiatric functioning status. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The objective of this cross sectional pilot study was to understand the cultural and social influences associated with the participation and retention of Mexican American parents in research studies. Mexican American parent's participation is limited due to cultural barriers that researchers may not recognize. Successful recruitment and retention of participants is a critical element for prevention research, particularly for groups that are underrepresented and carry a high burden of disease (Dunika, Garza, Roosa, & Stoerzinger, 1997). ^ The goal of this pilot study was to increase the understanding of research participation, recruitment and retention strategies among Mexican American adults using an instrument based on the Health Belief Model. This instrument was used to assess the cultural beliefs of Mexican American adults toward research participation. The dependent variable (research scenarios indexed by invasiveness) for each participant was compared to the independent variable (HBM scores) using chi-square analysis to see how the Health Belief Model constructs of perceived threat, perceived barriers, cues to action and perceived benefits are associated with how willing the participants are to participate in different risk levels of research. Descriptive statistics were used to assess the items on the instrument regarding acculturation, demographics, and sample size. ^ This study expands on current knowledge of research participation and retention strategies and methods involving the Mexican American parents. Using data from this study, researchers can observe relevant patterns from the participant's responses.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In the United States, “binge” drinking among college students is an emerging public health concern due to the significant physical and psychological effects on young adults. The focus is on identifying interventions that can help decrease high-risk drinking behavior among this group of drinkers. One such intervention is Motivational interviewing (MI), a client-centered therapy that aims at resolving client ambivalence by developing discrepancy and engaging the client in change talk. Of late, there is a growing interest in determining the active ingredients that influence the alliance between the therapist and the client. This study is a secondary analysis of the data obtained from the Southern Methodist Alcohol Research Trial (SMART) project, a dismantling trial of MI and feedback among heavy drinking college students. The present project examines the relationship between therapist and client language in MI sessions on a sample of “binge” drinking college students. Of the 126 SMART tapes, 30 tapes (‘MI with feedback’ group = 15, ‘MI only’ group = 15) were randomly selected for this study. MISC 2.1, a mutually exclusive and exhaustive coding system, was used to code the audio/videotaped MI sessions. Therapist and client language were analyzed for communication characteristics. Overall, therapists adopted a MI consistent style and clients were found to engage in change talk. Counselor acceptance, empathy, spirit, and complex reflections were all significantly related to client change talk (p-values ranged from 0.001 to 0.047). Additionally, therapist ‘advice without permission’ and MI Inconsistent therapist behaviors were strongly correlated with client sustain talk (p-values ranged from 0.006 to 0.048). Simple linear regression models showed a significant correlation between MI consistent (MICO) therapist language (independent variable) and change talk (dependent variable) and MI inconsistent (MIIN) therapist language (independent variable) and sustain talk (dependent variable). The study has several limitations such as small sample size, self-selection bias, poor inter-rater reliability for the global scales and the lack of a temporal measure of therapist and client language. Future studies might consider a larger sample size to obtain more statistical power. In addition the correlation between therapist language, client language and drinking outcome needs to be explored.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Background. Each year thousands of people participate in mass health screenings for diabetes and hypertension, but little is known about whether or not those who receive higher than normal screening results obtain the recommended follow-up medical care, or what barriers they perceive to doing so. ^ Methods. Study participants were recruited from attendees at three health fairs in low-income neighborhoods in Houston, Texas Potential participants had higher than normal blood pressure (> 90/140 mgHg) or blood glucose readings (100 mm/dL fasting or 140 mm/dL random). Study participants were called at one, two, and three months and asked if they had obtained follow-up medical care; those who had not yet obtained follow-up care were asked to identify barriers. Using a modified Aday-Andersen model of health service access, the independent variables were individual and community characteristics and self-perceived need. The dependent variable was obtaining follow-up care, with barriers to care a secondary outcome. ^ Results. Eighty-two study participants completed the initial questionnaire and 59 participants completed the study protocol. Forty-eight participants (59% under an intent to treat analysis, 81% of those completing the study protocol) obtained follow-up care. Those who completed the initial questionnaire and who reported a regular source of care were significantly more likely to obtain follow-up care. For those who completed the study protocol the relationship between having a regular source of care and obtaining follow-up care approached but did not reach significance. For those who completed the initial questionnaire, self-described health status, when examined as a binary variable (good, very good, excellent, or poor, fair, not sure) was associated with obtaining follow-up care for those who rated their health as poor, fair, or not sure. While the group who completed the study protocol did not reach statistical significance, the same relationship between self-described health status of poor, fair, or not sure and obtaining follow-up care was present. The participants who completed the study protocol and described their blood pressure as OK or a little high were statistically more likely to get follow-up care than those who described it as high or very high. All those on oral medications for hypertension (12/12) and diabetes (4/4) who were told to obtain follow-up care did so; however, the small sample size allows this correlation to be of statistical significance only for those treating hypertension. ^ The variables significantly associated with obtaining follow-up care were having a regular source of care, self-described health status of poor, fair, or not sure, self-described blood pressure of OK or a little high, and taking medication for blood pressure. ^ At the follow-up telephone calls, 34 participants identified barriers to care; cost was a significant barrier reported by 16 participants, and 10 reported that they didn’t have time because they were working long hours after Hurricane Ike. ^ The study included the offer of access assistance: information about nearby safety-net providers, a visit to or information from the Health Information Center at their Neighborhood Center location, or information from Project Safety Net (a searchable web site for safety net providers). Access assistance was offered at the health fairs and then again at follow-up telephone calls to those who had not yet obtained follow-up care. Of the 48 participants who reported obtaining follow-up care, 26 said they had made use of the access assistance to do so. The use of access assistance was associated with being Hispanic, not having health insurance or a regular source of care, and speaking Spanish. It was also associated with being worried about blood glucose. ^ Conclusion. Access assistance, as a community enabling characteristic, may be useful in aiding low-income people in obtaining medical care. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objective. To evaluate the host risk factors associated with rifamycin-resistant Clostridium difficile (C. diff) infection in hospitalized patients compared to rifamycin-susceptible C.diff infection.^ Background. C. diff is the most common definable cause of nosocomial diarrhea affecting elderly hospitalized patients taking antibiotics for prolonged durations. The epidemiology of Clostridium difficile associated disease is now changing with the reports of a new hypervirulent strain causing hospital outbreaks. This new strain is associated with increased disease severity and mortality. The conventional therapy for C. diff includes metronidazole and vancomycin but high recurrence rates and treatment failures are now becoming a major concern. Rifamycin antibiotics are being developed as a new therapeutic option to treat C. diff infection after their efficacy was established in a few in vivo and in vitro studies. There are some recent studies that report an association between the hypervirulent strain and emerging rifamycin resistance. These findings assess the need for clinical studies to better understand the efficacy of rifamycin drugs against C. diff.^ Methods. This is a hospital-based, matched case-control study using de-identified data drawn from two prospective cohort studies involving C. diff patients at St Luke's Hospital. The C. diff isolates from these patients are screened for rifamycin resistance using agar dilution methods for minimum inhibitory concentrations (MIC) as part of Dr Zhi-Dong Jiang's study. Twenty-four rifamycin-rifamycin resistant C. diff cases were identified and matched with one rifamycin susceptible C. diff control on the basis of ± 10 years of age and hospitalization 30 days before or after the case. De-identified data for the 48 subjects was obtained from Dr Kevin Garey's clinical study at St Luke's Hospital enrolling C. diff patients. It was reviewed to gather information about host risk factors, outcome variables and relevant clinical characteristic.^ Results. Medical diagnosis at the time of admission (p = 0.0281) and history of chemotherapy (p = 0.022) were identified as a significant risk factor while hospital stay ranging from 1 week to 1 month and artificial feeding were identified as an important outcome variable (p = 0.072 and p = 0.081 respectively). Horn's Index assessing the severity of underlying illness and duration of antibiotics for cases and controls showed no significant difference.^ Conclusion. The study was a small project designed to identify host risk factors and understand the clinical implications of rifamycin-resistance. The study was underpowered and a larger sample size is needed to validate the results.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Previous research has shown an association between mental health status and cigarette smoking. This study examined four specific mental health predictors and the outcome variable any smoking, defined as smoking one or more cigarettes in the past 30 days. The population included active duty military members serving in the United States Army, Air Force, Navy and Marine Corps. The data was collected during the 2005 Department of Defense Survey of Health Related Behaviors Among Active Duty Military Personnel, a component of the Defense Lifestyle Assessment Program. The sample size included 13,603 subjects. This cross sectional prevalence study consisted of descriptive statistics, univariate analysis, and multivariate logistic regression analysis of the four mental health predictors and the any smoking outcome variable. Multivariate adjustment showed an association between the four mental health predictors and any smoking. This association is consistent with previous literature and can help guide public health officials in the development of smoking prevention and cessation programs.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Introduction. Patient safety culture is the integration of interrelated practices that once developed is supported by both the culture and leadership of the organization (Sagan, 1993). The purpose of this study is to describe and examine the relationship between surgical residents’ perception of their leadership and the resulting organizational safety culture within their clinical setting. This assessment is important to understanding the extent that leadership style affects the perception of the safety culture.^ Methods. A secondary dataset was used which included data from 68 surgical residents from two survey instruments, Organizational Description Questionnaire (ODQ) and Patient Safety Climate In Healthcare Organizations (PSCHO) Survey. Multiple regressions followed by hierarchical regressions with the introduction of the Post Graduate Year (PGY) variable examined the association between the leadership styles, Transactional and Transformational and the organizational safety culture variables, Overall Emphasis on Safety, Senior management engagement, Organizational resources for safety. Independent t-tests were conducted to assess whether males and females differ among the organizational safety culture variables and either leadership style.^ Results. The surgical residents perceived their organizational leadership to have greater emphasis placed on transformational leadership culture style relative to transactional leadership culture style. The only significant association found was between Transformational leadership and Organizational resources for safety. PGY had no significant effect on the leadership or the safety culture perceived. No significant difference was found between females and males in regards to the safety culture or the leadership style.^ Discussion. These results have implications as they support the premise for the study which is surgical residents perceive their existing leadership and organizational culture to be more transformational in nature than transactional. Significance was found between the leadership perceived and one of the safety culture variables, Organizational resources for safety. The foundation for this association lies in the fact that surgical residents are the personnel which are a part of the organizational resources. Although PGY differentiation did not seem to play a difference in the leadership perceived this could be attributed to the small sample size. No gender difference were found which supports the assumption that within such a highly specialized group such as surgical residents there is no gender differences since the highly specialized field draws a certain type of person with distinct characteristics. In future research these survey tools can be used to gauge the survey audiences’ perception and safety interventions can be developed based on the results. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Objectives. This paper seeks to assess the effect on statistical power of regression model misspecification in a variety of situations. ^ Methods and results. The effect of misspecification in regression can be approximated by evaluating the correlation between the correct specification and the misspecification of the outcome variable (Harris 2010).In this paper, three misspecified models (linear, categorical and fractional polynomial) were considered. In the first section, the mathematical method of calculating the correlation between correct and misspecified models with simple mathematical forms was derived and demonstrated. In the second section, data from the National Health and Nutrition Examination Survey (NHANES 2007-2008) were used to examine such correlations. Our study shows that comparing to linear or categorical models, the fractional polynomial models, with the higher correlations, provided a better approximation of the true relationship, which was illustrated by LOESS regression. In the third section, we present the results of simulation studies that demonstrate overall misspecification in regression can produce marked decreases in power with small sample sizes. However, the categorical model had greatest power, ranging from 0.877 to 0.936 depending on sample size and outcome variable used. The power of fractional polynomial model was close to that of linear model, which ranged from 0.69 to 0.83, and appeared to be affected by the increased degrees of freedom of this model.^ Conclusion. Correlations between alternative model specifications can be used to provide a good approximation of the effect on statistical power of misspecification when the sample size is large. When model specifications have known simple mathematical forms, such correlations can be calculated mathematically. Actual public health data from NHANES 2007-2008 were used as examples to demonstrate the situations with unknown or complex correct model specification. Simulation of power for misspecified models confirmed the results based on correlation methods but also illustrated the effect of model degrees of freedom on power.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This study was designed to identify some of the factors related to patterns of physician visits to nursing home residents. The relationship of ten resident and organizational characteristics to patterns of physician visits was investigated through secondary analysis of data abstracted from the 1973-74 National Nursing Home Survey of the National Center for Health Statistics. The study sample was composed of 11,135 of the 19,013 nursing home residents who participated in the survey.^ The analytic results revealed that all ten variables had a statistically significant relationship to patterns of physician visits, mainly due to the large sample size. The degrees of association between the variables, measured by the Cramer's V statistic, ranged from moderate to very weak.^ Certification status of the nursing home under Medicare and/or Medicaid was shown to be most strongly related to patterns of physician visits, followed by primary source of payment for nursing home care, and residence prior to nursing home admission. Several variables thought to be related to patterns of physician visits were found to have a very weak relationship: age of the resident, marital status, length of stay, primary diagnosis, number of chronic conditions, activities of daily living status, and levels of care.^ In order to get a more precise picture of the relative influence of certification status and primary source of payment when the other variables were statistically controlled, these two variables were combined into a single variable. The results revealed that the combined effects of certification status and primary source of payment were sustained, regardless of differences in the residents' personal, utilization, and health status characteristics, and the levels of care that they received. The results also indicated that the five groups created by combining the two variables differed in patterns of physician visits. For example, private pay residents in intermediate care facilities (ICF's) and non-certified facilities were more likely to receive unscheduled visits than private pay residents in skilled nursing homes (SNH's), residents in SNH's supported by Medicare or Medicaid, and residents in ICF's supported by Medicaid. ^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Complex diseases, such as cancer, are caused by various genetic and environmental factors, and their interactions. Joint analysis of these factors and their interactions would increase the power to detect risk factors but is statistically. Bayesian generalized linear models using student-t prior distributions on coefficients, is a novel method to simultaneously analyze genetic factors, environmental factors, and interactions. I performed simulation studies using three different disease models and demonstrated that the variable selection performance of Bayesian generalized linear models is comparable to that of Bayesian stochastic search variable selection, an improved method for variable selection when compared to standard methods. I further evaluated the variable selection performance of Bayesian generalized linear models using different numbers of candidate covariates and different sample sizes, and provided a guideline for required sample size to achieve a high power of variable selection using Bayesian generalize linear models, considering different scales of number of candidate covariates. ^ Polymorphisms in folate metabolism genes and nutritional factors have been previously associated with lung cancer risk. In this study, I simultaneously analyzed 115 tag SNPs in folate metabolism genes, 14 nutritional factors, and all possible genetic-nutritional interactions from 1239 lung cancer cases and 1692 controls using Bayesian generalized linear models stratified by never, former, and current smoking status. SNPs in MTRR were significantly associated with lung cancer risk across never, former, and current smokers. In never smokers, three SNPs in TYMS and three gene-nutrient interactions, including an interaction between SHMT1 and vitamin B12, an interaction between MTRR and total fat intake, and an interaction between MTR and alcohol use, were also identified as associated with lung cancer risk. These lung cancer risk factors are worthy of further investigation.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND: Weight has been implicated as a risk factor for symptomatic community-acquired methicillin resistant Staphylococcus Aureus (CA-MRSA). Information from Texas Children's Hospital (TCH) in Houston, TX was used to implement a case-control study to assess weight-for-age percentile (WFA), race and seasonal exposure as risk factors. ^ METHODS: A retrospective chart review to collect data from TCH was conducted covering the time period January 1st, 2008 to May 31st, 2011. Cases were confirmed and identified by the infectious disease department and were matched on a 1:1 ratio to controls that were seen by the emergency department for non-infected fractures from June 1st, 2008 to May 31st, 2011. Data abstraction was performed using TCH's electronic medical records (EMR) system (EPIC ®). ^ RESULTS: Of 702 CA-MRSA identified cases, ages 9 to 16.99, 564 (80.3%) had the variable `weight' present in their EMR, were not duplicates and not determined to be outliers. Cases were randomly matched to a pool of available controls (n=1864) according to age and gender, yielding 539 1:1 matched pairs (95.5% case matching success) with a total study sample size, N=1078. Case median age was 13.38 years with the majority being White (66.05%) and male (59.4%). Adjusted conditional logistic regression analysis of the matched pairs identified the following risk factors to presenting with CA-MRSA infection among pediatric patients, ages 9 to 16.99 years: a) Individual weight in the highest (75th-99.9th) WFA quartile (OR=1.36; 95% confidence interval [CI]=1.06-1.74; P= 0.016), b) Infection during summer months (OR: 1.69; 95% CI=1.2-2.38; P= 0.003), c) patients of African American race/ethnicity (OR= 1.48; 95% CI=1.13-1.95; P= 0.004). ^ CONCLUSIONS: Pediatric patients, 9 to 16.99 years of age, in the highest WFA quartile (75th-99.9th), or of African-American race had an associated increased risk of presenting with CA-MRSA infection. Furthermore, children in this population were at a higher risk of contracting CA-MRSA infection during the summer season.^

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The purpose of this research is to develop a new statistical method to determine the minimum set of rows (R) in a R x C contingency table of discrete data that explains the dependence of observations. The statistical power of the method will be empirically determined by computer simulation to judge its efficiency over the presently existing methods. The method will be applied to data on DNA fragment length variation at six VNTR loci in over 72 populations from five major racial groups of human (total sample size is over 15,000 individuals; each sample having at least 50 individuals). DNA fragment lengths grouped in bins will form the basis of studying inter-population DNA variation within the racial groups are significant, will provide a rigorous re-binning procedure for forensic computation of DNA profile frequencies that takes into account intra-racial DNA variation among populations. ^