21 resultados para Receiver-operating Characteristics
em DigitalCommons@The Texas Medical Center
Resumo:
A non-parametric method was developed and tested to compare the partial areas under two correlated Receiver Operating Characteristic curves. Based on the theory of generalized U-statistics the mathematical formulas have been derived for computing ROC area, and the variance and covariance between the portions of two ROC curves. A practical SAS application also has been developed to facilitate the calculations. The accuracy of the non-parametric method was evaluated by comparing it to other methods. By applying our method to the data from a published ROC analysis of CT image, our results are very close to theirs. A hypothetical example was used to demonstrate the effects of two crossed ROC curves. The two ROC areas are the same. However each portion of the area between two ROC curves were found to be significantly different by the partial ROC curve analysis. For computation of ROC curves with large scales, such as a logistic regression model, we applied our method to the breast cancer study with Medicare claims data. It yielded the same ROC area computation as the SAS Logistic procedure. Our method also provides an alternative to the global summary of ROC area comparison by directly comparing the true-positive rates for two regression models and by determining the range of false-positive values where the models differ. ^
Resumo:
Critically ill and injured patients require pain relief and sedation to reduce the body's stress response and to facilitate painful diagnostic and therapeutic procedures. Presently, the level of sedation and analgesia is guided by the use of clinical scores which can be unreliable. There is therefore, a need for an objective measure of sedation and analgesia. The Bispectral Index (BIS) and Patient State Index (PSI) were recently introduced into clinical practice as objective measures of the depth of analgesia and sedation. ^ Aim. To compare the different measures of sedation and analgesia (BIS and PSI) to the standard and commonly used modified Ramsay Score (MRS) and determine if the monitors can be used interchangeably. ^ Methods. MRS, BIS and PSI values were obtained in 50 postoperative cardiac surgery patients requiring analgesia and sedation from June to December 2004. The MRS, BIS and PSI values were assessed hourly for up to 6-h by a single observer. ^ The relationship between BIS and PSI values were explored using scatter plots and correlation between MRS, BIS and PSI was determined using Spearman's correlation coefficient. Intra-class correlation (ICC) was used to determine the inter-rater reliability of MRS, BIS and PSI. Kappa statistics was used to further evaluate the agreement between BIS and PSI at light, moderate and deep levels of sedation. ^ Results. There was a positive correlation between BIS and PSI values (Rho = 0.731, p<0.001). Intra-class correlation between BIS and PSI was 0.58, MRS and BIS 0.43 and MRS and PSI 0.27. Using Kappa statistics, agreement between MRS and BIS was 0.35 (95% CI: 0.27–0.43) and for MRS and PSI was 0.21 (95% CI: 0.15–0.28). The kappa statistic for BIS and PSI was 0.45 (95% CI: 0.37–0.52). Receiver operating characteristics (ROC) curves constructed to detect undersedation indicated an area under the curve (AUC) of 0.91 (95% CI = 0.87 to 0.94) for the BIS and 0.84 (95% CI = 0.79 to 0.88) for the PSI. For detection of oversedation, AUC for the BIS was 0.89 (95% CI = 0.84 to 0.92) and 0.80 (95% CI = 0.75 to 0.85) for the PSI. ^ Conclusions. There is a statistically significant positive correlation between the BIS and PSI but poor correlation and poor test agreement between the MRS and BIS as well as MRS and PSI. Both the BIS and PSI demonstrated a high level of prediction for undersedation and oversedation; however, the BIS and PSI can not be considered interchangeable monitors of sedation. ^
Resumo:
The main objective of this study was to determine the external validity of a clinical prediction rule developed by the European Multicenter Study on Human Spinal Cord Injury (EM-SCI) to predict the ambulation outcomes 12 months after traumatic spinal cord injury. Data from the North American Clinical Trials Network (NACTN) data registry with approximately 500 SCI cases were used for this validity study. The predictive accuracy of the EM-SCI prognostic model was evaluated using calibration and discrimination based on 231 NACTN cases. The area under the receiver-operating-characteristics curve (ROC) curve was 0.927 (95% CI 0.894 – 0.959) for the EM-SCI model when applied to NACTN population. This is lower than the AUC of 0.956 (95% CI 0.936 – 0.976) reported for the EM-SCI population, but suggests that the EM-SCI clinical prediction rule distinguished well between those patients in the NACTN population who were able to achieve independent ambulation and those who did not achieve independent ambulation. The calibration curve suggests that higher the prediction score is, the better the probability of walking with the best prediction for AIS D patients. In conclusion, the EM-SCI clinical prediction rule was determined to be generalizable to the adult NACTN SCI population.^
Resumo:
OBJECTIVE: We sought to evaluate the performance of the human papillomavirus high-risk DNA test in patients 30 years and older. MATERIALS AND METHODS: Screening (n=835) and diagnosis (n=518) groups were defined based on prior Papanicolaou smear results as part of a clinical trial for cervical cancer detection. We compared the Hybrid Capture II (HCII) test result with the worst histologic report. We used cervical intraepithelial neoplasia (CIN) 2/3 or worse as the reference of disease. We calculated sensitivities, specificities, positive and negative likelihood ratios (LR+ and LR-), receiver operating characteristic (ROC) curves, and areas under the ROC curves for the HCII test. We also considered alternative strategies, including Papanicolaou smear, a combination of Papanicolaou smear and the HCII test, a sequence of Papanicolaou smear followed by the HCII test, and a sequence of the HCII test followed by Papanicolaou smear. RESULTS: For the screening group, the sensitivity was 0.69 and the specificity was 0.93; the area under the ROC curve was 0.81. The LR+ and LR- were 10.24 and 0.34, respectively. For the diagnosis group, the sensitivity was 0.88 and the specificity was 0.78; the area under the ROC curve was 0.83. The LR+ and LR- were 4.06 and 0.14, respectively. Sequential testing showed little or no improvement over the combination testing. CONCLUSIONS: The HCII test in the screening group had a greater LR+ for the detection of CIN 2/3 or worse. HCII testing may be an additional screening tool for cervical cancer in women 30 years and older.
Resumo:
This dissertation explores phase I dose-finding designs in cancer trials from three perspectives: the alternative Bayesian dose-escalation rules, a design based on a time-to-dose-limiting toxicity (DLT) model, and a design based on a discrete-time multi-state (DTMS) model. We list alternative Bayesian dose-escalation rules and perform a simulation study for the intra-rule and inter-rule comparisons based on two statistical models to identify the most appropriate rule under certain scenarios. We provide evidence that all the Bayesian rules outperform the traditional ``3+3'' design in the allocation of patients and selection of the maximum tolerated dose. The design based on a time-to-DLT model uses patients' DLT information over multiple treatment cycles in estimating the probability of DLT at the end of treatment cycle 1. Dose-escalation decisions are made whenever a cycle-1 DLT occurs, or two months after the previous check point. Compared to the design based on a logistic regression model, the new design shows more safety benefits for trials in which more late-onset toxicities are expected. As a trade-off, the new design requires more patients on average. The design based on a discrete-time multi-state (DTMS) model has three important attributes: (1) Toxicities are categorized over a distribution of severity levels, (2) Early toxicity may inform dose escalation, and (3) No suspension is required between accrual cohorts. The proposed model accounts for the difference in the importance of the toxicity severity levels and for transitions between toxicity levels. We compare the operating characteristics of the proposed design with those from a similar design based on a fully-evaluated model that directly models the maximum observed toxicity level within the patients' entire assessment window. We describe settings in which, under comparable power, the proposed design shortens the trial. The proposed design offers more benefit compared to the alternative design as patient accrual becomes slower.
Resumo:
The main goal of this study was to relate physical changes in image quality measured by Modulation Transfer Function (MTF) to diagnostic accuracy.^ One Hundred and Fifty Kodak Min-R screen/film combination conventional craniocaudal mammograms obtained with the Pfizer Microfocus Mammographic system were selected from the files of the Department of Radiology, at M.D. Anderson Hospital and Tumor Institute.^ The mammograms included 88 cases with a variety of benign diagnosis and 62 cases with a variety of malignant biopsy diagnosis. The average age of the patient population was 55 years old. 70 cases presented calcifications with 30 cases having calcifications smaller than 0.5mm. 46 cases presented irregular bordered masses larger than 1 cm. 30 cases presented smooth bordered masses with 20 larger than 1 cm.^ Four separated copies of the original images were made each having a different change in the MTF using a defocusing technique whereby copies of the original were obtained by light exposure through different thicknesses (spacing) of transparent film base.^ The mammograms were randomized, and evaluated by three experienced mammographers for the degree of visibility of various anatomical breast structures and pathological lesions (masses and calicifications), subjective image quality, and mammographic interpretation.^ 3,000 separate evaluations were anayzed by several statistical techniques including Receiver Operating Characteristic curve analysis, McNemar test for differences between proportions and the Landis et al. method of agreement weighted kappa for ordinal categorical data.^ Results from the statistical analysis show: (1) There were no statistical significant differences in the diagnostic accuracy of the observers when diagnosing from mammograms with the same MTF. (2) There were no statistically significant differences in diagnostic accuracy for each observer when diagnosing from mammograms with the different MTF's used in the study. (3) There statistical significant differences in detail visibility between the copies and the originals. Detail visibility was better in the originals. (4) Feature interpretations were not significantly different between the originals and the copies. (5) Perception of image quality did not affect image interpretation.^ Continuation and improvement of this research ca be accomplished by: using a case population more sensitive to MTF changes, i.e., asymptomatic women with minimum breast cancer, more observers (including less experienced radiologists and experienced technologists) must collaborate in the study, and using a minimum of 200 benign and 200 malignant cases.^
Resumo:
Standard methods for testing safety data are needed to ensure the safe conduct of clinical trials. In particular, objective rules for reliably identifying unsafe treatments need to be put into place to help protect patients from unnecessary harm. DMCs are uniquely qualified to evaluate accumulating unblinded data and make recommendations about the continuing safe conduct of a trial. However, it is the trial leadership who must make the tough ethical decision about stopping a trial, and they could benefit from objective statistical rules that help them judge the strength of evidence contained in the blinded data. We design early stopping rules for harm that act as continuous safety screens for randomized controlled clinical trials with blinded treatment information, which could be used by anyone, including trial investigators (and trial leadership). A Bayesian framework, with emphasis on the likelihood function, is used to allow for continuous monitoring without adjusting for multiple comparisons. Close collaboration between the statistician and the clinical investigators will be needed in order to design safety screens with good operating characteristics. Though the math underlying this procedure may be computationally intensive, implementation of the statistical rules will be easy and the continuous screening provided will give suitably early warning when real problems were to emerge. Trial investigators and trial leadership need these safety screens to help them to effectively monitor the ongoing safe conduct of clinical trials with blinded data.^
Resumo:
Bayesian adaptive randomization (BAR) is an attractive approach to allocate more patients to the putatively superior arm based on the interim data while maintains good statistical properties attributed to randomization. Under this approach, patients are adaptively assigned to a treatment group based on the probability that the treatment is better. The basic randomization scheme can be modified by introducing a tuning parameter, replacing the posterior estimated response probability, setting a boundary to randomization probabilities. Under randomization settings comprised of the above modifications, operating characteristics, including type I error, power, sample size, imbalance of sample size, interim success rate, and overall success rate, were evaluated through simulation. All randomization settings have low and comparable type I errors. Increasing tuning parameter decreases power, but increases imbalance of sample size and interim success rate. Compared with settings using the posterior probability, settings using the estimated response rates have higher power and overall success rate, but less imbalance of sample size and lower interim success rate. Bounded settings have higher power but less imbalance of sample size than unbounded settings. All settings have better performance in the Bayesian design than in the frequentist design. This simulation study provided practical guidance on the choice of how to implement the adaptive design. ^
Resumo:
Group sequential methods and response adaptive randomization (RAR) procedures have been applied in clinical trials due to economical and ethical considerations. Group sequential methods are able to reduce the average sample size by inducing early stopping, but patients are equally allocated with half of chance to inferior arm. RAR procedures incline to allocate more patients to better arm; however it requires more sample size to obtain a certain power. This study intended to combine these two procedures. We applied the Bayesian decision theory approach to define our group sequential stopping rules and evaluated the operating characteristics under RAR setting. The results showed that Bayesian decision theory method was able to preserve the type I error rate as well as achieve a favorable power; further by comparing with the error spending function method, we concluded that Bayesian decision theory approach was more effective on reducing average sample size.^
Resumo:
Although the area under the receiver operating characteristic (AUC) is the most popular measure of the performance of prediction models, it has limitations, especially when it is used to evaluate the added discrimination of a new biomarker in the model. Pencina et al. (2008) proposed two indices, the net reclassification improvement (NRI) and integrated discrimination improvement (IDI), to supplement the improvement in the AUC (IAUC). Their NRI and IDI are based on binary outcomes in case-control settings, which do not involve time-to-event outcome. However, many disease outcomes are time-dependent and the onset time can be censored. Measuring discrimination potential of a prognostic marker without considering time to event can lead to biased estimates. In this dissertation, we have extended the NRI and IDI to survival analysis settings and derived the corresponding sample estimators and asymptotic tests. Simulation studies were conducted to compare the performance of the time-dependent NRI and IDI with Pencina’s NRI and IDI. For illustration, we have applied the proposed method to a breast cancer study.^ Key words: Prognostic model, Discrimination, Time-dependent NRI and IDI ^
Resumo:
Loneliness is a pervasive, rather common experience in American culture, particularly notable among adolescents. However, the phenomenon is not well documented in the cross-cultural psychiatric literature. For psychiatric epidemiology to encompass a wide array of psychopathologic phenomena, it is important to develop useful measures to characterize and classify both non-clinical and clinical dysfunction in diverse subgroups and cultures.^ The goal of this research was to examine the cross-cultural reliability and construct validity of a scale designed to measure loneliness. The Roberts Loneliness Scale (RLS-8) was administered to 4,060 adolescents ages 10-19 years enrolled in high schools along either side of the Texas-Tamaulipas border region between the U.S. and Mexico. Data collected in 1988 from a study focusing on substance use and psychological distress among adolescents in these regions were used to examine the operating characteristics of the RLS-8. A sample stratified by nationality and language, age, gender, and grade was used for analysis.^ Results indicated that in general the RLS-8 has moderate reliability in the U.S. sample, but not in the Mexican sample. Validity analyses demonstrated that there was evidence for convergent validity of the RLS-8 in the U.S. sample, but none in the Mexican sample. Discriminant validity of the measures in neither sample could be established. Based on the factor structure of the RLS-8, two subscales were created and analyzed for construct validity. Evidence for convergent validity was established for both subscales in both national samples. However, the discriminant validity of the measure remains unsubstantiated in both national samples. Also, the dimensionality of the scale is unresolved.^ One primary goal for future cross-cultural research would be to develop and test better defined culture-specific models of loneliness within the two cultures. From such scientific endeavor, measures of loneliness can be developed or reconstructed to classify the phenomenon in the same manner across cultures. Since estimates of prevalence and incidence are contingent upon reliable and valid screening or diagnostic measures, this objective would serve as an important foundation for future psychiatric epidemiologic inquiry into loneliness. ^
Resumo:
Breast cancer is the most common non-skin cancer and the second leading cause of cancer-related death in women in the United States. Studies on ipsilateral breast tumor relapse (IBTR) status and disease-specific survival will help guide clinic treatment and predict patient prognosis.^ After breast conservation therapy, patients with breast cancer may experience breast tumor relapse. This relapse is classified into two distinct types: true local recurrence (TR) and new ipsilateral primary tumor (NP). However, the methods used to classify the relapse types are imperfect and are prone to misclassification. In addition, some observed survival data (e.g., time to relapse and time from relapse to death)are strongly correlated with relapse types. The first part of this dissertation presents a Bayesian approach to (1) modeling the potentially misclassified relapse status and the correlated survival information, (2) estimating the sensitivity and specificity of the diagnostic methods, and (3) quantify the covariate effects on event probabilities. A shared frailty was used to account for the within-subject correlation between survival times. The inference was conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in softwareWinBUGS. Simulation was used to validate the Bayesian method and assess its frequentist properties. The new model has two important innovations: (1) it utilizes the additional survival times correlated with the relapse status to improve the parameter estimation, and (2) it provides tools to address the correlation between the two diagnostic methods conditional to the true relapse types.^ Prediction of patients at highest risk for IBTR after local excision of ductal carcinoma in situ (DCIS) remains a clinical concern. The goals of the second part of this dissertation were to evaluate a published nomogram from Memorial Sloan-Kettering Cancer Center, to determine the risk of IBTR in patients with DCIS treated with local excision, and to determine whether there is a subset of patients at low risk of IBTR. Patients who had undergone local excision from 1990 through 2007 at MD Anderson Cancer Center with a final diagnosis of DCIS (n=794) were included in this part. Clinicopathologic factors and the performance of the Memorial Sloan-Kettering Cancer Center nomogram for prediction of IBTR were assessed for 734 patients with complete data. Nomogram for prediction of 5- and 10-year IBTR probabilities were found to demonstrate imperfect calibration and discrimination, with an area under the receiver operating characteristic curve of .63 and a concordance index of .63. In conclusion, predictive models for IBTR in DCIS patients treated with local excision are imperfect. Our current ability to accurately predict recurrence based on clinical parameters is limited.^ The American Joint Committee on Cancer (AJCC) staging of breast cancer is widely used to determine prognosis, yet survival within each AJCC stage shows wide variation and remains unpredictable. For the third part of this dissertation, biologic markers were hypothesized to be responsible for some of this variation, and the addition of biologic markers to current AJCC staging were examined for possibly provide improved prognostication. The initial cohort included patients treated with surgery as first intervention at MDACC from 1997 to 2006. Cox proportional hazards models were used to create prognostic scoring systems. AJCC pathologic staging parameters and biologic tumor markers were investigated to devise the scoring systems. Surveillance Epidemiology and End Results (SEER) data was used as the external cohort to validate the scoring systems. Binary indicators for pathologic stage (PS), estrogen receptor status (E), and tumor grade (G) were summed to create PS+EG scoring systems devised to predict 5-year patient outcomes. These scoring systems facilitated separation of the study population into more refined subgroups than the current AJCC staging system. The ability of the PS+EG score to stratify outcomes was confirmed in both internal and external validation cohorts. The current study proposes and validates a new staging system by incorporating tumor grade and ER status into current AJCC staging. We recommend that biologic markers be incorporating into revised versions of the AJCC staging system for patients receiving surgery as the first intervention.^ Chapter 1 focuses on developing a Bayesian method to solve misclassified relapse status and application to breast cancer data. Chapter 2 focuses on evaluation of a breast cancer nomogram for predicting risk of IBTR in patients with DCIS after local excision gives the statement of the problem in the clinical research. Chapter 3 focuses on validation of a novel staging system for disease-specific survival in patients with breast cancer treated with surgery as the first intervention. ^
Resumo:
Treating patients with combined agents is a growing trend in cancer clinical trials. Evaluating the synergism of multiple drugs is often the primary motivation for such drug-combination studies. Focusing on the drug combination study in the early phase clinical trials, our research is composed of three parts: (1) We conduct a comprehensive comparison of four dose-finding designs in the two-dimensional toxicity probability space and propose using the Bayesian model averaging method to overcome the arbitrariness of the model specification and enhance the robustness of the design; (2) Motivated by a recent drug-combination trial at MD Anderson Cancer Center with a continuous-dose standard of care agent and a discrete-dose investigational agent, we propose a two-stage Bayesian adaptive dose-finding design based on an extended continual reassessment method; (3) By combining phase I and phase II clinical trials, we propose an extension of a single agent dose-finding design. We model the time-to-event toxicity and efficacy to direct dose finding in two-dimensional drug-combination studies. We conduct extensive simulation studies to examine the operating characteristics of the aforementioned designs and demonstrate the designs' good performances in various practical scenarios.^
Resumo:
ACCURACY OF THE BRCAPRO RISK ASSESSMENT MODEL IN MALES PRESENTING TO MD ANDERSON FOR BRCA TESTING Publication No. _______ Carolyn A. Garby, B.S. Supervisory Professor: Banu Arun, M.D. Hereditary Breast and Ovarian Cancer (HBOC) syndrome is due to mutations in BRCA1 and BRCA2 genes. Women with HBOC have high risks to develop breast and ovarian cancers. Males with HBOC are commonly overlooked because male breast cancer is rare and other male cancer risks such as prostate and pancreatic cancers are relatively low. BRCA genetic testing is indicated for men as it is currently estimated that 4-40% of male breast cancers result from a BRCA1 or BRCA2 mutation (Ottini, 2010) and management recommendations can be made based on genetic test results. Risk assessment models are available to provide the individualized likelihood to have a BRCA mutation. Only one study has been conducted to date to evaluate the accuracy of BRCAPro in males and was based on a cohort of Italian males and utilized an older version of BRCAPro. The objective of this study is to determine if BRCAPro5.1 is a valid risk assessment model for males who present to MD Anderson Cancer Center for BRCA genetic testing. BRCAPro has been previously validated for determining the probability of carrying a BRCA mutation, however has not been further examined particularly in males. The total cohort consisted of 152 males who had undergone BRCA genetic testing. The cohort was stratified by indication for genetic counseling. Indications included having a known familial BRCA mutation, having a personal diagnosis of a BRCA-related cancer, or having a family history suggestive of HBOC. Overall there were 22 (14.47%) BRCA1+ males and 25 (16.45%) BRCA2+ males. Receiver operating characteristic curves were constructed for the cohort overall, for each particular indication, as well as for each cancer subtype. Our findings revealed that the BRCAPro5.1 model had perfect discriminating ability at a threshold of 56.2 for males with breast cancer, however only 2 (4.35%) of 46 were found to have BRCA2 mutations. These results are significantly lower than the high approximation (40%) reported in previous literature. BRCAPro does perform well in certain situations for men. Future investigation of male breast cancer and men at risk for BRCA mutations is necessary to provide a more accurate risk assessment.
Resumo:
My dissertation focuses mainly on Bayesian adaptive designs for phase I and phase II clinical trials. It includes three specific topics: (1) proposing a novel two-dimensional dose-finding algorithm for biological agents, (2) developing Bayesian adaptive screening designs to provide more efficient and ethical clinical trials, and (3) incorporating missing late-onset responses to make an early stopping decision. Treating patients with novel biological agents is becoming a leading trend in oncology. Unlike cytotoxic agents, for which toxicity and efficacy monotonically increase with dose, biological agents may exhibit non-monotonic patterns in their dose-response relationships. Using a trial with two biological agents as an example, we propose a phase I/II trial design to identify the biologically optimal dose combination (BODC), which is defined as the dose combination of the two agents with the highest efficacy and tolerable toxicity. A change-point model is used to reflect the fact that the dose-toxicity surface of the combinational agents may plateau at higher dose levels, and a flexible logistic model is proposed to accommodate the possible non-monotonic pattern for the dose-efficacy relationship. During the trial, we continuously update the posterior estimates of toxicity and efficacy and assign patients to the most appropriate dose combination. We propose a novel dose-finding algorithm to encourage sufficient exploration of untried dose combinations in the two-dimensional space. Extensive simulation studies show that the proposed design has desirable operating characteristics in identifying the BODC under various patterns of dose-toxicity and dose-efficacy relationships. Trials of combination therapies for the treatment of cancer are playing an increasingly important role in the battle against this disease. To more efficiently handle the large number of combination therapies that must be tested, we propose a novel Bayesian phase II adaptive screening design to simultaneously select among possible treatment combinations involving multiple agents. Our design is based on formulating the selection procedure as a Bayesian hypothesis testing problem in which the superiority of each treatment combination is equated to a single hypothesis. During the trial conduct, we use the current values of the posterior probabilities of all hypotheses to adaptively allocate patients to treatment combinations. Simulation studies show that the proposed design substantially outperforms the conventional multi-arm balanced factorial trial design. The proposed design yields a significantly higher probability for selecting the best treatment while at the same time allocating substantially more patients to efficacious treatments. The proposed design is most appropriate for the trials combining multiple agents and screening out the efficacious combination to be further investigated. The proposed Bayesian adaptive phase II screening design substantially outperformed the conventional complete factorial design. Our design allocates more patients to better treatments while at the same time providing higher power to identify the best treatment at the end of the trial. Phase II trial studies usually are single-arm trials which are conducted to test the efficacy of experimental agents and decide whether agents are promising to be sent to phase III trials. Interim monitoring is employed to stop the trial early for futility to avoid assigning unacceptable number of patients to inferior treatments. We propose a Bayesian single-arm phase II design with continuous monitoring for estimating the response rate of the experimental drug. To address the issue of late-onset responses, we use a piece-wise exponential model to estimate the hazard function of time to response data and handle the missing responses using the multiple imputation approach. We evaluate the operating characteristics of the proposed method through extensive simulation studies. We show that the proposed method reduces the total length of the trial duration and yields desirable operating characteristics for different physician-specified lower bounds of response rate with different true response rates.