8 resultados para Test data generation
em Duke University
Resumo:
Timing-related defects are major contributors to test escapes and in-field reliability problems for very-deep submicrometer integrated circuits. Small delay variations induced by crosstalk, process variations, power-supply noise, as well as resistive opens and shorts can potentially cause timing failures in a design, thereby leading to quality and reliability concerns. We present a test-grading technique that uses the method of output deviations for screening small-delay defects (SDDs). A new gate-delay defect probability measure is defined to model delay variations for nanometer technologies. The proposed technique intelligently selects the best set of patterns for SDD detection from an n-detect pattern set generated using timing-unaware automatic test-pattern generation (ATPG). It offers significantly lower computational complexity and excites a larger number of long paths compared to a current generation commercial timing-aware ATPG tool. Our results also show that, for the same pattern count, the selected patterns provide more effective coverage ramp-up than timing-aware ATPG and a recent pattern-selection method for random SDDs potentially caused by resistive shorts, resistive opens, and process variations. © 2010 IEEE.
Resumo:
Subspaces and manifolds are two powerful models for high dimensional signals. Subspaces model linear correlation and are a good fit to signals generated by physical systems, such as frontal images of human faces and multiple sources impinging at an antenna array. Manifolds model sources that are not linearly correlated, but where signals are determined by a small number of parameters. Examples are images of human faces under different poses or expressions, and handwritten digits with varying styles. However, there will always be some degree of model mismatch between the subspace or manifold model and the true statistics of the source. This dissertation exploits subspace and manifold models as prior information in various signal processing and machine learning tasks.
A near-low-rank Gaussian mixture model measures proximity to a union of linear or affine subspaces. This simple model can effectively capture the signal distribution when each class is near a subspace. This dissertation studies how the pairwise geometry between these subspaces affects classification performance. When model mismatch is vanishingly small, the probability of misclassification is determined by the product of the sines of the principal angles between subspaces. When the model mismatch is more significant, the probability of misclassification is determined by the sum of the squares of the sines of the principal angles. Reliability of classification is derived in terms of the distribution of signal energy across principal vectors. Larger principal angles lead to smaller classification error, motivating a linear transform that optimizes principal angles. This linear transformation, termed TRAIT, also preserves some specific features in each class, being complementary to a recently developed Low Rank Transform (LRT). Moreover, when the model mismatch is more significant, TRAIT shows superior performance compared to LRT.
The manifold model enforces a constraint on the freedom of data variation. Learning features that are robust to data variation is very important, especially when the size of the training set is small. A learning machine with large numbers of parameters, e.g., deep neural network, can well describe a very complicated data distribution. However, it is also more likely to be sensitive to small perturbations of the data, and to suffer from suffer from degraded performance when generalizing to unseen (test) data.
From the perspective of complexity of function classes, such a learning machine has a huge capacity (complexity), which tends to overfit. The manifold model provides us with a way of regularizing the learning machine, so as to reduce the generalization error, therefore mitigate overfiting. Two different overfiting-preventing approaches are proposed, one from the perspective of data variation, the other from capacity/complexity control. In the first approach, the learning machine is encouraged to make decisions that vary smoothly for data points in local neighborhoods on the manifold. In the second approach, a graph adjacency matrix is derived for the manifold, and the learned features are encouraged to be aligned with the principal components of this adjacency matrix. Experimental results on benchmark datasets are demonstrated, showing an obvious advantage of the proposed approaches when the training set is small.
Stochastic optimization makes it possible to track a slowly varying subspace underlying streaming data. By approximating local neighborhoods using affine subspaces, a slowly varying manifold can be efficiently tracked as well, even with corrupted and noisy data. The more the local neighborhoods, the better the approximation, but the higher the computational complexity. A multiscale approximation scheme is proposed, where the local approximating subspaces are organized in a tree structure. Splitting and merging of the tree nodes then allows efficient control of the number of neighbourhoods. Deviation (of each datum) from the learned model is estimated, yielding a series of statistics for anomaly detection. This framework extends the classical {\em changepoint detection} technique, which only works for one dimensional signals. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.
Resumo:
Association studies of quantitative traits have often relied on methods in which a normal distribution of the trait is assumed. However, quantitative phenotypes from complex human diseases are often censored, highly skewed, or contaminated with outlying values. We recently developed a rank-based association method that takes into account censoring and makes no distributional assumptions about the trait. In this study, we applied our new method to age-at-onset data on ALDX1 and ALDX2. Both traits are highly skewed (skewness > 1.9) and often censored. We performed a whole genome association study of age at onset of the ALDX1 trait using Illumina single-nucleotide polymorphisms. Only slightly more than 5% of markers were significant. However, we identified two regions on chromosomes 14 and 15, which each have at least four significant markers clustering together. These two regions may harbor genes that regulate age at onset of ALDX1 and ALDX2. Future fine mapping of these two regions with densely spaced markers is warranted.
Resumo:
BACKGROUND: In a time-course microarray experiment, the expression level for each gene is observed across a number of time-points in order to characterize the temporal trajectories of the gene-expression profiles. For many of these experiments, the scientific aim is the identification of genes for which the trajectories depend on an experimental or phenotypic factor. There is an extensive recent body of literature on statistical methodology for addressing this analytical problem. Most of the existing methods are based on estimating the time-course trajectories using parametric or non-parametric mean regression methods. The sensitivity of these regression methods to outliers, an issue that is well documented in the statistical literature, should be of concern when analyzing microarray data. RESULTS: In this paper, we propose a robust testing method for identifying genes whose expression time profiles depend on a factor. Furthermore, we propose a multiple testing procedure to adjust for multiplicity. CONCLUSIONS: Through an extensive simulation study, we will illustrate the performance of our method. Finally, we will report the results from applying our method to a case study and discussing potential extensions.
Resumo:
Real decision makers exhibit significant shortcomings in the generation of objectives for decisions that they face. Prior research has illustrated the magnitude of this shortcoming but not its causes. In this paper, we identify two distinct impediments to the generation of decision objectives: not thinking broadly enough about the range of relevant objectives, and not thinking deeply enough to articulate every objective within the range that is considered. To test these explanations and explore ways of stimulating a more comprehensive set of objectives, we present three experiments involving a variety of interventions: the provision of sample objectives, organization of objectives by category, and direct challenges to do better, with or without a warning that important objectives are missing. The use of category names and direct challenges with a warning both led to improvements in the quantity of objectives generated without impacting their quality; other interventions yielded less improvement. We conclude by discussing the relevance of our findings to decision analysis and offering prescriptive implications for the elicitation of decision objectives. © 2010 INFORMS.
Resumo:
Background: Acute febrile respiratory illnesses, including influenza, account for a large proportion of ambulatory care visits worldwide. In the developed world, these encounters commonly result in unwarranted antibiotic prescriptions; data from more resource-limited settings are lacking. The purpose of this study was to describe the epidemiology of influenza among outpatients in southern Sri Lanka and to determine if access to rapid influenza test results was associated with decreased antibiotic prescriptions.
Methods: In this pretest- posttest study, consecutive patients presenting from March 2013- April 2014 to the Outpatient Department of the largest tertiary care hospital in southern Sri Lanka were surveyed for influenza-like illness (ILI). Patients meeting World Health Organization criteria for ILI-- acute onset of fever ≥38.0°C and cough in the prior 7 days--were enrolled. Consenting patients were administered a structured questionnaire, physical examination, and nasal/nasopharyngeal sampling. Rapid influenza A/B testing (Veritor System, Becton Dickinson) was performed on all patients, but test results were only released to patients and clinicians during the second phase of the study (December 2013- April 2014).
Results: We enrolled 397 patients with ILI, with 217 (54.7%) adults ≥12 years and 188 (47.4%) females. A total of 179 (45.8%) tested positive for influenza by rapid testing, with April- July 2013 and September- November 2013 being the periods with the highest proportion of ILI due to influenza. A total of 310 (78.1%) patients with ILI received a prescription for an antibiotic from their outpatient provider. The proportion of patients prescribed antibiotics decreased from 81.4% in the first phase to 66.3% in the second phase (p=.005); among rapid influenza-positive patients, antibiotic prescriptions decreased from 83.7% in the first phase to 56.3% in the second phase (p=.001). On multivariable analysis, having a positive rapid influenza test available to clinicians was associated with decreased antibiotic use (OR 0.20, 95% CI 0.05- 0.82).
Conclusions: Influenza virus accounted for almost 50% of acute febrile respiratory illness in this study, but most patients were prescribed antibiotics. Providing rapid influenza test results to clinicians was associated with fewer antibiotic prescriptions, but overall prescription of antibiotics remained high. In this developing country setting, a multi-faceted approach that includes improved access to rapid diagnostic tests may help decrease antibiotic use and combat antimicrobial resistance.
Resumo:
BACKGROUND: Measurement of CD4+ T-lymphocytes (CD4) is a crucial parameter in the management of HIV patients, particularly in determining eligibility to initiate antiretroviral treatment (ART). A number of technologies exist for CD4 enumeration, with considerable variation in cost, complexity, and operational requirements. We conducted a systematic review of the performance of technologies for CD4 enumeration. METHODS AND FINDINGS: Studies were identified by searching electronic databases MEDLINE and EMBASE using a pre-defined search strategy. Data on test accuracy and precision included bias and limits of agreement with a reference standard, and misclassification probabilities around CD4 thresholds of 200 and 350 cells/μl over a clinically relevant range. The secondary outcome measure was test imprecision, expressed as % coefficient of variation. Thirty-two studies evaluating 15 CD4 technologies were included, of which less than half presented data on bias and misclassification compared to the same reference technology. At CD4 counts <350 cells/μl, bias ranged from -35.2 to +13.1 cells/μl while at counts >350 cells/μl, bias ranged from -70.7 to +47 cells/μl, compared to the BD FACSCount as a reference technology. Misclassification around the threshold of 350 cells/μl ranged from 1-29% for upward classification, resulting in under-treatment, and 7-68% for downward classification resulting in overtreatment. Less than half of these studies reported within laboratory precision or reproducibility of the CD4 values obtained. CONCLUSIONS: A wide range of bias and percent misclassification around treatment thresholds were reported on the CD4 enumeration technologies included in this review, with few studies reporting assay precision. The lack of standardised methodology on test evaluation, including the use of different reference standards, is a barrier to assessing relative assay performance and could hinder the introduction of new point-of-care assays in countries where they are most needed.
Resumo:
BACKGROUND: The National Comprehensive Cancer Network and the American Society of Clinical Oncology have established guidelines for the treatment and surveillance of colorectal cancer (CRC), respectively. Considering these guidelines, an accurate and efficient method is needed to measure receipt of care. METHODS: The accuracy and completeness of Veterans Health Administration (VA) administrative data were assessed by comparing them with data manually abstracted during the Colorectal Cancer Care Collaborative (C4) quality improvement initiative for 618 patients with stage I-III CRC. RESULTS: The VA administrative data contained gender, marital, and birth information for all patients but race information was missing for 62.1% of patients. The percent agreement for demographic variables ranged from 98.1-100%. The kappa statistic for receipt of treatments ranged from 0.21 to 0.60 and there was a 96.9% agreement for the date of surgical resection. The percentage of post-diagnosis surveillance events in C4 also in VA administrative data were 76.0% for colonoscopy, 84.6% for physician visit, and 26.3% for carcinoembryonic antigen (CEA) test. CONCLUSIONS: VA administrative data are accurate and complete for non-race demographic variables, receipt of CRC treatment, colonoscopy, and physician visits; but alternative data sources may be necessary to capture patient race and receipt of CEA tests.