3 resultados para Process control -- Statistical methods
em Duke University
Resumo:
Constant technology advances have caused data explosion in recent years. Accord- ingly modern statistical and machine learning methods must be adapted to deal with complex and heterogeneous data types. This phenomenon is particularly true for an- alyzing biological data. For example DNA sequence data can be viewed as categorical variables with each nucleotide taking four different categories. The gene expression data, depending on the quantitative technology, could be continuous numbers or counts. With the advancement of high-throughput technology, the abundance of such data becomes unprecedentedly rich. Therefore efficient statistical approaches are crucial in this big data era.
Previous statistical methods for big data often aim to find low dimensional struc- tures in the observed data. For example in a factor analysis model a latent Gaussian distributed multivariate vector is assumed. With this assumption a factor model produces a low rank estimation of the covariance of the observed variables. Another example is the latent Dirichlet allocation model for documents. The mixture pro- portions of topics, represented by a Dirichlet distributed variable, is assumed. This dissertation proposes several novel extensions to the previous statistical methods that are developed to address challenges in big data. Those novel methods are applied in multiple real world applications including construction of condition specific gene co-expression networks, estimating shared topics among newsgroups, analysis of pro- moter sequences, analysis of political-economics risk data and estimating population structure from genotype data.
Resumo:
BACKGROUND: Guidance for appropriate utilisation of transthoracic echocardiograms (TTEs) can be incorporated into ordering prompts, potentially affecting the number of requests. METHODS: We incorporated data from the 2011 Appropriate Use Criteria for Echocardiography, the 2010 National Institute for Clinical Excellence Guideline on Chronic Heart Failure, and American College of Cardiology Choosing Wisely list on TTE use for dyspnoea, oedema and valvular disease into electronic ordering systems at Durham Veterans Affairs Medical Center. Our primary outcome was TTE orders per month. Secondary outcomes included rates of outpatient TTE ordering per 100 visits and frequency of brain natriuretic peptide (BNP) ordering prior to TTE. Outcomes were measured for 20 months before and 12 months after the intervention. RESULTS: The number of TTEs ordered did not decrease (338±32 TTEs/month prior vs 320±33 afterwards, p=0.12). Rates of outpatient TTE ordering decreased minimally post intervention (2.28 per 100 primary care/cardiology visits prior vs 1.99 afterwards, p<0.01). Effects on TTE ordering and ordering rate significantly interacted with time from intervention (p<0.02 for both), as the small initial effects waned after 6 months. The percentage of TTE orders with preceding BNP increased (36.5% prior vs 42.2% after for inpatients, p=0.01; 10.8% prior vs 14.5% after for outpatients, p<0.01). CONCLUSIONS: Ordering prompts for TTEs initially minimally reduced the number of TTEs ordered and increased BNP measurement at a single institution, but the effect on TTEs ordered was likely insignificant from a utilisation standpoint and decayed over time.
Resumo:
Current state of the art techniques for landmine detection in ground penetrating radar (GPR) utilize statistical methods to identify characteristics of a landmine response. This research makes use of 2-D slices of data in which subsurface landmine responses have hyperbolic shapes. Various methods from the field of visual image processing are adapted to the 2-D GPR data, producing superior landmine detection results. This research goes on to develop a physics-based GPR augmentation method motivated by current advances in visual object detection. This GPR specific augmentation is used to mitigate issues caused by insufficient training sets. This work shows that augmentation improves detection performance under training conditions that are normally very difficult. Finally, this work introduces the use of convolutional neural networks as a method to learn feature extraction parameters. These learned convolutional features outperform hand-designed features in GPR detection tasks. This work presents a number of methods, both borrowed from and motivated by the substantial work in visual image processing. The methods developed and presented in this work show an improvement in overall detection performance and introduce a method to improve the robustness of statistical classification.