6 resultados para LIMITED SETS
em Duke University
Resumo:
This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: 1) filtering, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations and 2) hedging, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset. © 1963-2012 IEEE.
Resumo:
Some luxury goods manufacturers offer limited editions of their products, whereas some others market multiple product lines. Researchers have found that reference groups shape consumer evaluations of these product categories. Yet little empirical research has examined how reference groups affect the product line decisions of firms. Indeed, in a field setting it is quite a challenge to isolate reference group effects from contextual effects and correlated effects. In this paper, we propose a parsimonious model that allows us to study how reference groups influence firm behavior and that lends itself to experimental analysis. With the aid of the model we investigate the behavior of consumers in a laboratory setting where we can focus on the reference group effects after controlling for the contextual and correlated effects. The experimental results show that in the presence of strong reference group effects, limited editions and multiple products can help improve firms' profits. Furthermore, the trends in the purchase decisions of our participants point to the possibility that they are capable of introspecting close to two steps of thinking at the outset of the game and then learning through reinforcement mechanisms. © 2010 INFORMS.
Resumo:
BACKGROUND: Sharing of epidemiological and clinical data sets among researchers is poor at best, in detriment of science and community at large. The purpose of this paper is therefore to (1) describe a novel Web application designed to share information on study data sets focusing on epidemiological clinical research in a collaborative environment and (2) create a policy model placing this collaborative environment into the current scientific social context. METHODOLOGY: The Database of Databases application was developed based on feedback from epidemiologists and clinical researchers requiring a Web-based platform that would allow for sharing of information about epidemiological and clinical study data sets in a collaborative environment. This platform should ensure that researchers can modify the information. A Model-based predictions of number of publications and funding resulting from combinations of different policy implementation strategies (for metadata and data sharing) were generated using System Dynamics modeling. PRINCIPAL FINDINGS: The application allows researchers to easily upload information about clinical study data sets, which is searchable and modifiable by other users in a wiki environment. All modifications are filtered by the database principal investigator in order to maintain quality control. The application has been extensively tested and currently contains 130 clinical study data sets from the United States, Australia, China and Singapore. Model results indicated that any policy implementation would be better than the current strategy, that metadata sharing is better than data-sharing, and that combined policies achieve the best results in terms of publications. CONCLUSIONS: Based on our empirical observations and resulting model, the social network environment surrounding the application can assist epidemiologists and clinical researchers contribute and search for metadata in a collaborative environment, thus potentially facilitating collaboration efforts among research communities distributed around the globe.
Resumo:
Quantitative optical spectroscopy has the potential to provide an effective low cost, and portable solution for cervical pre-cancer screening in resource-limited communities. However, clinical studies to validate the use of this technology in resource-limited settings require low power consumption and good quality control that is minimally influenced by the operator or variable environmental conditions in the field. The goal of this study was to evaluate the effects of two sources of potential error: calibration and pressure on the extraction of absorption and scattering properties of normal cervical tissues in a resource-limited setting in Leogane, Haiti. Our results show that self-calibrated measurements improved scattering measurements through real-time correction of system drift, in addition to minimizing the time required for post-calibration. Variations in pressure (tested without the potential confounding effects of calibration error) caused local changes in vasculature and scatterer density that significantly impacted the tissue absorption and scattering properties Future spectroscopic systems intended for clinical use, particularly where operator training is not viable and environmental conditions unpredictable, should incorporate a real-time self-calibration channel and collect diffuse reflectance spectra at a consistent pressure to maximize data integrity.
Resumo:
Most studies that apply qualitative comparative analysis (QCA) rely on macro-level data, but an increasing number of studies focus on units of analysis at the micro or meso level (i.e., households, firms, protected areas, communities, or local governments). For such studies, qualitative interview data are often the primary source of information. Yet, so far no procedure is available describing how to calibrate qualitative data as fuzzy sets. The authors propose a technique to do so and illustrate it using examples from a study of Guatemalan local governments. By spelling out the details of this important analytic step, the authors aim at contributing to the growing literature on best practice in QCA. © The Author(s) 2012.
Resumo:
MOTIVATION: Although many network inference algorithms have been presented in the bioinformatics literature, no suitable approach has been formulated for evaluating their effectiveness at recovering models of complex biological systems from limited data. To overcome this limitation, we propose an approach to evaluate network inference algorithms according to their ability to recover a complex functional network from biologically reasonable simulated data. RESULTS: We designed a simulator to generate data representing a complex biological system at multiple levels of organization: behaviour, neural anatomy, brain electrophysiology, and gene expression of songbirds. About 90% of the simulated variables are unregulated by other variables in the system and are included simply as distracters. We sampled the simulated data at intervals as one would sample from a biological system in practice, and then used the sampled data to evaluate the effectiveness of an algorithm we developed for functional network inference. We found that our algorithm is highly effective at recovering the functional network structure of the simulated system-including the irrelevance of unregulated variables-from sampled data alone. To assess the reproducibility of these results, we tested our inference algorithm on 50 separately simulated sets of data and it consistently recovered almost perfectly the complex functional network structure underlying the simulated data. To our knowledge, this is the first approach for evaluating the effectiveness of functional network inference algorithms at recovering models from limited data. Our simulation approach also enables researchers a priori to design experiments and data-collection protocols that are amenable to functional network inference.