2 resultados para pre-processing quality
em DRUM (Digital Repository at the University of Maryland)
Resumo:
In economics of information theory, credence products are those whose quality is difficult or impossible for consumers to assess, even after they have consumed the product (Darby & Karni, 1973). This dissertation is focused on the content, consumer perception, and power of online reviews for credence services. Economics of information theory has long assumed, without empirical confirmation, that consumers will discount the credibility of claims about credence quality attributes. The same theories predict that because credence services are by definition obscure to the consumer, reviews of credence services are incapable of signaling quality. Our research aims to question these assumptions. In the first essay we examine how the content and structure of online reviews of credence services systematically differ from the content and structure of reviews of experience services and how consumers judge these differences. We have found that online reviews of credence services have either less important or less credible content than reviews of experience services and that consumers do discount the credibility of credence claims. However, while consumers rationally discount the credibility of simple credence claims in a review, more complex argument structure and the inclusion of evidence attenuate this effect. In the second essay we ask, “Can online reviews predict the worst doctors?” We examine the power of online reviews to detect low quality, as measured by state medical board sanctions. We find that online reviews are somewhat predictive of a doctor’s suitability to practice medicine; however, not all the data are useful. Numerical or star ratings provide the strongest quality signal; user-submitted text provides some signal but is subsumed almost completely by ratings. Of the ratings variables in our dataset, we find that punctuality, rather than knowledge, is the strongest predictor of medical board sanctions. These results challenge the definition of credence products, which is a long-standing construct in economics of information theory. Our results also have implications for online review users, review platforms, and for the use of predictive modeling in the context of information systems research.
Resumo:
Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.