8 resultados para keyword spotting
em Indian Institute of Science - Bangalore - Índia
Resumo:
The problem of scaling up data integration, such that new sources can be quickly utilized as they are discovered, remains elusive: Global schemas for integrated data are difficult to develop and expand, and schema and record matching techniques are limited by the fact that data and metadata are often under-specified and must be disambiguated by data experts. One promising approach is to avoid using a global schema, and instead to develop keyword search-based data integration-where the system lazily discovers associations enabling it to join together matches to keywords, and return ranked results. The user is expected to understand the data domain and provide feedback about answers' quality. The system generalizes such feedback to learn how to correctly integrate data. A major open challenge is that under this model, the user only sees and offers feedback on a few ``top-'' results: This result set must be carefully selected to include answers of high relevance and answers that are highly informative when feedback is given on them. Existing systems merely focus on predicting relevance, by composing the scores of various schema and record matching algorithms. In this paper, we show how to predict the uncertainty associated with a query result's score, as well as how informative feedback is on a given result. We build upon these foundations to develop an active learning approach to keyword search-based data integration, and we validate the effectiveness of our solution over real data from several very different domains.
Resumo:
Substrates for 2D materials are important for tailoring their fundamental properties and realizing device applications. Aluminum nitride (AIN) films on silicon are promising large-area substrates for such devices in view of their high surface phonon energies and reasonably large dielectric constants. In this paper epitaxial layers of AlN on 2 `' Si wafers have been investigated as a necessary first step to realize devices from exfoliated or transferred atomic layers. Significant thickness dependent contrast enhancements are both predicted and observed for monolayers of graphene and MoS2 on AlN films as compared to the conventional SiO2 films on silicon, with calculated contrast values approaching 100% for graphene on AlN as compared to 8% for SiO2 at normal incidences. Quantitative estimates of experimentally measured contrast using reflectance spectroscopy show very good agreement with calculated values. Transistors of monolayer graphene on AlN films are demonstrated, indicating the feasibility of complete device fabrication on the identified layers.
Resumo:
In this paper, we first describe a framework to model the sponsored search auction on the web as a mechanism design problem. Using this framework, we describe two well-known mechanisms for sponsored search auction-Generalized Second Price (GSP) and Vickrey-Clarke-Groves (VCG). We then derive a new mechanism for sponsored search auction which we call optimal (OPT) mechanism. The OPT mechanism maximizes the search engine's expected revenue, while achieving Bayesian incentive compatibility and individual rationality of the advertisers. We then undertake a detailed comparative study of the mechanisms GSP, VCG, and OPT. We compute and compare the expected revenue earned by the search engine under the three mechanisms when the advertisers are symmetric and some special conditions are satisfied. We also compare the three mechanisms in terms of incentive compatibility, individual rationality, and computational complexity. Note to Practitioners-The advertiser-supported web site is one of the successful business models in the emerging web landscape. When an Internet user enters a keyword (i.e., a search phrase) into a search engine, the user gets back a page with results, containing the links most relevant to the query and also sponsored links, (also called paid advertisement links). When a sponsored link is clicked, the user is directed to the corresponding advertiser's web page. The advertiser pays the search engine in some appropriate manner for sending the user to its web page. Against every search performed by any user on any keyword, the search engine faces the problem of matching a set of advertisers to the sponsored slots. In addition, the search engine also needs to decide on a price to be charged to each advertiser. Due to increasing demands for Internet advertising space, most search engines currently use auction mechanisms for this purpose. These are called sponsored search auctions. A significant percentage of the revenue of Internet giants such as Google, Yahoo!, MSN, etc., comes from sponsored search auctions. In this paper, we study two auction mechanisms, GSP and VCG, which are quite popular in the sponsored auction context, and pursue the objective of designing a mechanism that is superior to these two mechanisms. In particular, we propose a new mechanism which we call the OPT mechanism. This mechanism maximizes the search engine's expected revenue subject to achieving Bayesian incentive compatibility and individual rationality. Bayesian incentive compatibility guarantees that it is optimal for each advertiser to bid his/her true value provided that all other agents also bid their respective true values. Individual rationality ensures that the agents participate voluntarily in the auction since they are assured of gaining a non-negative payoff by doing so.
Resumo:
The keyword based search technique suffers from the problem of synonymic and polysemic queries. Current approaches address only theproblem of synonymic queries in which different queries might have the same information requirement. But the problem of polysemic queries,i.e., same query having different intentions, still remains unaddressed. In this paper, we propose the notion of intent clusters, the members of which will have the same intention. We develop a clustering algorithm that uses the user session information in query logs in addition to query URL entries to identify cluster of queries having the same intention. The proposed approach has been studied through case examples from the actual log data from AOL, and the clustering algorithm is shown to be successful in discerning the user intentions.
Resumo:
In pay-per click sponsored search auctions which are currently extensively used by search engines, the auction for a keyword involves a certain number of advertisers (say k) competing for available slots (say m) to display their ads. This auction is typically conducted for a number of rounds (say T). There are click probabilities mu_ij associated with agent-slot pairs. The search engine's goal is to maximize social welfare, for example, the sum of values of the advertisers. The search engine does not know the true value of an advertiser for a click to her ad and also does not know the click probabilities mu_ij s. A key problem for the search engine therefore is to learn these during the T rounds of the auction and also to ensure that the auction mechanism is truthful. Mechanisms for addressing such learning and incentives issues have recently been introduced and would be referred to as multi-armed-bandit (MAB) mechanisms. When m = 1,characterizations for truthful MAB mechanisms are available in the literature and it has been shown that the regret for such mechanisms will be O(T^{2/3}). In this paper, we seek to derive a characterization in the realistic but nontrivial general case when m > 1 and obtain several interesting results.
Resumo:
Bid optimization is now becoming quite popular in sponsored search auctions on the Web. Given a keyword and the maximum willingness to pay of each advertiser interested in the keyword, the bid optimizer generates a profile of bids for the advertisers with the objective of maximizing customer retention without compromising the revenue of the search engine. In this paper, we present a bid optimization algorithm that is based on a Nash bargaining model where the first player is the search engine and the second player is a virtual agent representing all the bidders. We make the realistic assumption that each bidder specifies a maximum willingness to pay values and a discrete, finite set of bid values. We show that the Nash bargaining solution for this problem always lies on a certain edge of the convex hull such that one end point of the edge is the vector of maximum willingness to pay of all the bidders. We show that the other endpoint of this edge can be computed as a solution of a linear programming problem. We also show how the solution can be transformed to a bid profile of the advertisers.
Resumo:
In pay-per-click sponsored search auctions which are currently extensively used by search engines, the auction for a keyword involves a certain number of advertisers (say k) competing for available slots (say m) to display their advertisements (ads for short). A sponsored search auction for a keyword is typically conducted for a number of rounds (say T). There are click probabilities mu(ij) associated with each agent slot pair (agent i and slot j). The search engine would like to maximize the social welfare of the advertisers, that is, the sum of values of the advertisers for the keyword. However, the search engine does not know the true values advertisers have for a click to their respective advertisements and also does not know the click probabilities. A key problem for the search engine therefore is to learn these click probabilities during the initial rounds of the auction and also to ensure that the auction mechanism is truthful. Mechanisms for addressing such learning and incentives issues have recently been introduced. These mechanisms, due to their connection to the multi-armed bandit problem, are aptly referred to as multi-armed bandit (MAB) mechanisms. When m = 1, exact characterizations for truthful MAB mechanisms are available in the literature. Recent work has focused on the more realistic but non-trivial general case when m > 1 and a few promising results have started appearing. In this article, we consider this general case when m > 1 and prove several interesting results. Our contributions include: (1) When, mu(ij)s are unconstrained, we prove that any truthful mechanism must satisfy strong pointwise monotonicity and show that the regret will be Theta T7) for such mechanisms. (2) When the clicks on the ads follow a certain click precedence property, we show that weak pointwise monotonicity is necessary for MAB mechanisms to be truthful. (3) If the search engine has a certain coarse pre-estimate of mu(ij) values and wishes to update them during the course of the T rounds, we show that weak pointwise monotonicity and type-I separatedness are necessary while weak pointwise monotonicity and type-II separatedness are sufficient conditions for the MAB mechanisms to be truthful. (4) If the click probabilities are separable into agent-specific and slot-specific terms, we provide a characterization of MAB mechanisms that are truthful in expectation.