18 resultados para Web search engines

em Indian Institute of Science - Bangalore - Índia


Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we first describe a framework to model the sponsored search auction on the web as a mechanism design problem. Using this framework, we describe two well-known mechanisms for sponsored search auction-Generalized Second Price (GSP) and Vickrey-Clarke-Groves (VCG). We then derive a new mechanism for sponsored search auction which we call optimal (OPT) mechanism. The OPT mechanism maximizes the search engine's expected revenue, while achieving Bayesian incentive compatibility and individual rationality of the advertisers. We then undertake a detailed comparative study of the mechanisms GSP, VCG, and OPT. We compute and compare the expected revenue earned by the search engine under the three mechanisms when the advertisers are symmetric and some special conditions are satisfied. We also compare the three mechanisms in terms of incentive compatibility, individual rationality, and computational complexity. Note to Practitioners-The advertiser-supported web site is one of the successful business models in the emerging web landscape. When an Internet user enters a keyword (i.e., a search phrase) into a search engine, the user gets back a page with results, containing the links most relevant to the query and also sponsored links, (also called paid advertisement links). When a sponsored link is clicked, the user is directed to the corresponding advertiser's web page. The advertiser pays the search engine in some appropriate manner for sending the user to its web page. Against every search performed by any user on any keyword, the search engine faces the problem of matching a set of advertisers to the sponsored slots. In addition, the search engine also needs to decide on a price to be charged to each advertiser. Due to increasing demands for Internet advertising space, most search engines currently use auction mechanisms for this purpose. These are called sponsored search auctions. A significant percentage of the revenue of Internet giants such as Google, Yahoo!, MSN, etc., comes from sponsored search auctions. In this paper, we study two auction mechanisms, GSP and VCG, which are quite popular in the sponsored auction context, and pursue the objective of designing a mechanism that is superior to these two mechanisms. In particular, we propose a new mechanism which we call the OPT mechanism. This mechanism maximizes the search engine's expected revenue subject to achieving Bayesian incentive compatibility and individual rationality. Bayesian incentive compatibility guarantees that it is optimal for each advertiser to bid his/her true value provided that all other agents also bid their respective true values. Individual rationality ensures that the agents participate voluntarily in the auction since they are assured of gaining a non-negative payoff by doing so.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In pay-per-click sponsored search auctions which are currently extensively used by search engines, the auction for a keyword involves a certain number of advertisers (say k) competing for available slots (say m) to display their advertisements (ads for short). A sponsored search auction for a keyword is typically conducted for a number of rounds (say T). There are click probabilities mu(ij) associated with each agent slot pair (agent i and slot j). The search engine would like to maximize the social welfare of the advertisers, that is, the sum of values of the advertisers for the keyword. However, the search engine does not know the true values advertisers have for a click to their respective advertisements and also does not know the click probabilities. A key problem for the search engine therefore is to learn these click probabilities during the initial rounds of the auction and also to ensure that the auction mechanism is truthful. Mechanisms for addressing such learning and incentives issues have recently been introduced. These mechanisms, due to their connection to the multi-armed bandit problem, are aptly referred to as multi-armed bandit (MAB) mechanisms. When m = 1, exact characterizations for truthful MAB mechanisms are available in the literature. Recent work has focused on the more realistic but non-trivial general case when m > 1 and a few promising results have started appearing. In this article, we consider this general case when m > 1 and prove several interesting results. Our contributions include: (1) When, mu(ij)s are unconstrained, we prove that any truthful mechanism must satisfy strong pointwise monotonicity and show that the regret will be Theta T7) for such mechanisms. (2) When the clicks on the ads follow a certain click precedence property, we show that weak pointwise monotonicity is necessary for MAB mechanisms to be truthful. (3) If the search engine has a certain coarse pre-estimate of mu(ij) values and wishes to update them during the course of the T rounds, we show that weak pointwise monotonicity and type-I separatedness are necessary while weak pointwise monotonicity and type-II separatedness are sufficient conditions for the MAB mechanisms to be truthful. (4) If the click probabilities are separable into agent-specific and slot-specific terms, we provide a characterization of MAB mechanisms that are truthful in expectation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Our study concerns an important current problem, that of diffusion of information in social networks. This problem has received significant attention from the Internet research community in the recent times, driven by many potential applications such as viral marketing and sales promotions. In this paper, we focus on the target set selection problem, which involves discovering a small subset of influential players in a given social network, to perform a certain task of information diffusion. The target set selection problem manifests in two forms: 1) top-k nodes problem and 2) lambda-coverage problem. In the top-k nodes problem, we are required to find a set of k key nodes that would maximize the number of nodes being influenced in the network. The lambda-coverage problem is concerned with finding a set of k key nodes having minimal size that can influence a given percentage lambda of the nodes in the entire network. We propose a new way of solving these problems using the concept of Shapley value which is a well known solution concept in cooperative game theory. Our approach leads to algorithms which we call the ShaPley value-based Influential Nodes (SPINs) algorithms for solving the top-k nodes problem and the lambda-coverage problem. We compare the performance of the proposed SPIN algorithms with well known algorithms in the literature. Through extensive experimentation on four synthetically generated random graphs and six real-world data sets (Celegans, Jazz, NIPS coauthorship data set, Netscience data set, High-Energy Physics data set, and Political Books data set), we show that the proposed SPIN approach is more powerful and computationally efficient. Note to Practitioners-In recent times, social networks have received a high level of attention due to their proven ability in improving the performance of web search, recommendations in collaborative filtering systems, spreading a technology in the market using viral marketing techniques, etc. It is well known that the interpersonal relationships (or ties or links) between individuals cause change or improvement in the social system because the decisions made by individuals are influenced heavily by the behavior of their neighbors. An interesting and key problem in social networks is to discover the most influential nodes in the social network which can influence other nodes in the social network in a strong and deep way. This problem is called the target set selection problem and has two variants: 1) the top-k nodes problem, where we are required to identify a set of k influential nodes that maximize the number of nodes being influenced in the network and 2) the lambda-coverage problem which involves finding a set of influential nodes having minimum size that can influence a given percentage lambda of the nodes in the entire network. There are many existing algorithms in the literature for solving these problems. In this paper, we propose a new algorithm which is based on a novel interpretation of information diffusion in a social network as a cooperative game. Using this analogy, we develop an algorithm based on the Shapley value of the underlying cooperative game. The proposed algorithm outperforms the existing algorithms in terms of generality or computational complexity or both. Our results are validated through extensive experimentation on both synthetically generated and real-world data sets.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In pay-per click sponsored search auctions which are currently extensively used by search engines, the auction for a keyword involves a certain number of advertisers (say k) competing for available slots (say m) to display their ads. This auction is typically conducted for a number of rounds (say T). There are click probabilities mu_ij associated with agent-slot pairs. The search engine's goal is to maximize social welfare, for example, the sum of values of the advertisers. The search engine does not know the true value of an advertiser for a click to her ad and also does not know the click probabilities mu_ij s. A key problem for the search engine therefore is to learn these during the T rounds of the auction and also to ensure that the auction mechanism is truthful. Mechanisms for addressing such learning and incentives issues have recently been introduced and would be referred to as multi-armed-bandit (MAB) mechanisms. When m = 1,characterizations for truthful MAB mechanisms are available in the literature and it has been shown that the regret for such mechanisms will be O(T^{2/3}). In this paper, we seek to derive a characterization in the realistic but nontrivial general case when m > 1 and obtain several interesting results.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Users can rarely reveal their information need in full detail to a search engine within 1--2 words, so search engines need to "hedge their bets" and present diverse results within the precious 10 response slots. Diversity in ranking is of much recent interest. Most existing solutions estimate the marginal utility of an item given a set of items already in the response, and then use variants of greedy set cover. Others design graphs with the items as nodes and choose diverse items based on visit rates (PageRank). Here we introduce a radically new and natural formulation of diversity as finding centers in resistive graphs. Unlike in PageRank, we do not specify the edge resistances (equivalently, conductances) and ask for node visit rates. Instead, we look for a sparse set of center nodes so that the effective conductance from the center to the rest of the graph has maximum entropy. We give a cogent semantic justification for turning PageRank thus on its head. In marked deviation from prior work, our edge resistances are learnt from training data. Inference and learning are NP-hard, but we give practical solutions. In extensive experiments with subtopic retrieval, social network search, and document summarization, our approach convincingly surpasses recently-published diversity algorithms like subtopic cover, max-marginal relevance (MMR), Grasshopper, DivRank, and SVMdiv.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Owing to high evolutionary divergence, it is not always possible to identify distantly related protein domains by sequence search techniques. Intermediate sequences possess sequence features of more than one protein and facilitate detection of remotely related proteins. We have demonstrated recently the employment of Cascade PSI-BLAST where we perform PSI-BLAST for many 'generations', initiating searches from new homologues as well. Such a rigorous propagation through generations of PSI-BLAST employs effectively the role of intermediates in detecting distant similarities between proteins. This approach has been tested on a large number of folds and its performance in detecting superfamily level relationships is similar to 35% better than simple PSI-BLAST searches. We present a web server for this search method that permits users to perform Cascade PSI-BLAST searches against the Pfam, SCOP and SwissProt databases. The URL for this server is http://crick.mbu.iisc.ernet.in/similar to CASCADE/CascadeBlast.html.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this paper, we address a key problem faced by advertisers in sponsored search auctions on the web: how much to bid, given the bids of the other advertisers, so as to maximize individual payoffs? Assuming the generalized second price auction as the auction mechanism, we formulate this problem in the framework of an infinite horizon alternative-move game of advertiser bidding behavior. For a sponsored search auction involving two advertisers, we characterize all the pure strategy and mixed strategy Nash equilibria. We also prove that the bid prices will lead to a Nash equilibrium, if the advertisers follow a myopic best response bidding strategy. Following this, we investigate the bidding behavior of the advertisers if they use Q-learning. We discover empirically an interesting trend that the Q-values converge even if both the advertisers learn simultaneously.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Packet forwarding is a memory-intensive application requiring multiple accesses through a trie structure. With the requirement to process packets at line rates, high-performance routers need to forward millions of packets every second with each packet needing up to seven memory accesses. Earlier work shows that a single cache for the nodes of a trie can reduce the number of external memory accesses. It is observed that the locality characteristics of the level-one nodes of a trie are significantly different from those of lower level nodes. Hence, we propose a heterogeneously segmented cache architecture (HSCA) which uses separate caches for level-one and lower level nodes, each with carefully chosen sizes. Besides reducing misses, segmenting the cache allows us to focus on optimizing the more frequently accessed level-one node segment. We find that due to the nonuniform distribution of nodes among cache sets, the level-one nodes cache is susceptible t high conflict misses. We reduce conflict misses by introducing a novel two-level mapping-based cache placement framework. We also propose an elegant way to fit the modified placement function into the cache organization with minimal increase in access time. Further, we propose an attribute preserving trace generation methodology which emulates real traces and can generate traces with varying locality. Performanc results reveal that our HSCA scheme results in a 32 percent speedup in average memory access time over a unified nodes cache. Also, HSC outperforms IHARC, a cache for lookup results, with as high as a 10-fold speedup in average memory access time. Two-level mappin further enhances the performance of the base HSCA by up to 13 percent leading to an overall improvement of up to 40 percent over the unified scheme.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The existing internet computing resource, Biomolecules Segment Display Device (BSDD), has been updated with several additional useful features. An advanced option is provided to superpose the structural motifs obtained from a search on the Protein Data Bank (PDB) in order to see if the three-dimensional structures adopted by identical or similar sequence motifs are the same. Furthermore, the options to display structural aspects like inter- and intra-molecular interactions, ion-pairs, disulphide bonds, etc. have been provided.The updated resource is interfaced with an up-to-date copy of the public domain PDB as well as 25 and 90% non-redundant protein structures. Further, users can upload the three-dimensional atomic coordinates (PDB format) from the client machine. A free molecular graphics program, JMol, is interfaced with it to display the three-dimensional structures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The operational life and reliability of I.C. engines are limited to a certain extent by the break down of the engine components due to wear. It is advantageous to know the condition of an engine and its components without disassembling for detailed measurements. This paper describes the possibility of employing chemical analysis of the used crank case oil to predict the wear of engine components. It is concluded that the acidity and carbon contents of the crank case oil play a significant role in assessing the wear of copper-lead bearings used for the big end of the connecting rod.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The operational life and reliability of I.C. engines are limited to a certain extent by the break down of the engine components due to wear. It is advantageous to know the condition of an engine and its components without disassembling for detailed measurements. This paper describes the possibility of employing chemical analysis of the used crank case oil to predict the wear of engine components. It is concluded that the acidity and carbon contents of the crank case oil play a significant role in assessing the wear of copper-lead bearings used for the big end of the connecting rod.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Business processes and application functionality are becoming available as internal web services inside enterprise boundaries as well as becoming available as commercial web services from enterprise solution vendors and web services marketplaces. Typically there are multiple web service providers offering services capable of fulfilling a particular functionality, although with different Quality of Service (QoS). Dynamic creation of business processes requires composing an appropriate set of web services that best suit the current need. This paper presents a novel combinatorial auction approach to QoS aware dynamic web services composition. Such an approach would enable not only stand-alone web services but also composite web services to be a part of a business process. The combinatorial auction leads to an integer programming formulation for the web services composition problem. An important feature of the model is the incorporation of service level agreements. We describe a software tool QWESC for QoS-aware web services composition based on the proposed approach.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper we first describe a framework to model the sponsored search auction on the web as a mechanism design problem. Using this framework, we design a novel auction which we call the OPT (optimal) auction. The OPT mechanism maximizes the search engine's expected revenue while achieving Bayesian incentive compatibility and individual rationality of the advertisers. We show that the OPT mechanism is superior to two of the most commonly used mechanisms for sponsored search namely (1) GSP (Generalized Second Price) and (2) VCG (Vickrey-Clarke-Groves). We then show an important revenue equivalence result that the expected revenue earned by the search engine is the same for all the three mechanisms provided the advertisers are symmetric and the number of sponsored slots is strictly less than the number of advertisers.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The protein-protein docking programs typically perform four major tasks: (i) generation of docking poses, (ii) selecting a subset of poses, (iii) their structural refinement and (iv) scoring, ranking for the final assessment of the true quaternary structure. Although the tasks can be integrated or performed in a serial order, they are by nature modular, allowing an opportunity to substitute one algorithm with another. We have implemented two modular web services, (i) PRUNE: to select a subset of docking poses generated during sampling search (http://pallab.serc.iisc.ernet.in/prune) and (ii) PROBE: to refine, score and rank them (http://pallab.serc.iisc.ernet.in/probe). The former uses a new interface area based edge-scoring function to eliminate > 95% of the poses generated during docking search. In contrast to other multi-parameter-based screening functions, this single parameter based elimination reduces the computational time significantly, in addition to increasing the chances of selecting native-like models in the top rank list. The PROBE server performs ranking of pruned poses, after structure refinement and scoring using a regression model for geometric compatibility, and normalized interaction energy. While web-service similar to PROBE is infrequent, no web-service akin to PRUNE has been described before. Both the servers are publicly accessible and free for use.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

With the emergence of Internet, the global connectivity of computers has become a reality. Internet has progressed to provide many user-friendly tools like Gopher, WAIS, WWW etc. for information publishing and access. The WWW, which integrates all other access tools, also provides a very convenient means for publishing and accessing multimedia and hypertext linked documents stored in computers spread across the world. With the emergence of WWW technology, most of the information activities are becoming Web-centric. Once the information is published on the Web, a user can access this information from any part of the world. A Web browser like Netscape or Internet Explorer is used as a common user interface for accessing information/databases. This will greatly relieve a user from learning the search syntax of individual information systems. Libraries are taking advantage of these developments to provide access to their resources on the Web. CDS/ISIS is a very popular bibliographic information management software used in India. In this tutorial we present details of integrating CDS/ISIS with the WWW. A number of tools are now available for making CDS/ISIS database accessible on the Internet/Web. Some of these are 1) the WAIS_ISIS Server. 2) the WWWISIS Server 3) the IQUERY Server. In this tutorial, we have explained in detail the steps involved in providing Web access to an existing CDS/ISIS database using the freely available software, WWWISIS. This software is developed, maintained and distributed by BIREME, the Latin American & Caribbean Centre on Health Sciences Information. WWWISIS acts as a server for CDS/ISIS databases in a WWW client/server environment. It supports functions for searching, formatting and data entry operations over CDS/ISIS databases. WWWISIS is available for various operating systems. We have tested this software on Windows '95, Windows NT and Red Hat Linux release 5.2 (Appolo) Kernel 2. 0. 36 on an i686. The testing was carried out using IISc's main library's OPAC containing more than 80,000 records and Current Contents issues (bibliographic data) containing more than 25,000 records. WWWISIS is fully compatible with CDS/ISIS 3.07 file structure. However, on a system running Unix or its variant, there is no guarantee of this compatibility. It is therefore safe to recreate the master and the inverted files, using utilities provided by BIREME, under Unix environment.