42 resultados para Web documents
em Indian Institute of Science - Bangalore - Índia
Resumo:
With the emergence of Internet, the global connectivity of computers has become a reality. Internet has progressed to provide many user-friendly tools like Gopher, WAIS, WWW etc. for information publishing and access. The WWW, which integrates all other access tools, also provides a very convenient means for publishing and accessing multimedia and hypertext linked documents stored in computers spread across the world. With the emergence of WWW technology, most of the information activities are becoming Web-centric. Once the information is published on the Web, a user can access this information from any part of the world. A Web browser like Netscape or Internet Explorer is used as a common user interface for accessing information/databases. This will greatly relieve a user from learning the search syntax of individual information systems. Libraries are taking advantage of these developments to provide access to their resources on the Web. CDS/ISIS is a very popular bibliographic information management software used in India. In this tutorial we present details of integrating CDS/ISIS with the WWW. A number of tools are now available for making CDS/ISIS database accessible on the Internet/Web. Some of these are 1) the WAIS_ISIS Server. 2) the WWWISIS Server 3) the IQUERY Server. In this tutorial, we have explained in detail the steps involved in providing Web access to an existing CDS/ISIS database using the freely available software, WWWISIS. This software is developed, maintained and distributed by BIREME, the Latin American & Caribbean Centre on Health Sciences Information. WWWISIS acts as a server for CDS/ISIS databases in a WWW client/server environment. It supports functions for searching, formatting and data entry operations over CDS/ISIS databases. WWWISIS is available for various operating systems. We have tested this software on Windows '95, Windows NT and Red Hat Linux release 5.2 (Appolo) Kernel 2. 0. 36 on an i686. The testing was carried out using IISc's main library's OPAC containing more than 80,000 records and Current Contents issues (bibliographic data) containing more than 25,000 records. WWWISIS is fully compatible with CDS/ISIS 3.07 file structure. However, on a system running Unix or its variant, there is no guarantee of this compatibility. It is therefore safe to recreate the master and the inverted files, using utilities provided by BIREME, under Unix environment.
Resumo:
Sequence-structure correlation studies are important in deciphering the relationships between various structural aspects, which may shed light on the protein-folding problem. The first step of this process is the prediction of secondary structure for a protein sequence of unknown three-dimensional structure. To this end, a web server has been created to predict the consensus secondary structure using well known algorithms from the literature. Furthermore, the server allows users to see the occurrence of predicted secondary structural elements in other structure and sequence databases and to visualize predicted helices as a helical wheel plot. The web server is accessible at http://bioserver1.physics.iisc.ernet.in/cssp/.
Resumo:
The existing internet computing resource, Biomolecules Segment Display Device (BSDD), has been updated with several additional useful features. An advanced option is provided to superpose the structural motifs obtained from a search on the Protein Data Bank (PDB) in order to see if the three-dimensional structures adopted by identical or similar sequence motifs are the same. Furthermore, the options to display structural aspects like inter- and intra-molecular interactions, ion-pairs, disulphide bonds, etc. have been provided.The updated resource is interfaced with an up-to-date copy of the public domain PDB as well as 25 and 90% non-redundant protein structures. Further, users can upload the three-dimensional atomic coordinates (PDB format) from the client machine. A free molecular graphics program, JMol, is interfaced with it to display the three-dimensional structures.
Resumo:
In this paper, we first describe a framework to model the sponsored search auction on the web as a mechanism design problem. Using this framework, we describe two well-known mechanisms for sponsored search auction-Generalized Second Price (GSP) and Vickrey-Clarke-Groves (VCG). We then derive a new mechanism for sponsored search auction which we call optimal (OPT) mechanism. The OPT mechanism maximizes the search engine's expected revenue, while achieving Bayesian incentive compatibility and individual rationality of the advertisers. We then undertake a detailed comparative study of the mechanisms GSP, VCG, and OPT. We compute and compare the expected revenue earned by the search engine under the three mechanisms when the advertisers are symmetric and some special conditions are satisfied. We also compare the three mechanisms in terms of incentive compatibility, individual rationality, and computational complexity. Note to Practitioners-The advertiser-supported web site is one of the successful business models in the emerging web landscape. When an Internet user enters a keyword (i.e., a search phrase) into a search engine, the user gets back a page with results, containing the links most relevant to the query and also sponsored links, (also called paid advertisement links). When a sponsored link is clicked, the user is directed to the corresponding advertiser's web page. The advertiser pays the search engine in some appropriate manner for sending the user to its web page. Against every search performed by any user on any keyword, the search engine faces the problem of matching a set of advertisers to the sponsored slots. In addition, the search engine also needs to decide on a price to be charged to each advertiser. Due to increasing demands for Internet advertising space, most search engines currently use auction mechanisms for this purpose. These are called sponsored search auctions. A significant percentage of the revenue of Internet giants such as Google, Yahoo!, MSN, etc., comes from sponsored search auctions. In this paper, we study two auction mechanisms, GSP and VCG, which are quite popular in the sponsored auction context, and pursue the objective of designing a mechanism that is superior to these two mechanisms. In particular, we propose a new mechanism which we call the OPT mechanism. This mechanism maximizes the search engine's expected revenue subject to achieving Bayesian incentive compatibility and individual rationality. Bayesian incentive compatibility guarantees that it is optimal for each advertiser to bid his/her true value provided that all other agents also bid their respective true values. Individual rationality ensures that the agents participate voluntarily in the auction since they are assured of gaining a non-negative payoff by doing so.
Resumo:
The identification of sequence (amino acids or nucleotides) motifs in a particular order in biological sequences has proved to be of interest. This paper describes a computing server, SSMBS, which can locate anddisplay the occurrences of user-defined biologically important sequence motifs (a maximum of five) present in a specific order in protein and nucleotide sequences. While the server can efficiently locate motifs specified using regular expressions, it can also find occurrences of long and complex motifs. The computation is carried out by an algorithm developed using the concepts of quantifiers in regular expressions. The web server is available to users around the clock at http://dicsoft1.physics.iisc.ernet.in/ssmbs/.
Resumo:
We report a hierarchical blind script identifier for 11 different Indian scripts. An initial grouping of the 11 scripts is accomplished at the first level of this hierarchy. At the subsequent level, we recognize the script in each group. The various nodes of this tree use different feature-classifier combinations. A database of 20,000 words of different font styles and sizes is collected and used for each script. Effectiveness of Gabor and Discrete Cosine Transform features has been independently, evaluated using nearest neighbor linear discriminant and support vector machine classifiers. The minimum and maximum accuracies obtained, using this hierarchical mechanism, are 92.2% and 97.6%, respectively.
Resumo:
Business processes and application functionality are becoming available as internal web services inside enterprise boundaries as well as becoming available as commercial web services from enterprise solution vendors and web services marketplaces. Typically there are multiple web service providers offering services capable of fulfilling a particular functionality, although with different Quality of Service (QoS). Dynamic creation of business processes requires composing an appropriate set of web services that best suit the current need. This paper presents a novel combinatorial auction approach to QoS aware dynamic web services composition. Such an approach would enable not only stand-alone web services but also composite web services to be a part of a business process. The combinatorial auction leads to an integer programming formulation for the web services composition problem. An important feature of the model is the incorporation of service level agreements. We describe a software tool QWESC for QoS-aware web services composition based on the proposed approach.
Resumo:
This paper describes an approach based on Zernike moments and Delaunay triangulation for localization of hand-written text in machine printed text documents. The Zernike moments of the image are first evaluated and we classify the text as hand-written using the nearest neighbor classifier. These features are independent of size, slant, orientation, translation and other variations in handwritten text. We then use Delaunay triangulation to reclassify the misclassified text regions. When imposing Delaunay triangulation on the centroid points of the connected components, we extract features based on the triangles and reclassify the text. We remove the noise components in the document as part of the preprocessing step so this method works well on noisy documents. The success rate of the method is found to be 86%. Also for specific hand-written elements such as signatures or similar text the accuracy is found to be even higher at 93%.
Resumo:
We propose a novel, language-neutral approach for searching online handwritten text using Frechet distance. Online handwritten data, which is available as a time series (x,y,t), is treated as representing a parameterized curve in two-dimensions and the problem of searching online handwritten text is posed as a problem of matching two curves in a two-dimensional Euclidean space. Frechet distance is a natural measure for matching curves. The main contribution of this paper is the formulation of a variant of Frechet distance that can be used for retrieving words even when only a prefix of the word is given as query. Extensive experiments on UNIPEN dataset(1) consisting of over 16,000 words written by 7 users show that our method outperforms the state-of-the-art DTW method. Experiments were also conducted on a Multilingual dataset, generated on a PDA, with encouraging results. Our approach can be used to implement useful, exciting features like auto-completion of handwriting in PDAs.
Resumo:
Extensible Markup Language ( XML) has emerged as a medium for interoperability over the Internet. As the number of documents published in the form of XML is increasing, there is a need for selective dissemination of XML documents based on user interests. In the proposed technique, a combination of Adaptive Genetic Algorithms and multi class Support Vector Machine ( SVM) is used to learn a user model. Based on the feedback from the users, the system automatically adapts to the user's preference and interests. The user model and a similarity metric are used for selective dissemination of a continuous stream of XML documents. Experimental evaluations performed over a wide range of XML documents, indicate that the proposed approach significantly improves the performance of the selective dissemination task, with respect to accuracy and efficiency.
Resumo:
Extensible Markup Language ( XML) has emerged as a medium for interoperability over the Internet. As the number of documents published in the form of XML is increasing, there is a need for selective dissemination of XML documents based on user interests. In the proposed technique, a combination of Adaptive Genetic Algorithms and multi class Support Vector Machine ( SVM) is used to learn a user model. Based on the feedback from the users, the system automatically adapts to the user's preference and interests. The user model and a similarity metric are used for selective dissemination of a continuous stream of XML documents. Experimental evaluations performed over a wide range of XML documents, indicate that the proposed approach significantly improves the performance of the selective dissemination task, with respect to accuracy and efficiency.
Resumo:
XML has emerged as a medium for interoperability over the Internet. As the number of documents published in the form of XML is increasing there is a need for selective dissemination of XML documents based on user interests. In the proposed technique, a combination of Self Adaptive Migration Model Genetic Algorithm (SAMCA)[5] and multi class Support Vector Machine (SVM) are used to learn a user model. Based on the feedback from the users the system automatically adapts to the user's preference and interests. The user model and a similarity metric are used for selective dissemination of a continuous stream of XML documents. Experimental evaluations performed over a wide range of XML documents indicate that the proposed approach significantly improves the performance of the selective dissemination task, with respect to accuracy and efficiency.
Resumo:
Encoding protein 3D structures into 1D string using short structural prototypes or structural alphabets opens a new front for structure comparison and analysis. Using the well-documented 16 motifs of Protein Blocks (PBs) as structural alphabet, we have developed a methodology to compare protein structures that are encoded as sequences of PBs by aligning them using dynamic programming which uses a substitution matrix for PBs. This methodology is implemented in the applications available in Protein Block Expert (PBE) server. PBE addresses common issues in the field of protein structure analysis such as comparison of proteins structures and identification of protein structures in structural databanks that resemble a given structure. PBE-T provides facility to transform any PDB file into sequences of PBs. PBE-ALIGNc performs comparison of two protein structures based on the alignment of their corresponding PB sequences. PBE-ALIGNm is a facility for mining SCOP database for similar structures based on the alignment of PBs. Besides, PBE provides an interface to a database (PBE-SAdb) of preprocessed PB sequences from SCOP culled at 95% and of all-against-all pairwise PB alignments at family and superfamily levels. PBE server is freely available at http://bioinformatics.univ-reunion.fr/ PBE/.
Resumo:
Owing to high evolutionary divergence, it is not always possible to identify distantly related protein domains by sequence search techniques. Intermediate sequences possess sequence features of more than one protein and facilitate detection of remotely related proteins. We have demonstrated recently the employment of Cascade PSI-BLAST where we perform PSI-BLAST for many 'generations', initiating searches from new homologues as well. Such a rigorous propagation through generations of PSI-BLAST employs effectively the role of intermediates in detecting distant similarities between proteins. This approach has been tested on a large number of folds and its performance in detecting superfamily level relationships is similar to 35% better than simple PSI-BLAST searches. We present a web server for this search method that permits users to perform Cascade PSI-BLAST searches against the Pfam, SCOP and SwissProt databases. The URL for this server is http://crick.mbu.iisc.ernet.in/similar to CASCADE/CascadeBlast.html.