982 resultados para Web documents
Resumo:
Introduction: Coordination through CVHL/BVCS gives Canadian health libraries access to information technology they could not offer individually, thereby enhancing the library services offered to Canadian health professionals. An example is the portal being developed. Portal best practices are of increasing interest (usability.gov; Wikipedia portals; JISC subject portal project; Stanford clinical portals) but conclusive research is not yet available. This paper will identify best practices for a portal bringing together knowledge for Canadian health professionals supported through a network of libraries. Description: The portal for Canadian health professionals will include capabilities such as: • Authentication • Question referral • Specialist “branch libraries” • Integration of commercial resources, web resources and health systems data • Cross-resource search engine • Infrastructure to enable links from EHR and decision support systems • Knowledge translation tools, such as highlighting of best evidence Best practices will be determined by studying the capabilities of existing portals, including consortia/networks and individual institutions, and through a literature review. Outcomes: Best practices in portals will be reviewed. The collaboratively developed Virtual Library, currently the heart of cvhl.ca, is a unique database collecting high quality, free web documents and sites relevant to Canadian health care. The evident strengths of the Virtual Library will be discussed in light of best practices. Discussion: Identification of best practices will support cost-benefit analysis of options and provide direction for CVHL/BVCS. Open discussion with stakeholders (libraries and professionals) informed by this review will lead to adoption of the best technical solutions supporting Canadian health libraries and their users.
Resumo:
The bibliographic profile of 125 undergraduate (licentiate) theses was analyzed, describing absolute quantities of several bibliometric variables, as well as within-document indexes and average lags of the references. The results show a consistent pattern across the years in the 6 cohorts included in the sample (2001-2007), with variations, which fall within the robust confi dence intervals for the global central tendency. The median number of references per document was 52 (99% CI 47-55); the median percentage of journal articles cited was 55%, with a median age for journal references of 9 years. Other highlights of the bibliographic profile were the use of foreign language references (median 61%), and low reliance on open web documents (median 2%). A cluster analysis of the bibliometric indexes resulted in a typology of 2 main profiles, almost evenly distributed, one of them with the makeup of a natural science bibliographic profile and the second within the style of the humanities. In general, the number of references, proportion of papers, and age of the references are close to PhD dissertations and Master theses, setting a rather high standard for undergraduate theses.
Resumo:
In any data mining applications, automated text and text and image retrieval of information is needed. This becomes essential with the growth of the Internet and digital libraries. Our approach is based on the latent semantic indexing (LSI) and the corresponding term-by-document matrix suggested by Berry and his co-authors. Instead of using deterministic methods to find the required number of first "k" singular triplets, we propose a stochastic approach. First, we use Monte Carlo method to sample and to build much smaller size term-by-document matrix (e.g. we build k x k matrix) from where we then find the first "k" triplets using standard deterministic methods. Second, we investigate how we can reduce the problem to finding the "k"-largest eigenvalues using parallel Monte Carlo methods. We apply these methods to the initial matrix and also to the reduced one. The algorithms are running on a cluster of workstations under MPI and results of the experiments arising in textual retrieval of Web documents as well as comparison of the stochastic methods proposed are presented. (C) 2003 IMACS. Published by Elsevier Science B.V. All rights reserved.
Resumo:
In today's society it is becoming more and more important with direct marketing. Some of the direct marketing is done through e-mail, in which companies see an easy way to advertise himself. I did this thesis work at WebDoc Systems. They have a product that creates web documents directly in your browser, also called CMS. The CMS has a module for sending mass e-mail, but this module does not function properly and WebDoc Systems customers are dissatisfied with that part of the product. The problem with the module was that sometimes it didn't send the e-mail, and that it was not possible to obtain some form of follow-up information on the e-mail. The goal of this work was to develop a Web service that could easily send e-mail to many receivers, just as easily be able to view statistics on how mailing has gone. The first step was to do a literature review to get a good picture of available programming platforms, but also to be able create a good application infrastructure. The next step was to implement this design and improve it over time by using an iterative development methodology. The result was an application infrastructure that consists of three main parts and a plugin interface. The parts that were implemented were a Web service application, a Web application and a Windows service application. The three elements cooperate with each other and share a database, and plugins.
Resumo:
Graph-structured databases are widely prevalent, and the problem of effective search and retrieval from such graphs has been receiving much attention recently. For example, the Web can be naturally viewed as a graph. Likewise, a relational database can be viewed as a graph where tuples are modeled as vertices connected via foreign-key relationships. Keyword search querying has emerged as one of the most effective paradigms for information discovery, especially over HTML documents in the World Wide Web. One of the key advantages of keyword search querying is its simplicity—users do not have to learn a complex query language, and can issue queries without any prior knowledge about the structure of the underlying data. The purpose of this dissertation was to develop techniques for user-friendly, high quality and efficient searching of graph structured databases. Several ranked search methods on data graphs have been studied in the recent years. Given a top-k keyword search query on a graph and some ranking criteria, a keyword proximity search finds the top-k answers where each answer is a substructure of the graph containing all query keywords, which illustrates the relationship between the keyword present in the graph. We applied keyword proximity search on the web and the page graph of web documents to find top-k answers that satisfy user’s information need and increase user satisfaction. Another effective ranking mechanism applied on data graphs is the authority flow based ranking mechanism. Given a top- k keyword search query on a graph, an authority-flow based search finds the top-k answers where each answer is a node in the graph ranked according to its relevance and importance to the query. We developed techniques that improved the authority flow based search on data graphs by creating a framework to explain and reformulate them taking in to consideration user preferences and feedback. We also applied the proposed graph search techniques for Information Discovery over biological databases. Our algorithms were experimentally evaluated for performance and quality. The quality of our method was compared to current approaches by using user surveys.
Resumo:
Dans cette thèse, nous présentons les problèmes d’échange de documents d'affaires et proposons une méthode pour y remédier. Nous proposons une méthodologie pour adapter les standards d’affaires basés sur XML aux technologies du Web sémantique en utilisant la transformation des documents définis en DTD ou XML Schema vers une représentation ontologique en OWL 2. Ensuite, nous proposons une approche basée sur l'analyse formelle de concept pour regrouper les classes de l'ontologie partageant une certaine sémantique dans le but d'améliorer la qualité, la lisibilité et la représentation de l'ontologie. Enfin, nous proposons l’alignement d'ontologies pour déterminer les liens sémantiques entre les ontologies d'affaires hétérogènes générés par le processus de transformation pour aider les entreprises à communiquer fructueusement.
Resumo:
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to congure the annotations to their specic needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation condence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
Resumo:
Interlinking text documents with Linked Open Data enables the Web of Data to be used as background knowledge within document-oriented applications such as search and faceted browsing. As a step towards interconnecting the Web of Documents with the Web of Data, we developed DBpedia Spotlight, a system for automatically annotating text documents with DBpedia URIs. DBpedia Spotlight allows users to configure the annotations to their specific needs through the DBpedia Ontology and quality measures such as prominence, topical pertinence, contextual ambiguity and disambiguation confidence. We compare our approach with the state of the art in disambiguation, and evaluate our results in light of three baselines and six publicly available annotation systems, demonstrating the competitiveness of our system. DBpedia Spotlight is shared as open source and deployed as a Web Service freely available for public use.
Resumo:
Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.
Resumo:
XML Schema is one of the most used specifications for defining types of XML documents. It provides an extensive set of primitive data types, ways to extend and reuse definitions and an XML syntax that simplifies automatic manipulation. However, many features that make XML Schema Definitions (XSD) so interesting also make them rather cumbersome to read. Several tools to visualize and browse schema definitions have been proposed to cope with this issue. The novel approach proposed in this paper is to base XSD visualization and navigation on the XML document itself, using solely the web browser, without requiring a pre-processing step or an intermediate representation. We present the design and implementation of a web-based XML Schema browser called schem@Doc that operates over the XSD file itself. With this approach, XSD visualization is synchronized with the source file and always reflects its current state. This tool fits well in the schema development process and is easy to integrate in web repositories containing large numbers of XSD files.
Resumo:
As we move more closely to the practical concept of the Internet of Things and, our reliance on public and private APIs increases, web services and their related topics have become utterly crucial to the informatics community. However, the question about which style of web services would best solve a particular problem, can raise signi cant and multifarious debates. There can be found two implementation styles that highlight themselves: the RPC-oriented style represented by the SOAP protocol’s implementations and the hypermedia style, which is represented by the REST architectural style’s implementations. As we search examples of already established web services, we can nd a handful of robust and reliable public and private SOAP APIs, nevertheless, it seems that RESTful services are gaining popularity in the enterprise community. For the current generation of developers that work on informatics solutions, REST seems to represent a fundamental and straightforward alternative and even, a more deep-rooted approach than SOAP. But are they comparable? Do both approaches have each speci c best suitable scenarios? Such study is brie y carried out in the present document’s chapters, starting with the respective background study, following an analysis of the hypermedia approach and an instantiation of its architecture, in a particular case study applied in a BPM context.
Resumo:
Dissertação para obtenção do Grau de Mestre em Engenharia Informática
Resumo:
L'objectiu d'aquest pràcticum és treballar amb una eina d'edició i catalogació remota de vídeos accessible via internet. L'eina ha estat desenvolupada per l'empresa Vision Robotics. La memòria reflecteix les experiències viscudes a través del treball amb aquesta eina i analitza les possibilitats de millora, les potencialitats de l'eina a més d'afegir unes reflexions finals de l'eina en general.
Resumo:
Aquest treball te com a propòsit informar sobre si certes planes web (proporcionades a l'enunciat del TFC) compleixen les pautes d'accessibilitat web i determinar si la UOC ha pres la decisió encertada utilitzant Dspace per gestionar les seves publicacions digitals. Les planes web que tractarem, pertanyen al repositori de documents de la UOC, és a dir, un espai de divulgació on podem desar i consultar documentació en format digital.
Resumo:
L'objectiu del present document de Treball de Final de Carrera consisteix en l'anàlisi automàtic de cinc pàgines del repositori de documents de la UOC, i l'anàlisi dels diferents repositoris de documents existents al mercat, tot comparant-los amb el Dspace, que és el repositori que actualment utilitza la UOC. Per realitzar l'anàlisi primer s'ha fet una cerca per trobar les eines de revisió automàtica existents al mercat. Desprès d'analitzar-les, s'han escollit les cinc millors i s'han utilitzat per analitzar el compliment de les pautes d'accessibilitat de les normes WCAG 1.0 i 2.0 Posteriorment s'ha fet una cerca per trobar els diferents repositoris de documents que actualment existeixen, per escollir els tres millors, i després, utilitzant l'eina TAW de revisió automàtica de la normativa WCAG 1.0 i 2.0, s'ha fet una comparativa de com implementen aquest repositoris l'accessibilitat web. Com a conclusió, és pot afirmar que la UOC està utilitzant el millor repositori de documents que existeix a l'actualitat, Dspace, aquest repositori és també el més utilitzat a nivell mundial. La implementació que ha realitzat la UOC del Dspace, tenint en compte l'accessibilitat, és correcte, encara que és pot millorar. Com també és poden millorar les eines de validació automàtiques, ja que els resultats que obtenen analitzant les mateixes pàgines són molt diferents.