977 resultados para document administratif
Resumo:
This paper presents some developments in query expansion and document representation of our spoken document retrieval system and shows how various retrieval techniques affect performance for different sets of transcriptions derived from a common speech source. Modifications of the document representation are used, which combine several techniques for query expansion, knowledge-based on one hand and statistics-based on the other. Taken together, these techniques can improve Average Precision by over 19% relative to a system similar to that which we presented at TREC-7. These new experiments have also confirmed that the degradation of Average Precision due to a word error rate (WER) of 25% is quite small (3.7% relative) and can be reduced to almost zero (0.2% relative). The overall improvement of the retrieval system can also be observed for seven different sets of transcriptions from different recognition engines with a WER ranging from 24.8% to 61.5%. We hope to repeat these experiments when larger document collections become available, in order to evaluate the scalability of these techniques.
Resumo:
This paper focuses on document data, one of the most significant sources for technology intelligence. To help organisations use their knowledge in documents effectively, this research aims to identify what organizations really want from documents and what might be possible to obtain from them. The research involves a literature review, a series of in-depth/on-site interviews and a descriptive analysis of document mining applications. The output of the research includes: a document mining framework; an analysis of the current condition of document mining in technology-based organisations together with their future requirements; and guidelines for introducing document mining into an organisation along with a discussion on the practical issues that are faced by users. Copyright © 2011 Inderscience Enterprises Ltd.
Resumo:
This research proposes a method for extracting technology intelligence (TI) systematically from a large set of document data. To do this, the internal and external sources in the form of documents, which might be valuable for TI, are first identified. Then the existing techniques and software systems applicable to document analysis are examined. Finally, based on the reviews, a document-mining framework designed for TI is suggested and guidelines for software selection are proposed. The research output is expected to support intelligence operatives in finding suitable techniques and software systems for getting value from document-mining and thus facilitate effective knowledge management. Copyright © 2012 Inderscience Enterprises Ltd.
Resumo:
It is common that documents are represented by document icon in graphical user interfaces. The document icon facilitates user to retrieve documents, but it is difficult to distinguish the document from a collection of documents that user have accessed to. Our paper presents a document icon on which the users can add some subjective values and mark. Then we describe a system ex-explorer that users can browser and search the extent document icon. We found that it is easy to re-find the document on which users added some annotation or mark by themselves.
Resumo:
ACM SIGIR; ACM SIGWEB
Resumo:
This survey was undertaken by the film crew accompanying Cary Grant when making the film "Charade" in 1963.
Resumo:
This article contributes to the debate on what form of preparation and support can enhance the intercultural student experience during the Year Abroad. It presents a credit-bearing and multi-modal module at a UK university designed to both prepare students prior to departure through a series of workshops and activities on an e-portfolio and help them engage in meta-reflection on intercultural issues during their stay. The presentation of the curricular components of the course and instances extracted from student blogs are contextualised within theoretical considerations on intercultural education and a holistic approach to student development. The longitudinal evolution of the module is presented in the context of an iterative approach leading to a cycle of revisions and amendments. With its pragmatic stance this article aims to address one of the concerns recently expressed about intercultural education, namely that although intercultural theories are suitably incorporated in the latest thinking on communicative competence, there is a lack of evidence-based practice.
Resumo:
We present a type system, StaXML, which employs the stacked type syntax to represent essential aspects of the potential roles of XML fragments to the structure of complete XML documents. The simplest application of this system is to enforce well-formedness upon the construction of XML documents without requiring the use of templates or balanced "gap plugging" operators; this allows it to be applied to programs written according to common imperative web scripting idioms, particularly the echoing of unbalanced XML fragments to an output buffer. The system can be extended to verify particular XML applications such as XHTML and identifying individual XML tags constructed from their lexical components. We also present StaXML for PHP, a prototype precompiler for the PHP4 scripting language which infers StaXML types for expressions without assistance from the programmer.
Resumo:
With the increasing demand for document transfer services such as the World Wide Web comes a need for better resource management to reduce the latency of documents in these systems. To address this need, we analyze the potential for document caching at the application level in document transfer services. We have collected traces of actual executions of Mosaic, reflecting over half a million user requests for WWW documents. Using those traces, we study the tradeoffs between caching at three levels in the system, and the potential for use of application-level information in the caching system. Our traces show that while a high hit rate in terms of URLs is achievable, a much lower hit rate is possible in terms of bytes, because most profitably-cached documents are small. We consider the performance of caching when applied at the level of individual user sessions, at the level of individual hosts, and at the level of a collection of hosts on a single LAN. We show that the performance gain achievable by caching at the session level (which is straightforward to implement) is nearly all of that achievable at the LAN level (where caching is more difficult to implement). However, when resource requirements are considered, LAN level caching becomes much more desirable, since it can achieve a given level of caching performance using a much smaller amount of cache space. Finally, we consider the use of organizational boundary information as an example of the potential for use of application-level information in caching. Our results suggest that distinguishing between documents produced locally and those produced remotely can provide useful leverage in designing caching policies, because of differences in the potential for sharing these two document types among multiple users.
Resumo:
We analyzed the logs of our departmental HTTP server http://cs-www.bu.edu as well as the logs of the more popular Rolling Stones HTTP server http://www.stones.com. These servers have very different purposes; the former caters primarily to local clients, whereas the latter caters exclusively to remote clients all over the world. In both cases, our analysis showed that remote HTTP accesses were confined to a very small subset of documents. Using a validated analytical model of server popularity and file access profiles, we show that by disseminating the most popular documents on servers (proxies) closer to the clients, network traffic could be reduced considerably, while server loads are balanced. We argue that this process could be generalized so as to provide for an automated demand-based duplication of documents. We believe that such server-based information dissemination protocols will be more effective at reducing both network bandwidth and document retrieval times than client-based caching protocols [2].
Resumo:
This deliverable outlines the design blueprints for the RAGE application scenario games and forms the rest of the scope for WP4’s tasks. The game designs have been developed in collaboration with application scenario partners in WP5, and informed by WP1, 2 & 3. Additionally peer-feedback has been provided by game developers across WP4. The designs outline the integration of the RAGE assets developed in WP2 and WP3. Each section provides in detail the game play descriptions, game dynamics and mechanics, pedagogies and technical implementation of the RAGE assets into the game applications as described in detailed in WP5’s application documents. The full description of the application objectives and associated learning outcomes has been provided in the project’s MS2 Application Scenario Outlines document.