954 resultados para Web archiving methods


Relevância:

100.00% 100.00%

Publicador:

Resumo:

L’arxivament del web és una disciplina que te el seu origen en el camp de la biblioteconomia i les ciències de la informació i és aliena al món arxivístic del nostre país. La primera part del present treball ofereix un breu estat de la qüestió sobre l’arxivament de les pàgines web i, des d’una perspectiva arxivística, intentarà donar resposta a qüestions com en què consisteix l’arxivament de les pàgines web? Per a què serveix? Des de quan es practica? Quines organitzacions el practiquen? Com es captura i emmagatzema el web? En la segona part es proposa una reflexió sobre l’aplicació de l’arxivament web des de la disciplina arxivística. Paraules clau: Preservació digital, arxivament web, arxivística, Internet, Biblioteques Nacionals, documents electrònics, tecnologies de la informació i la comunicació

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This report describes web archiving in the National Library of Finland. The National Library of Finland has been archiving Finnish web on a regular basis since 2006. Web archiving is an important part of the Library'ʹs endeavours to collect and preserve Finnish published cultural heritage. In 2010, the amount of harvested data was 200 million files, or 25 Terabytes. The report takes the reader through the relevant legislation; internal plans and policies; funding and their allocation; the practices of web archiving; arrangements for the use of the archive; and issues rising from data security, sensitive materials, &c.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Le Web représente actuellement un espace privilégié d’expression et d’activité pour plusieurs communautés, où pratiques communicationnelles et pratiques documentaires s’enrichissent mutuellement. Dans sa dimension visible ou invisible, le Web constitue aussi un réservoir documentaire planétaire caractérisé non seulement par l’abondance de l’information qui y circule, mais aussi par sa diversité, sa complexité et son caractère éphémère. Les projets d’archivage du Web en cours abordent pour beaucoup cette question du point de vue de la préservation des publications en ligne sans la considérer dans une perspective archivistique. Seuls quelques projets d’archivage du Web visent la préservation du Web organisationnel ou gouvernemental. La valeur archivistique du Web, notamment du Web organisationnel, ne semble pas être reconnue malgré un effort soutenu de certaines archives nationales à diffuser des politiques d’archivage du Web organisationnel. La présente thèse a pour but de développer une meilleure compréhension de la nature des archives Web et de documenter les pratiques actuelles d’archivage du Web organisationnel. Plus précisément, cette recherche vise à répondre aux trois questions suivantes : (1) Que recommandent en général les politiques d’archivage du Web organisationnel? (2) Quelles sont les principales caractéristiques des archives Web? (3) Quelles pratiques d’archivage du Web organisationnel sont mises en place dans des organisations au Québec? Pour répondre à ces questions, cette recherche exploratoire et descriptive a adopté une approche qualitative basée sur trois modes de collecte des données, à savoir : l’analyse d’un corpus de 55 politiques et documents complémentaires relatifs à l’archivage du Web organisationnel; l’observation de 11 sites Web publics d’organismes au Québec de même que l’observation d’un échantillon de 737 documents produits par ces systèmes Web; et, enfin, des entrevues avec 21 participants impliqués dans la gestion et l’archivage de ces sites Web. Les résultats de recherche démontrent que les sites Web étudiés sont le produit de la conduite des activités en ligne d’une organisation et documentent, en même temps, les objectifs et les manifestations de sa présence sur le Web. De nouveaux types de documents propres au Web organisationnel ont pu être identifiés. Les documents qui ont migré sur le Web ont acquis un autre contexte d’usage et de nouvelles caractéristiques. Les méthodes de gestion actuelles doivent prendre en considération les propriétés des documents dans un environnement Web. Alors que certains sites d’étude n’archivent pas leur site Web public, d’autres s’y investissent. Toutefois les choix établis ne correspondent pas toujours aux recommandations proposées dans les politiques d’archivage du Web analysées et ne garantissent pas la pérennité des archives Web ni leur exploitabilité à long terme. Ce constat nous a amenée à proposer une politique type adaptée aux caractéristiques des archives Web. Ce modèle décrit les composantes essentielles d’une politique pour l’archivage des sites Web ainsi qu’un éventail des mesures que pourrait mettre en place l’organisation en fonction des résultats d’une analyse des risques associés à l’usage de son site Web public dans la conduite de ses affaires.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Current model-driven Web Engineering approaches (such as OO-H, UWE or WebML) provide a set of methods and supporting tools for a systematic design and development of Web applications. Each method addresses different concerns using separate models (content, navigation, presentation, business logic, etc.), and provide model compilers that produce most of the logic and Web pages of the application from these models. However, these proposals also have some limitations, especially for exchanging models or representing further modeling concerns, such as architectural styles, technology independence, or distribution. A possible solution to these issues is provided by making model-driven Web Engineering proposals interoperate, being able to complement each other, and to exchange models between the different tools. MDWEnet is a recent initiative started by a small group of researchers working on model-driven Web Engineering (MDWE). Its goal is to improve current practices and tools for the model-driven development of Web applications for better interoperability. The proposal is based on the strengths of current model-driven Web Engineering methods, and the existing experience and knowledge in the field. This paper presents the background, motivation, scope, and objectives of MDWEnet. Furthermore, it reports on the MDWEnet results and achievements so far, and its future plan of actions.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This presentation aims to make understandable the use and application context of two Webometrics techniques, the logs analysis and Google Analytics, which currently coexist in the Virtual Library of the UOC. In this sense, first of all it is provided a comprehensive introduction to webometrics and then it is analysed the case of the UOC's Virtual Library focusing on the assimilation of these techniques and the considerations underlying their use, and covering in a holistic way the process of gathering, processing and data exploitation. Finally there are also provided guidelines for the interpretation of the metric variables obtained.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This class introduces basics of web mining and information retrieval including, for example, an introduction to the Vector Space Model and Text Mining. Guest Lecturer: Dr. Michael Granitzer Optional: Modeling the Internet and the Web: Probabilistic Methods and Algorithms, Pierre Baldi, Paolo Frasconi, Padhraic Smyth, Wiley, 2003 (Chapter 4, Text Analysis)

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Presentation at the IIPC General Assembly, Reykjavik, 12 April, 2016

Relevância:

80.00% 80.00%

Publicador:

Resumo:

La aplicación Log2XML tiene como objeto principal la transformación de archivos log en formato texto con separador de campos a un formato XML estandarizado. Para permitir que la aplicación pueda trabajar con logs de diferentes sistemas o aplicaciones, dispone de un sistema de plantillas (indicación de orden de campos y carácter separador) que permite definir la estructura mínima para poder extraer la información de cualquier tipo de log que se base en separadores de campo. Por último, la aplicación permite el procesamiento de la información extraída para la generación de informes y estadísticas.Por otro lado, en el proyecto se profundiza en la tecnología Grails.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The consideration of personalization politics in the context of any web application modelling method obliges to the revision of its different modelling activities, which must be adapted to take into account the information regarding the user (usually gathered in a user model) to define aspects such as navigation or presentation. Additionally, they must provide a set of techniques to populate such user model. Finally, and because of the rapid pace at which personalization politics usually change, the modelling process should provide support not only for static personalization rules (known at design time) but also for the definition or change of these rules once the application has been deployed. This article presents, in the context of the Object Oriented Hypermedia Method (OO-H), a personalization framework that fulfils these requirements, and is organized around four main concepts: (1) a set of design activities that capture the personalization requirements known at design time, (2) a mechanism for the specification of personalization rules, defined by means of an XML template, that decouples the definition of the personalization model from the remaining models, (3) an execution architecture that supports the change at execution time of these rules and (4) an extensible repository that includes a set of register mechanisms for the user activity in the system. The possibility of extension of this repository facilitates its adaptation to the particular characteristics of any particular application.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This poster presentation from the May 2015 Florida Library Association Conference, along with the Everglades Explorer discovery portal at http://ee.fiu.edu, demonstrates how traditional bibliographic and curatorial principles can be applied to: 1) selection, cross-walking and aggregation of metadata linking end-users to wide-spread digital resources from multiple silos; 2) harvesting of select PDFs, HTML and media for web archiving and access; 3) selection of CMS domains, sub-domains and folders for targeted searching using an API. Choosing content for this discovery portal is comparable to past scholarly practice of creating and publishing subject bibliographies, except metadata and data are housed in relational databases. This new and yet traditional capacity coincides with: Growth of bibliographic utilities (MarcEdit); Evolution of open-source discovery systems (eXtensible Catalog); Development of target-capable web crawling and archiving systems (Archive-it); and specialized search APIs (Google). At the same time, historical and technical changes – specifically the increasing fluidity and re-purposing of syndicated metadata – make this possible. It equally stems from the expansion of freely accessible digitized legacy and born-digital resources. Innovation principles helped frame the process by which the thematic Everglades discovery portal was created at Florida International University. The path -- to providing for more effective searching and co-location of digital scientific, educational and historical material related to the Everglades -- is contextualized through five concepts found within Dyer and Christensen’s “The Innovator’s DNA: Mastering the five skills of disruptive innovators (2011). The project also aligns with Ranganathan’s Laws of Library Science, especially the 4th Law -- to "save the time of the user.”

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Constrained and unconstrained Nonlinear Optimization Problems often appear in many engineering areas. In some of these cases it is not possible to use derivative based optimization methods because the objective function is not known or it is too complex or the objective function is non-smooth. In these cases derivative based methods cannot be used and Direct Search Methods might be the most suitable optimization methods. An Application Programming Interface (API) including some of these methods was implemented using Java Technology. This API can be accessed either by applications running in the same computer where it is installed or, it can be remotely accessed through a LAN or the Internet, using webservices. From the engineering point of view, the information needed from the API is the solution for the provided problem. On the other hand, from the optimization methods researchers’ point of view, not only the solution for the problem is needed. Also additional information about the iterative process is useful, such as: the number of iterations; the value of the solution at each iteration; the stopping criteria, etc. In this paper are presented the features added to the API to allow users to access to the iterative process data.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

PADICAT is the web archive created in 2005 in Catalonia (Spain ) by the Library of Catalonia (BC ) , the National Library of Catalonia , with the aim of collecting , processing and providing permanent access to the digital heritage of Catalonia . Its harvesting strategy is based on the hybrid model ( of massive harvesting . SPA top level domain ; selective compilation of the web site output of Catalan organizations; focused harvesting of public events) . The system provides open access to the whole collection , on the Internet . We consider necessary to complement the current search for new and visualization software with open source software tool, CAT ( Curator Archiving Tool) , composed by three modules aimed to effectively managing the processes of human cataloguing ; to publish directories where the digital resources and special collections ; and to offer statistical information of added value to end users. Within the framework of the International Internet Preservation Consortium meeting ( Vienna 2010) , the progress in the development of this new tool, and the philosophy that has motivated his design, are presented to the international community.

Relevância:

40.00% 40.00%

Publicador:

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The M-Coffee server is a web server that makes it possible to compute multiple sequence alignments (MSAs) by running several MSA methods and combining their output into one single model. This allows the user to simultaneously run all his methods of choice without having to arbitrarily choose one of them. The MSA is delivered along with a local estimation of its consistency with the individual MSAs it was derived from. The computation of the consensus multiple alignment is carried out using a special mode of the T-Coffee package [Notredame, Higgins and Heringa (T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 2000; 302: 205-217); Wallace, O'Sullivan, Higgins and Notredame (M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 2006; 34: 1692-1699)] Given a set of sequences (DNA or proteins) in FASTA format, M-Coffee delivers a multiple alignment in the most common formats. M-Coffee is a freeware open source package distributed under a GPL license and it is available either as a standalone package or as a web service from www.tcoffee.org.