10 resultados para Information Retrieval, Document Databases, Digital Libraries
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo
Resumo:
Discusses the technological changes that affects learning organizations as well as the human, technical, legal and sustainable aspects regarding learning objects repositories creation, maintenance and use. It presents concepts of information objects and learning objects, the functional requirements needed to their storage at Learning Management Systems. The role of Metadata is reviewed concerning learning objects creation and retrieval, followed by considerations about learning object repositories models, community participation/collaborative strategies and potential derived metrics/indicators. As a result of this desktop research, it can be said that not only technical competencies are critical to any learning objects repository implementation, but it urges that an engaged community of interest be establish as a key to support a learning object repository project. On that matter, researchers are applying Activity Theory (Vygostky, Luria y Leontiev) in order to seek joint perceptions and actions involving learning objects repository users, curators and managers, perceived as critical assets to a successful proposal.
Resumo:
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.
Resumo:
O projeto „Por uma Biblioteca Brasiliana Digital‟ é parte integrante do projeto BRASILIANA USP, uma iniciativa da Reitoria da Universidade de São Paulo (USP), que tem por objetivo facultar para a pesquisa a maior Brasiliana custodiada por uma universidade, tornando-a disponível na internet. O trabalho que ora apresentamos é resultado da implantação de um modelo de biblioteca digital que atende aos padrões de interoperabilidade e compartilhamento de informações. Especificamente, apresentaremos os procedimentos e processos de descrição de conteúdos das diversas tipologias documentais (livros, periódicos, mapas, gravuras etc.) e formatos digitais (pdf, mp3, jpeg entre outros), bem como a consolidação de um esquema de metadados gerenciais e administrativos que contemplam as informações e dados produzidos pelo Projeto da Brasiliana Digital.
Resumo:
Este artigo trata da pesquisa bibliográfica no contexto de elaboração, execução e escrita tendo como recomendação a Internet, como fonte de pesquisa, com a finalidade de orientar o pesquisador na utilização de diversas ferramentas de pesquisa e bibliotecas digitais que servirão na recuperação de informações relevantes. Foram abordadas as seguintes etapas: escolha do assunto e delimitação do tema, a busca da terminologia da área, a recuperação exaustiva dos trabalhos, a localização dos trabalhos, a obtenção dos documentos, a leitura, a seleção mais o fichamento dos documentos e, finalmente, a redação do trabalho.
Resumo:
Acessível ao público desde junho de 2009, a Biblioteca Brasiliana Digital, da Universidade de São Paulo tem por objetivo facultar para a pesquisa, a maior Brasiliana custodiada por uma universidade. Sua intenção é disponibilizar virtualmente parte do acervo da Universidade oferecendo-se como um instrumento útil e funcional para a pesquisa e o estudo dos temas e cultura brasileiros, além de oferecer um modelo tecnológico de gestão que possa ser difundido a outras coleções, acervos e instituições. Este trabalho apresenta os resultado da implantação de um esquema de metadados baseado no formato Dublin Core, para a descrição de obras raras e especiais na web. Especificamente, apresenta os procedimentos e processos de descrição de conteúdos das diversas tipologias documentais (livros, periódicos, gravuras etc.) e formatos digitais (pdf, jpeg entre outros). Palavras-Chave: Bibliotecas digitais; Metadados; Dublin Core.
Resumo:
This paper describes the integration of information between Digital Library of Historical Cartography and Bibliographical Database (DEDALUS), both of the University of São Paulo (USP), to guarantee open, public access by Internet to the maps in the collection and make them available to users everywhere. This digital library was designed by Historical Cartography Studies Laboratory team (LECH/USP), and provides maps images on the Web, of high resolution, as well as such information on these maps as technical-scientific data (projection, scale, coordinates), printing techniques and material support that have made their circulation and cultural consumption possible. The Digital Library of Historical Cartography is accessible not only to the historical cartography researchers, but also to students and the general public. Beyond being a source of information about maps, the Digital Library of Historical Cartography seeks to be interactive, exchanging information and seeking dialogue with different branches of knowledge
Resumo:
The classification of texts has become a major endeavor with so much electronic material available, for it is an essential task in several applications, including search engines and information retrieval. There are different ways to define similarity for grouping similar texts into clusters, as the concept of similarity may depend on the purpose of the task. For instance, in topic extraction similar texts mean those within the same semantic field, whereas in author recognition stylistic features should be considered. In this study, we introduce ways to classify texts employing concepts of complex networks, which may be able to capture syntactic, semantic and even pragmatic features. The interplay between various metrics of the complex networks is analyzed with three applications, namely identification of machine translation (MT) systems, evaluation of quality of machine translated texts and authorship recognition. We shall show that topological features of the networks representing texts can enhance the ability to identify MT systems in particular cases. For evaluating the quality of MT texts, on the other hand, high correlation was obtained with methods capable of capturing the semantics. This was expected because the golden standards used are themselves based on word co-occurrence. Notwithstanding, the Katz similarity, which involves semantic and structure in the comparison of texts, achieved the highest correlation with the NIST measurement, indicating that in some cases the combination of both approaches can improve the ability to quantify quality in MT. In authorship recognition, again the topological features were relevant in some contexts, though for the books and authors analyzed good results were obtained with semantic features as well. Because hybrid approaches encompassing semantic and topological features have not been extensively used, we believe that the methodology proposed here may be useful to enhance text classification considerably, as it combines well-established strategies. (c) 2012 Elsevier B.V. All rights reserved.
Resumo:
The automatic disambiguation of word senses (i.e., the identification of which of the meanings is used in a given context for a word that has multiple meanings) is essential for such applications as machine translation and information retrieval, and represents a key step for developing the so-called Semantic Web. Humans disambiguate words in a straightforward fashion, but this does not apply to computers. In this paper we address the problem of Word Sense Disambiguation (WSD) by treating texts as complex networks, and show that word senses can be distinguished upon characterizing the local structure around ambiguous words. Our goal was not to obtain the best possible disambiguation system, but we nevertheless found that in half of the cases our approach outperforms traditional shallow methods. We show that the hierarchical connectivity and clustering of words are usually the most relevant features for WSD. The results reported here shed light on the relationship between semantic and structural parameters of complex networks. They also indicate that when combined with traditional techniques the complex network approach may be useful to enhance the discrimination of senses in large texts. Copyright (C) EPLA, 2012
Resumo:
Este trabalho relata a experiência e os procedimentos adotados em um processo de análise e identificação dos títulos de periódicos recebidos pela Biblioteca do Instituto de Medicina Tropical de São Paulo da Universidade de São Paulo, desde sua criação. Para a coleta de dados foram utilizadas as informações dos registros bibliográficos no Módulo de Catalogação no Banco de Dados Bibliográficos – DEDALUS Aleph 500 Versão 18.1 da Universidade de São Paulo, seguindo alguns critérios pré-estabelecidos. Conclui-se que, apesar dos problemas detectados serem pouco relevantes em relação ao acervo analisado, deve-se manter um estudo comparativo entre a necessidade do usuário e a coleção disponível na Biblioteca, para que os periódicos atendam às necessidades de informação de seus usuários.
Resumo:
O artigo apresenta uma análise da operacionalidade das Folksonomias e a possibilidade de aplicação dessa ferramenta nos sistemas de organização da informação da área de Ciência da Informação. Para tanto foi realizada uma análise de coerência de tags e dos recursos disponíveis para etiquetagem em dois websites, a Last.fm e o CiteULike. Por meio dessa análise constatou-se que em ambos os websites ocorreram incoerências e discrepâncias nas tags utilizadas. Todavia, o sistema da Last.fm demonstrou-se mais funcional que o do CiteULike obtendo um desempenho melhor. Por fim, sugere-se a junção das Folksonomias às Ontologias, que permitiriam a criação de sistemas automatizados de organização de conteúdos informacionais alimentados pelos próprios usuários