9 resultados para indexing

em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Due to both the widespread and multipurpose use of document images and the current availability of a high number of document images repositories, robust information retrieval mechanisms and systems have been increasingly demanded. This paper presents an approach to support the automatic generation of relationships among document images by exploiting Latent Semantic Indexing (LSI) and Optical Character Recognition (OCR). We developed the LinkDI (Linking of Document Images) service, which extracts and indexes document images content, computes its latent semantics, and defines relationships among images as hyperlinks. LinkDI was experimented with document images repositories, and its performance was evaluated by comparing the quality of the relationships created among textual documents as well as among their respective document images. Considering those same document images, we ran further experiments in order to compare the performance of LinkDI when it exploits or not the LSI technique. Experimental results showed that LSI can mitigate the effects of usual OCR misrecognition, which reinforces the feasibility of LinkDI relating OCR output with high degradation.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Searching in a dataset for elements that are similar to a given query element is a core problem in applications that manage complex data, and has been aided by metric access methods (MAMs). A growing number of applications require indices that must be built faster and repeatedly, also providing faster response for similarity queries. The increase in the main memory capacity and its lowering costs also motivate using memory-based MAMs. In this paper. we propose the Onion-tree, a new and robust dynamic memory-based MAM that slices the metric space into disjoint subspaces to provide quick indexing of complex data. It introduces three major characteristics: (i) a partitioning method that controls the number of disjoint subspaces generated at each node; (ii) a replacement technique that can change the leaf node pivots in insertion operations; and (iii) range and k-NN extended query algorithms to support the new partitioning method, including a new visit order of the subspaces in k-NN queries. Performance tests with both real-world and synthetic datasets showed that the Onion-tree is very compact. Comparisons of the Onion-tree with the MM-tree and a memory-based version of the Slim-tree showed that the Onion-tree was always faster to build the index. The experiments also showed that the Onion-tree significantly improved range and k-NN query processing performance and was the most efficient MAM, followed by the MM-tree, which in turn outperformed the Slim-tree in almost all the tests. (C) 2010 Elsevier B.V. All rights reserved.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Traditional content-based image retrieval (CBIR) systems use low-level features such as colors, shapes, and textures of images. Although, users make queries based on semantics, which are not easily related to such low-level characteristics. Recent works on CBIR confirm that researchers have been trying to map visual low-level characteristics and high-level semantics. The relation between low-level characteristics and image textual information has motivated this article which proposes a model for automatic classification and categorization of words associated to images. This proposal considers a self-organizing neural network architecture, which classifies textual information without previous learning. Experimental results compare the performance results of the text-based approach to an image retrieval system based on low-level features. (c) 2008 Wiley Periodicals, Inc.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Geographic Data Warehouses (GDW) are one of the main technologies used in decision-making processes and spatial analysis, and the literature proposes several conceptual and logical data models for GDW. However, little effort has been focused on studying how spatial data redundancy affects SOLAP (Spatial On-Line Analytical Processing) query performance over GDW. In this paper, we investigate this issue. Firstly, we compare redundant and non-redundant GDW schemas and conclude that redundancy is related to high performance losses. We also analyze the issue of indexing, aiming at improving SOLAP query performance on a redundant GDW. Comparisons of the SB-index approach, the star-join aided by R-tree and the star-join aided by GiST indicate that the SB-index significantly improves the elapsed time in query processing from 25% up to 99% with regard to SOLAP queries defined over the spatial predicates of intersection, enclosure and containment and applied to roll-up and drill-down operations. We also investigate the impact of the increase in data volume on the performance. The increase did not impair the performance of the SB-index, which highly improved the elapsed time in query processing. Performance tests also show that the SB-index is far more compact than the star-join, requiring only a small fraction of at most 0.20% of the volume. Moreover, we propose a specific enhancement of the SB-index to deal with spatial data redundancy. This enhancement improved performance from 80 to 91% for redundant GDW schemas.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

O objetivo deste estudo foi avaliar, utilizando diferentes indicadores antropométricos, o estado nutricional dos idosos de Fortaleza. Este estudo é de base populacional, do tipo transversal, com coleta de dados primários. As variáveis antropométricas analisadas foram: índice de massa corporal (IMC), dobra cutânea tricipital (DCT) e circunferência muscular do braço (CMB). O estado nutricional foi definido a partir dos diagnósticos obtidos com a análise das variáveis antropométricas: eutrófico (idoso, no qual as três variáveis antropométricas (IMC, DCT e CMB), simultaneamente, indicassem o estado de eutrofia, segundo os padrões adotados) e não eutrófico (demais idosos). Foram selecionados 385 domicílios para comporem a amostra deste estudo, nos quais foram entrevistados 483 idosos (68% mulheres). Quanto ao IMC, 47,3% do total de idosos foram considerados eutróficos. As mulheres apresentaram maior proporção de valores de IMC excessivo (21,9%), quando comparadas aos homens (13,5%). Foi verificada associação estatisticamente significativa entre adequação de IMC e sexo. Os valores de DCT mostraram que 54,4% do total de idosos eram eutróficos. Não houve associação estatisticamente significativa entre a adequação da DCT e sexo. Quanto à CMB, os homens apresentaram maior prevalência de desnutrição (66,5%), quando comparados às mulheres (40,6%). Foi verificada associação estatisticamente significativa entre adequação da CMB e sexo. Ao verificar o estado nutricional por meio das variáveis antropométricas, observou-se que 83,9% dos homens foram considerados não eutróficos, assim como maior parte das mulheres (74,2%). Foi observada associação estatisticamente significativa entre estado nutricional e sexo. Os idosos de Fortaleza apresentam estado nutricional vulnerável, visto as prevalências de não eutróficos

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Tibolone is used for hormone reposition of postmenopause women and isotibolone is considered the major degradation product of tibolone. Isotibolone can also be present in tibolone API raw materials due to some inadequate synthesis. Its presence is then necessary to be identified and quantified in the quality control of both API and drug products. In this work we present the indexing of an isotibolone X-ray diffraction pattern measured with synchrotron light (lambda=1.2407 angstrom) in the transmission mode. The characterization of the isotibolone sample by IR spectroscopy, elemental analysis, and thermal analysis are also presented. The isotibolone crystallographic data are a=6.8066 angstrom, b=20.7350 angstrom, c=6.4489 angstrom, beta=76.428 degrees, V=884.75 angstrom(3), and space group P2(1), rho(o)= 1.187 g cm(-3), Z=2. (C) 2009 International Centre for Diffraction Data. [DOI: 10.1154/1.3257612]

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Viroids have been used as ""graft transmissible dwarfing agents"" (GTDA) in several countries, mainly to reduce growth of citrus trees, thus increasing their density in orchards. In the State of Sao Paulo, Brazil, plants of the acid lime `Tahiti` are usually grafted with a complex of GTDA, presumably viroids. The aim of the present work was the identification and molecular characterization of the viroids infecting trees of acid lime `Tahiti` displaying ""Quebra galho"" (bark-cracking). Viroids were identified and characterized by biological indexing in `Etrog` citron, Northern-blot hybridization, RT-PCR, cloning and complete sequencing of the RNA genomes. Citrus exocortis viroid (CEVd), Hop stunt viroid (HSVd) and Citrus dwarfing viroid (CDVd) were found in different combinations. Although we have not been able to infer a direct relationship between the agronomical performance and symptom severity with the presence of a specific viroid or viroid combination, the differences in the severity of ""Quebra-galho"" symptoms among different trees is probably associated with the presence (or absence) of CEVd, with its interaction with other viroids perhaps determining the different phenotypes observed in the field.