202 resultados para Indexação automática
em Repositório Institucional UNESP - Universidade Estadual Paulista "Julio de Mesquita Filho"
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
The indexing process aims to represent synthetically the informational content of documents by a set of terms whose meanings indicate the themes or subjects treated by them. With the emergence of the Web, research in automatic indexing received major boost with the necessity of retrieving documents from this huge collection. The traditional indexing languages, used to translate the thematic content of documents in standardized terms, always proved efficient in manual indexing. Ontologies open new perspectives for research in automatic indexing, offering a computer-process able language restricted to a particular domain. The use of ontologies in the automatic indexing process allows using a specific domain language and a logical and conceptual framework to make inferences, and whose relations allow an expansion of the terms extracted directly from the text of the document. This paper presents techniques for the construction and use of ontologies in the automatic indexing process. We conclude that the use of ontologies in the indexing process allows to add not only new feature to the indexing process, but also allows us to think in new and advanced features in an information retrieval system.
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
The terminological performance of the descriptors representing the Information Science domain in the SIBI/USP Controlled Vocabulary was evaluated in manual, automatic and semi-automatic indexing processes. It can be concluded that, in order to have a better performance (i.e., to adequately represent the content of the corpus), current Information Science descriptors of the SIBi/USP Controlled Vocabulary must be extended and put into context by means of terminological definitions so that information needs of users are fulfilled.
Resumo:
This paper proposes a methodology for automatic extraction of building roof contours from a Digital Elevation Model (DEM), which is generated through the regularization of an available laser point cloud. The methodology is based on two steps. First, in order to detect high objects (buildings, trees etc.), the DEM is segmented through a recursive splitting technique and a Bayesian merging technique. The recursive splitting technique uses the quadtree structure for subdividing the DEM into homogeneous regions. In order to minimize the fragmentation, which is commonly observed in the results of the recursive splitting segmentation, a region merging technique based on the Bayesian framework is applied to the previously segmented data. The high object polygons are extracted by using vectorization and polygonization techniques. Second, the building roof contours are identified among all high objects extracted previously. Taking into account some roof properties and some feature measurements (e. g., area, rectangularity, and angles between principal axes of the roofs), an energy function was developed based on the Markov Random Field (MRF) model. The solution of this function is a polygon set corresponding to building roof contours and is found by using a minimization technique, like the Simulated Annealing (SA) algorithm. Experiments carried out with laser scanning DEM's showed that the methodology works properly, as it delivered roof contours with approximately 90% shape accuracy and no false positive was verified.
Resumo:
This paper proposes a monoscopic method for automatic determination of building's heights in digital photographs areas, based on radial displacement of points in the plan image and geometry at the time the photo is obtained. Determination of the buildings' heights can be used to model the surface in urban areas, urban planning and management, among others. The proposed methodology employs a set of steps to detect arranged radially from the system of photogrammetric coordinates, which characterizes the lateral edges of buildings present in the photo. In a first stage is performed the reduction of the searching area through detection of shadows projected by buildings, generating sub-images of the areas around each of the detected shadow. Then, for each sub-image, the edges are automatically extracted, and tests of consistency are applied for it in order to be characterized as segments of straight arranged radially. Next, with the lateral edges selected and the knowledge of the flight height, the buildings' heights can be calculated. The experimental results obtained with real images showed that the proposed approach is suitable to perform the automatic identification of the buildings height in digital images.
Resumo:
This article proposes a method for 3D road extraction from a stereopair of aerial images. The dynamic programming (DP) algorithm is used to carry out the optimization process in the object-space, instead of usually doing it in the image-space such as the DP traditional methodologies. This means that road centerlines are directly traced in the object-space, implying that a mathematical relationship is necessary to connect road points in object and image-space. This allows the integration of radiometric information from images into the associate mathematical road model. As the approach depends on an initial approximation of each road, it is necessary a few seed points to coarsely describe the road. Usually, the proposed method allows good results to be obtained, but large anomalies along the road can disturb its performance. Therefore, the method can be used for practical application, although it is expected some kind of local manual edition of the extracted road centerline.
Resumo:
In this paper is a totally automatic strategy proposed to reduce the complexity of patterns ( vegetation, building, soils etc.) that interact with the object 'road' in color images, thus reducing the difficulty of the automatic extraction of this object. The proposed methodology consists of three sequential steps. In the first step the punctual operator is applied for artificiality index computation known as NandA ( Natural and Artificial). The result is an image whose the intensity attribute is the NandA response. The second step consists in automatically thresholding the image obtained in the previous step, resulting in a binary image. This image usually allows the separation between artificial and natural objects. The third step consists in applying a preexisting road seed extraction methodology to the previous generated binary image. Several experiments carried out with real images made the verification of the potential of the proposed methodology possible. The comparison of the obtained result to others obtained by a similar methodology for road seed extraction from gray level images, showed that the main benefit was the drastic reduction of the computational effort.
Resumo:
Na atualidade a atribuição dos descritores de assuntos ou indexação do conteúdo dos livros, nem sempre está associada ao contexto concreto de cada biblioteca, provocando, em muitos casos, que a recuperação por assuntos não resulte adequada. Neste trabalho analisam-se os principais desafios e perspectivas da indexação dos livros, os avanços de análises de assuntos nos catálogos de bibliotecas, examinam-se procedimentos, instrumentos, regras e condutas utilizadas nas análises e representação do conteúdo dos livros. Também se mostra a interação entre o ensino, a pesquisa e a atuação profissional necessária para que os estudantes possam desenvolver competências na análise, na representação e na procura da informação, assim como os princípios - provavelmente menos evidentes- da organização do conhecimento. Este trabalho coloca em evidência que as políticas de gestão da informação, mais quantitativas que qualitativas, deixam num segundo plano o processamento intelectual do conteúdo prejudicando, desta maneira, a recuperação por assuntos através do catalogo da biblioteca. Finalmente, se recolhe uma serie de propostas docentes relacionadas com a atribuição de descritores de assuntos em contextos bibliotecários.
Resumo:
A política de indexação deve ser constituída de estratégias que permitam o alcance dos objetivos de recuperação do sistema de informação. O indexador tem a função primordial de compreender o documento ao realizar uma análise conceitual que represente adequadamente seu conteúdo. Utilizando a leitura como evento social/protocolo verbal em grupo, nosso objetivo é contribuir com a literatura sobre política de indexação e apresentar propostas de ensino de política de indexação direcionadas a alunos de graduação e pós-graduação, além de uma experiência de educação à distância com vistas à formação do bibliotecário em serviço. Os resultados obtidos demonstraram que a metodologia pode ser utilizada por sistemas de informação para que se tenha acesso ao conhecimento do indexador. Conclui que o indexador deve ser o alvo de investimento dos sistemas de informação e sugere aos sistemas de informação que a experiência do indexador também seja utilizada como parâmetro para política de indexação.
Resumo:
The indexing automation has been discussed by researches in the area of Information Science however the discussions have not been so clear on the use of indexing software. Thus, it is necessary to know the indexing software, as well as its application in the analysis of documentary contents. To do so, it is proposed, here, to investigate both the consistency of indexing and the exhaustiveness and precision of the information retrieval, by means of comparative analysis between SISA (Sistema de Indizacion Semi-Automatico) automatic index and BIREME ( Centro Latino-Americano e do Caribe de Informação em Ciencias da Saude) manual indexing. The aim of this paper is to contribute to the theoretical development of the indexing automation and the improvement of SISA. Thus, SISA application and evaluation was used based on the calculation of the consistency indexes between the two types of indexing, and the calculation of the exhaustiveness and precision indexes in information retrieval, by means of searching into BDSISA and BIREME databases, composed by descriptors taken from SISA and manual indexing respectively. The differences among the terms used in scientific papers comparing to the DeCS ones were the main difficult factor to achieve higher consistency indexes in the indexing. These differences influenced the exhaustiveness and precision indexes in the information retrieval, showing that it is necessary to improve the documentary language used by SISA software and to incorporate linguistic methods.
Resumo:
The aim of this paper is to evaluate the consistency indexes among 30 Brazilian university libraries from the south and south-east regions through a specific mathematical formula. It was selected a sample of 30 university libraries that, according to the information in their official sites, have a collection consisted of more than 100.000 copies and allow the search into the on-line catalog. Searches were carried out in every university by means of their sites, requesting books that contained a certain word in its title and were printed in a certain year. The response was a list of available titles in the library, from which we chose at random a title and asked to visualize the complete record to verify the existence of a given subject. This procedure was repeated until we found the same title in five libraries with the chosen subjects. The result is 10 trials, each one consisting of one figure and one table showing the selected libraries, the subjects, the documentary languages ( tools) and the consistency indexes relaxed and rigid. These trials show great discrepancy between the values of consistency indexes with intervals between 73,3% to 34,4% in the relaxed index, and between 60% and 9,6% in the rigid one. It was revealed that the coincidence in determining the subjects is not too high remaining below 39%. It is concluded that the difference between the consistency indexes may be due to factors as: incompatibility among documentary languages; lack of updating of these languages so as to follow the knowledge evolution; absence of a well-defined indexing policy with guidelines clearly established. Procedures of indexing followed by indexers could contribute to the consistency index to be bigger in percentage, since there would be parameters for the indexing process.
Resumo:
Estudaram-se os procedimentos de indexação biológica e bioquímica em um experimento de avaliação do uso de viróides como agentes ananicantes para pomeleiro 'Marsh Seedless'. em algumas parcelas, houve segregação de viróides, sendo que, em duas, a segregação foi de CEVd (viróide da exocorte dos citros) e, em outras duas, não foi detectada a presença de CVd-II (viróide dos citros II). Como conseqüência desta segregação, não ocorreram sintomas típicos da exocorte em cidra nas parcelas nas quais houve segregação de CEVd, o que ocorreu nas parcelas em que a segregação foi de CVd-II, indicando que é necessária a presença de CEV para a manifestação de sintomas típicos de exocorte em cidra. Numa quinta parcela, não se detectaram viróides, provavelmente, devido à ausência dos mesmos no material de inoculação do teste biológico (escape), já que a planta de campo teve desenvolvimento vegetativo normal e ausência de sintomas no tronco. Finalmente, em uma sexta parcela, o que ocorreu, foi uma segregação drástica, não relatada na literatura, já que a planta de campo mostrou redução de porte e sintomas no tronco. A coleta de varetas para indexação deve ser mais rigorosa em função da ocorrência de segregação.
Resumo:
One of the main problems in Computer Vision and Close Range Digital Photogrammetry is 3D reconstruction. 3D reconstruction with structured light is one of the existing techniques and which still has several problems, one of them the identification or classification of the projected targets. Approaching this problem is the goal of this paper. An area based method called template matching was used for target classification. This method performs detection of area similarity by correlation, which measures the similarity between the reference and search windows, using a suitable correlation function. In this paper the modified cross covariance function was used, which presented the best results. A strategy was developed for adaptative resampling of the patterns, which solved the problem of deformation of the targets due to object surface inclination. Experiments with simulated and real data were performed in order to assess the efficiency of the proposed methodology for target detection. The results showed that the proposed classification strategy works properly, identifying 98% of targets in plane surfaces and 93% in oblique surfaces.