960 resultados para document categorization
Resumo:
The aim of this paper is to give an overview of the issues and actions on the Brazilian cultural heritage and then to discuss contributions as well as relationships that may be established from the principles of Information Science. The first item is concerned with the relationship between heritage and the concept of document, the second relates the documentary processes and the information scientist and finally, an approach of cultural heritage mediation and appropriation is presented.
Resumo:
Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (http://www.documentengineering.org). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.
Resumo:
Point placement strategies aim at mapping data points represented in higher dimensions to bi-dimensional spaces and are frequently used to visualize relationships amongst data instances. They have been valuable tools for analysis and exploration of data sets of various kinds. Many conventional techniques, however, do not behave well when the number of dimensions is high, such as in the case of documents collections. Later approaches handle that shortcoming, but may cause too much clutter to allow flexible exploration to take place. In this work we present a novel hierarchical point placement technique that is capable of dealing with these problems. While good grouping and separation of data with high similarity is maintained without increasing computation cost, its hierarchical structure lends itself both to exploration in various levels of detail and to handling data in subsets, improving analysis capability and also allowing manipulation of larger data sets.
Resumo:
The problem of projecting multidimensional data into lower dimensions has been pursued by many researchers due to its potential application to data analyses of various kinds. This paper presents a novel multidimensional projection technique based on least square approximations. The approximations compute the coordinates of a set of projected points based on the coordinates of a reduced number of control points with defined geometry. We name the technique Least Square Projections ( LSP). From an initial projection of the control points, LSP defines the positioning of their neighboring points through a numerical solution that aims at preserving a similarity relationship between the points given by a metric in mD. In order to perform the projection, a small number of distance calculations are necessary, and no repositioning of the points is required to obtain a final solution with satisfactory precision. The results show the capability of the technique to form groups of points by degree of similarity in 2D. We illustrate that capability through its application to mapping collections of textual documents from varied sources, a strategic yet difficult application. LSP is faster and more accurate than other existing high-quality methods, particularly where it was mostly tested, that is, for mapping text sets.
Resumo:
Being able to ask questions about the provenance of some data requires documentation on each influence on that data's existence and content. Much software exists, and is being developed, for which there is no provenance-awareness, i.e. at best, the data it outputs can be connected to its inputs, but with no record of intermediate processing. Further, where some record of processing does exist, e.g. as logs, it is not in a form easily connected with that of other processes. We would like to enable compiled software to record useful documentation without requiring prior manual adaptation. In this paper, we present an approach to adapting source code from its original form without manual manipulation, to record information on data provenance during execution.
Resumo:
Documento componente do jogo “Musikinésia (http://www.loa.sead.ufscar.br/musikinesia.php)” desenvolvido pela equipe do Laboratório de Objetos de Aprendizagem da Universidade Federal de São Carlos (LOA/UFSCar).
Resumo:
Documento componente do jogo “Digestower (http://www.loa.sead.ufscar.br/digestower.php)” desenvolvido pela equipe do Laboratório de Objetos de Aprendizagem da Universidade Federal de São Carlos (LOA/UFSCar).
Resumo:
O presente estudo teve como objetivo a análise das intervenções em saúde bucal, registradas em atas de reuniões, de 15 Conselhos Municipais de Saúde, próprios de municípios pertencentes à 17ª Regional de Saúde do Estado do Paraná. A análise documental deu-se a partir da identificação das temáticas em saúde, com ênfase na categorização por assunto das intervenções em saúde bucal. Os resultados evidenciaram os registros relativos à programação e organização da prestação de serviços, seguida pelo orçamento em saúde, como sendo os mais freqüentes do conjunto de temáticas analisadas. Pôde-se identificar, em 90 atas das 591 estudadas, o total de 134 registros de intervenções em saúde bucal. Por meio da análise desses últimos, percebeu-se que as intervenções em saúde bucal eram relatos de ações já concretizadas, desprovidas de características propositivas quando analisadas sob a dimensão do planejamento em saúde. Sinaliza-se para a necessidade da categoria odontológica de adquirir um maior padrão de representatividade nesses espaços, de forma a possibilitar vínculos importantes no processo de planejamento e de fortalecimento da saúde bucal enquanto direito de cidadania.
Resumo:
Background. Ductal carcinoma in situ (DCIS) of the breast has been diagnosed increasingly since the advent of mammography. However, the natural history of these lesions remains uncertain. Ductal carcinoma in situ of the breast does not represent a single entity but a heterogeneous group with histologic and clinical differences. The histologic subtype of DCIS seems to have an influence on its biologic behavior, but there are few studies correlating subtype with biologic markers.Methods. The authors studied a consecutive series of 40 cases of DCIS and after its histologic categorization verified its relationship with ploidy using image analysis and analyzing estrogen receptor (ER), progesterone receptor (PR), p53 and c-erbB-2 expression using immunohistochemistry.Results. The three groups proposed according to the grade of malignancy were correlated significantly with some of the additional parameters studied, including aneuploidy and c-erB-2 expression. Aneuploidy was detected in 77.5% of cases of DCIS mainly in high and intermediate grade subtypes (100% and 80% vs. 35.7% in low grade) whereas immunoreactivity for c-erbB-2 was detected in 45% of cases of DCIS mainly in the high grade group. Expression of ER and PR were observed frequently in this study (63.9% and 65.7% respectively), but without correlation with the histologic subtype of DCIS, although we found a somewhat significant association between high grade DCIS and lack of ER. p53 protein expression was detected in 36.8% of these cases, but no relationship between this expression and histologic subtype or grading of DCIS was found.Conclusions. These results provide further evidence for the morphologic and biologic heterogeneity of DCIS. Besides histologic classification and nuclear grading, some biologic markers such as aneuploidy and c-erbB-2 expression constitute additional criteria of high grade of malignancy.
Resumo:
Includes bibliography
Resumo:
One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels have to be built using only the terms in the documents of the collection. This paper presents the SeCLAR (Selecting Candidate Labels using Association Rules) method, which explores the use of association rules for the selection of good candidates for labels of hierarchical document clusters. The candidates are processed by a classical method to generate the labels. The idea of the proposed method is to process each parent-child relationship of the nodes as an antecedent-consequent relationship of association rules. The experimental results show that the proposed method can improve the precision and recall of labels obtained by classical methods. © 2010 Springer-Verlag.
Resumo:
Different from the first attempts to solve the image categorization problem (often based on global features), recently, several researchers have been tackling this research branch through a new vantage point - using features around locally invariant interest points and visual dictionaries. Although several advances have been done in the visual dictionaries literature in the past few years, a problem we still need to cope with is calculation of the number of representative words in the dictionary. Therefore, in this paper we introduce a new solution for automatically finding the number of visual words in an N-Way image categorization problem by means of supervised pattern classification based on optimum-path forest. © 2011 IEEE.
Resumo:
One way to organize knowledge and make its search and retrieval easier is to create a structural representation divided by hierarchically related topics. Once this structure is built, it is necessary to find labels for each of the obtained clusters. In many cases the labels must be built using all the terms in the documents of the collection. This paper presents the SeCLAR method, which explores the use of association rules in the selection of good candidates for labels of hierarchical document clusters. The purpose of this method is to select a subset of terms by exploring the relationship among the terms of each document. Thus, these candidates can be processed by a classical method to generate the labels. An experimental study demonstrates the potential of the proposed approach to improve the precision and recall of labels obtained by classical methods only considering the terms which are potentially more discriminative. © 2012 - IOS Press and the authors. All rights reserved.