891 resultados para 080704 Information Retrieval and Web Search
Resumo:
Information retrieval has been much discussed within Information Science lately. The search for quality information compatible with the users needs became the object of constant research.Using the Internet as a source of dissemination of knowledge has suggested new models of information storage, such as digital repositories, which have been used in academic research as the main form of autoarchiving and disseminating information, but with an information structure that suggests better descriptions of resources and hence better retrieval.Thus the objective is to improve the process of information retrieval, presenting a proposal for a structural model in the context of the semantic web, addressing the use of web 2.0 and web 3.0 in digital repositories, enabling semantic retrieval of information through building a data layer called Iterative Representation.The present study is characterized as descriptive and analytical, based on document analysis, divided into two parts: the first, characterized by direct observation of non-participatory tools that implement digital repositories, as well as digital repositories already instantiated, and the second with scanning feature, which suggests an innovative model for repositories, with the use of structures of knowledge representation and user participation in building a vocabulary domain. The model suggested and proposed ─ Iterative Representation ─ will allow to tailor the digital repositories using Folksonomy and also controlled vocabulary of the field in order to generate a data layer iterative, which allows feedback information, and semantic retrieval of information, through the structural model designed for repositories. The suggested model resulted in the formulation of the thesis that through Iterative Representation it is possible to establish a process of semantic retrieval of information in digital repositories.
Resumo:
In this paper we study the intersection of Knowledge Organization with Information Technologies and the challenges and opportunities for Knowledge Organization experts that, in our view, are important to be studied and for them to be aware of. We start by giving some definitions necessary for providing the context for our work. Then we review the history of the Web, beginning with the Internet and continuing with the World Wide Web, the Semantic Web, problems of Artificial Intelligence, Web 2.0, and Linked Data. Finally, we conclude our paper with IT applications for Knowledge Organization in libraries, such as FRBR, BIBFRAME, and several OCLC initiatives, as well as with some of the challenges and opportunities in which Knowledge Organization experts and researchers might play a key role in relation to the Semantic Web.
Resumo:
This paper reports a research to evaluate the potential and the effects of use of annotated Paraconsistent logic in automatic indexing. This logic attempts to deal with contradictions, concerned with studying and developing inconsistency-tolerant systems of logic. This logic, being flexible and containing logical states that go beyond the dichotomies yes and no, permits to advance the hypothesis that the results of indexing could be better than those obtained by traditional methods. Interactions between different disciplines, as information retrieval, automatic indexing, information visualization, and nonclassical logics were considered in this research. From the methodological point of view, an algorithm for treatment of uncertainty and imprecision, developed under the Paraconsistent logic, was used to modify the values of the weights assigned to indexing terms of the text collections. The tests were performed on an information visualization system named Projection Explorer (PEx), created at Institute of Mathematics and Computer Science (ICMC - USP Sao Carlos), with available source code. PEx uses traditional vector space model to represent documents of a collection. The results were evaluated by criteria built in the information visualization system itself, and demonstrated measurable gains in the quality of the displays, confirming the hypothesis that the use of the para-analyser under the conditions of the experiment has the ability to generate more effective clusters of similar documents. This is a point that draws attention, since the constitution of more significant clusters can be used to enhance information indexing and retrieval. It can be argued that the adoption of non-dichotomous (non-exclusive) parameters provides new possibilities to relate similar information.
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
Pós-graduação em Ciência da Informação - FFC
Resumo:
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)
Resumo:
XML similarity evaluation has become a central issue in the database and information communities, its applications ranging over document clustering, version control, data integration and ranked retrieval. Various algorithms for comparing hierarchically structured data, XML documents in particular, have been proposed in the literature. Most of them make use of techniques for finding the edit distance between tree structures, XML documents being commonly modeled as Ordered Labeled Trees. Yet, a thorough investigation of current approaches led us to identify several similarity aspects, i.e., sub-tree related structural and semantic similarities, which are not sufficiently addressed while comparing XML documents. In this paper, we provide an integrated and fine-grained comparison framework to deal with both structural and semantic similarities in XML documents (detecting the occurrences and repetitions of structurally and semantically similar sub-trees), and to allow the end-user to adjust the comparison process according to her requirements. Our framework consists of four main modules for (i) discovering the structural commonalities between sub-trees, (ii) identifying sub-tree semantic resemblances, (iii) computing tree-based edit operations costs, and (iv) computing tree edit distance. Experimental results demonstrate higher comparison accuracy with respect to alternative methods, while timing experiments reflect the impact of semantic similarity on overall system performance.
Resumo:
Nowadays, competitiveness introduces new behaviors and leads companies to a discomforting situation and often to non adaptation to environmental requirements. A growing number of challenges associated with control of information in organizations with engineering activities can be seen, particularly, the growing amount of information subject to continuous changes. The innovative performance of an organization is directly proportional to its ability to manage information. Thus, the importance of information management is recognized by the search for more competent ways to face current demands. The purpose of this article was to analyze informationdependent processes in technology-based companies, through the four major stages of information management. The comparative method of cases and qualitative research were used. The research was conducted in nine technology-based companies which were incubated or recently went through the incubating process at the Technological Park of Sao Carlos, in the state of Sao Paulo. Among the main results, it was found that in graduated companies information management and its procedures were identified as more conscious and structured in contrast to those of the incubated companies.
Resumo:
Models are becoming increasingly important in the software development process. As a consequence, the number of models being used is increasing, and so is the need for efficient mechanisms to search them. Various existing search engines could be used for this purpose, but they lack features to properly search models, mainly because they are strongly focused on text-based search. This paper presents Moogle, a model search engine that uses metamodeling information to create richer search indexes and to allow more complex queries to be performed. The paper also presents the results of an evaluation of Moogle, which showed that the metamodel information improves the accuracy of the search.
Resumo:
The our reality is characterized by a constant progress and, to follow that, people need to stay up to date on the events. In a world with a lot of existing news, search for the ideal ones may be difficult, because the obstacles that make it arduous will be expanded more and more over time, due to the enrichment of data. In response, a great help is given by Information Retrieval, an interdisciplinary branch of computer science that deals with the management and the retrieval of the information. An IR system is developed to search for contents, contained in a reference dataset, considered relevant with respect to the need expressed by an interrogative query. To satisfy these ambitions, we must consider that most of the developed IR systems rely solely on textual similarity to identify relevant information, defining them as such when they include one or more keywords expressed by the query. The idea studied here is that this is not always sufficient, especially when it's necessary to manage large databases, as is the web. The existing solutions may generate low quality responses not allowing, to the users, a valid navigation through them. The intuition, to overcome these limitations, has been to define a new concept of relevance, to differently rank the results. So, the light was given to Temporal PageRank, a new proposal for the Web Information Retrieval that relies on a combination of several factors to increase the quality of research on the web. Temporal PageRank incorporates the advantages of a ranking algorithm, to prefer the information reported by web pages considered important by the context itself in which they reside, and the potential of techniques belonging to the world of the Temporal Information Retrieval, exploiting the temporal aspects of data, describing their chronological contexts. In this thesis, the new proposal is discussed, comparing its results with those achieved by the best known solutions, analyzing its strengths and its weaknesses.
Resumo:
We describe the use of log file analysis to investigate whether the use of CSCL applications corresponds to its didactical purposes. Exemplarily we examine the use of the web-based system CommSy as software support for project-oriented university courses. We present two findings: (1) We suggest measures to shape the context of CSCL applications and support their initial and continuous use. (2) We show how log files can be used to analyze how, when and by whom a CSCL system is used and thus help to validate further empirical findings. However, log file analyses can only be interpreted reasonably when additional data concerning the context of use is available.
Quality evaluation of the available Internet information regarding pain during orthodontic treatment
Resumo:
OBJECTIVE To investigate the quality of the data disseminated via the Internet regarding pain experienced by orthodontic patients. MATERIALS AND METHODS A systematic online search was performed for 'orthodontic pain' and 'braces pain' separately using five search engines. The first 25 results from each search term-engine combination were pooled for analysis. After excluding advertising sites, discussion groups, video feeds, and links to scientific articles, 25 Web pages were evaluated in terms of accuracy, readability, accessibility, usability, and reliability using recommended research methodology; reference textbook material, the Flesch Reading Ease Score; and the LIDA instrument. Author and information details were also recorded. RESULTS Overall, the results indicated a variable quality of the available informational material. Although the readability of the Web sites was generally acceptable, the individual LIDA categories were rated of medium or low quality, with average scores ranging from 16.9% to 86.2%. The orthodontic relevance of the Web sites was not accompanied by the highest assessment results, and vice versa. CONCLUSIONS The quality of the orthodontic pain information cited by Web sources appears to be highly variable. Further structural development of health information technology along with public referral to reliable sources by specialists are recommended.
Resumo:
OBJECTIVE: To characterize PubMed usage over a typical day and compare it to previous studies of user behavior on Web search engines. DESIGN: We performed a lexical and semantic analysis of 2,689,166 queries issued on PubMed over 24 consecutive hours on a typical day. MEASUREMENTS: We measured the number of queries, number of distinct users, queries per user, terms per query, common terms, Boolean operator use, common phrases, result set size, MeSH categories, used semantic measurements to group queries into sessions, and studied the addition and removal of terms from consecutive queries to gauge search strategies. RESULTS: The size of the result sets from a sample of queries showed a bimodal distribution, with peaks at approximately 3 and 100 results, suggesting that a large group of queries was tightly focused and another was broad. Like Web search engine sessions, most PubMed sessions consisted of a single query. However, PubMed queries contained more terms. CONCLUSION: PubMed's usage profile should be considered when educating users, building user interfaces, and developing future biomedical information retrieval systems.