955 resultados para Open source information retrieval
Resumo:
There has been a dramatic change in the U.K. government policy regarding the establishment of new towns. The emphasis is now on the redevelopment of existing cities rather than on building new ones. This has created an urgent need to carry out detailed surveys and inventories of many aspects of urban land use in metropolitan areas: this study concentrates on just one aspect - urban open space. In the first stage a comparison was made between 1:10,000 scale black and white and 1:10,000 scale colour infra-red aerial photographs, to compare the type and amount of open space information which could be obtained from these two sources. The advantages of using colour infra-red photography were clearly demonstrated in this comparison. The second stage was the use of colour infra-red photography as the sole source of data to survey and map the urban open space of a sample area in Merseyside Metropolitan County. This sample area comprised eleven 1/4km2 squares, on each of which a 20m x 20m grid cell was placed to record, directly from the photography, 625 sets of data. Each set of data recorded the type and amount of open space, its surface cover, maintenance status and management. The data recorded were fed into a computer and a suite of programs was developed to provide output in both computer map and statistical form, for each of the eleven -1/4km2 -sample areas. The third stage involved a comparison of open space data with socio-economic status. Merseyside County Planning Authority had previously conducted a socio-economic survey of the county, and this information was used to identify ' the socio-economic status of the population in the eleven ilkm2 areas of this project. This comparison revealed many interesting and useful relationships between the provision of urban open space and socio-economic status.
Resumo:
In the context of Software Reuse providing techniques to support source code retrieval has been widely experimented. However, much effort is required in order to find how to match classical Information Retrieval and source code characteristics and implicit information. Introducing linguistic theories in the software development process, in terms of documentation standardization may produce significant benefits when applying Information Retrieval techniques. The goal of our research is to provide a tool to improve source code search and retrieval In order to achieve this goal we apply some linguistic rules to the development process.
Resumo:
The information architecture supports information retrieval by users in Web environment. The design should be center in the information user, favoring usability. The Faculty of Industrial Engineering and Tourism of the Universidad Central "Marta Abreu" de Las Villas, lacks a site that enhances the disclosure of information to its members. Are presented as objectives of the study: 1) conduct a user survey to identify information needs of users, 2) establish guidelines for information architecture for the institution focused on users, 3) designing the information architecture for the institution and 4) designed to evaluate the proposal. Are presented as objectives of the study: 1) to realize a user study to identify the information needs of users, 2) establish guidelines for information architecture for the institution focused on users, 3) to design the information architecture for the institution and 4) to evaluate the proposal designed. To obtain results are used methods in the theoretical and empirical levels. Besides, are use techniques that favored the design and evaluation. Is designed the intranet of the Faculty of Industrial Engineering and Tourism. Is evaluated the proposed design for the validation of the results.
Resumo:
Background and aims: Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging.
Materials and methods: The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: ‘semi-structured’ and ‘unstructured’. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry.
Results: The best result of 99.4% accuracy – which included only one semi-structured report predicted as unstructured – was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured.
Conclusions: These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought.
Resumo:
Abstract: Decision support systems have been widely used for years in companies to gain insights from internal data, thus making successful decisions. Lately, thanks to the increasing availability of open data, these systems are also integrating open data to enrich decision making process with external data. On the other hand, within an open-data scenario, decision support systems can be also useful to decide which data should be opened, not only by considering technical or legal constraints, but other requirements, such as "reusing potential" of data. In this talk, we focus on both issues: (i) open data for decision making, and (ii) decision making for opening data. We will first briefly comment some research problems regarding using open data for decision making. Then, we will give an outline of a novel decision-making approach (based on how open data is being actually used in open-source projects hosted in Github) for supporting open data publication. Bio of the speaker: Jose-Norberto Mazón holds a PhD from the University of Alicante (Spain). He is head of the "Cátedra Telefónica" on Big Data and coordinator of the Computing degree at the University of Alicante. He is also member of the WaKe research group at the University of Alicante. His research work focuses on open data management, data integration and business intelligence within "big data" scenarios, and their application to the tourism domain (smart tourism destinations). He has published his research in international journals, such as Decision Support Systems, Information Sciences, Data & Knowledge Engineering or ACM Transaction on the Web. Finally, he is involved in the open data project in the University of Alicante, including its open data portal at http://datos.ua.es
Resumo:
Background: Digital forensics is a rapidly expanding field, due to the continuing advances in computer technology and increases in data stage capabilities of devices. However, the tools supporting digital forensics investigations have not kept pace with this evolution, often leaving the investigator to analyse large volumes of textual data and rely heavily on their own intuition and experience. Aim: This research proposes that given the ability of information visualisation to provide an end user with an intuitive way to rapidly analyse large volumes of complex data, such approached could be applied to digital forensics datasets. Such methods will be investigated; supported by a review of literature regarding the use of such techniques in other fields. The hypothesis of this research body is that by utilising exploratory information visualisation techniques in the form of a tool to support digital forensic investigations, gains in investigative effectiveness can be realised. Method:To test the hypothesis, this research examines three different case studies which look at different forms of information visualisation and their implementation with a digital forensic dataset. Two of these case studies take the form of prototype tools developed by the researcher, and one case study utilises a tool created by a third party research group. A pilot study by the researcher is conducted on these cases, with the strengths and weaknesses of each being drawn into the next case study. The culmination of these case studies is a prototype tool which was developed to resemble a timeline visualisation of the user behaviour on a device. This tool was subjected to an experiment involving a class of university digital forensics students who were given a number of questions about a synthetic digital forensic dataset. Approximately half were given the prototype tool, named Insight, to use, and the others given a common open-source tool. The assessed metrics included: how long the participants took to complete all tasks, how accurate their answers to the tasks were, and how easy the participants found the tasks to complete. They were also asked for their feedback at multiple points throughout the task. Results:The results showed that there was a statistically significant increase in accuracy for one of the six tasks for the participants using the Insight prototype tool. Participants also found completing two of the six tasks significantly easier when using the prototype tool. There were no statistically significant different difference between the completion times of both participant groups. There were no statistically significant differences in the accuracy of participant answers for five of the six tasks. Conclusions: The results from this body of research show that there is evidence to suggest that there is the potential for gains in investigative effectiveness when information visualisation techniques are applied to a digital forensic dataset. Specifically, in some scenarios, the investigator can draw conclusions which are more accurate than those drawn when using primarily textual tools. There is also evidence so suggest that the investigators found these conclusions to be reached significantly more easily when using a tool with a visual format. None of the scenarios led to the investigators being at a significant disadvantage in terms of accuracy or usability when using the prototype visual tool over the textual tool. It is noted that this research did not show that the use of information visualisation techniques leads to any statistically significant difference in the time taken to complete a digital forensics investigation.
Resumo:
Things change. Words change, meaning changes and use changes both words and meaning. In information access systems this means concept schemes such as thesauri or clas- sification schemes change. They always have. Concept schemes that have survived have evolved over time, moving from one version, often called an edition, to the next. If we want to manage how words and meanings - and as a conse- quence use - change in an effective manner, and if we want to be able to search across versions of concept schemes, we have to track these changes. This paper explores how we might expand SKOS, a World Wide Web Consortium (W3C) draft recommendation in order to do that kind of tracking.The Simple Knowledge Organization System (SKOS) Core Guide is sponsored by the Semantic Web Best Practices and Deployment Working Group. The second draft, edited by Alistair Miles and Dan Brickley, was issued in November 2005. SKOS is a “model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, other types of controlled vocabulary and also concept schemes embedded in glossaries and terminologies” in RDF. How SKOS handles version in concept schemes is an open issue. The current draft guide suggests using OWL and DCTERMS as mechanisms for concept scheme revision.As it stands an editor of a concept scheme can make notes or declare in OWL that more than one version exists. This paper adds to the SKOS Core by introducing a tracking sys- tem for changes in concept schemes. We call this tracking system vocabulary ontogeny. Ontogeny is a biological term for the development of an organism during its lifetime. Here we use the ontogeny metaphor to describe how vocabularies change over their lifetime. Our purpose here is to create a conceptual mechanism that will track these changes and in so doing enhance information retrieval and prevent document loss through versioning, thereby enabling persistent retrieval.
Resumo:
This paper outlines the purposes, predications, functions, and contexts of information organization frameworks; including: bibliographic control, information retrieval, resource discovery, resource description, open access scholarly indexing, personal information management protocols, and social tagging in order to compare and contrast those purposes, predications, functions, and contexts. Information organization frameworks, for the purpose of this paper, consist of information organization systems (classification schemes, taxonomies, ontologies, bibliographic descriptions, etc.), methods of conceiving of and creating the systems, and the work processes involved in maintaining these systems. The paper first outlines the theoretical literature of these information organization frameworks. In conclusion, this paper establishes the first part of an evaluation rubric for a function, predication, purpose, and context analysis.
Resumo:
This article discusses issues related to the organization and reception of information in the context of services and public information systems driven by technology. It stems from the assumption that in a ""technologized"" society, the distance between users and information is almost always of cognitive and socio-cultural nature, a product of our effort to design communication. In this context, we favor the approach of the information sign, seeking to answer how a documentary message turns into information, i.e. a structure recognized as socially useful. Observing the structural, cognitive and communicative aspects of the documentary message, based on Documentary Linguistics, Terminology, as well as on Textual Linguistics, the policy of knowledge management and innovation of the Government of the State of Sao Paulo is analyzed, which authorizes the use of Web 2.0, also questioning to what extent this initiative represents innovation in the environment of libraries.
Resumo:
This article is published online with Open Access and distributed under the terms of the Creative Commons Attribution Non-Commercial License.
Resumo:
This paper summarizes a project that is contributing to a change in the way of teaching and learning Mathematics. Mathematics is a subject of the Accounting and Administration course. In this subject we teach: Functions and Algebra. The aim is that the student understand the basic concepts and is able to apply them in other issues, when possible, establishing a bridge between the issues that they have studied and their application in Accounting. As from this year, the Accounting course falls under in Bologna Process. The teacher and the student roles have changed. The time for theoretical and practical classes has been reduced, so it was necessary to modify the way of teaching and learning. In the theoretical classes we use systems of multimedia projection to present the concepts, and in the practical classes we solve exercises. We also use the Excel and the mathematical open source software wxMaxima. To supplement our theoretical and practical classes we have developed a project called MatActiva based on the Moodle platform offered by PAOL - Projecto de Apoio Online (Online Support Project). With the creation of this new project we wanted to take advantage already obtained results with the previous experiences, giving to the students opportunities to complement their study in Mathematics. One of the great objectives is to motivate students, encourage them to overcome theirs difficulties through an auto-study giving them more confidence. In the MatActiva project the students have a big collection of information about the way of the subject works, which includes the objectives, the program, recommended bibliography, evaluation method and summaries. It works as material support for the practical and theoretical classes, the slides of the theoretical classes are available, the sheets with exercises for the students to do in the classroom and complementary exercises, as well as the exams of previous years. Students can also do diagnostic tests and evaluation tests online. Our approach is a reflexive one, based on the professional experience of the teachers that explore and incorporate new tools of Moodle with their students and coordinate the project MatActiva.
Resumo:
Many of the most common human functions such as temporal and non-monotonic reasoning have not yet been fully mapped in developed systems, even though some theoretical breakthroughs have already been accomplished. This is mainly due to the inherent computational complexity of the theoretical approaches. In the particular area of fault diagnosis in power systems however, some systems which tried to solve the problem, have been deployed using methodologies such as production rule based expert systems, neural networks, recognition of chronicles, fuzzy expert systems, etc. SPARSE (from the Portuguese acronym, which means expert system for incident analysis and restoration support) was one of the developed systems and, in the sequence of its development, came the need to cope with incomplete and/or incorrect information as well as the traditional problems for power systems fault diagnosis based on SCADA (supervisory control and data acquisition) information retrieval, namely real-time operation, huge amounts of information, etc. This paper presents an architecture for a decision support system, which can solve the presented problems, using a symbiosis of the event calculus and the default reasoning rule based system paradigms, insuring soft real-time operation with incomplete, incorrect or domain incoherent information handling ability. A prototype implementation of this system is already at work in the control centre of the Portuguese Transmission Network.
Resumo:
Mestrado em Engenharia Electrotécnica e de Computadores
Resumo:
Devido ao facto de hoje em dia a informação que é processada numa rede informática empresarial, ser cada vez mais de ordem confidencial, torna-se necessário que essa informação esteja o mais protegida possível. Ao mesmo tempo, é necessário que esta a informação esteja disponível com a devida rapidez, para os parceiros certos, num mundo cada vez mais globalizado. Com este trabalho pretende-se efectuar o estudo e implementação da segurança, numa pequena e genérica rede de testes, que facilmente seja extrapolada, para uma rede da dimensão, de uma grande empresa com potenciais ramificações por diversos locais. Pretende-se implementar/monitorização segurança quer externamente, (Internet service provider ISP) quer internamente (activos de rede, postos de trabalho/utilizadores). Esta análise é baseada na localização (local, wireless ou remota), e, sempre que seja detectada qualquer anomalia, seja identificada a sua localização, sendo tomadas automaticamente acções de protecção. Estas anomalias poderão ser geridas recorrendo a ferramentas open source ou comerciais, que façam a recolha de toda a informação necessária, e tomem acções de correcção ou alerta mediante o tipo de anomalia.
Resumo:
Mestrado em Engenharia Electrotécnica e de Computadores