924 resultados para XML, Information, Retrieval, Query, Language
Resumo:
A evolução tecnológica tem provocado uma evolução na medicina, através de sistemas computacionais voltados para o armazenamento, captura e disponibilização de informações médicas. Os relatórios médicos são, na maior parte das vezes, guardados num texto livre não estruturado e escritos com vocabulário proprietário, podendo ocasionar falhas de interpretação. Através das linguagens da Web Semântica, é possível utilizar antologias como modo de estruturar e padronizar a informação dos relatórios médicos, adicionando¬ lhe anotações semânticas. A informação contida nos relatórios pode desta forma ser publicada na Web, permitindo às máquinas o processamento automático da informação. No entanto, o processo de criação de antologias é bastante complexo, pois existe o problema de criar uma ontologia que não cubra todo o domínio pretendido. Este trabalho incide na criação de uma ontologia e respectiva povoação, através de técnicas de PLN e Aprendizagem Automática que permitem extrair a informação dos relatórios médicos. Foi desenvolvida uma aplicação, que permite ao utilizador converter relatórios do formato digital para o formato OWL. ABSTRACT: Technological evolution has caused a medicine evolution through computer systems which allow storage, gathering and availability of medical information. Medical reports are, most of the times, stored in a non-structured free text and written in a personal way so that misunderstandings may occur. Through Semantic Web languages, it’s possible to use ontology as a way to structure and standardize medical reports information by adding semantic notes. The information in those reports can, by these means, be displayed on the web, allowing machines automatic information processing. However, the process of creating ontology is very complex, as there is a risk creating of an ontology that not covering the whole desired domain. This work is about creation of an ontology and its population through NLP and Machine Learning techniques to extract information from medical reports. An application was developed which allows the user to convert reports from digital for¬ mat to OWL format.
Resumo:
This paper describes our semi-automatic keyword based approach for the four topics of Information Extraction from Microblogs Posted during Disasters task at Forum for Information Retrieval Evaluation (FIRE) 2016. The approach consists three phases.
Resumo:
Rappresentazione della conoscenza in banca di dati testuali non strutturati in lingua Italiana.
Resumo:
This thesis develops AI methods as a contribution to computational musicology, an interdisciplinary field that studies music with computers. In systematic musicology a composition is defined as the combination of harmony, melody and rhythm. According to de La Borde, harmony alone "merits the name of composition". This thesis focuses on analysing the harmony from a computational perspective. We concentrate on symbolic music representation and address the problem of formally representing chord progressions in western music compositions. Informally, chords are sets of pitches played simultaneously, and chord progressions constitute the harmony of a composition. Our approach combines ML techniques with knowledge-based techniques. We design and implement the Modal Harmony ontology (MHO), using OWL. It formalises one of the most important theories in western music: the Modal Harmony Theory. We propose and experiment with different types of embedding methods to encode chords, inspired by NLP and adapted to the music domain, using both statistical (extensional) knowledge by relying on a huge dataset of chord annotations (ChoCo), intensional knowledge by relying on MHO and a combination of the two. The methods are evaluated on two musicologically relevant tasks: chord classification and music structure segmentation. The former is verified by comparing the results of the Odd One Out algorithm to the classification obtained with MHO. Good performances (accuracy: 0.86) are achieved. We feed a RNN for the latter, using our embeddings. Results show that the best performance (F1: 0.6) is achieved with embeddings that combine both approaches. Our method outpeforms the state of the art (F1 = 0.42) for symbolic music structure segmentation. It is worth noticing that embeddings based only on MHO almost equal the best performance (F1 = 0.58). We remark that those embeddings only require the ontology as an input as opposed to other approaches that rely on large datasets.
Resumo:
Relevant past events can be remembered when visualizing related pictures. The main difficulty is how to find these photos in a large personal collection. Query definition and image annotation are key issues to overcome this problem. The former is relevant due to the diversity of the clues provided by our memory when recovering a past moment and the later because images need to be annotated with information regarding those clues to be retrieved. Consequently, tools to recover past memories should deal carefully with these two tasks. This paper describes a user interface designed to explore pictures from personal memories. Users can query the media collection in several ways and for this reason an iconic visual language to define queries is proposed. Automatic and semi-automatic annotation is also performed using the image content and the audio information obtained when users show their images to others. The paper also presents the user interface evaluation based on tests with 58 participants.
Resumo:
Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.
Resumo:
The TCABR data analysis and acquisition system has been upgraded to support a joint research programme using remote participation technologies. The architecture of the new system uses Java language as programming environment. Since application parameters and hardware in a joint experiment are complex with a large variability of components, requirements and specification solutions need to be flexible and modular, independent from operating system and computer architecture. To describe and organize the information on all the components and the connections among them, systems are developed using the extensible Markup Language (XML) technology. The communication between clients and servers uses remote procedure call (RPC) based on the XML (RPC-XML technology). The integration among Java language, XML and RPC-XML technologies allows to develop easily a standard data and communication access layer between users and laboratories using common software libraries and Web application. The libraries allow data retrieval using the same methods for all user laboratories in the joint collaboration, and the Web application allows a simple graphical user interface (GUI) access. The TCABR tokamak team in collaboration with the IPFN (Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Universidade Tecnica de Lisboa) is implementing this remote participation technologies. The first version was tested at the Joint Experiment on TCABR (TCABRJE), a Host Laboratory Experiment, organized in cooperation with the IAEA (International Atomic Energy Agency) in the framework of the IAEA Coordinated Research Project (CRP) on ""Joint Research Using Small Tokamaks"". (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
Pode-se afirmar que a evolução tecnológica (desenvolvimento de novos instrumentos de medição como, softwares, satélites e computadores, bem como, o barateamento das mídias de armazenamento) permite às Organizações produzirem e adquirirem grande quantidade de dados em curto espaço de tempo. Devido ao volume de dados, Organizações de pesquisa se tornam potencialmente vulneráveis aos impactos da explosão de informações. Uma solução adotada por algumas Organizações é a utilização de ferramentas de sistemas de informação para auxiliar na documentação, recuperação e análise dos dados. No âmbito científico, essas ferramentas são desenvolvidas para armazenar diferentes padrões de metadados (dados sobre dados). Durante o processo de desenvolvimento destas ferramentas, destaca-se a adoção de padrões como a Linguagem Unificada de Modelagem (UML, do Inglês Unified Modeling Language), cujos diagramas auxiliam na modelagem de diferentes aspectos do software. O objetivo deste estudo é apresentar uma ferramenta de sistemas de informação para auxiliar na documentação dos dados das Organizações por meio de metadados e destacar o processo de modelagem de software, por meio da UML. Será abordado o Padrão de Metadados Digitais Geoespaciais, amplamente utilizado na catalogação de dados por Organizações científicas de todo mundo, e os diagramas dinâmicos e estáticos da UML como casos de uso, sequências e classes. O desenvolvimento das ferramentas de sistemas de informação pode ser uma forma de promover a organização e a divulgação de dados científicos. No entanto, o processo de modelagem requer especial atenção para o desenvolvimento de interfaces que estimularão o uso das ferramentas de sistemas de informação.
Resumo:
The cooperation and the sharing of information cataloguing and bibliographical in environment automated, this was only possible with the creation and adoption of interchange format MARC21. But due to the progresses of the technologies of information and communication, of the crescent use of Internet and of the databases and databanks, there were the need of the creation and development of tools that optimize the organization activities, retrieval and interchange of information. XML is one of those developments that have as purpose to facilitate the management, storage and transmission of data through Internet. Before that, it was proposed through a literature revision, to analyze Interchange Format MARC21 and Markup Language XML as tools for the consolidation of the Automated Cooperative Cataloguing, your differences of storage flexibilities, organization, retrieval and interchange of data through Internet. This research made possible the divulgation to the community librarian, through a literature revision, that has been discussed internationally on MARC21 and XML
Resumo:
Query expansion (QE) is a potentially useful technique to help searchers formulate improved query statements, and ultimately retrieve better search results. The objective of our query expansion technique is to find a suitable additional term. Two query expansion methods are applied in sequence to reformulate the query. Experiments on test collections show that the retrieval effectiveness is considerably higher when the query expansion technique is applied.
Resumo:
Over time, XML markup language has acquired a considerable importance in applications development, standards definition and in the representation of large volumes of data, such as databases. Today, processing XML documents in a short period of time is a critical activity in a large range of applications, which imposes choosing the most appropriate mechanism to parse XML documents quickly and efficiently. When using a programming language for XML processing, such as Java, it becomes necessary to use effective mechanisms, e.g. APIs, which allow reading and processing of large documents in appropriated manners. This paper presents a performance study of the main existing Java APIs that deal with XML documents, in order to identify the most suitable one for processing large XML files
Resumo:
MSc. Dissertation presented at Faculdade de Ciências e Tecnologia of Universidade Nova de Lisboa to obtain the Master degree in Electrical and Computer Engineering
Resumo:
A Work Project, presented as part of the requirements for the Award of a Masters Degree in Management from the NOVA – School of Business and Economics