939 resultados para Open Information Extraction


Relevância:

40.00% 40.00%

Publicador:

Resumo:

Esta dissertação apresenta uma proposta de sistema capaz de preencher a lacuna entre documentos legislativos em formato PDF e documentos legislativos em formato aberto. O objetivo principal é mapear o conhecimento presente nesses documentos de maneira a representar essa coleção como informação interligada. O sistema é composto por vários componentes responsáveis pela execução de três fases propostas: extração de dados, organização de conhecimento, acesso à informação. A primeira fase propõe uma abordagem à extração de estrutura, texto e entidades de documentos PDF de maneira a obter a informação desejada, de acordo com a parametrização do utilizador. Esta abordagem usa dois métodos de extração diferentes, de acordo com as duas fases de processamento de documentos – análise de documento e compreensão de documento. O critério utilizado para agrupar objetos de texto é a fonte usada nos objetos de texto de acordo com a sua definição no código de fonte (Content Stream) do PDF. A abordagem está dividida em três partes: análise de documento, compreensão de documento e conjunção. A primeira parte da abordagem trata da extração de segmentos de texto, adotando uma abordagem geométrica. O resultado é uma lista de linhas do texto do documento; a segunda parte trata de agrupar os objetos de texto de acordo com o critério estipulado, produzindo um documento XML com o resultado dessa extração; a terceira e última fase junta os resultados das duas fases anteriores e aplica regras estruturais e lógicas no sentido de obter o documento XML final. A segunda fase propõe uma ontologia no domínio legal capaz de organizar a informação extraída pelo processo de extração da primeira fase. Também é responsável pelo processo de indexação do texto dos documentos. A ontologia proposta apresenta três características: pequena, interoperável e partilhável. A primeira característica está relacionada com o facto da ontologia não estar focada na descrição pormenorizada dos conceitos presentes, propondo uma descrição mais abstrata das entidades presentes; a segunda característica é incorporada devido à necessidade de interoperabilidade com outras ontologias do domínio legal, mas também com as ontologias padrão que são utilizadas geralmente; a terceira característica é definida no sentido de permitir que o conhecimento traduzido, segundo a ontologia proposta, seja independente de vários fatores, tais como o país, a língua ou a jurisdição. A terceira fase corresponde a uma resposta à questão do acesso e reutilização do conhecimento por utilizadores externos ao sistema através do desenvolvimento dum Web Service. Este componente permite o acesso à informação através da disponibilização de um grupo de recursos disponíveis a atores externos que desejem aceder à informação. O Web Service desenvolvido utiliza a arquitetura REST. Uma aplicação móvel Android também foi desenvolvida de maneira a providenciar visualizações dos pedidos de informação. O resultado final é então o desenvolvimento de um sistema capaz de transformar coleções de documentos em formato PDF para coleções em formato aberto de maneira a permitir o acesso e reutilização por outros utilizadores. Este sistema responde diretamente às questões da comunidade de dados abertos e de Governos, que possuem muitas coleções deste tipo, para as quais não existe a capacidade de raciocinar sobre a informação contida, e transformá-la em dados que os cidadãos e os profissionais possam visualizar e utilizar.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Based in internet growth, through semantic web, together with communication speed improvement and fast development of storage device sizes, data and information volume rises considerably every day. Because of this, in the last few years there has been a growing interest in structures for formal representation with suitable characteristics, such as the possibility to organize data and information, as well as the reuse of its contents aimed for the generation of new knowledge. Controlled Vocabulary, specifically Ontologies, present themselves in the lead as one of such structures of representation with high potential. Not only allow for data representation, as well as the reuse of such data for knowledge extraction, coupled with its subsequent storage through not so complex formalisms. However, for the purpose of assuring that ontology knowledge is always up to date, they need maintenance. Ontology Learning is an area which studies the details of update and maintenance of ontologies. It is worth noting that relevant literature already presents first results on automatic maintenance of ontologies, but still in a very early stage. Human-based processes are still the current way to update and maintain an ontology, which turns this into a cumbersome task. The generation of new knowledge aimed for ontology growth can be done based in Data Mining techniques, which is an area that studies techniques for data processing, pattern discovery and knowledge extraction in IT systems. This work aims at proposing a novel semi-automatic method for knowledge extraction from unstructured data sources, using Data Mining techniques, namely through pattern discovery, focused in improving the precision of concept and its semantic relations present in an ontology. In order to verify the applicability of the proposed method, a proof of concept was developed, presenting its results, which were applied in building and construction sector.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

According to the Bethesda Statement on Open Access Policy for libraries and the recommendations of the BOAI10, libraries and librarians have an important role to fulfil in the encouragement of open access. Taking into account the Competencies for Information Professionals of the 21st Century, elaborated by the Special Libraries Association, and the Librarians’ Competencies Profile for Scholarly Publishing and Open Access, we shall identify the competencies and new areas of knowledge and expertise that have been involved in the process of the development and upkeep of our institutional repository (Repositorio SSPA).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Monetary policy is conducted in an environment of uncertainty. This paper sets upa model where the central bank uses real-time data from the bond market togetherwith standard macroeconomic indicators to estimate the current state of theeconomy more efficiently, while taking into account that its own actions influencewhat it observes. The timeliness of bond market data allows for quicker responsesof monetary policy to disturbances compared to the case when the central bankhas to rely solely on collected aggregate data. The information content of theterm structure creates a link between the bond market and the macroeconomythat is novel to the literature. To quantify the importance of the bond market asa source of information, the model is estimated on data for the United Statesand Australia using Bayesian methods. The empirical exercise suggests that thereis some information in the US term structure that helps the Federal Reserve toidentify shocks to the economy on a timely basis. Australian bond prices seemto be less informative than their US counterparts, perhaps because Australia is arelatively small and open economy.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

This paper describes a project led by the Instituto Brasileiro de Informações em Ciência e Tecnologia (Ibict), a government institution, to build a national digital library for electronic theses and dissertations - Bibliteca Digital de Teses e Dissertações (BDTD). The project has been a collaborative effort among Ibict, universities and other research centers in Brazil. The developers adopted a system architecture based on the Open Archives Initiative (OAI) in which universities and research centers act as data providers and Ibict as a service provider. A Brazilian metadata standard for electronic theses and dissertations was developed for the digital library. A toolkit including open source package was also developed by Ibict to be distributed to potential data providers. BDTD has been integrated with the international initiative: the Networked Digital Library of Thesis and Dissertation (NDLTD). Discussions in the paper address various issues related to project design, development and management as well as the role played by Ibict. Conclusions highlight some important lessons learned to date and challenges for the future in expanding the BDTD project.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The article outlines free online legal resources to conduct research on Catalan and Spanish legislation and case-law. Most of these resources are primary sources made public by government bodies. The list shows how the Spanish and Catalan governments, in their attempt to promote equal access to legislation and case-law, cover the different jurisdictions. The text also mentions some resources to conduct historical legal research about legislation and case law, and some free legal private websites.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Background: Information about the composition of regulatory regions is of great value for designing experiments to functionally characterize gene expression. The multiplicity of available applications to predict transcription factor binding sites in a particular locus contrasts with the substantial computational expertise that is demanded to manipulate them, which may constitute a potential barrier for the experimental community. Results: CBS (Conserved regulatory Binding Sites, http://compfly.bio.ub.es/CBS) is a public platform of evolutionarily conserved binding sites and enhancers predicted in multiple Drosophila genomes that is furnished with published chromatin signatures associated to transcriptionally active regions and other experimental sources of information. The rapid access to this novel body of knowledge through a user-friendly web interface enables non-expert users to identify the binding sequences available for any particular gene, transcription factor, or genome region. Conclusions: The CBS platform is a powerful resource that provides tools for data mining individual sequences and groups of co-expressed genes with epigenomics information to conduct regulatory screenings in Drosophila.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

In this study we used market settlement prices of European call options on stock index futures to extract implied probability distribution function (PDF). The method used produces a PDF of returns of an underlying asset at expiration date from implied volatility smile. With this method, the assumption of lognormal distribution (Black-Scholes model) is tested. The market view of the asset price dynamics can then be used for various purposes (hedging, speculation). We used the so called smoothing approach for implied PDF extraction presented by Shimko (1993). In our analysis we obtained implied volatility smiles from index futures markets (S&P 500 and DAX indices) and standardized them. The method introduced by Breeden and Litzenberger (1978) was then used on PDF extraction. The results show significant deviations from the assumption of lognormal returns for S&P500 options while DAX options mostly fit the lognormal distribution. A deviant subjective view of PDF can be used to form a strategy as discussed in the last section.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Taking a realist view that law is one form of politics, this dissertation studies the roles of citizens and organizations in mobilizing the law to request government agencies to disclose environmental information in China, and during this process, how the socio-legal field interacts with the political-legal sphere, and what changes have been brought about during their interactions. This work takes a socio-legal approach and applies methodologies of social science and legal analysis. It aims to understand the paradox of why and how citizens and entities have been invoking the law to access environmental information despite the fact that various obstacles exist and the effectiveness of the new mechanism of environmental information disclosure still remains low. The study is largely based on the 28 cases and eight surveys of environmental information disclosure requests collected by the author. The cases and surveys analysed in this dissertation all occurred between May 2008, when the OGI Regulations and the OEI Measures came into effect, and August 2012 when the case collection was completed. The findings of this study have shown that by invoking the rules of law made by the authorities to demand government agencies disclosing environmental information, the public, including citizens, organizations, law firms, and the media, have strategically created a repercussive pressure upon the authorities to act according to the law. While it is a top-down process that has established the mechanism of open government information in China, it is indeed the bottom-up activism of the public that makes it work. Citizens and organizations’ use of legal tactics to push government agencies to disclose environmental information have formed not only an end of accessing the information but more a means of making government agencies accountable to their legal obligations. Law has thus played a pivotal role in enabling citizen participation in the political process. Against the current situation in China that political campaigns, or politicization, from general election to collective actions, especially contentious actions, are still restrained or even repressed by the government, legal mobilization, or judicialization, that citizens and organizations use legal tactics to demand their rights and push government agencies to enforce the law, become de facto an alternative of political participation. During this process, legal actions have helped to strengthen the civil society, make government agencies act according to law, push back the political boundaries, and induce changes in the relationship between the state and the public. In the field of environmental information disclosure, citizens and organizations have formed a bottom-up social activism, though limited in scope, using the language of law, creating progressive social, legal and political changes. This study emphasizes that it is partial and incomplete to understand China’s transition only from the top-down policy-making and government administration; it is also important to observe it from the bottom-up perspective that in a realistic view law can be part of politics and legal mobilization, even when utterly apolitical, can help to achieve political aims as well. This study of legal mobilization in the field of environmental information disclosure also helps us to better understand the function of law: law is not only a tool for the authorities to regulate and control, but inevitably also a weapon for the public to demand government agencies to work towards their obligations stipulated by the laws issued by themselves.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Biomedical natural language processing (BioNLP) is a subfield of natural language processing, an area of computational linguistics concerned with developing programs that work with natural language: written texts and speech. Biomedical relation extraction concerns the detection of semantic relations such as protein-protein interactions (PPI) from scientific texts. The aim is to enhance information retrieval by detecting relations between concepts, not just individual concepts as with a keyword search. In recent years, events have been proposed as a more detailed alternative for simple pairwise PPI relations. Events provide a systematic, structural representation for annotating the content of natural language texts. Events are characterized by annotated trigger words, directed and typed arguments and the ability to nest other events. For example, the sentence “Protein A causes protein B to bind protein C” can be annotated with the nested event structure CAUSE(A, BIND(B, C)). Converted to such formal representations, the information of natural language texts can be used by computational applications. Biomedical event annotations were introduced by the BioInfer and GENIA corpora, and event extraction was popularized by the BioNLP'09 Shared Task on Event Extraction. In this thesis we present a method for automated event extraction, implemented as the Turku Event Extraction System (TEES). A unified graph format is defined for representing event annotations and the problem of extracting complex event structures is decomposed into a number of independent classification tasks. These classification tasks are solved using SVM and RLS classifiers, utilizing rich feature representations built from full dependency parsing. Building on earlier work on pairwise relation extraction and using a generalized graph representation, the resulting TEES system is capable of detecting binary relations as well as complex event structures. We show that this event extraction system has good performance, reaching the first place in the BioNLP'09 Shared Task on Event Extraction. Subsequently, TEES has achieved several first ranks in the BioNLP'11 and BioNLP'13 Shared Tasks, as well as shown competitive performance in the binary relation Drug-Drug Interaction Extraction 2011 and 2013 shared tasks. The Turku Event Extraction System is published as a freely available open-source project, documenting the research in detail as well as making the method available for practical applications. In particular, in this thesis we describe the application of the event extraction method to PubMed-scale text mining, showing how the developed approach not only shows good performance, but is generalizable and applicable to large-scale real-world text mining projects. Finally, we discuss related literature, summarize the contributions of the work and present some thoughts on future directions for biomedical event extraction. This thesis includes and builds on six original research publications. The first of these introduces the analysis of dependency parses that leads to development of TEES. The entries in the three BioNLP Shared Tasks, as well as in the DDIExtraction 2011 task are covered in four publications, and the sixth one demonstrates the application of the system to PubMed-scale text mining.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The goal of this work is to develop an Open Agent Architecture for Multilingual information retrieval from Relational Database. The query for information retrieval can be given in plain Hindi or Malayalam; two prominent regional languages of India. The system supports distributed processing of user requests through collaborating agents. Natural language processing techniques are used for meaning extraction from the plain query and information is given back to the user in his/ her native language. The system architecture is designed in a structured way so that it can be adapted to other regional languages of India

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Los Centros de Investigación de Geografía son por lo general productores de un gran volumen de Información Geográfica (IG), los cuales generan tanto proyectos financiados como iniciativas de investigación individuales. El Centro de Estudos de Geografia e Planeamento Regional (e-GEO) ha estado involucrado en varios proyectos a escala local, regional, nacional e internacional. Recientemente, dos cuestiones fueron objeto de debate. Una de ellas fue el hecho de que la información espacial obtenida a partir del desarrollo de tales proyectos de investigación no ha tenido la visibilidad que se esperaba. En la mayoría de las veces, la IG de estos proyectos no estaba en el formato adecuado para que los investigadores -o incluso el público en general o grupos de interés- pudieran pesquisar fácilmente. La segunda cuestión era sobre cómo hacer que estos resultados pudieran ser accesibles al alcance de todos, en todos los lugares, fácilmente y con los mínimos costes para el Centro, teniendo en cuenta el actual contexto económico portugués y los intereses de e-GEO. Estas dos cuestiones se resuelven con una sola respuesta: la puesta en marcha de un WebGIS en una plataforma Open Source. En este trabajo se ilustra la producción de un instrumento para la difusión de las indicaciones geográficas en el World Wide Web, utilizando únicamente software libre y freeware. Esta herramienta permite a todos los investigadores del Centro publicar su IG, la cual aparece como plenamente accesible a cualquier usuario final. Potencialmente, el hecho de permitir que este tipo de información sea plenamente accesible debería generar un gran impacto, acortando las distancias entre el trabajo realizado por los académicos y el usuario final. Creemos que es una óptima manera para que el público pueda acceder e interpretar la información espacial. En conclusión, esta plataforma debería servir para cerrar la brecha entre productores y usuarios de la información geográfica, permitiendo la interacción entre todas las partes así como la carga de nuevos datos dado un conjunto de normas destinadas a control de calidad

Relevância:

40.00% 40.00%

Publicador:

Resumo:

AimTo describe the sequential healing of open extraction sockets at which no attempts to obtain a primary closure of the coronal access to the alveolus have been made.Material and methodsThe third mandibular premolar was extracted bilaterally in 12 monkeys, and no sutures were applied to close the wound. The healing after 4, 10, 20, 30, 90 and 180days was morphometrically studied.ResultsAfter 4days of healing, a blood clot mainly occupied the extraction sockets, with the presence of an inflammatory cells' infiltrate. A void was confined in the central zones of the coronal and middle regions, in continuity with the entrance of the alveoli. At 10days, the alveolus was occupied by a provisional matrix, with new bone formation lining the socket bony walls. At 20days, the amount of woven bone was sensibly increasing. At 30days, the alveolar socket was mainly occupied by mineralized immature bone at different stages of healing. At 90 and 180days, the amount of mineralized bone decreased and substituted by trabecular bone and bone marrow. Bundle bone decreased from 95.5% at 4days to 7.6% at 180days, of the whole length of the inner alveolar surface.ConclusionsModeling processes start from the lateral and apical walls of the alveolus, leading to the closure of the socket with newly formed bone within a month from extraction. Remodeling processes will follow the previous stages, resulting in trabecular and bone marrow formation and in a corticalization of the socket access.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The strategic management of information plays a fundamental role in the organizational management process since the decision-making process depend on the need for survival in a highly competitive market. Companies are constantly concerned about information transparency and good practices of corporate governance (CG) which, in turn, directs relations between the controlling power of the company and investors. In this context, this article presents the relationship between the disclosing of information of joint-stock companies by means of using XBRL, the open data model adopted by the Brazilian government, a model that boosted the publication of Information Access Law (Lei de Acesso à Informação), nº 12,527 of 18 November 2011. Information access should be permeated by a mediation policy in order to subsidize the knowledge construction and decision-making of investors. The XBRL is the main model for the publishing of financial information. The use of XBRL by means of new semantic standard created for Linked Data, strengthens the information dissemination, as well as creates analysis mechanisms and cross-referencing of data with different open databases available on the Internet, providing added value to the data/information accessed by civil society.