25 resultados para semantic retrieval
Resumo:
The extraction of relevant terms from texts is an extensively researched task in Text- Mining. Relevant terms have been applied in areas such as Information Retrieval or document clustering and classification. However, relevance has a rather fuzzy nature since the classification of some terms as relevant or not relevant is not consensual. For instance, while words such as "president" and "republic" are generally considered relevant by human evaluators, and words like "the" and "or" are not, terms such as "read" and "finish" gather no consensus about their semantic and informativeness. Concepts, on the other hand, have a less fuzzy nature. Therefore, instead of deciding on the relevance of a term during the extraction phase, as most extractors do, I propose to first extract, from texts, what I have called generic concepts (all concepts) and postpone the decision about relevance for downstream applications, accordingly to their needs. For instance, a keyword extractor may assume that the most relevant keywords are the most frequent concepts on the documents. Moreover, most statistical extractors are incapable of extracting single-word and multi-word expressions using the same methodology. These factors led to the development of the ConceptExtractor, a statistical and language-independent methodology which is explained in Part I of this thesis. In Part II, I will show that the automatic extraction of concepts has great applicability. For instance, for the extraction of keywords from documents, using the Tf-Idf metric only on concepts yields better results than using Tf-Idf without concepts, specially for multi-words. In addition, since concepts can be semantically related to other concepts, this allows us to build implicit document descriptors. These applications led to published work. Finally, I will present some work that, although not published yet, is briefly discussed in this document.
Resumo:
Ontologies formalized by means of Description Logics (DLs) and rules in the form of Logic Programs (LPs) are two prominent formalisms in the field of Knowledge Representation and Reasoning. While DLs adhere to the OpenWorld Assumption and are suited for taxonomic reasoning, LPs implement reasoning under the Closed World Assumption, so that default knowledge can be expressed. However, for many applications it is useful to have a means that allows reasoning over an open domain and expressing rules with exceptions at the same time. Hybrid MKNF knowledge bases make such a means available by formalizing DLs and LPs in a common logic, the Logic of Minimal Knowledge and Negation as Failure (MKNF). Since rules and ontologies are used in open environments such as the Semantic Web, inconsistencies cannot always be avoided. This poses a problem due to the Principle of Explosion, which holds in classical logics. Paraconsistent Logics offer a solution to this issue by assigning meaningful models even to contradictory sets of formulas. Consequently, paraconsistent semantics for DLs and LPs have been investigated intensively. Our goal is to apply the paraconsistent approach to the combination of DLs and LPs in hybrid MKNF knowledge bases. In this thesis, a new six-valued semantics for hybrid MKNF knowledge bases is introduced, extending the three-valued approach by Knorr et al., which is based on the wellfounded semantics for logic programs. Additionally, a procedural way of computing paraconsistent well-founded models for hybrid MKNF knowledge bases by means of an alternating fixpoint construction is presented and it is proven that the algorithm is sound and complete w.r.t. the model-theoretic characterization of the semantics. Moreover, it is shown that the new semantics is faithful w.r.t. well-studied paraconsistent semantics for DLs and LPs, respectively, and maintains the efficiency of the approach it extends.
Resumo:
Instituto Politécnico de Lisboa (IPL) e Instituto Superior de Engenharia de Lisboa (ISEL)apoio concedido pela bolsa SPRH/PROTEC/67580/2010, que apoiou parcialmente este trabalho
Resumo:
Nowadays, the consumption of goods and services on the Internet are increasing in a constant motion. Small and Medium Enterprises (SMEs) mostly from the traditional industry sectors are usually make business in weak and fragile market sectors, where customized products and services prevail. To survive and compete in the actual markets they have to readjust their business strategies by creating new manufacturing processes and establishing new business networks through new technological approaches. In order to compete with big enterprises, these partnerships aim the sharing of resources, knowledge and strategies to boost the sector’s business consolidation through the creation of dynamic manufacturing networks. To facilitate such demand, it is proposed the development of a centralized information system, which allows enterprises to select and create dynamic manufacturing networks that would have the capability to monitor all the manufacturing process, including the assembly, packaging and distribution phases. Even the networking partners that come from the same area have multi and heterogeneous representations of the same knowledge, denoting their own view of the domain. Thus, different conceptual, semantic, and consequently, diverse lexically knowledge representations may occur in the network, causing non-transparent sharing of information and interoperability inconsistencies. The creation of a framework supported by a tool that in a flexible way would enable the identification, classification and resolution of such semantic heterogeneities is required. This tool will support the network in the semantic mapping establishments, to facilitate the various enterprises information systems integration.
Resumo:
Currently the world swiftly adapts to visual communication. Online services like YouTube and Vine show that video is no longer the domain of broadcast television only. Video is used for different purposes like entertainment, information, education or communication. The rapid growth of today’s video archives with sparsely available editorial data creates a big problem of its retrieval. The humans see a video like a complex interplay of cognitive concepts. As a result there is a need to build a bridge between numeric values and semantic concepts. This establishes a connection that will facilitate videos’ retrieval by humans. The critical aspect of this bridge is video annotation. The process could be done manually or automatically. Manual annotation is very tedious, subjective and expensive. Therefore automatic annotation is being actively studied. In this thesis we focus on the multimedia content automatic annotation. Namely the use of analysis techniques for information retrieval allowing to automatically extract metadata from video in a videomail system. Furthermore the identification of text, people, actions, spaces, objects, including animals and plants. Hence it will be possible to align multimedia content with the text presented in the email message and the creation of applications for semantic video database indexing and retrieving.
Resumo:
According to a recent Eurobarometer survey (2014), 68% of Europeans tend not to trust national governments. As the increasing alienation of citizens from politics endangers democracy and welfare, governments, practitioners and researchers look for innovative means to engage citizens in policy matters. One of the measures intended to overcome the so-called democratic deficit is the promotion of civic participation. Digital media proliferation offers a set of novel characteristics related to interactivity, ubiquitous connectivity, social networking and inclusiveness that enable new forms of societal-wide collaboration with a potential impact on leveraging participative democracy. Following this trend, e-Participation is an emerging research area that consists in the use of Information and Communication Technologies to mediate and transform the relations among citizens and governments towards increasing citizens’ participation in public decision-making. However, despite the widespread efforts to implement e-Participation through research programs, new technologies and projects, exhaustive studies on the achieved outcomes reveal that it has not yet been successfully incorporated in institutional politics. Given the problems underlying e-Participation implementation, the present research suggested that, rather than project-oriented efforts, the cornerstone for successfully implementing e-Participation in public institutions as a sustainable added-value activity is a systematic organisational planning, embodying the principles of open-governance and open-engagement. It further suggested that BPM, as a management discipline, can act as a catalyst to enable the desired transformations towards value creation throughout the policy-making cycle, including political, organisational and, ultimately, citizen value. Following these findings, the primary objective of this research was to provide an instrumental model to foster e-Participation sustainability across Government and Public Administration towards a participatory, inclusive, collaborative and deliberative democracy. The developed artefact, consisting in an e-Participation Organisational Semantic Model (ePOSM) underpinned by a BPM-steered approach, introduces this vision. This approach to e-Participation was modelled through a semi-formal lightweight ontology stack structured in four sub-ontologies, namely e-Participation Strategy, Organisational Units, Functions and Roles. The ePOSM facilitates e-Participation sustainability by: (1) Promoting a common and cross-functional understanding of the concepts underlying e-Participation implementation and of their articulation that bridges the gap between technical and non-technical users; (2) Providing an organisational model which allows a centralised and consistent roll-out of strategy-driven e-Participation initiatives, supported by operational units dedicated to the execution of transformation projects and participatory processes; (3) Providing a standardised organisational structure, goals, functions and roles related to e-Participation processes that enhances process-level interoperability among government agencies; (4) Providing a representation usable in software development for business processes’ automation, which allows advanced querying using a reasoner or inference engine to retrieve concrete and specific information about the e-Participation processes in place. An evaluation of the achieved outcomes, as well a comparative analysis with existent models, suggested that this innovative approach tackling the organisational planning dimension can constitute a stepping stone to harness e-Participation value.
Resumo:
Based in internet growth, through semantic web, together with communication speed improvement and fast development of storage device sizes, data and information volume rises considerably every day. Because of this, in the last few years there has been a growing interest in structures for formal representation with suitable characteristics, such as the possibility to organize data and information, as well as the reuse of its contents aimed for the generation of new knowledge. Controlled Vocabulary, specifically Ontologies, present themselves in the lead as one of such structures of representation with high potential. Not only allow for data representation, as well as the reuse of such data for knowledge extraction, coupled with its subsequent storage through not so complex formalisms. However, for the purpose of assuring that ontology knowledge is always up to date, they need maintenance. Ontology Learning is an area which studies the details of update and maintenance of ontologies. It is worth noting that relevant literature already presents first results on automatic maintenance of ontologies, but still in a very early stage. Human-based processes are still the current way to update and maintain an ontology, which turns this into a cumbersome task. The generation of new knowledge aimed for ontology growth can be done based in Data Mining techniques, which is an area that studies techniques for data processing, pattern discovery and knowledge extraction in IT systems. This work aims at proposing a novel semi-automatic method for knowledge extraction from unstructured data sources, using Data Mining techniques, namely through pattern discovery, focused in improving the precision of concept and its semantic relations present in an ontology. In order to verify the applicability of the proposed method, a proof of concept was developed, presenting its results, which were applied in building and construction sector.
Resumo:
In the current global and competitive business context, it is essential that enterprises adapt their knowledge resources in order to smoothly interact and collaborate with others. However, due to the existent multiculturalism of people and enterprises, there are different representation views of business processes or products, even inside a same domain. Consequently, one of the main problems found in the interoperability between enterprise systems and applications is related to semantics. The integration and sharing of enterprises knowledge to build a common lexicon, plays an important role to the semantic adaptability of the information systems. The author proposes a framework to support the development of systems to manage dynamic semantic adaptability resolution. It allows different organisations to participate in a common knowledge base building, letting at the same time maintain their own views of the domain, without compromising the integration between them. Thus, systems are able to be aware of new knowledge, and have the capacity to learn from it and to manage its semantic interoperability in a dynamic and adaptable way. The author endorses the vision that in the near future, the semantic adaptability skills of the enterprise systems will be the booster to enterprises collaboration and the appearance of new business opportunities.
Resumo:
Based on the report for the unit “Sociology of New Information Technologies” of the Master on Computer Sciences at FCT/University Nova Lisbon in 2015-16. The responsible of this curricular unit is Prof. António Moniz
Resumo:
Recaí sob a responsabilidade da Marinha Portuguesa a gestão da Zona Económica Exclusiva de Portugal, assegurando a sua segurança da mesma face a atividades criminosas. Para auxiliar a tarefa, é utilizado o sistema Oversee, utilizado para monitorizar a posição de todas as embarcações presentes na área afeta, permitindo a rápida intervenção da Marinha Portuguesa quando e onde necessário. No entanto, o sistema necessita de transmissões periódicas constantes originadas nas embarcações para operar corretamente – casos as transmissões sejam interrompidas, deliberada ou acidentalmente, o sistema deixa de conseguir localizar embarcações, dificultando a intervenção da Marinha. A fim de colmatar esta falha, é proposto adicionar ao sistema Oversee a capacidade de prever as posições futuras de uma embarcação com base no seu trajeto até à cessação das transmissões. Tendo em conta os grandes volumes de dados gerados pelo sistema (históricos de posições), a área de Inteligência Artificial apresenta uma possível solução para este problema. Atendendo às necessidades de resposta rápida do problema abordado, o algoritmo de Geometric Semantic Genetic Programming baseado em referências de Vanneschi et al. apresenta-se como uma possível solução, tendo já produzido bons resultados em problemas semelhantes. O presente trabalho de tese pretende integrar o algoritmo de Geometric Semantic Genetic Programming desenvolvido com o sistema Oversee, a fim de lhe conceder capacidades preditivas. Adicionalmente, será realizado um processo de análise de desempenho a fim de determinar qual a ideal parametrização do algoritmo. Pretende-se com esta tese fornecer à Marinha Portuguesa uma ferramenta capaz de auxiliar o controlo da Zona Económica Exclusiva Portuguesa, permitindo a correta intervenção da Marinha em casos onde o atual sistema não conseguiria determinar a correta posição da embarcação em questão.