943 resultados para SIB Semantic Information Broker OSGI Semantic Web
Resumo:
Humans have a high ability to extract visual data information acquired by sight. Trought a learning process, which starts at birth and continues throughout life, image interpretation becomes almost instinctively. At a glance, one can easily describe a scene with reasonable precision, naming its main components. Usually, this is done by extracting low-level features such as edges, shapes and textures, and associanting them to high level meanings. In this way, a semantic description of the scene is done. An example of this, is the human capacity to recognize and describe other people physical and behavioral characteristics, or biometrics. Soft-biometrics also represents inherent characteristics of human body and behaviour, but do not allow unique person identification. Computer vision area aims to develop methods capable of performing visual interpretation with performance similar to humans. This thesis aims to propose computer vison methods which allows high level information extraction from images in the form of soft biometrics. This problem is approached in two ways, unsupervised and supervised learning methods. The first seeks to group images via an automatic feature extraction learning , using both convolution techniques, evolutionary computing and clustering. In this approach employed images contains faces and people. Second approach employs convolutional neural networks, which have the ability to operate on raw images, learning both feature extraction and classification processes. Here, images are classified according to gender and clothes, divided into upper and lower parts of human body. First approach, when tested with different image datasets obtained an accuracy of approximately 80% for faces and non-faces and 70% for people and non-person. The second tested using images and videos, obtained an accuracy of about 70% for gender, 80% to the upper clothes and 90% to lower clothes. The results of these case studies, show that proposed methods are promising, allowing the realization of automatic high level information image annotation. This opens possibilities for development of applications in diverse areas such as content-based image and video search and automatica video survaillance, reducing human effort in the task of manual annotation and monitoring.
Resumo:
This dissertation research points out major challenging problems with current Knowledge Organization (KO) systems, such as subject gateways or web directories: (1) the current systems use traditional knowledge organization systems based on controlled vocabulary which is not very well suited to web resources, and (2) information is organized by professionals not by users, which means it does not reflect intuitively and instantaneously expressed users’ current needs. In order to explore users’ needs, I examined social tags which are user-generated uncontrolled vocabulary. As investment in professionally-developed subject gateways and web directories diminishes (support for both BUBL and Intute, examined in this study, is being discontinued), understanding characteristics of social tagging becomes even more critical. Several researchers have discussed social tagging behavior and its usefulness for classification or retrieval; however, further research is needed to qualitatively and quantitatively investigate social tagging in order to verify its quality and benefit. This research particularly examined the indexing consistency of social tagging in comparison to professional indexing to examine the quality and efficacy of tagging. The data analysis was divided into three phases: analysis of indexing consistency, analysis of tagging effectiveness, and analysis of tag attributes. Most indexing consistency studies have been conducted with a small number of professional indexers, and they tended to exclude users. Furthermore, the studies mainly have focused on physical library collections. This dissertation research bridged these gaps by (1) extending the scope of resources to various web documents indexed by users and (2) employing the Information Retrieval (IR) Vector Space Model (VSM) - based indexing consistency method since it is suitable for dealing with a large number of indexers. As a second phase, an analysis of tagging effectiveness with tagging exhaustivity and tag specificity was conducted to ameliorate the drawbacks of consistency analysis based on only the quantitative measures of vocabulary matching. Finally, to investigate tagging pattern and behaviors, a content analysis on tag attributes was conducted based on the FRBR model. The findings revealed that there was greater consistency over all subjects among taggers compared to that for two groups of professionals. The analysis of tagging exhaustivity and tag specificity in relation to tagging effectiveness was conducted to ameliorate difficulties associated with limitations in the analysis of indexing consistency based on only the quantitative measures of vocabulary matching. Examination of exhaustivity and specificity of social tags provided insights into particular characteristics of tagging behavior and its variation across subjects. To further investigate the quality of tags, a Latent Semantic Analysis (LSA) was conducted to determine to what extent tags are conceptually related to professionals’ keywords and it was found that tags of higher specificity tended to have a higher semantic relatedness to professionals’ keywords. This leads to the conclusion that the term’s power as a differentiator is related to its semantic relatedness to documents. The findings on tag attributes identified the important bibliographic attributes of tags beyond describing subjects or topics of a document. The findings also showed that tags have essential attributes matching those defined in FRBR. Furthermore, in terms of specific subject areas, the findings originally identified that taggers exhibited different tagging behaviors representing distinctive features and tendencies on web documents characterizing digital heterogeneous media resources. These results have led to the conclusion that there should be an increased awareness of diverse user needs by subject in order to improve metadata in practical applications. This dissertation research is the first necessary step to utilize social tagging in digital information organization by verifying the quality and efficacy of social tagging. This dissertation research combined both quantitative (statistics) and qualitative (content analysis using FRBR) approaches to vocabulary analysis of tags which provided a more complete examination of the quality of tags. Through the detailed analysis of tag properties undertaken in this dissertation, we have a clearer understanding of the extent to which social tagging can be used to replace (and in some cases to improve upon) professional indexing.
Resumo:
La visibilidad de una página Web involucra el proceso de mejora de la posición del sitio en los resultados devueltos por motores de búsqueda como Google. Hay muchas empresas que compiten agresivamente para conseguir la primera posición en los motores de búsqueda más populares. Como regla general, los sitios que aparecen más arriba en los resultados suelen obtener más tráfico a sus páginas, y de esta forma, potencialmente más negocios. En este artículo se describe los principales modelos para enriquecer los resultados de las búsquedas con información tales como fechas o localidades; información de tipo clave-valor que permite al usuario interactuar con el contenido de una página Web directamente desde el sitio de resultados de la búsqueda. El aporte fundamental del artículo es mostrar la utilidad de diferentes formatos de marcado para enriquecer fragmentos de una página Web con el fin de ayudar a las empresas que están planeando implementar métodos de enriquecimiento semánticos en la estructuración de sus sitios Web.
Resumo:
Part 19: Knowledge Management in Networks
Resumo:
The ontology engineering research community has focused for many years on supporting the creation, development and evolution of ontologies. Ontology forecasting, which aims at predicting semantic changes in an ontology, represents instead a new challenge. In this paper, we want to give a contribution to this novel endeavour by focusing on the task of forecasting semantic concepts in the research domain. Indeed, ontologies representing scientific disciplines contain only research topics that are already popular enough to be selected by human experts or automatic algorithms. They are thus unfit to support tasks which require the ability of describing and exploring the forefront of research, such as trend detection and horizon scanning. We address this issue by introducing the Semantic Innovation Forecast (SIF) model, which predicts new concepts of an ontology at time t + 1, using only data available at time t. Our approach relies on lexical innovation and adoption information extracted from historical data. We evaluated the SIF model on a very large dataset consisting of over one million scientific papers belonging to the Computer Science domain: the outcomes show that the proposed approach offers a competitive boost in mean average precision-at-ten compared to the baselines when forecasting over 5 years.
Resumo:
Les applications Web en général ont connu d’importantes évolutions technologiques au cours des deux dernières décennies et avec elles les habitudes et les attentes de la génération de femmes et d’hommes dite numérique. Paradoxalement à ces bouleversements technologiques et comportementaux, les logiciels d’enseignement et d’apprentissage (LEA) n’ont pas tout à fait suivi la même courbe d’évolution technologique. En effet, leur modèle de conception est demeuré si statique que leur utilité pédagogique est remise en cause par les experts en pédagogie selon lesquels les LEA actuels ne tiennent pas suffisamment compte des aspects théoriques pédagogiques. Mais comment améliorer la prise en compte de ces aspects dans le processus de conception des LEA? Plusieurs approches permettent de concevoir des LEA robustes. Cependant, un intérêt particulier existe pour l’utilisation du concept patron dans ce processus de conception tant par les experts en pédagogie que par les experts en génie logiciel. En effet, ce concept permet de capitaliser l’expérience des experts et permet aussi de simplifier de belle manière le processus de conception et de ce fait son coût. Une comparaison des travaux utilisant des patrons pour concevoir des LEA a montré qu’il n’existe pas de cadre de synergie entre les différents acteurs de l’équipe de conception, les experts en pédagogie d’un côté et les experts en génie logiciel de l’autre. De plus, les cycles de vie proposés dans ces travaux ne sont pas complets, ni rigoureusement décrits afin de permettre de développer des LEA efficients. Enfin, les travaux comparés ne montrent pas comment faire coexister les exigences pédagogiques avec les exigences logicielles. Le concept patron peut-il aider à construire des LEA robustes satisfaisant aux exigences pédagogiques ? Comme solution, cette thèse propose une approche de conception basée sur des patrons pour concevoir des LEA adaptés aux technologies du Web. Plus spécifiquement, l’approche méthodique proposée montre quelles doivent être les étapes séquentielles à prévoir pour concevoir un LEA répondant aux exigences pédagogiques. De plus, un répertoire est présenté et contient 110 patrons recensés et organisés en paquetages. Ces patrons peuvent être facilement retrouvés à l’aide du guide de recherche décrit pour être utilisés dans le processus de conception. L’approche de conception a été validée avec deux exemples d’application, permettant de conclure d’une part que l’approche de conception des LEA est réaliste et d’autre part que les patrons sont bien valides et fonctionnels. L’approche de conception de LEA proposée est originale et se démarque de celles que l’on trouve dans la littérature car elle est entièrement basée sur le concept patron. L’approche permet également de prendre en compte les exigences pédagogiques. Elle est générique car indépendante de toute plateforme logicielle ou matérielle. Toutefois, le processus de traduction des exigences pédagogiques n’est pas encore très intuitif, ni très linéaire. D’autres travaux doivent être réalisés pour compléter les résultats obtenus afin de pouvoir traduire en artéfacts exploitables par les ingénieurs logiciels les exigences pédagogiques les plus complexes et les plus abstraites. Pour la suite de cette thèse, une instanciation des patrons proposés serait intéressante ainsi que la définition d’un métamodèle basé sur des patrons qui pourrait permettre la spécification d’un langage de modélisation typique des LEA. L’ajout de patrons permettant d’ajouter une couche sémantique au niveau des LEA pourrait être envisagée. Cette couche sémantique permettra non seulement d’adapter les scénarios pédagogiques, mais aussi d’automatiser le processus d’adaptation au besoin d’un apprenant en particulier. Il peut être aussi envisagé la transformation des patrons proposés en ontologies pouvant permettre de faciliter l’évaluation des connaissances de l’apprenant, de lui communiquer des informations structurées et utiles pour son apprentissage et correspondant à son besoin d’apprentissage.
Resumo:
Conventional web search engines are centralised in that a single entity crawls and indexes the documents selected for future retrieval, and the relevance models used to determine which documents are relevant to a given user query. As a result, these search engines suffer from several technical drawbacks such as handling scale, timeliness and reliability, in addition to ethical concerns such as commercial manipulation and information censorship. Alleviating the need to rely entirely on a single entity, Peer-to-Peer (P2P) Information Retrieval (IR) has been proposed as a solution, as it distributes the functional components of a web search engine – from crawling and indexing documents, to query processing – across the network of users (or, peers) who use the search engine. This strategy for constructing an IR system poses several efficiency and effectiveness challenges which have been identified in past work. Accordingly, this thesis makes several contributions towards advancing the state of the art in P2P-IR effectiveness by improving the query processing and relevance scoring aspects of a P2P web search. Federated search systems are a form of distributed information retrieval model that route the user’s information need, formulated as a query, to distributed resources and merge the retrieved result lists into a final list. P2P-IR networks are one form of federated search in routing queries and merging result among participating peers. The query is propagated through disseminated nodes to hit the peers that are most likely to contain relevant documents, then the retrieved result lists are merged at different points along the path from the relevant peers to the query initializer (or namely, customer). However, query routing in P2P-IR networks is considered as one of the major challenges and critical part in P2P-IR networks; as the relevant peers might be lost in low-quality peer selection while executing the query routing, and inevitably lead to less effective retrieval results. This motivates this thesis to study and propose query routing techniques to improve retrieval quality in such networks. Cluster-based semi-structured P2P-IR networks exploit the cluster hypothesis to organise the peers into similar semantic clusters where each such semantic cluster is managed by super-peers. In this thesis, I construct three semi-structured P2P-IR models and examine their retrieval effectiveness. I also leverage the cluster centroids at the super-peer level as content representations gathered from cooperative peers to propose a query routing approach called Inverted PeerCluster Index (IPI) that simulates the conventional inverted index of the centralised corpus to organise the statistics of peers’ terms. The results show a competitive retrieval quality in comparison to baseline approaches. Furthermore, I study the applicability of using the conventional Information Retrieval models as peer selection approaches where each peer can be considered as a big document of documents. The experimental evaluation shows comparative and significant results and explains that document retrieval methods are very effective for peer selection that brings back the analogy between documents and peers. Additionally, Learning to Rank (LtR) algorithms are exploited to build a learned classifier for peer ranking at the super-peer level. The experiments show significant results with state-of-the-art resource selection methods and competitive results to corresponding classification-based approaches. Finally, I propose reputation-based query routing approaches that exploit the idea of providing feedback on a specific item in the social community networks and manage it for future decision-making. The system monitors users’ behaviours when they click or download documents from the final ranked list as implicit feedback and mines the given information to build a reputation-based data structure. The data structure is used to score peers and then rank them for query routing. I conduct a set of experiments to cover various scenarios including noisy feedback information (i.e, providing positive feedback on non-relevant documents) to examine the robustness of reputation-based approaches. The empirical evaluation shows significant results in almost all measurement metrics with approximate improvement more than 56% compared to baseline approaches. Thus, based on the results, if one were to choose one technique, reputation-based approaches are clearly the natural choices which also can be deployed on any P2P network.
Resumo:
This paper presents a study made in a field poorly explored in the Portuguese language – modality and its automatic tagging. Our main goal was to find a set of attributes for the creation of automatic tag- gers with improved performance over the bag-of-words (bow) approach. The performance was measured using precision, recall and F1. Because it is a relatively unexplored field, the study covers the creation of the corpus (composed by eleven verbs), the use of a parser to extract syntac- tic and semantic information from the sentences and a machine learning approach to identify modality values. Based on three different sets of attributes – from trigger itself and the trigger’s path (from the parse tree) and context – the system creates a tagger for each verb achiev- ing (in almost every verb) an improvement in F1 when compared to the traditional bow approach.
Resumo:
Building Information Modelling is changing the design and construction field ever since it entered the market. It took just some time to show its capabilities, it takes some time to be mastered before it could be used expressing all its best features. Since it was conceived to be adopted from the earliest stage of design to get the maximum from the decisional project, it still struggles to adapt to existing buildings. In fact, there is a branch of this methodology that is dedicated to what has been already made that is called Historic BIM or HBIM. This study aims to make clear what are BIM and HBIM, both from a theoretical point of view and in practice, applying from scratch the state of the art to a case study. It had been chosen the fortress of San Felice sul Panaro, a marvellous building with a thousand years of history in its bricks, that suffered violent earthquakes, but it is still standing. By means of this example, it will be shown which are the limits that could be encountered when applying BIM methodology to existing heritage, moreover will be pointed out all the new features that a simple 2D design could not achieve.
Resumo:
Most of the existing open-source search engines, utilize keyword or tf-idf based techniques to find relevant documents and web pages relative to an input query. Although these methods, with the help of a page rank or knowledge graphs, proved to be effective in some cases, they often fail to retrieve relevant instances for more complicated queries that would require a semantic understanding to be exploited. In this Thesis, a self-supervised information retrieval system based on transformers is employed to build a semantic search engine over the library of Gruppo Maggioli company. Semantic search or search with meaning can refer to an understanding of the query, instead of simply finding words matches and, in general, it represents knowledge in a way suitable for retrieval. We chose to investigate a new self-supervised strategy to handle the training of unlabeled data based on the creation of pairs of ’artificial’ queries and the respective positive passages. We claim that by removing the reliance on labeled data, we may use the large volume of unlabeled material on the web without being limited to languages or domains where labeled data is abundant.
Resumo:
This paper explains and explores the concept of "semantic molecules" in the NSM methodology of semantic analysis. A semantic molecule is a complex lexical meaning which functions as an intermediate unit in the structure of other, more complex concepts. The paper undertakes an overview of different kinds of semantic molecule, showing how they enter into more complex meanings and how they themselves can be explicated. It shows that four levels of "nesting" of molecules within molecules are attested, and it argues that while some molecules such as 'hands' and 'make', may well be language-universal, many others are language-specific.
Resumo:
Ecological interface design (EID) is proving to be a promising approach to the design of interfaces for complex dynamic systems. Although the principles of EID and examples of its effective use are widely available, few readily available examples exist of how the individual displays that constitute an ecological interface are developed. This paper presents the semantic mapping process within EID in the context of prior theoretical work in this area. The semantic mapping process that was used in developing an ecological interface for the Pasteurizer II microworld is outlined, and the results of an evaluation of the ecological interface against a more conventional interface are briefly presented. Subjective reports indicate features of the ecological interface that made it particularly valuable for participants. Finally, we outline the steps of an analytic process for using EID. The findings presented here can be applied in the design of ecological interfaces or of configural displays for dynamic processes.
Resumo:
Episodic memory impairment is a well-recognized feature of mesial temporal lobe epilepsy. Semantic memory has received much less attention in this patient population. In this study, semantic memory aspects (word-picture matching, word definition, confrontation and responsive naming, and word list generation) in 19 patients with left and right temporal lobe epilepsy secondary to mesial temporal sclerosis (MTS) were compared with those of normal controls. Patients with LMTS showed impaired performance in word definition (compared to controls and RMTS) and in responsive naming (compared to controls). RMTS and LMTS patients performed worse than controls in word-picture matching. Both patients with left and right mesial temporal lobe epilepsy performed worse than controls in word list generation and in confrontation naming tests. Attentional-executive dysfunction may have contributed to these deficits. We conclude that patients with left and right NITS display impaired aspects of semantic knowledge. A better understanding of semantic processing difficulties in these patients will provide better insight into the difficulties with activities of daily living in this patient population. (C) 2007 Elsevier Inc. All rights reserved.