846 resultados para Semantic Publishing, Linked Data, Bibliometrics, Informetrics, Data Retrieval, Citations


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The title of the study is ''Toxicology Literature: An Informetric Analysis".In the field of Toxicology, the interdisciplinary research resulted in 'information fragmentation' of the basic subject to environmental, medical and economic toxicology. The interest in collaborative research resulted in the transdisciplinary growth of Toxicology which ultimately resulted in the scatter of literature.For the purpose of present study Toxicology is defined as the physical and chemical aspects of all poisons affecting environmental, economical and medical aspects of human life. Informetrics is "the use and development of a variety of measures to study and analyse several properties of information in general and documents in particular."The present study fled light on the main fields of Toxicology research as well as the important primary journals through which the results are being published. The authorshippattern, subject-wise scatter, country-wise, language-wise and growth pattern, self-citation, bibliographic coupling of the journals were studied. The study will be of great use in forrnulatinq the acquisition policy of documents in a library. The present study is useful in identifying obsolate journals so that they can be discarded from the collection

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Lavoro svolto per la creazione di una rete citazionale a partire da articoli scientifici codificati in XML JATS. Viene effettuata un'introduzione sul semantic publishing, le ontologie di riferimento e i principali dataset su pubblicazioni scientifiche. Infine viene presentato il prototipo CiNeX che si occupa di estrarre da un dataset in XML JATS un grafo RDF utilizzando l'ontologia SPAR.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Un'applicazione web user-friendly di supporto ai ricercatori per l'esecuzione efficiente di specifici tasks di ricerca e analisi di articoli scientifici

Relevância:

100.00% 100.00%

Publicador:

Resumo:

A workflow-centric research object bundles a workflow, the provenance of the results obtained by its enactment, other digital objects that are relevant for the experiment (papers, datasets, etc.), and annotations that semantically describe all these objects. In this paper, we propose a model to specify workflow-centric research objects, and show how the model can be grounded using semantic technologies and existing vocabularies, in particular the Object Reuse and Exchange (ORE) model and the Annotation Ontology (AO).We describe the life-cycle of a research object, which resembles the life-cycle of a scienti?c experiment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the rise of smart phones, lifelogging devices (e.g. Google Glass) and popularity of image sharing websites (e.g. Flickr), users are capturing and sharing every aspect of their life online producing a wealth of visual content. Of these uploaded images, the majority are poorly annotated or exist in complete semantic isolation making the process of building retrieval systems difficult as one must firstly understand the meaning of an image in order to retrieve it. To alleviate this problem, many image sharing websites offer manual annotation tools which allow the user to “tag” their photos, however, these techniques are laborious and as a result have been poorly adopted; Sigurbjörnsson and van Zwol (2008) showed that 64% of images uploaded to Flickr are annotated with < 4 tags. Due to this, an entire body of research has focused on the automatic annotation of images (Hanbury, 2008; Smeulders et al., 2000; Zhang et al., 2012a) where one attempts to bridge the semantic gap between an image’s appearance and meaning e.g. the objects present. Despite two decades of research the semantic gap still largely exists and as a result automatic annotation models often offer unsatisfactory performance for industrial implementation. Further, these techniques can only annotate what they see, thus ignoring the “bigger picture” surrounding an image (e.g. its location, the event, the people present etc). Much work has therefore focused on building photo tag recommendation (PTR) methods which aid the user in the annotation process by suggesting tags related to those already present. These works have mainly focused on computing relationships between tags based on historical images e.g. that NY and timessquare co-exist in many images and are therefore highly correlated. However, tags are inherently noisy, sparse and ill-defined often resulting in poor PTR accuracy e.g. does NY refer to New York or New Year? This thesis proposes the exploitation of an image’s context which, unlike textual evidences, is always present, in order to alleviate this ambiguity in the tag recommendation process. Specifically we exploit the “what, who, where, when and how” of the image capture process in order to complement textual evidences in various photo tag recommendation and retrieval scenarios. In part II, we combine text, content-based (e.g. # of faces present) and contextual (e.g. day-of-the-week taken) signals for tag recommendation purposes, achieving up to a 75% improvement to precision@5 in comparison to a text-only TF-IDF baseline. We then consider external knowledge sources (i.e. Wikipedia & Twitter) as an alternative to (slower moving) Flickr in order to build recommendation models on, showing that similar accuracy could be achieved on these faster moving, yet entirely textual, datasets. In part II, we also highlight the merits of diversifying tag recommendation lists before discussing at length various problems with existing automatic image annotation and photo tag recommendation evaluation collections. In part III, we propose three new image retrieval scenarios, namely “visual event summarisation”, “image popularity prediction” and “lifelog summarisation”. In the first scenario, we attempt to produce a rank of relevant and diverse images for various news events by (i) removing irrelevant images such memes and visual duplicates (ii) before semantically clustering images based on the tweets in which they were originally posted. Using this approach, we were able to achieve over 50% precision for images in the top 5 ranks. In the second retrieval scenario, we show that by combining contextual and content-based features from images, we are able to predict if it will become “popular” (or not) with 74% accuracy, using an SVM classifier. Finally, in chapter 9 we employ blur detection and perceptual-hash clustering in order to remove noisy images from lifelogs, before combining visual and geo-temporal signals in order to capture a user’s “key moments” within their day. We believe that the results of this thesis show an important step towards building effective image retrieval models when there lacks sufficient textual content (i.e. a cold start).

Relevância:

90.00% 90.00%

Publicador:

Resumo:

POSTDATA is a 5 year's European Research Council (ERC) Starting Grant Project that started in May 2016 and is hosted by the Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain. The context of the project is the corpora of European Poetry (EP), with a special focus on poetic materials from different languages and literary traditions. POSTDATA aims to offer a standardized model in the philological field and a metadata application profile (MAP) for EP in order to build a common classification of all these poetic materials. The information of Spanish, Italian and French repertoires will be published in the Linked Open Data (LOD) ecosystem. Later we expect to extend the model to include additional corpora. There are a number of Web Based Information Systems in Europe with repertoires of poems available to human consumption but not in an appropriate condition to be accessible and reusable by the Semantic Web. These systems are not interoperable; they are in fact locked in their databases and proprietary software, not suitable to be linked in the Semantic Web. A way to make this data interoperable is to develop a MAP in order to be able to publish this data available in the LOD ecosystem, and also to publish new data that will be created and modeled based on this MAP. To create a common data model for EP is not simple since the existent data models are based on conceptualizations and terminology belonging to their own poetical traditions and each tradition has developed an idiosyncratic analytical terminology in a different and independent way for years. The result of this uncoordinated evolution is a set of varied terminologies to explain analogous metrical phenomena through the different poetic systems whose correspondences have been hardly studied – see examples in González-Blanco & Rodríguez (2014a and b). This work has to be done by domain experts before the modeling actually starts. On the other hand, the development of a MAP is a complex task though it is imperative to follow a method for this development. The last years Curado Malta & Baptista (2012, 2013a, 2013b) have been studying the development of MAP's in a Design Science Research (DSR) methodological process in order to define a method for the development of MAPs (see Curado Malta (2014)). The output of this DSR process was a first version of a method for the development of Metadata Application Profiles (Me4MAP) (paper to be published). The DSR process is now in the validation phase of the Relevance Cycle to validate Me4MAP. The development of this MAP for poetry will follow the guidelines of Me4MAP and this development will be used to do the validation of Me4MAP. The final goal of the POSTDATA project is: i) to be able to publish all the data locked in the WIS, in LOD, where any agent interested will be able to build applications over the data in order to serve final users; ii) to build a Web platform where: a) researchers, students and other final users interested in EP will be able to access poems (and their analyses) of all databases; b) researchers, students and other final users will be able to upload poems, the digitalized images of manuscripts, and fill in the information concerning the analysis of the poem, collaboratively contributing to a LOD dataset of poetry.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In the paper we report on the results of our experiments on the construction of the opinion ontology. Our aim is to show the benefits of publishing in the open, on the Web, the results of the opinion mining process in a structured form. On the road to achieving this, we attempt to answer the research question to what extent opinion information can be formalized in a unified way. Furthermore, as part of the evaluation, we experiment with the usage of Semantic Web technologies and show particular use cases that support our claims.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Electronic publishing exploits numerous possibilities to present or exchange information and to communicate via most current media like the Internet. By utilizing modern Web technologies like Web Services, loosely coupled services, and peer-to-peer networks we describe the integration of an intelligent business news presentation and distribution network. Employing semantics technologies enables the coupling of multinational and multilingual business news data on a scalable international level and thus introduce a service quality that is not achieved by alternative technologies in the news distribution area so far. Architecturally, we identified the loose coupling of existing services as the most feasible way to address multinational and multilingual news presentation and distribution networks. Furthermore we semantically enrich multinational news contents by relating them using AI techniques like the Vector Space Model. Summarizing our experiences we describe the technical integration of semantics and communication technologies in order to create a modern international news network.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Several techniques are known for searching an ordered collection of data. The techniques and analyses of retrieval methods based on primary attributes are straightforward. Retrieval using secondary attributes depends on several factors. For secondary attribute retrieval, the linear structures—inverted lists, multilists, doubly linked lists—and the recently proposed nonlinear tree structures—multiple attribute tree (MAT), K-d tree (kdT)—have their individual merits. It is shown in this paper that, of the two tree structures, MAT possesses several features of a systematic data structure for external file organisation which make it superior to kdT. Analytic estimates for the complexity of node searchers, in MAT and kdT for several types of queries, are developed and compared.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Through this workshop, database experts from the various ministries and the Central Statistical Office (CSO) were introduced to the CREATE and PROCESS modules of the REDATAM software, which could be used for database creation and analysis of data. This workshop was the second in a series of workshops aimed at promoting human-resource and capacity-building at the national and regional levels in the use of the REDATAM software. It also served as a qualifier for a follow-up workshop on the use of the web-publishing application of the software to be held in 2010.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In questo elaborato viene presentata Semantic City Guide, un'applicazione mobile di guida turistica basata su Linked Open Data. Si vogliono presentare i principali vantaggi e svantaggi derivati dall'interazione tra sviluppo nativo di applicazioni mobili e tecnologie del Semantic Web. Il tutto verrà contestualizzato esaminando alcuni progetti di aziende ed enti statali operativi nel settore turistico e dell'informatica.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Lo scopo di questo elaborato è di analizzare e progettare un sistema in grado di supportare la definizione dei dati nel formato utilizzato per definire in modo formale la semantica dei dati, ma soprattutto nella complessa e innovativa attività di link discovery. Una attività molto potente che, tramite gli strumenti e le regole del Web Semantico (chiamato anche Web of Data), permette data una base di conoscenza sorgente ed altre basi di conoscenza esterne e distribuite nel Web, di interconnettere i dati della base di conoscenza sorgente a quelli esterni sulla base di complessi algoritmi di interlinking. Questi algoritmi fanno si che i concetti espressi sulla base di dati sorgente ed esterne vengano interconnessi esprimendo la semantica del collegamento ed in base a dei complessi criteri di confronto definiti nel suddetto algoritmo. Tramite questa attività si è in grado quindi di aumentare notevolmente la conoscenza della base di conoscenza sorgente, se poi tutte le basi di conoscenza presenti nel Web of Data seguissero questo procedimento, la conoscenza definita aumenterebbe fino a livelli che sono limitati solo dalla immensa vastità del Web, dando una potenza di elaborazione dei dati senza eguali. Per mezzo di questo sistema si ha l’ambizioso obiettivo di fornire uno strumento che permetta di aumentare sensibilmente la presenza dei Linked Open Data principalmente sul territorio nazionale ma anche su quello internazionale, a supporto di enti pubblici e privati che tramite questo sistema hanno la possibilità di aprire nuovi scenari di business e di utilizzo dei dati, dando una potenza al dato che attualmente è solo immaginabile.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

OBJECTIVE: To determine whether algorithms developed for the World Wide Web can be applied to the biomedical literature in order to identify articles that are important as well as relevant. DESIGN AND MEASUREMENTS A direct comparison of eight algorithms: simple PubMed queries, clinical queries (sensitive and specific versions), vector cosine comparison, citation count, journal impact factor, PageRank, and machine learning based on polynomial support vector machines. The objective was to prioritize important articles, defined as being included in a pre-existing bibliography of important literature in surgical oncology. RESULTS Citation-based algorithms were more effective than noncitation-based algorithms at identifying important articles. The most effective strategies were simple citation count and PageRank, which on average identified over six important articles in the first 100 results compared to 0.85 for the best noncitation-based algorithm (p < 0.001). The authors saw similar differences between citation-based and noncitation-based algorithms at 10, 20, 50, 200, 500, and 1,000 results (p < 0.001). Citation lag affects performance of PageRank more than simple citation count. However, in spite of citation lag, citation-based algorithms remain more effective than noncitation-based algorithms. CONCLUSION Algorithms that have proved successful on the World Wide Web can be applied to biomedical information retrieval. Citation-based algorithms can help identify important articles within large sets of relevant results. Further studies are needed to determine whether citation-based algorithms can effectively meet actual user information needs.