Biblioteca Digital

921 resultados para Web data

Mining the World Wide Web - Methods, Applications, and Perspectives

Relevância:

70.00% 70.00%

Publicador:

Veja mais

Semantic web mining and the representation, analysis, and evolution of web space

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: Growing numbers of researchers work on improving the results of Web Mining by exploiting semantic structures in the Web, and they use Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The second aim of this paper is to use these concepts to circumscribe what Web space is, what it represents and how it can be represented and analyzed. This is used to sketch the role that Semantic Web Mining and the software agents and human agents involved in it can play in the evolution of Web space.

Veja mais

Semantic web mining

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Semantic Web Mining aims at combining the two fast-developing research areas Semantic Web and Web Mining. This survey analyzes the convergence of trends from both areas: an increasing number of researchers is working on improving the results of Web Mining by exploiting semantic structures in the Web, and they make use of Web Mining techniques for building the Semantic Web. Last but not least, these techniques can be used for mining the Semantic Web itself. The Semantic Web is the second-generation WWW, enriched by machine-processable information which supports the user in his tasks. Given the enormous size even of today’s Web, it is impossible to manually enrich all of these resources. Therefore, automated schemes for learning the relevant information are increasingly being used. Web Mining aims at discovering insights about the meaning of Web resources and their usage. Given the primarily syntactical nature of the data being mined, the discovery of meaning is impossible based on these data only. Therefore, formalizations of the semantics of Web sites and navigation behavior are becoming more and more common. Furthermore, mining the Semantic Web itself is another upcoming application. We argue that the two areas Web Mining and Semantic Web need each other to fulfill their goals, but that the full potential of this convergence is not yet realized. This paper gives an overview of where the two areas meet today, and sketches ways of how a closer integration could be profitable.

Veja mais

Bias in the Social Web

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract A frequent assumption in Social Media is that its open nature leads to a representative view of the world. In this talk we want to consider bias occurring in the Social Web. We will consider a case study of liquid feedback, a direct democracy platform of the German pirate party as well as models of (non-)discriminating systems. As a conclusion of this talk we stipulate the need of Social Media systems to bias their working according to social norms and to publish the bias they introduce. Speaker Biography: Prof Steffen Staab Steffen studied in Erlangen (Germany), Philadelphia (USA) and Freiburg (Germany) computer science and computational linguistics. Afterwards he worked as researcher at Uni. Stuttgart/Fraunhofer and Univ. Karlsruhe, before he became professor in Koblenz (Germany). Since March 2015 he also holds a chair for Web and Computer Science at Univ. of Southampton sharing his time between here and Koblenz. In his research career he has managed to avoid almost all good advice that he now gives to his team members. Such advise includes focusing on research (vs. company) or concentrating on only one or two research areas (vs. considering ontologies, semantic web, social web, data engineering, text mining, peer-to-peer, multimedia, HCI, services, software modelling and programming and some more). Though, actually, improving how we understand and use text and data is a good common denominator for a lot of Steffen's professional activities.

Veja mais

Inteligência cibernética e uso de recursos semânticos na detecção de perfis falsos no contexto do Big Data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Informação - FFC

Veja mais

Inteligência cibernética e uso de recursos semânticos na detecção de perfis falsos no contexto do Big Data

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Informação - FFC

Veja mais

A fuzzy grassroots ontology for improving social semantic web search

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The web is continuously evolving into a collection of many data, which results in the interest to collect and merge these data in a meaningful way. Based on that web data, this paper describes the building of an ontology resting on fuzzy clustering techniques. Through continual harvesting folksonomies by web agents, an entire automatic fuzzy grassroots ontology is built. This self-updating ontology can then be used for several practical applications in fields such as web structuring, web searching and web knowledge visualization.A potential application for online reputation analysis, added value and possible future studies are discussed in the conclusion.

Veja mais

A Focused Crawler in order to Get Semantic Web Resources (CSR)

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper presents a Focused Crawler in order to Get Semantic Web Resources (CSR). Structured data web are available in formats such as Extensible Markup Language (XML), Resource Description Framework (RDF) and Ontology Web Language (OWL) that can be used for processing. One of the main challenges for performing a manual search and download semantic web resources is that this task consumes a lot of time. Our research work propose a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing. CRS consists of three layers: (a) The User Interface Layer, (b) The Focus Crawler Layer and (c) The Base Crawler Layer. CSR uses as a selection policie the Shark-Search method. CSR was conducted with two experiments. The first one starts on December 15 2012 at 7:11 am and ends on December 16 2012 at 4:01 were obtained 448,123,537 bytes of data. The CSR ends by itself after to analyze 80,4375 seeds with an unlimited depth. CSR got 16,576 semantic resources files where the 89 % was RDF, the 10 % was XML and the 1% was OWL. The second one was based on the Web Data Commons work of the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. This began at 4:46 am of June 2 2013 and 1:37 am June 9 2013. After 162.51 hours of execution the result was 285,279 semantic resources where predominated the XML resources with 99 % and OWL and RDF with 1 % each one.

Veja mais

Handling and communicating uncertainty in chained geospatial web services

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Recent developments in service-oriented and distributed computing have created exciting opportunities for the integration of models in service chains to create the Model Web. This offers the potential for orchestrating web data and processing services, in complex chains; a flexible approach which exploits the increased access to products and tools, and the scalability offered by the Web. However, the uncertainty inherent in data and models must be quantified and communicated in an interoperable way, in order for its effects to be effectively assessed as errors propagate through complex automated model chains. We describe a proposed set of tools for handling, characterizing and communicating uncertainty in this context, and show how they can be used to 'uncertainty- enable' Web Services in a model chain. An example implementation is presented, which combines environmental and publicly-contributed data to produce estimates of sea-level air pressure, with estimates of uncertainty which incorporate the effects of model approximation as well as the uncertainty inherent in the observational and derived data.

Veja mais

An Empirical Investigation of Learning From the Semantic Web

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Peer reviewed

Veja mais

MHCWeb: Converting a WWW database into a knowledge-based collaborative environment

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The World Wide Web (WWW) is useful for distributing scientific data. Most existing web data resources organize their information either in structured flat files or relational databases with basic retrieval capabilities. For databases with one or a few simple relations, these approaches are successful, but they can be cumbersome when there is a data model involving multiple relations between complex data. We believe that knowledge-based resources offer a solution in these cases. Knowledge bases have explicit declarations of the concepts in the domain, along with the relations between them. They are usually organized hierarchically, and provide a global data model with a controlled vocabulary, We have created the OWEB architecture for building online scientific data resources using knowledge bases. OWEB provides a shell for structuring data, providing secure and shared access, and creating computational modules for processing and displaying data. In this paper, we describe the translation of the online immunological database MHCPEP into an OWEB system called MHCWeb. This effort involved building a conceptual model for the data, creating a controlled terminology for the legal values for different types of data, and then translating the original data into the new structure. The 0 WEB environment allows for flexible access to the data by both users and computer programs.

Veja mais

EULAR/PRINTO/PRES criteria for Henoch-Schonlein purpura, childhood polyarteritis nodosa, childhood Wegener granulomatosis and childhood Takayasu arteritis: Ankara 2008. Part I: Overall methodology and clinical characterisation

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objectives To report methodology and overall clinical, laboratory and radiographic characteristics for Henoch-Schonlein purpura (HSP), childhood polyarteritis nodosa (c-PAN), c-Wegener granulomatosis (c-WG) and c-Takayasu arteritis (c-TA) classification criteria.Methods The preliminary Vienna 2005 consensus conference, which proposed preliminary criteria for paediatric vasculitides, was followed by a EULAR/PRINTO/PRES-supported validation project divided into three main steps. Step 1: retrospective/prospective web-data collection for HSP, c-PAN, c-WG and c-TA, with age at diagnosis <= 18 years. Step 2: blinded classification by consensus panel of a subgroup of 280 cases (128 difficult cases, 152 randomly selected) enabling expert diagnostic verification. Step 3: Ankara 2008 Consensus Conference and statistical evaluation (sensitivity, specificity, area under the curve, kappa-agreement) using as 'gold standard' the final consensus classification or original treating physician diagnosis.Results A total of 1183/1398 (85%) samples collected were available for analysis: 827 HSP, 150 c-PAN, 60 c-WG, 87 c-TA and 59 c-other. Prevalence, signs/symptoms, laboratory, biopsy and imaging reports were consistent with the clinical picture of the four c-vasculitides. A representative subgroup of 280 patients was blinded to the treating physician diagnosis and classified by a consensus panel, with kappa-agreement of 0.96 for HSP (95% CI 0.84 to 1), 0.88 for c-WG (95% CI 0.76 to 0.99), 0.84 for c-TA (95% CI 0.73 to 0.96) and 0.73 for c-PAN (95% CI 0.62 to 0.84), with an overall. of 0.79 (95% CI 0.73 to 0.84).Conclusion EULAR/PRINTO/PRES propose validated classification criteria for HSP, c-PAN, c-WG and c-TA, with substantial/almost perfect agreement with the final consensus classification or original treating physician diagnosis.

Veja mais

Extração de dados de produtos em páginas de comércio eletrônico

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Pós-graduação em Ciência da Computação - IBILCE

Veja mais

The FORA Framework - A Fuzzy Grassroots Ontology for Online Reputation Management

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Online reputation management deals with monitoring and influencing the online record of a person, an organization or a product. The Social Web offers increasingly simple ways to publish and disseminate personal or opinionated information, which can rapidly have a disastrous influence on the online reputation of some of the entities. This dissertation can be split into three parts: In the first part, possible fuzzy clustering applications for the Social Semantic Web are investigated. The second part explores promising Social Semantic Web elements for organizational applications,while in the third part the former two parts are brought together and a fuzzy online reputation analysis framework is introduced and evaluated. Theentire PhD thesis is based on literature reviews as well as on argumentative-deductive analyses.The possible applications of Social Semantic Web elements within organizations have been researched using a scenario and an additional case study together with two ancillary case studies—based on qualitative interviews. For the conception and implementation of the online reputation analysis application, a conceptual framework was developed. Employing test installations and prototyping, the essential parts of the framework have been implemented.By following a design sciences research approach, this PhD has created two artifacts: a frameworkand a prototype as proof of concept. Bothartifactshinge on twocoreelements: a (cluster analysis-based) translation of tags used in the Social Web to a computer-understandable fuzzy grassroots ontology for the Semantic Web, and a (Topic Maps-based) knowledge representation system, which facilitates a natural interaction with the fuzzy grassroots ontology. This is beneficial to the identification of unknown but essential Web data that could not be realized through conventional online reputation analysis. Theinherent structure of natural language supports humans not only in communication but also in the perception of the world. Fuzziness is a promising tool for transforming those human perceptions intocomputer artifacts. Through fuzzy grassroots ontologies, the Social Semantic Web becomes more naturally and thus can streamline online reputation management.

Veja mais

FORA - A Fuzzy Set Based Framework for Online Reputation Management

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Social Web offers increasingly simple ways to publish and disseminate personal or opinionated information, which can rapidly exhibit a disastrous influence on the online reputation of organizations. Based on social Web data, this study describes the building of an ontology based on fuzzy sets. At the end of a recurring harvesting of folksonomies by Web agents, the aggregated tags are purified, linked, and transformed to a so-called fuzzy grassroots ontology by means of a fuzzy clustering algorithm. This self-updating ontology is used for online reputation analysis, a crucial task of reputation management, with the goal to follow the online conversation going on around an organization to discover and monitor its reputation. In addition, an application of the Fuzzy Online Reputation Analysis (FORA) framework, lesson learned, and potential extensions are discussed in this article.

Veja mais

921 resultados para Web data

Filtro por publicador