957 resultados para Web Science
Resumo:
A seminar for BNU students who have participated in the FutureLearn Web Science MOOC.
Resumo:
This talk will present an overview of the ongoing ERCIM project SMARTDOCS (SeMAntically-cReaTed DOCuments) which aims at automatically generating webpages from RDF data. It will particularly focus on the current issues and the investigated solutions in the different modules of the project, which are related to document planning, natural language generation and multimedia perspectives. The second part of the talk will be dedicated to the KODA annotation system, which is a knowledge-base-agnostic annotator designed to provide the RDF annotations required in the document generation process.
Resumo:
Nothing lasts forever. The World Wide Web was an essential part of life for much of humantiy in the early 21st century, but these days few people even remember that it existed. Members of the Web Science research group will present several possible scenarios for how the Web, as we know it, could cease to be. This will be followed by an open discussion about the future we want for the Web and what Web Science should be doing today to help make that future happen, or at least avoid some of the bad ones.
Resumo:
Abstract: There is a lot of hype around the Internet of Things along with talk about 100 billion devices within 10 years time. The promise of innovative new services and efficiency savings is fueling interest in a wide range of potential applications across many sectors including smart homes, healthcare, smart grids, smart cities, retail, and smart industry. However, the current reality is one of fragmentation and data silos. W3C is seeking to fix that by exposing IoT platforms through the Web with shared semantics and data formats as the basis for interoperability. This talk will address the abstractions needed to move from a Web of pages to a Web of things, and introduce the work that is being done on standards and on open source projects for a new breed of Web servers on microcontrollers to cloud based server farms. Speaker Biography -Dave Raggett : Dave has been involved at the heart of web standards since 1992, and part of the W3C Team since 1995. As well as working on standards, he likes to dabble with software, and more recently with IoT hardware. He has participated in a wide range of European research projects on behalf of W3C/ERCIM. He currently focuses on Web payments, and realising the potential for the Web of Things as an evolution from the Web of pages. Dave has a doctorate from the University of Oxford. He is a visiting professor at the University of the West of England, and lives in the UK in a small town near to Bath.
Resumo:
Slides to introduce political economy, and its relevance to the study of the Web. Brief review of methods and issues.
Resumo:
Abstract A frequent assumption in Social Media is that its open nature leads to a representative view of the world. In this talk we want to consider bias occurring in the Social Web. We will consider a case study of liquid feedback, a direct democracy platform of the German pirate party as well as models of (non-)discriminating systems. As a conclusion of this talk we stipulate the need of Social Media systems to bias their working according to social norms and to publish the bias they introduce. Speaker Biography: Prof Steffen Staab Steffen studied in Erlangen (Germany), Philadelphia (USA) and Freiburg (Germany) computer science and computational linguistics. Afterwards he worked as researcher at Uni. Stuttgart/Fraunhofer and Univ. Karlsruhe, before he became professor in Koblenz (Germany). Since March 2015 he also holds a chair for Web and Computer Science at Univ. of Southampton sharing his time between here and Koblenz. In his research career he has managed to avoid almost all good advice that he now gives to his team members. Such advise includes focusing on research (vs. company) or concentrating on only one or two research areas (vs. considering ontologies, semantic web, social web, data engineering, text mining, peer-to-peer, multimedia, HCI, services, software modelling and programming and some more). Though, actually, improving how we understand and use text and data is a good common denominator for a lot of Steffen's professional activities.
Resumo:
In this session we'll explore how Microsoft uses data science and machine learning across it's entire business, from Windows and Office, to Skype and XBox. We'll look at how companies across the world use Microsoft technology for empowering their businesses in many different industries. And we'll look at data science technologies you can use yourselves, such as Azure Machine Learning and Power BI. Finally we'll discuss job opportunities for data scientists and tips on how you can be successful!
Resumo:
This paper presents a Focused Crawler in order to Get Semantic Web Resources (CSR). Structured data web are available in formats such as Extensible Markup Language (XML), Resource Description Framework (RDF) and Ontology Web Language (OWL) that can be used for processing. One of the main challenges for performing a manual search and download semantic web resources is that this task consumes a lot of time. Our research work propose a focused crawler which allow to download these resources automatically and store them on disk in order to have a collection that will be used for data processing. CRS consists of three layers: (a) The User Interface Layer, (b) The Focus Crawler Layer and (c) The Base Crawler Layer. CSR uses as a selection policie the Shark-Search method. CSR was conducted with two experiments. The first one starts on December 15 2012 at 7:11 am and ends on December 16 2012 at 4:01 were obtained 448,123,537 bytes of data. The CSR ends by itself after to analyze 80,4375 seeds with an unlimited depth. CSR got 16,576 semantic resources files where the 89 % was RDF, the 10 % was XML and the 1% was OWL. The second one was based on the Web Data Commons work of the Research Group Data and Web Science at the University of Mannheim and the Institute AIFB at the Karlsruhe Institute of Technology. This began at 4:46 am of June 2 2013 and 1:37 am June 9 2013. After 162.51 hours of execution the result was 285,279 semantic resources where predominated the XML resources with 99 % and OWL and RDF with 1 % each one.
Resumo:
The main argument of this paper is that Natural Language Processing (NLP) does, and will continue to, underlie the Semantic Web (SW), including its initial construction from unstructured sources like the World Wide Web (WWW), whether its advocates realise this or not. Chiefly, we argue, such NLP activity is the only way up to a defensible notion of meaning at conceptual levels (in the original SW diagram) based on lower level empirical computations over usage. Our aim is definitely not to claim logic-bad, NLP-good in any simple-minded way, but to argue that the SW will be a fascinating interaction of these two methodologies, again like the WWW (which has been basically a field for statistical NLP research) but with deeper content. Only NLP technologies (and chiefly information extraction) will be able to provide the requisite RDF knowledge stores for the SW from existing unstructured text databases in the WWW, and in the vast quantities needed. There is no alternative at this point, since a wholly or mostly hand-crafted SW is also unthinkable, as is a SW built from scratch and without reference to the WWW. We also assume that, whatever the limitations on current SW representational power we have drawn attention to here, the SW will continue to grow in a distributed manner so as to serve the needs of scientists, even if it is not perfect. The WWW has already shown how an imperfect artefact can become indispensable.
Resumo:
The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.
Resumo:
Provenance is a record that describes the people, institutions, entities, and activities, involved in producing, influencing, or delivering a piece of data or a thing in the world. Some 10 years after beginning research on the topic of provenance, I co-chaired the provenance working group at the World Wide Web Consortium. The working group published the PROV standard for provenance in 2013. In this talk, I will present some use cases for provenance, the PROV standard and some flagship examples of adoption. I will then move on to our current research area aiming to exploit provenance, in the context of the Sociam, SmartSociety, ORCHID projects. Doing so, I will present techniques to deal with large scale provenance, to build predictive models based on provenance, and to analyse provenance.
Resumo:
Abstract Heading into the 2020s, Physics and Astronomy are undergoing experimental revolutions that will reshape our picture of the fabric of the Universe. The Large Hadron Collider (LHC), the largest particle physics project in the world, produces 30 petabytes of data annually that need to be sifted through, analysed, and modelled. In astrophysics, the Large Synoptic Survey Telescope (LSST) will be taking a high-resolution image of the full sky every 3 days, leading to data rates of 30 terabytes per night over ten years. These experiments endeavour to answer the question why 96% of the content of the universe currently elude our physical understanding. Both the LHC and LSST share the 5-dimensional nature of their data, with position, energy and time being the fundamental axes. This talk will present an overview of the experiments and data that is gathered, and outlines the challenges in extracting information. Common strategies employed are very similar to industrial data! Science problems (e.g., data filtering, machine learning, statistical interpretation) and provide a seed for exchange of knowledge between academia and industry. Speaker Biography Professor Mark Sullivan Mark Sullivan is a Professor of Astrophysics in the Department of Physics and Astronomy. Mark completed his PhD at Cambridge, and following postdoctoral study in Durham, Toronto and Oxford, now leads a research group at Southampton studying dark energy using exploding stars called "type Ia supernovae". Mark has many years' experience of research that involves repeatedly imaging the night sky to track the arrival of transient objects, involving significant challenges in data handling, processing, classification and analysis.
Resumo:
Abstract This seminar consists of two very different research reports by PhD students in WAIS. Hypertext Engineering, Fettling or Tinkering (Mark Anderson): Contributors to a public hypertext such as Wikipedia do not necessarily record their maintenance activities, but some specific hypertext features - such transclusion - could indicate deliberate editing with a mind to the hypertext’s long-term use. The MediaWiki software used to create Wikipedia supports transclusion, a deliberately hypertextual form of content creation which aids long terms consistency. This discusses the evidence of the use of hypertext transclusion in Wikipedia, and its implications for the coherence and stability of Wikipedia. Designing a Public Intervention - Towards a Sociotechnical Approach to Web Governance (Faranak Hardcastle): In this talk I introduce a critical and speculative design for a socio-technical intervention -called TATE (Transparency and Accountability Tracking Extension)- that aims to enhance transparency and accountability in Online Behavioural Tracking and Advertising mechanisms and practices.
Resumo:
Instructor and student beliefs, attitudes and intentions toward contributing to local open courseware (OCW) sites have been investigated through campus-wide surveys at Universidad Politecnica de Valencia and the University of Michigan. In addition, at the University of Michigan, faculty have been queried about their participation in open access (OA) publishing. We compare the instructor and student data concerning OCW between the two institutions, and introduce the investigation of open access publishing in relation to open courseware publishing.
Resumo:
Social Networking tools like Facebook yield recognisable small world phenomena, that is particular kinds of social graphs that facilitate particular kinds of interaction and information exchange.