14 resultados para Data Mining, Big Data, Consumi energetici, Weka Data Cleaning

em University of Southampton, United Kingdom


Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Relates to the following software for analysing Blackboard stats http://www.edshare.soton.ac.uk/11134/ Is supporting material for the following podcast: http://youtu.be/yHxCzjiYBoU

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract: Big Data has been characterised as a great economic opportunity and a massive threat to privacy. Both may be correct: the same technology can indeed be used in ways that are highly beneficial and those that are ethically intolerable, maybe even simultaneously. Using examples of how Big Data might be used in education - normally referred to as "learning analytics" - the seminar will discuss possible ethical and legal frameworks for Big Data, and how these might guide the development of technologies, processes and policies that can deliver the benefits of Big Data without the nightmares. Speaker Biography: Andrew Cormack is Chief Regulatory Adviser, Jisc Technologies. He joined the company in 1999 as head of the JANET-CERT and EuroCERT incident response teams. In his current role he concentrates on the security, policy and regulatory issues around the network and services that Janet provides to its customer universities and colleges. Previously he worked for Cardiff University running web and email services, and for NERC's Shipboard Computer Group. He has degrees in Mathematics, Humanities and Law.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract Big data nowadays is a fashionable topic, independently of what people mean when they use this term. But being big is just a matter of volume, although there is no clear agreement in the size threshold. On the other hand, it is easy to capture large amounts of data using a brute force approach. So the real goal should not be big data but to ask ourselves, for a given problem, what is the right data and how much of it is needed. For some problems this would imply big data, but for the majority of the problems much less data will and is needed. In this talk we explore the trade-offs involved and the main problems that come with big data using the Web as case study: scalability, redundancy, bias, noise, spam, and privacy. Speaker Biography Ricardo Baeza-Yates Ricardo Baeza-Yates is VP of Research for Yahoo Labs leading teams in United States, Europe and Latin America since 2006 and based in Sunnyvale, California, since August 2014. During this time he has lead the labs in Barcelona and Santiago de Chile. Between 2008 and 2012 he also oversaw the Haifa lab. He is also part time Professor at the Dept. of Information and Communication Technologies of the Universitat Pompeu Fabra, in Barcelona, Spain. During 2005 he was an ICREA research professor at the same university. Until 2004 he was Professor and before founder and Director of the Center for Web Research at the Dept. of Computing Science of the University of Chile (in leave of absence until today). He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. Before he obtained two masters (M.Sc. CS & M.Eng. EE) and the electronics engineer degree from the University of Chile in Santiago. He is co-author of the best-seller Modern Information Retrieval textbook, published in 1999 by Addison-Wesley with a second enlarged edition in 2011, that won the ASIST 2012 Book of the Year award. He is also co-author of the 2nd edition of the Handbook of Algorithms and Data Structures, Addison-Wesley, 1991; and co-editor of Information Retrieval: Algorithms and Data Structures, Prentice-Hall, 1992, among more than 500 other publications. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society and in 2012 he was elected for the ACM Council. He has received the Organization of American States award for young researchers in exact sciences (1993), the Graham Medal for innovation in computing given by the University of Waterloo to distinguished ex-alumni (2007), the CLEI Latin American distinction for contributions to CS in the region (2009), and the National Award of the Chilean Association of Engineers (2010), among other distinctions. In 2003 he was the first computer scientist to be elected to the Chilean Academy of Sciences and since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

80.00% 80.00%

Publicador:

Resumo:

An emerging consensus in cognitive science views the biological brain as a hierarchically-organized predictive processing system. This is a system in which higher-order regions are continuously attempting to predict the activity of lower-order regions at a variety of (increasingly abstract) spatial and temporal scales. The brain is thus revealed as a hierarchical prediction machine that is constantly engaged in the effort to predict the flow of information originating from the sensory surfaces. Such a view seems to afford a great deal of explanatory leverage when it comes to a broad swathe of seemingly disparate psychological phenomena (e.g., learning, memory, perception, action, emotion, planning, reason, imagination, and conscious experience). In the most positive case, the predictive processing story seems to provide our first glimpse at what a unified (computationally-tractable and neurobiological plausible) account of human psychology might look like. This obviously marks out one reason why such models should be the focus of current empirical and theoretical attention. Another reason, however, is rooted in the potential of such models to advance the current state-of-the-art in machine intelligence and machine learning. Interestingly, the vision of the brain as a hierarchical prediction machine is one that establishes contact with work that goes under the heading of 'deep learning'. Deep learning systems thus often attempt to make use of predictive processing schemes and (increasingly abstract) generative models as a means of supporting the analysis of large data sets. But are such computational systems sufficient (by themselves) to provide a route to general human-level analytic capabilities? I will argue that they are not and that closer attention to a broader range of forces and factors (many of which are not confined to the neural realm) may be required to understand what it is that gives human cognition its distinctive (and largely unique) flavour. The vision that emerges is one of 'homomimetic deep learning systems', systems that situate a hierarchically-organized predictive processing core within a larger nexus of developmental, behavioural, symbolic, technological and social influences. Relative to that vision, I suggest that we should see the Web as a form of 'cognitive ecology', one that is as much involved with the transformation of machine intelligence as it is with the progressive reshaping of our own cognitive capabilities.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.

Relevância:

80.00% 80.00%

Publicador:

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This is a research discussion about the Hampshire Hub - see http://protohub.net/. The aim is to find out more about the project, and discuss future collaboration and sharing of ideas. Mark Braggins (Hampshire Hub Partnership) will introduce the Hampshire Hub programme, setting out its main objectives, work done to-date, next steps including the Hampshire data store (which will use the PublishMyData linked data platform), and opportunities for University of Southampton to engage with the programme , including the forthcoming Hampshire Hackathons Bill Roberts (Swirrl) will give an overview of the PublishMyData platform, and how it will help deliver the objectives of the Hampshire Hub. He will detail some of the new functionality being added to the platform Steve Peters (DCLG Open Data Communities) will focus on developing a web of data that blends and combines local and national data sources around localities, and common topics/themes. This will include observations on the potential employing emerging new, big data sources to help deliver more effective, better targeted public services. Steve will illustrate this with practical examples of DCLG’s work to publish its own data in a SPARQL end-point, so that it can be used over the web alongside related 3rd party sources. He will share examples of some of the practical challenges, particularly around querying and re-using geographic LinkedData in a federated world of SPARQL end-point.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract This seminar is a research discussion around a very interesting problem, which may be a good basis for a WAISfest theme. A little over a year ago Professor Alan Dix came to tell us of his plans for a magnificent adventure:to walk all of the way round Wales - 1000 miles 'Alan Walks Wales'. The walk was a personal journey, but also a technological and community one, exploring the needs of the walker and the people along the way. Whilst walking he recorded his thoughts in an audio diary, took lots of photos, wrote a blog and collected data from the tech instruments he was wearing. As a result Alan has extensive quantitative data (bio-sensing and location) and qualitative data (text, images and some audio). There are challenges in analysing individual kinds of data, including merging similar data streams, entity identification, time-series and textual data mining, dealing with provenance, ontologies for paths, and journeys. There are also challenges for author and third-party annotation, linking the data-sets and visualising the merged narrative or facets of it.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Speaker: Dr Kieron O'Hara Organiser: Time: 04/02/2015 11:00-11:45 Location: B32/3077 Abstract In order to reap the potential societal benefits of big and broad data, it is essential to share and link personal data. However, privacy and data protection considerations mean that, to be shared, personal data must be anonymised, so that the data subject cannot be identified from the data. Anonymisation is therefore a vital tool for data sharing, but deanonymisation, or reidentification, is always possible given sufficient auxiliary information (and as the amount of data grows, both in terms of creation, and in terms of availability in the public domain, the probability of finding such auxiliary information grows). This creates issues for the management of anonymisation, which are exacerbated not only by uncertainties about the future, but also by misunderstandings about the process(es) of anonymisation. This talk discusses these issues in relation to privacy, risk management and security, reports on recent theoretical tools created by the UKAN network of statistics professionals (on which the author is one of the leads), and asks how long anonymisation can remain a useful tool, and what might replace it.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract: Decision support systems have been widely used for years in companies to gain insights from internal data, thus making successful decisions. Lately, thanks to the increasing availability of open data, these systems are also integrating open data to enrich decision making process with external data. On the other hand, within an open-data scenario, decision support systems can be also useful to decide which data should be opened, not only by considering technical or legal constraints, but other requirements, such as "reusing potential" of data. In this talk, we focus on both issues: (i) open data for decision making, and (ii) decision making for opening data. We will first briefly comment some research problems regarding using open data for decision making. Then, we will give an outline of a novel decision-making approach (based on how open data is being actually used in open-source projects hosted in Github) for supporting open data publication. Bio of the speaker: Jose-Norberto Mazón holds a PhD from the University of Alicante (Spain). He is head of the "Cátedra Telefónica" on Big Data and coordinator of the Computing degree at the University of Alicante. He is also member of the WaKe research group at the University of Alicante. His research work focuses on open data management, data integration and business intelligence within "big data" scenarios, and their application to the tourism domain (smart tourism destinations). He has published his research in international journals, such as Decision Support Systems, Information Sciences, Data & Knowledge Engineering or ACM Transaction on the Web. Finally, he is involved in the open data project in the University of Alicante, including its open data portal at http://datos.ua.es