847 resultados para Data Mining, Big Data, Consumi energetici, Weka Data Cleaning


Relevância:

80.00% 80.00%

Publicador:

Resumo:

This thesis investigates the legal, ethical, technical, and psychological issues of general data processing and artificial intelligence practices and the explainability of AI systems. It consists of two main parts. In the initial section, we provide a comprehensive overview of the big data processing ecosystem and the main challenges we face today. We then evaluate the GDPR’s data privacy framework in the European Union. The Trustworthy AI Framework proposed by the EU’s High-Level Expert Group on AI (AI HLEG) is examined in detail. The ethical principles for the foundation and realization of Trustworthy AI are analyzed along with the assessment list prepared by the AI HLEG. Then, we list the main big data challenges the European researchers and institutions identified and provide a literature review on the technical and organizational measures to address these challenges. A quantitative analysis is conducted on the identified big data challenges and the measures to address them, which leads to practical recommendations for better data processing and AI practices in the EU. In the subsequent part, we concentrate on the explainability of AI systems. We clarify the terminology and list the goals aimed at the explainability of AI systems. We identify the reasons for the explainability-accuracy trade-off and how we can address it. We conduct a comparative cognitive analysis between human reasoning and machine-generated explanations with the aim of understanding how explainable AI can contribute to human reasoning. We then focus on the technical and legal responses to remedy the explainability problem. In this part, GDPR’s right to explanation framework and safeguards are analyzed in-depth with their contribution to the realization of Trustworthy AI. Then, we analyze the explanation techniques applicable at different stages of machine learning and propose several recommendations in chronological order to develop GDPR-compliant and Trustworthy XAI systems.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Big data and AI are paving the way to promising scenarios in clinical practice and research. However, the use of such technologies might clash with GDPR requirements. Today, two forces are driving the EU policies in this domain. The first is the necessity to protect individuals’ safety and fundamental rights. The second is to incentivize the deployment of innovative technologies. The first objective is pursued by legislative acts such as the GDPR or the AIA, the second is supported by the new data strategy recently launched by the European Commission. Against this background, the thesis analyses the issue of GDPR compliance when big data and AI systems are implemented in the health domain. The thesis focuses on the use of co-regulatory tools for compliance with the GDPR. This work argues that there are two level of co-regulation in the EU legal system. The first, more general, is the approach pursued by the EU legislator when shaping legislative measures that deal with fast-evolving technologies. The GDPR can be deemed a co-regulatory solution since it mainly introduces general requirements, which implementation shall then be interpretated by the addressee of the law following a risk-based approach. This approach, although useful is costly and sometimes burdensome for organisations. The second co-regulatory level is represented by specific co-regulatory tools, such as code of conduct and certification mechanisms. These tools are meant to guide and support the interpretation effort of the addressee of the law. The thesis argues that the lack of co-regulatory tools which are supposed to implement data protection law in specific situations could be an obstacle to the deployment of innovative solutions in complex scenario such as the health ecosystem. The thesis advances hypothesis on theoretical level about the reasons of such a lack of co-regulatory solutions.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

L’argomento di questa tesi nasce dall’idea di unire due temi che stanno assumendo sempre più importanza nei nostri giorni, ovvero l’economia circolare e i big data, e ha come obiettivo quello di fornire dei punti di collegamento tra questi due. In un mondo tecnologico come quello di oggi, che sta trasformando tutto quello che abbiamo tra le nostre mani in digitale, si stanno svolgendo sempre più studi per capire come la sostenibilità possa essere supportata dalle tecnologie emergenti. L’economia circolare costituisce un nuovo paradigma economico in grado di sostituirsi a modelli di crescita incentrati su una visione lineare, puntando ad una riduzione degli sprechi e ad un radicale ripensamento nella concezione dei prodotti e nel loro uso nel tempo. In questa transizione verso un’economia circolare può essere utile considerare di assumere le nuove tecnologie emergenti per semplificare i processi di produzione e attuare politiche più sostenibili, che stanno diventando sempre più apprezzate anche dai consumatori. Il tutto verrà sostenuto dall’utilizzo sempre più significativo dei big data, ovvero di grandi dati ricchi di informazioni che permettono, tramite un’attenta analisi, di sviluppare piani di produzione che seguono il paradigma circolare: questo viene attuato grazie ai nuovi sistemi digitali sempre più innovativi e alle figure specializzate che acquisiscono sempre più conoscenze in questo campo.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: The inherent complexity of statistical methods and clinical phenomena compel researchers with diverse domains of expertise to work in interdisciplinary teams, where none of them have a complete knowledge in their counterpart's field. As a result, knowledge exchange may often be characterized by miscommunication leading to misinterpretation, ultimately resulting in errors in research and even clinical practice. Though communication has a central role in interdisciplinary collaboration and since miscommunication can have a negative impact on research processes, to the best of our knowledge, no study has yet explored how data analysis specialists and clinical researchers communicate over time. Methods/Principal Findings: We conducted qualitative analysis of encounters between clinical researchers and data analysis specialists (epidemiologist, clinical epidemiologist, and data mining specialist). These encounters were recorded and systematically analyzed using a grounded theory methodology for extraction of emerging themes, followed by data triangulation and analysis of negative cases for validation. A policy analysis was then performed using a system dynamics methodology looking for potential interventions to improve this process. Four major emerging themes were found. Definitions using lay language were frequently employed as a way to bridge the language gap between the specialties. Thought experiments presented a series of ""what if'' situations that helped clarify how the method or information from the other field would behave, if exposed to alternative situations, ultimately aiding in explaining their main objective. Metaphors and analogies were used to translate concepts across fields, from the unfamiliar to the familiar. Prolepsis was used to anticipate study outcomes, thus helping specialists understand the current context based on an understanding of their final goal. Conclusion/Significance: The communication between clinical researchers and data analysis specialists presents multiple challenges that can lead to errors.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Hot tensile and creep tests were carried out on Kanthal A1 alloy in the temperature range from 600 to 800 degrees C. Each of these sets of data were analyzed separately according to their own methodologies, but an attempt was made to find a correlation between them. A new criterion proposed for converting hot tensile data to creep data, makes possible the analysis of the two kinds of results according to usual creep relations like: Norton, Monkman-Grant, Larson-Miller and others. The remarkable compatibility verified between both sets of data by this procedure strongly suggests that hot tensile data can be converted to creep data and vice-versa for Kanthal A1 alloy, as verified previously for other metallic materials.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The central issue for pillar design in underground coal mining is the in situ uniaxial compressive strength (sigma (cm)). The paper proposes a new method for estimating in situ uniaxial compressive strength in coal seams based on laboratory strength and P wave propagation velocity. It describes the collection of samples in the Bonito coal seam, Fontanella Mine, southern Brazil, the techniques used for the structural mapping of the coal seam and determination of seismic wave propagation velocity as well as the laboratory procedures used to determine the strength and ultrasonic wave velocity. The results obtained using the new methodology are compared with those from seven other techniques for estimating in situ rock mass uniaxial compressive strength.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Geospatial clustering must be designed in such a way that it takes into account the special features of geoinformation and the peculiar nature of geographical environments in order to successfully derive geospatially interesting global concentrations and localized excesses. This paper examines families of geospaital clustering recently proposed in the data mining community and identifies several features and issues especially important to geospatial clustering in data-rich environments.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Eysenck Personality Questionnaire-Revised (EPQ-R), the Eysenck Personality Profiler Short Version (EPP-S), and the Big Five Inventory (BFI-V4a) were administered to 135 postgraduate students of business in Pakistan. Whilst Extraversion and Neuroticism scales from the three questionnaires were highly correlated, it was found that Agreeableness was most highly correlated with Psychoticism in the EPQ-R and Conscientiousness was most highly correlated with Psychoticism in the EPP-S. Principal component analyses with varimax rotation were carried out. The analyses generally suggested that the five factor model rather than the three-factor model was more robust and better for interpretation of all the higher order scales of the EPQ-R, EPP-S, and BFI-V4a in the Pakistani data. Results show that the superiority of the five factor solution results from the inclusion of a broader variety of personality scales in the input data, whereas Eysenck's three factor solution seems to be best when a less complete but possibly more important set of variables are input. (C) 2001 Elsevier Science Ltd. All rights reserved.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

3rd SMTDA Conference Proceedings, 11-14 June 2014, Lisbon Portugal.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Data analytic applications are characterized by large data sets that are subject to a series of processing phases. Some of these phases are executed sequentially but others can be executed concurrently or in parallel on clusters, grids or clouds. The MapReduce programming model has been applied to process large data sets in cluster and cloud environments. For developing an application using MapReduce there is a need to install/configure/access specific frameworks such as Apache Hadoop or Elastic MapReduce in Amazon Cloud. It would be desirable to provide more flexibility in adjusting such configurations according to the application characteristics. Furthermore the composition of the multiple phases of a data analytic application requires the specification of all the phases and their orchestration. The original MapReduce model and environment lacks flexible support for such configuration and composition. Recognizing that scientific workflows have been successfully applied to modeling complex applications, this paper describes our experiments on implementing MapReduce as subworkflows in the AWARD framework (Autonomic Workflow Activities Reconfigurable and Dynamic). A text mining data analytic application is modeled as a complex workflow with multiple phases, where individual workflow nodes support MapReduce computations. As in typical MapReduce environments, the end user only needs to define the application algorithms for input data processing and for the map and reduce functions. In the paper we present experimental results when using the AWARD framework to execute MapReduce workflows deployed over multiple Amazon EC2 (Elastic Compute Cloud) instances.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Dissertação apresentada como requisito parcial para a obtenção do grau de Mestre em Estatística e Gestão da Informação

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This document presents a tool able to automatically gather data provided by real energy markets and to generate scenarios, capture and improve market players’ profiles and strategies by using knowledge discovery processes in databases supported by artificial intelligence techniques, data mining algorithms and machine learning methods. It provides the means for generating scenarios with different dimensions and characteristics, ensuring the representation of real and adapted markets, and their participating entities. The scenarios generator module enhances the MASCEM (Multi-Agent Simulator of Competitive Electricity Markets) simulator, endowing a more effective tool for decision support. The achievements from the implementation of the proposed module enables researchers and electricity markets’ participating entities to analyze data, create real scenarios and make experiments with them. On the other hand, applying knowledge discovery techniques to real data also allows the improvement of MASCEM agents’ profiles and strategies resulting in a better representation of real market players’ behavior. This work aims to improve the comprehension of electricity markets and the interactions among the involved entities through adequate multi-agent simulation.