896 resultados para computation- and data-intensive applications
Resumo:
Ensemble Stream Modeling and Data-cleaning are sensor information processing systems have different training and testing methods by which their goals are cross-validated. This research examines a mechanism, which seeks to extract novel patterns by generating ensembles from data. The main goal of label-less stream processing is to process the sensed events to eliminate the noises that are uncorrelated, and choose the most likely model without over fitting thus obtaining higher model confidence. Higher quality streams can be realized by combining many short streams into an ensemble which has the desired quality. The framework for the investigation is an existing data mining tool. First, to accommodate feature extraction such as a bush or natural forest-fire event we make an assumption of the burnt area (BA*), sensed ground truth as our target variable obtained from logs. Even though this is an obvious model choice the results are disappointing. The reasons for this are two: One, the histogram of fire activity is highly skewed. Two, the measured sensor parameters are highly correlated. Since using non descriptive features does not yield good results, we resort to temporal features. By doing so we carefully eliminate the averaging effects; the resulting histogram is more satisfactory and conceptual knowledge is learned from sensor streams. Second is the process of feature induction by cross-validating attributes with single or multi-target variables to minimize training error. We use F-measure score, which combines precision and accuracy to determine the false alarm rate of fire events. The multi-target data-cleaning trees use information purity of the target leaf-nodes to learn higher order features. A sensitive variance measure such as f-test is performed during each node’s split to select the best attribute. Ensemble stream model approach proved to improve when using complicated features with a simpler tree classifier. The ensemble framework for data-cleaning and the enhancements to quantify quality of fitness (30% spatial, 10% temporal, and 90% mobility reduction) of sensor led to the formation of streams for sensor-enabled applications. Which further motivates the novelty of stream quality labeling and its importance in solving vast amounts of real-time mobile streams generated today.
Resumo:
The literature clearly links the quality and capacity of a country’s infrastructure to its economic growth and competitiveness. This thesis analyses the historic national and spatial distribution of investment by the Irish state in its physical networks (water, wastewater and roads) across the 34 local authorities and examines how Ireland is perceived internationally relative to its economic counterparts. An appraisal of the current status and shortcomings of Ireland’s infrastructure is undertaken using key stakeholders from foreign direct investment companies and national policymakers to identify Ireland's infrastructural gaps, along with current challenges in how the country is delivering infrastructure. The output of these interviews identified many issues with how infrastructure decision-making is currently undertaken. This led to an evaluation of how other countries are informing decision-making, and thus this thesis presents a framework of how and why Ireland should embrace a Systems of Systems (SoS) methodology approach to infrastructure decision-making going forward. In undertaking this study a number of other infrastructure challenges were identified: significant political interference in infrastructure decision-making and delivery the need for a national agency to remove the existing ‘silo’ type of mentality to infrastructure delivery how tax incentives can interfere with the market; and their significance. The two key infrastructure gaps identified during the interview process were: the need for government intervention in the rollout of sufficient communication capacity and at a competitive cost outside of Dublin; and the urgent need to address water quality and capacity with approximately 25% of the population currently being served by water of unacceptable quality. Despite considerable investment in its national infrastructure, Ireland’s infrastructure performance continues to trail behind its economic partners in the Eurozone and OECD. Ireland is projected to have the highest growth rate in the euro zone region in 2015 and 2016, albeit that it required a bailout in 2010, and, at the time of writing, is beginning to invest in its infrastructure networks again. This thesis proposes the development and implementation of a SoS approach for infrastructure decision-making which would be based on: existing spatial and capacity data of each of the constituent infrastructure networks; and scenario computation and analysis of alternative drivers eg. Demographic change, economic variability and demand/capacity constraints. The output from such an analysis would provide valuable evidence upon which policy makers and decision makers alike could rely, which has been lacking in historic investment decisions.
Resumo:
Healthcare systems have assimilated information and communication technologies in order to improve the quality of healthcare and patient's experience at reduced costs. The increasing digitalization of people's health information raises however new threats regarding information security and privacy. Accidental or deliberate data breaches of health data may lead to societal pressures, embarrassment and discrimination. Information security and privacy are paramount to achieve high quality healthcare services, and further, to not harm individuals when providing care. With that in mind, we give special attention to the category of Mobile Health (mHealth) systems. That is, the use of mobile devices (e.g., mobile phones, sensors, PDAs) to support medical and public health. Such systems, have been particularly successful in developing countries, taking advantage of the flourishing mobile market and the need to expand the coverage of primary healthcare programs. Many mHealth initiatives, however, fail to address security and privacy issues. This, coupled with the lack of specific legislation for privacy and data protection in these countries, increases the risk of harm to individuals. The overall objective of this thesis is to enhance knowledge regarding the design of security and privacy technologies for mHealth systems. In particular, we deal with mHealth Data Collection Systems (MDCSs), which consists of mobile devices for collecting and reporting health-related data, replacing paper-based approaches for health surveys and surveillance. This thesis consists of publications contributing to mHealth security and privacy in various ways: with a comprehensive literature review about mHealth in Brazil; with the design of a security framework for MDCSs (SecourHealth); with the design of a MDCS (GeoHealth); with the design of Privacy Impact Assessment template for MDCSs; and with the study of ontology-based obfuscation and anonymisation functions for health data.
Resumo:
The International Seabed Authority (ISA) regulates the activities related with the exploration and exploitation of seabed mineral resources in the Area, which are considered as the "common heritage of mankind" under the United Nations Convention on the Law of the Sea.The ISA has also the mandate to ensure the protection of the marine environment.The development of good practices for the annual reporting and data submission by Contractors is crucial for the ISA to comply with the sustainable development of the mineral marine resources. In 2015,the ISA issued a new template for reporting on exploration activities, which includes the definition of the format for all geophysical, geological and environmental data to be collected and analysed during exploration. The availability of reliable data contributes to improve the assessment of the ISA on the activities in the Area while promoting transparency, which is considered as a major principle of industry bestpractices.
Resumo:
The semiarid region of northeastern Brazil, the Caatinga, is extremely important due to its biodiversity and endemism. Measurements of plant physiology are crucial to the calibration of Dynamic Global Vegetation Models (DGVMs) that are currently used to simulate the responses of vegetation in face of global changes. In a field work realized in an area of preserved Caatinga forest located in Petrolina, Pernambuco, measurements of carbon assimilation (in response to light and CO2) were performed on 11 individuals of Poincianella microphylla, a native species that is abundant in this region. These data were used to calibrate the maximum carboxylation velocity (Vcmax) used in the INLAND model. The calibration techniques used were Multiple Linear Regression (MLR), and data mining techniques as the Classification And Regression Tree (CART) and K-MEANS. The results were compared to the UNCALIBRATED model. It was found that simulated Gross Primary Productivity (GPP) reached 72% of observed GPP when using the calibrated Vcmax values, whereas the UNCALIBRATED approach accounted for 42% of observed GPP. Thus, this work shows the benefits of calibrating DGVMs using field ecophysiological measurements, especially in areas where field data is scarce or non-existent, such as in the Caatinga
Resumo:
This thesis aims to present the ORC technology, its advantages and related problems. In particular, it provides an analysis of ORC waste heat recovery system in different and innovative scenarios, focusing on cases from the biggest to the lowest scale. Both industrial and residential ORC applications are considered. In both applications, the installation of a subcritical and recuperated ORC system is examined. Moreover, heat recovery is considered in absence of an intermediate heat transfer circuit. This solution allow to improve the recovery efficiency, but requiring safety precautions. Possible integrations of ORC systems with renewable sources are also presented and investigated to improve the non-programmable source exploitation. In particular, the offshore oil and gas sector has been selected as a promising industrial large-scale ORC application. From the design of ORC systems coupled with Gas Turbines (GTs) as topper systems, the dynamic behavior of the GT+ORC innovative combined cycles has been analyzed by developing a dynamic model of all the considered components. The dynamic behavior is caused by integration with a wind farm. The electric and thermal aspects have been examined to identify the advantages related to the waste heat recovery system installation. Moreover, an experimental test rig has been realized to test the performance of a micro-scale ORC prototype. The prototype recovers heat from a low temperature water stream, available for instance in industrial or residential waste heat. In the test bench, various sensors have been installed, an acquisitions system developed in Labview environment to completely analyze the ORC behavior. Data collected in real time and corresponding to the system dynamic behavior have been used to evaluate the system performance based on selected indexes. Moreover, various operational steady-state conditions are identified and operation maps are realized for a completely characterization of the system and to detect the optimal operating conditions.
Resumo:
Inverse problems are at the core of many challenging applications. Variational and learning models provide estimated solutions of inverse problems as the outcome of specific reconstruction maps. In the variational approach, the result of the reconstruction map is the solution of a regularized minimization problem encoding information on the acquisition process and prior knowledge on the solution. In the learning approach, the reconstruction map is a parametric function whose parameters are identified by solving a minimization problem depending on a large set of data. In this thesis, we go beyond this apparent dichotomy between variational and learning models and we show they can be harmoniously merged in unified hybrid frameworks preserving their main advantages. We develop several highly efficient methods based on both these model-driven and data-driven strategies, for which we provide a detailed convergence analysis. The arising algorithms are applied to solve inverse problems involving images and time series. For each task, we show the proposed schemes improve the performances of many other existing methods in terms of both computational burden and quality of the solution. In the first part, we focus on gradient-based regularized variational models which are shown to be effective for segmentation purposes and thermal and medical image enhancement. We consider gradient sparsity-promoting regularized models for which we develop different strategies to estimate the regularization strength. Furthermore, we introduce a novel gradient-based Plug-and-Play convergent scheme considering a deep learning based denoiser trained on the gradient domain. In the second part, we address the tasks of natural image deblurring, image and video super resolution microscopy and positioning time series prediction, through deep learning based methods. We boost the performances of supervised, such as trained convolutional and recurrent networks, and unsupervised deep learning strategies, such as Deep Image Prior, by penalizing the losses with handcrafted regularization terms.
Resumo:
The internet and digital technologies revolutionized the economy. Regulating the digital market has become a priority for the European Union. While promoting innovation and development, EU institutions must assure that the digital market maintains a competitive structure. Among the numerous elements characterizing the digital sector, users’ data are particularly important. Digital services are centered around personal data, the accumulation of which contributed to the centralization of market power in the hands of a few large providers. As a result, data-driven mergers and data-related abuses gained a central role for the purposes of EU antitrust enforcement. In light of these considerations, this work aims at assessing whether EU competition law is well-suited to address data-driven mergers and data-related abuses of dominance. These conducts are of crucial importance to the maintenance of competition in the digital sector, insofar as the accumulation of users’ data constitutes a fundamental competitive advantage. To begin with, part 1 addresses the specific features of the digital market and their impact on the definition of the relevant market and the assessment of dominance by antitrust authorities. Secondly, part 2 analyzes the EU’s case law on data-driven mergers to verify if merger control is well-suited to address these concentrations. Thirdly, part 3 discusses abuses of dominance in the phase of data collection and the legal frameworks applicable to these conducts. Fourthly, part 4 focuses on access to “essential” datasets and the indirect effects of anticompetitive conducts on rivals’ ability to access users’ information. Finally, Part 5 discusses differential pricing practices implemented online and based on personal data. As it will be assessed, the combination of an efficient competition law enforcement and the auspicial adoption of a specific regulation seems to be the best solution to face the challenges raised by “data-related dominance”.
Resumo:
Model misspecification affects the classical test statistics used to assess the fit of the Item Response Theory (IRT) models. Robust tests have been derived under model misspecification, as the Generalized Lagrange Multiplier and Hausman tests, but their use has not been largely explored in the IRT framework. In the first part of the thesis, we introduce the Generalized Lagrange Multiplier test to detect differential item response functioning in IRT models for binary data under model misspecification. By means of a simulation study and a real data analysis, we compare its performance with the classical Lagrange Multiplier test, computed using the Hessian and the cross-product matrix, and the Generalized Jackknife Score test. The power of these tests is computed empirically and asymptotically. The misspecifications considered are local dependence among items and non-normal distribution of the latent variable. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the performance of the tests deteriorates. None of the tests considered show an overall superior performance than the others. In the second part of the thesis, we extend the Generalized Hausman test to detect non-normality of the latent variable distribution. To build the test, we consider a seminonparametric-IRT model, that assumes a more flexible latent variable distribution. By means of a simulation study and two real applications, we compare the performance of the Generalized Hausman test with the M2 limited information goodness-of-fit test and the Likelihood-Ratio test. Additionally, the information criteria are computed. The Generalized Hausman test has a better performance than the Likelihood-Ratio test in terms of Type I error rates and the M2 test in terms of power. The performance of the Generalized Hausman test and the information criteria deteriorates when the sample size is small and with a few items.
Resumo:
The chapters of the thesis focus on a limited variety of selected themes in EU privacy and data protection law. Chapter 1 sets out the general introduction on the research topic. Chapter 2 touches upon the methodology used in the research. Chapter 3 conceptualises the basic notions from a legal standpoint. Chapter 4 examines the current regulatory regime applicable to digital health technologies, healthcare emergencies, privacy, and data protection. Chapter 5 provides case studies on the application deployed in the Covid-19 scenario, from the perspective of privacy and data protection. Chapter 6 addresses the post-Covid European regulatory initiatives on the subject matter, and its potential effects on privacy and data protection. Chapter 7 is the outcome of a six-month internship with a company in Italy and focuses on the protection of fundamental rights through common standardisation and certification, demonstrating that such standards can serve as supporting tools to guarantee the right to privacy and data protection in digital health technologies. The thesis concludes with the observation that finding and transposing European privacy and data protection standards into scenarios, such as public healthcare emergencies where digital health technologies are deployed, requires rapid coordination between the European Data Protection Authorities and the Member States guarantee that individual privacy and data protection rights are ensured.
Resumo:
In recent years, IoT technology has radically transformed many crucial industrial and service sectors such as healthcare. The multi-facets heterogeneity of the devices and the collected information provides important opportunities to develop innovative systems and services. However, the ubiquitous presence of data silos and the poor semantic interoperability in the IoT landscape constitute a significant obstacle in the pursuit of this goal. Moreover, achieving actionable knowledge from the collected data requires IoT information sources to be analysed using appropriate artificial intelligence techniques such as automated reasoning. In this thesis work, Semantic Web technologies have been investigated as an approach to address both the data integration and reasoning aspect in modern IoT systems. In particular, the contributions presented in this thesis are the following: (1) the IoT Fitness Ontology, an OWL ontology that has been developed in order to overcome the issue of data silos and enable semantic interoperability in the IoT fitness domain; (2) a Linked Open Data web portal for collecting and sharing IoT health datasets with the research community; (3) a novel methodology for embedding knowledge in rule-defined IoT smart home scenarios; and (4) a knowledge-based IoT home automation system that supports a seamless integration of heterogeneous devices and data sources.
Resumo:
Following the approval of the 2030 Agenda for Sustainable Development in 2015, sustainability became a hotly debated topic. In order to build a better and more sustainable future by 2030, this agenda addressed several global issues, including inequality, climate change, peace, and justice, in the form of 17 Sustainable Development Goals (SDGs), that should be understood and pursued by nations, corporations, institutions, and individuals. In this thesis, we researched how to exploit and integrate Human-Computer Interaction (HCI) and Data Visualization to promote knowledge and awareness about SDG 8, which wants to encourage lasting, inclusive, and sustainable economic growth, full and productive employment, and decent work for all. In particular, we focused on three targets: green economy, sustainable tourism, employment, decent work for all, and social protection. The primary goal of this research is to determine whether HCI approaches may be used to create and validate interactive data visualization that can serve as helpful decision-making aids for specific groups and raise their knowledge of public-interest issues. To accomplish this goal, we analyzed four case studies. In the first two, we wanted to promote knowledge and awareness about green economy issues: we investigated the Human-Building Interaction inside a Smart Campus and the dematerialization process inside a University. In the third, we focused on smart tourism, investigating the relationship between locals and tourists to create meaningful connections and promote more sustainable tourism. In the fourth, we explored the industry context to highlight sustainability policies inside well-known companies. This research focuses on the hypothesis that interactive data visualization tools can make communities aware of sustainability aspects related to SDG8 and its targets. The research questions addressed are two: "how to promote awareness about SDG8 and its targets through interactive data visualizations?" and "to what extent are these interactive data visualizations effective?".
Resumo:
The General Data Protection Regulation (GDPR) has been designed to help promote a view in favor of the interests of individuals instead of large corporations. However, there is the need of more dedicated technologies that can help companies comply with GDPR while enabling people to exercise their rights. We argue that such a dedicated solution must address two main issues: the need for more transparency towards individuals regarding the management of their personal information and their often hindered ability to access and make interoperable personal data in a way that the exercise of one's rights would result in straightforward. We aim to provide a system that helps to push personal data management towards the individual's control, i.e., a personal information management system (PIMS). By using distributed storage and decentralized computing networks to control online services, users' personal information could be shifted towards those directly concerned, i.e., the data subjects. The use of Distributed Ledger Technologies (DLTs) and Decentralized File Storage (DFS) as an implementation of decentralized systems is of paramount importance in this case. The structure of this dissertation follows an incremental approach to describing a set of decentralized systems and models that revolves around personal data and their subjects. Each chapter of this dissertation builds up the previous one and discusses the technical implementation of a system and its relation with the corresponding regulations. We refer to the EU regulatory framework, including GDPR, eIDAS, and Data Governance Act, to build our final system architecture's functional and non-functional drivers. In our PIMS design, personal data is kept in a Personal Data Space (PDS) consisting of encrypted personal data referring to the subject stored in a DFS. On top of that, a network of authorization servers acts as a data intermediary to provide access to potential data recipients through smart contracts.
Resumo:
The recent widespread use of social media platforms and web services has led to a vast amount of behavioral data that can be used to model socio-technical systems. A significant part of this data can be represented as graphs or networks, which have become the prevalent mathematical framework for studying the structure and the dynamics of complex interacting systems. However, analyzing and understanding these data presents new challenges due to their increasing complexity and diversity. For instance, the characterization of real-world networks includes the need of accounting for their temporal dimension, together with incorporating higher-order interactions beyond the traditional pairwise formalism. The ongoing growth of AI has led to the integration of traditional graph mining techniques with representation learning and low-dimensional embeddings of networks to address current challenges. These methods capture the underlying similarities and geometry of graph-shaped data, generating latent representations that enable the resolution of various tasks, such as link prediction, node classification, and graph clustering. As these techniques gain popularity, there is even a growing concern about their responsible use. In particular, there has been an increased emphasis on addressing the limitations of interpretability in graph representation learning. This thesis contributes to the advancement of knowledge in the field of graph representation learning and has potential applications in a wide range of complex systems domains. We initially focus on forecasting problems related to face-to-face contact networks with time-varying graph embeddings. Then, we study hyperedge prediction and reconstruction with simplicial complex embeddings. Finally, we analyze the problem of interpreting latent dimensions in node embeddings for graphs. The proposed models are extensively evaluated in multiple experimental settings and the results demonstrate their effectiveness and reliability, achieving state-of-the-art performances and providing valuable insights into the properties of the learned representations.
Resumo:
The notion of commodification is a fascinating one. It entails many facets, ranging from subjective debates on desirability of commodification to in depth economic analyses of objects of value and their corresponding markets. Commodity theory is therefore not just defined by a single debate, but spans a plethora of different discussions. This thesis maps and situates those theories and debates and selects one specific strain to investigate further. This thesis argues that commodity theory in its optima forma deals with the investigation into what sets commodities apart from non-commodities. It proceeds to examine the many given answers to this question by scholars ranging from the mid 1800’s to the late 2000’s. Ultimately, commodification is defined as a process in which an object becomes an element of the total wealth of societies in which the capitalist mode of production prevails. In doing so, objects must meet observables, or indicia, of commodification provided by commodity theories. Problems arise when objects are clearly part of the total wealth in societies without meeting established commodity indicia. In such cases, objects are part of the total wealth of a society without counting as a commodity. This thesis examines this phenomenon in relation to the novel commodities of audiences and data. It explains how these non-commodities (according to classical theories) are still essential elements of industry. The thesis then takes a deep dive into commodity theory using the theory on the construction of social reality by John Searle.