940 resultados para data-driven modelling


Relevância:

90.00% 90.00%

Publicador:

Resumo:

Computers employing some degree of data flow organisation are now well established as providing a possible vehicle for concurrent computation. Although data-driven computation frees the architecture from the constraints of the single program counter, processor and global memory, inherent in the classic von Neumann computer, there can still be problems with the unconstrained generation of fresh result tokens if a pure data flow approach is adopted. The advantages of allowing serial processing for those parts of a program which are inherently serial, and of permitting a demand-driven, as well as data-driven, mode of operation are identified and described. The MUSE machine described here is a structured architecture supporting both serial and parallel processing which allows the abstract structure of a program to be mapped onto the machine in a logical way.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Collecting and analyzing consumer data is essential in today’s data-driven business environment. However, consumers are becoming more aware of the value of the information they can provide to companies, thereby being more reluctant to share it for free. Therefore, companies need to find ways to motivate consumers to disclose personal information. The main research question of the study was formed as “How can companies motivate consumers to disclose personal information?” and it was further divided into two subquestions: 1) What types of benefits motivate consumers to disclose personal information? 2) How does the disclosure context affect the consumers’ information disclosure behavior? The conceptual framework consisted of a classification of extrinsic and intrinsic benefits, and moderating factors, which were recognized on the basis of prior research in the field. The study was conducted by using qualitative research methods. The primary data was collected by interviewing ten representatives from eight companies. The data was analyzed and reported according to predetermined themes. The findings of the study confirm that consumers can be motivated to disclose personal information by offering different types of extrinsic (monetary saving, time saving, self-enhancement, and social adjustment) and intrinsic (novelty, pleasure, and altruism) benefits. However, not all the benefits are equally useful ways to convince the customer to disclose information. Moreover, different factors in the disclosure context can either alleviate or increase the effectiveness of the benefits and the consumers’ motivation to disclose personal information. Such factors include the consumer’s privacy concerns, perceived trust towards the company, the relevancy of the requested information, personalization, website elements (especially security, usability, and aesthetics of a website), and the consumer’s shopping motivation. This study has several contributions. It is essential that companies recognize the most attractive benefits regarding their business and their customers, and that they understand how the disclosure context affects the consumer’s information disclosure behavior. The likelihood of information disclosure can be increased, for example, by offering benefits that meet the consumers’ needs and preferences, improving the relevancy of the asked information, stating the reasons for data collection, creating and maintaining a trustworthy image of the company, and enhancing the quality of the company’s website.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This thesis investigates how web search evaluation can be improved using historical interaction data. Modern search engines combine offline and online evaluation approaches in a sequence of steps that a tested change needs to pass through to be accepted as an improvement and subsequently deployed. We refer to such a sequence of steps as an evaluation pipeline. In this thesis, we consider the evaluation pipeline to contain three sequential steps: an offline evaluation step, an online evaluation scheduling step, and an online evaluation step. In this thesis we show that historical user interaction data can aid in improving the accuracy or efficiency of each of the steps of the web search evaluation pipeline. As a result of these improvements, the overall efficiency of the entire evaluation pipeline is increased. Firstly, we investigate how user interaction data can be used to build accurate offline evaluation methods for query auto-completion mechanisms. We propose a family of offline evaluation metrics for query auto-completion that represents the effort the user has to spend in order to submit their query. The parameters of our proposed metrics are trained against a set of user interactions recorded in the search engine’s query logs. From our experimental study, we observe that our proposed metrics are significantly more correlated with an online user satisfaction indicator than the metrics proposed in the existing literature. Hence, fewer changes will pass the offline evaluation step to be rejected after the online evaluation step. As a result, this would allow us to achieve a higher efficiency of the entire evaluation pipeline. Secondly, we state the problem of the optimised scheduling of online experiments. We tackle this problem by considering a greedy scheduler that prioritises the evaluation queue according to the predicted likelihood of success of a particular experiment. This predictor is trained on a set of online experiments, and uses a diverse set of features to represent an online experiment. Our study demonstrates that a higher number of successful experiments per unit of time can be achieved by deploying such a scheduler on the second step of the evaluation pipeline. Consequently, we argue that the efficiency of the evaluation pipeline can be increased. Next, to improve the efficiency of the online evaluation step, we propose the Generalised Team Draft interleaving framework. Generalised Team Draft considers both the interleaving policy (how often a particular combination of results is shown) and click scoring (how important each click is) as parameters in a data-driven optimisation of the interleaving sensitivity. Further, Generalised Team Draft is applicable beyond domains with a list-based representation of results, i.e. in domains with a grid-based representation, such as image search. Our study using datasets of interleaving experiments performed both in document and image search domains demonstrates that Generalised Team Draft achieves the highest sensitivity. A higher sensitivity indicates that the interleaving experiments can be deployed for a shorter period of time or use a smaller sample of users. Importantly, Generalised Team Draft optimises the interleaving parameters w.r.t. historical interaction data recorded in the interleaving experiments. Finally, we propose to apply the sequential testing methods to reduce the mean deployment time for the interleaving experiments. We adapt two sequential tests for the interleaving experimentation. We demonstrate that one can achieve a significant decrease in experiment duration by using such sequential testing methods. The highest efficiency is achieved by the sequential tests that adjust their stopping thresholds using historical interaction data recorded in diagnostic experiments. Our further experimental study demonstrates that cumulative gains in the online experimentation efficiency can be achieved by combining the interleaving sensitivity optimisation approaches, including Generalised Team Draft, and the sequential testing approaches. Overall, the central contributions of this thesis are the proposed approaches to improve the accuracy or efficiency of the steps of the evaluation pipeline: the offline evaluation frameworks for the query auto-completion, an approach for the optimised scheduling of online experiments, a general framework for the efficient online interleaving evaluation, and a sequential testing approach for the online search evaluation. The experiments in this thesis are based on massive real-life datasets obtained from Yandex, a leading commercial search engine. These experiments demonstrate the potential of the proposed approaches to improve the efficiency of the evaluation pipeline.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Part 14: Interoperability and Integration

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The Exhibitium Project , awarded by the BBVA Foundation, is a data-driven project developed by an international consortium of research groups . One of its main objectives is to build a prototype that will serve as a base to produce a platform for the recording and exploitation of data about art-exhibitions available on the Internet . Therefore, our proposal aims to expose the methods, procedures and decision-making processes that have governed the technological implementation of this prototype, especially with regard to the reuse of WordPress (WP) as development framework.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Our proposal aims to display the analysis techniques, methodologies as well as the most relevant results expected within the Exhibitium project framework (http://www.exhibitium.com). Awarded by the BBVA Foundation, the Exhibitium project is being developed by an international consortium of several research groups . Its main purpose is to build a comprehensive and structured data repository about temporary art exhibitions, captured from the web, to make them useful and reusable in various domains through open and interoperable data systems.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Il presente elaborato esplora l’attitudine delle organizzazioni nei confronti dei processi di business che le sostengono: dalla semi-assenza di struttura, all’organizzazione funzionale, fino all’avvento del Business Process Reengineering e del Business Process Management, nato come superamento dei limiti e delle problematiche del modello precedente. All’interno del ciclo di vita del BPM, trova spazio la metodologia del process mining, che permette un livello di analisi dei processi a partire dagli event data log, ossia dai dati di registrazione degli eventi, che fanno riferimento a tutte quelle attività supportate da un sistema informativo aziendale. Il process mining può essere visto come naturale ponte che collega le discipline del management basate sui processi (ma non data-driven) e i nuovi sviluppi della business intelligence, capaci di gestire e manipolare l’enorme mole di dati a disposizione delle aziende (ma che non sono process-driven). Nella tesi, i requisiti e le tecnologie che abilitano l’utilizzo della disciplina sono descritti, cosi come le tre tecniche che questa abilita: process discovery, conformance checking e process enhancement. Il process mining è stato utilizzato come strumento principale in un progetto di consulenza da HSPI S.p.A. per conto di un importante cliente italiano, fornitore di piattaforme e di soluzioni IT. Il progetto a cui ho preso parte, descritto all’interno dell’elaborato, ha come scopo quello di sostenere l’organizzazione nel suo piano di improvement delle prestazioni interne e ha permesso di verificare l’applicabilità e i limiti delle tecniche di process mining. Infine, nell’appendice finale, è presente un paper da me realizzato, che raccoglie tutte le applicazioni della disciplina in un contesto di business reale, traendo dati e informazioni da working papers, casi aziendali e da canali diretti. Per la sua validità e completezza, questo documento è stata pubblicato nel sito dell'IEEE Task Force on Process Mining.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The isoprene degradation mechanism included in version 3 of the Master Chemical Mechanism (MCM v3) has been evaluated and refined, using the Statewide Air Pollution Research Center (SAPRC) environmental chamber datasets on the photo-oxidation of isoprene and its degradation products, methacrolein (MACR) and methylvinyl ketone (MVK). Prior to this, the MCM v3 butane degradation chemistry was also evaluated using chamber data on the photo-oxidation of butane, and its degradation products, methylethyl ketone (MEK), acetaldehyde (CH3CHO) and formaldehyde (HCHO), in conjunction with an initial evaluation of the chamber-dependent auxiliary mechanisms for the series of relevant chambers. The MCM v3 mechanisms for both isoprene and butane generally performed well and were found to provide an acceptable reaction framework for describing the NOx-photo-oxidation experiments on the above systems, although a number of parameter modifications and refinements were identified which resulted in an improved performance. All these relate to the magnitude of sources of free radicals from organic chemical process, such as carbonyl photolysis rates and the yields of radicals from the reactions of O3 with unsaturated oxygenates, and specific recommendations are made for refinements. In addition to this, it was necessary to include a representation of the reactions of O(3P) with isoprene, MACR and MVK (which were not previously treated in MCM v3), and conclusions are drawn concerning the required extent of free radical formation from these reactions. Throughout the study, the performance of MCM v3 was also compared with that of the SAPRC-99 mechanism, which was developed and optimized in conjunction with the chamber datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The representation of alkene degradation in version 3 of the Master Chemical Mechanism (MCM v3) has been evaluated, using environmental chamber data on the photo-oxidation of ethene, propene, 1-butene and 1-hexene in the presence of NOx, from up to five chambers at the Statewide Air Pollution Research Center (SAPRC) at the University of California. As part of this evaluation, it was necessary to include a representation of the reactions of the alkenes with O(3P), which are significant under chamber conditions but generally insignificant under atmospheric conditions. The simulations for the ethene and propene systems, in particular, were found to be sensitive to the branching ratios assigned to molecular and free radical forming pathways of the O(3P) reactions, with the extent of radical formation required for proper fitting of the model to the chamber data being substantially lower than the reported consensus. With this constraint, the MCM v3 mechanisms for ethene and propene generally performed well. The sensitivity of the simulations to the parameters applied to a series of other radical sources and sink reactions (radical formation from the alkene ozonolysis reactions and product carbonyl photolysis; radical removal from the reaction of OH with NO2 and β-hydroxynitrate formation) were also considered, and the implications of these results are discussed. Evaluation of the MCM v3 1-butene and 1-hexene degradation mechanisms, using a more limited dataset from only one chamber, was found to be inconclusive. The results of sensitivity studies demonstrate that it is impossible to reconcile the simulated and observed formation of ozone in these systems for ranges of parameter values which can currently be justified on the basis of the literature. As a result of this work, gaps and uncertainties in the kinetic, mechanistic and chamber database are identified and discussed, in relation to both tropospheric chemistry and chemistry important under chamber conditions which may compromise the evaluation procedure, and recommendations are made for future experimental studies. Throughout the study, the performance of the MCM v3 chemistry was also simultaneously compared with that of the corresponding chemistry in the SAPRC-99 mechanism, which was developed and optimized in conjunction with the chamber datasets.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper proposes a process for the classifi cation of new residential electricity customers. The current state of the art is extended by using a combination of smart metering and survey data and by using model-based feature selection for the classifi cation task. Firstly, the normalized representative consumption profi les of the population are derived through the clustering of data from households. Secondly, new customers are classifi ed using survey data and a limited amount of smart metering data. Thirdly, regression analysis and model-based feature selection results explain the importance of the variables and which are the drivers of diff erent consumption profi les, enabling the extraction of appropriate models. The results of a case study show that the use of survey data signi ficantly increases accuracy of the classifi cation task (up to 20%). Considering four consumption groups, more than half of the customers are correctly classifi ed with only one week of metering data, with more weeks the accuracy is signifi cantly improved. The use of model-based feature selection resulted in the use of a signifi cantly lower number of features allowing an easy interpretation of the derived models.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The internet and digital technologies revolutionized the economy. Regulating the digital market has become a priority for the European Union. While promoting innovation and development, EU institutions must assure that the digital market maintains a competitive structure. Among the numerous elements characterizing the digital sector, users’ data are particularly important. Digital services are centered around personal data, the accumulation of which contributed to the centralization of market power in the hands of a few large providers. As a result, data-driven mergers and data-related abuses gained a central role for the purposes of EU antitrust enforcement. In light of these considerations, this work aims at assessing whether EU competition law is well-suited to address data-driven mergers and data-related abuses of dominance. These conducts are of crucial importance to the maintenance of competition in the digital sector, insofar as the accumulation of users’ data constitutes a fundamental competitive advantage. To begin with, part 1 addresses the specific features of the digital market and their impact on the definition of the relevant market and the assessment of dominance by antitrust authorities. Secondly, part 2 analyzes the EU’s case law on data-driven mergers to verify if merger control is well-suited to address these concentrations. Thirdly, part 3 discusses abuses of dominance in the phase of data collection and the legal frameworks applicable to these conducts. Fourthly, part 4 focuses on access to “essential” datasets and the indirect effects of anticompetitive conducts on rivals’ ability to access users’ information. Finally, Part 5 discusses differential pricing practices implemented online and based on personal data. As it will be assessed, the combination of an efficient competition law enforcement and the auspicial adoption of a specific regulation seems to be the best solution to face the challenges raised by “data-related dominance”.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Machine learning is widely adopted to decode multi-variate neural time series, including electroencephalographic (EEG) and single-cell recordings. Recent solutions based on deep learning (DL) outperformed traditional decoders by automatically extracting relevant discriminative features from raw or minimally pre-processed signals. Convolutional Neural Networks (CNNs) have been successfully applied to EEG and are the most common DL-based EEG decoders in the state-of-the-art (SOA). However, the current research is affected by some limitations. SOA CNNs for EEG decoding usually exploit deep and heavy structures with the risk of overfitting small datasets, and architectures are often defined empirically. Furthermore, CNNs are mainly validated by designing within-subject decoders. Crucially, the automatically learned features mainly remain unexplored; conversely, interpreting these features may be of great value to use decoders also as analysis tools, highlighting neural signatures underlying the different decoded brain or behavioral states in a data-driven way. Lastly, SOA DL-based algorithms used to decode single-cell recordings rely on more complex, slower to train and less interpretable networks than CNNs, and the use of CNNs with these signals has not been investigated. This PhD research addresses the previous limitations, with reference to P300 and motor decoding from EEG, and motor decoding from single-neuron activity. CNNs were designed light, compact, and interpretable. Moreover, multiple training strategies were adopted, including transfer learning, which could reduce training times promoting the application of CNNs in practice. Furthermore, CNN-based EEG analyses were proposed to study neural features in the spatial, temporal and frequency domains, and proved to better highlight and enhance relevant neural features related to P300 and motor states than canonical EEG analyses. Remarkably, these analyses could be used, in perspective, to design novel EEG biomarkers for neurological or neurodevelopmental disorders. Lastly, CNNs were developed to decode single-neuron activity, providing a better compromise between performance and model complexity.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Currently making digital 3D models and replicas of the cultural heritage assets play an important role in the preservation and having a high detail source for future research and intervention. In this dissertation, it is tried to assess different methods for digital surveying and making 3D replicas of cultural heritage assets in different scales of size. The methodologies vary in devices, software, workflow, and the amount of skill that is required. The three phases of the 3D modelling process are data acquisition, modelling, and model presentation. Each of these sections is divided into sub-sections and there are several approaches, methods, devices, and software that may be employed, furthermore, the selection process should be based on the operation's goal, available facilities, the scale and properties of the object or structure to be modeled, as well as the operators' expertise and experience. The most key point to remember is that the 3D modelling operation should be properly accurate, precise, and reliable; therefore, there are so many instructions and pieces of advice on how to perform 3D modelling effectively. It is an attempt to compare and evaluate the various ways of each phase in order to explain and demonstrate their differences, benefits, and drawbacks in order to serve as a simple guide for new and/or inexperienced users.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

In recent times, a significant research effort has been focused on how deformable linear objects (DLOs) can be manipulated for real world applications such as assembly of wiring harnesses for the automotive and aerospace sector. This represents an open topic because of the difficulties in modelling accurately the behaviour of these objects and simulate a task involving their manipulation, considering a variety of different scenarios. These problems have led to the development of data-driven techniques in which machine learning techniques are exploited to obtain reliable solutions. However, this approach makes the solution difficult to be extended, since the learning must be replicated almost from scratch as the scenario changes. It follows that some model-based methodology must be introduced to generalize the results and reduce the training effort accordingly. The objective of this thesis is to develop a solution for the DLOs manipulation to assemble a wiring harness for the automotive sector based on adaptation of a base trajectory set by means of reinforcement learning methods. The idea is to create a trajectory planning software capable of solving the proposed task, reducing where possible the learning time, which is done in real time, but at the same time presenting suitable performance and reliability. The solution has been implemented on a collaborative 7-DOFs Panda robot at the Laboratory of Automation and Robotics of the University of Bologna. Experimental results are reported showing how the robot is capable of optimizing the manipulation of the DLOs gaining experience along the task repetition, but showing at the same time a high success rate from the very beginning of the learning phase.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Nel panorama aziendale odierno, risulta essere di fondamentale importanza la capacità, da parte di un’azienda o di una società di servizi, di orientare in modo programmatico la propria innovazione in modo tale da poter essere competitivi sul mercato. In molti casi, questo e significa investire una cospicua somma di denaro in progetti che andranno a migliorare aspetti essenziali del prodotto o del servizio e che avranno un importante impatto sulla trasformazione digitale dell’azienda. Lo studio che viene proposto riguarda in particolar modo due approcci che sono tipicamente in antitesi tra loro proprio per il fatto che si basano su due tipologie di dati differenti, i Big Data e i Thick Data. I due approcci sono rispettivamente il Data Science e il Design Thinking. Nel corso dei seguenti capitoli, dopo aver definito gli approcci di Design Thinking e Data Science, verrà definito il concetto di blending e la problematica che ruota attorno all’intersezione dei due metodi di innovazione. Per mettere in evidenza i diversi aspetti che riguardano la tematica, verranno riportati anche casi di aziende che hanno integrato i due approcci nei loro processi di innovazione, ottenendo importanti risultati. In particolar modo verrà riportato il lavoro di ricerca svolto dall’autore riguardo l'esame, la classificazione e l'analisi della letteratura esistente all'intersezione dell'innovazione guidata dai dati e dal pensiero progettuale. Infine viene riportato un caso aziendale che è stato condotto presso la realtà ospedaliero-sanitaria di Parma in cui, a fronte di una problematica relativa al rapporto tra clinici dell’ospedale e clinici del territorio, si è progettato un sistema innovativo attraverso l’utilizzo del Design Thinking. Inoltre, si cercherà di sviluppare un’analisi critica di tipo “what-if” al fine di elaborare un possibile scenario di integrazione di metodi o tecniche provenienti anche dal mondo del Data Science e applicarlo al caso studio in oggetto.