940 resultados para data-driven modelling


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Nykypäivän monimutkaisessa ja epävakaassa liiketoimintaympäristössä yritykset, jotka kykenevät muuttamaan tuottamansa operatiivisen datan tietovarastoiksi, voivat saavuttaa merkittävää kilpailuetua. Ennustavan analytiikan hyödyntäminen tulevien trendien ennakointiin mahdollistaa yritysten tunnistavan avaintekijöitä, joiden avulla he pystyvät erottumaan kilpailijoistaan. Ennustavan analytiikan hyödyntäminen osana päätöksentekoprosessia mahdollistaa ketterämmän, reaaliaikaisen päätöksenteon. Tämän diplomityön tarkoituksena on koota teoreettinen viitekehys analytiikan mallintamisesta liike-elämän loppukäyttäjän näkökulmasta ja hyödyntää tätä mallinnusprosessia diplomityön tapaustutkimuksen yritykseen. Teoreettista mallia hyödynnettiin asiakkuuksien mallintamisessa sekä tunnistamalla ennakoivia tekijöitä myynnin ennustamiseen. Työ suoritettiin suomalaiseen teollisten suodattimien tukkukauppaan, jolla on liiketoimintaa Suomessa, Venäjällä ja Balteissa. Tämä tutkimus on määrällinen tapaustutkimus, jossa tärkeimpänä tiedonkeruumenetelmänä käytettiin tapausyrityksen transaktiodataa. Data työhön saatiin yrityksen toiminnanohjausjärjestelmästä.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study occurred in 2009 and questioned how Ontario secondary school principals perceived their role had changed, over a 7 year period, in response to the increased demands of data-driven school environments. Specifically, it sought to identify principals' perceptions on how high-stakes testing and data-driven environments had affected their role, tasks, and accountability responsibilities. This study contextualized the emergence of the Education Quality and Accountability Offices (EQAO) as a central influence in the creation of data-driven school environments, and conceptualized the role of the principal as using data to inform and persuade a shift in thinking about the use of data to improve instruction and student achievement. The findings of the study suggest that data-driven environments had helped principals reclaim their positional power as instructional leaders, using data as an avenue back into the classroom. The use of data shifted the responsibilities of the principal to persuade teachers to work collaboratively to improve classroom instruction in order to demonstrate accountability.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As our world becomes increasingly interconnected, diseases can spread at a faster and faster rate. Recent years have seen large-scale influenza, cholera and ebola outbreaks and failing to react in a timely manner to outbreaks leads to a larger spread and longer persistence of the outbreak. Furthermore, diseases like malaria, polio and dengue fever have been eliminated in some parts of the world but continue to put a substantial burden on countries in which these diseases are still endemic. To reduce the disease burden and eventually move towards countrywide elimination of diseases such as malaria, understanding human mobility is crucial for both planning interventions as well as estimation of the prevalence of the disease. In this talk, I will discuss how various data sources can be used to estimate human movements, population distributions and disease prevalence as well as the relevance of this information for intervention planning. Particularly anonymised mobile phone data has been shown to be a valuable source of information for countries with unreliable population density and migration data and I will present several studies where mobile phone data has been used to derive these measures.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Title: Data-Driven Text Generation using Neural Networks Speaker: Pavlos Vougiouklis, University of Southampton Abstract: Recent work on neural networks shows their great potential at tackling a wide variety of Natural Language Processing (NLP) tasks. This talk will focus on the Natural Language Generation (NLG) problem and, more specifically, on the extend to which neural network language models could be employed for context-sensitive and data-driven text generation. In addition, a neural network architecture for response generation in social media along with the training methods that enable it to capture contextual information and effectively participate in public conversations will be discussed. Speaker Bio: Pavlos Vougiouklis obtained his 5-year Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki in 2013. He was awarded an MSc degree in Software Engineering from the University of Southampton in 2014. In 2015, he joined the Web and Internet Science (WAIS) research group of the University of Southampton and he is currently working towards the acquisition of his PhD degree in the field of Neural Network Approaches for Natural Language Processing. Title: Provenance is Complicated and Boring — Is there a solution? Speaker: Darren Richardson, University of Southampton Abstract: Paper trails, auditing, and accountability — arguably not the sexiest terms in computer science. But then you discover that you've possibly been eating horse-meat, and the importance of provenance becomes almost palpable. Having accepted that we should be creating provenance-enabled systems, the challenge of then communicating that provenance to casual users is not trivial: users should not have to have a detailed working knowledge of your system, and they certainly shouldn't be expected to understand the data model. So how, then, do you give users an insight into the provenance, without having to build a bespoke system for each and every different provenance installation? Speaker Bio: Darren is a final year Computer Science PhD student. He completed his undergraduate degree in Electronic Engineering at Southampton in 2012.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The complexity of current and emerging high performance architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven performance modelling approach is outlined that is appro- priate for modern multicore architectures. The approach is demonstrated by constructing a model of a simple shallow water code on a Cray XE6 system, from application-specific benchmarks that illustrate precisely how architectural char- acteristics impact performance. The model is found to recre- ate observed scaling behaviour up to 16K cores, and used to predict optimal rank-core affinity strategies, exemplifying the type of problem such a model can be used for.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Seamless phase II/III clinical trials are conducted in two stages with treatment selection at the first stage. In the first stage, patients are randomized to a control or one of k > 1 experimental treatments. At the end of this stage, interim data are analysed, and a decision is made concerning which experimental treatment should continue to the second stage. If the primary endpoint is observable only after some period of follow-up, at the interim analysis data may be available on some early outcome on a larger number of patients than those for whom the primary endpoint is available. These early endpoint data can thus be used for treatment selection. For two previously proposed approaches, the power has been shown to be greater for one or other method depending on the true treatment effects and correlations. We propose a new approach that builds on the previously proposed approaches and uses data available at the interim analysis to estimate these parameters and then, on the basis of these estimates, chooses the treatment selection method with the highest probability of correctly selecting the most effective treatment. This method is shown to perform well compared with the two previously described methods for a wide range of true parameter values. In most cases, the performance of the new method is either similar to or, in some cases, better than either of the two previously proposed methods.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The complexity of current and emerging architectures provides users with options about how best to use the available resources, but makes predicting performance challenging. In this work a benchmark-driven model is developed for a simple shallow water code on a Cray XE6 system, to explore how deployment choices such as domain decomposition and core affinity affect performance. The resource sharing present in modern multi-core architectures adds various levels of heterogeneity to the system. Shared resources often includes cache, memory, network controllers and in some cases floating point units (as in the AMD Bulldozer), which mean that the access time depends on the mapping of application tasks, and the core's location within the system. Heterogeneity further increases with the use of hardware-accelerators such as GPUs and the Intel Xeon Phi, where many specialist cores are attached to general-purpose cores. This trend for shared resources and non-uniform cores is expected to continue into the exascale era. The complexity of these systems means that various runtime scenarios are possible, and it has been found that under-populating nodes, altering the domain decomposition and non-standard task to core mappings can dramatically alter performance. To find this out, however, is often a process of trial and error. To better inform this process, a performance model was developed for a simple regular grid-based kernel code, shallow. The code comprises two distinct types of work, loop-based array updates and nearest-neighbour halo-exchanges. Separate performance models were developed for each part, both based on a similar methodology. Application specific benchmarks were run to measure performance for different problem sizes under different execution scenarios. These results were then fed into a performance model that derives resource usage for a given deployment scenario, with interpolation between results as necessary.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Parkinson’s disease (PD) is an increasing neurological disorder in an aging society. The motor and non-motor symptoms of PD advance with the disease progression and occur in varying frequency and duration. In order to affirm the full extent of a patient’s condition, repeated assessments are necessary to adjust medical prescription. In clinical studies, symptoms are assessed using the unified Parkinson’s disease rating scale (UPDRS). On one hand, the subjective rating using UPDRS relies on clinical expertise. On the other hand, it requires the physical presence of patients in clinics which implies high logistical costs. Another limitation of clinical assessment is that the observation in hospital may not accurately represent a patient’s situation at home. For such reasons, the practical frequency of tracking PD symptoms may under-represent the true time scale of PD fluctuations and may result in an overall inaccurate assessment. Current technologies for at-home PD treatment are based on data-driven approaches for which the interpretation and reproduction of results are problematic.  The overall objective of this thesis is to develop and evaluate unobtrusive computer methods for enabling remote monitoring of patients with PD. It investigates first-principle data-driven model based novel signal and image processing techniques for extraction of clinically useful information from audio recordings of speech (in texts read aloud) and video recordings of gait and finger-tapping motor examinations. The aim is to map between PD symptoms severities estimated using novel computer methods and the clinical ratings based on UPDRS part-III (motor examination). A web-based test battery system consisting of self-assessment of symptoms and motor function tests was previously constructed for a touch screen mobile device. A comprehensive speech framework has been developed for this device to analyze text-dependent running speech by: (1) extracting novel signal features that are able to represent PD deficits in each individual component of the speech system, (2) mapping between clinical ratings and feature estimates of speech symptom severity, and (3) classifying between UPDRS part-III severity levels using speech features and statistical machine learning tools. A novel speech processing method called cepstral separation difference showed stronger ability to classify between speech symptom severities as compared to existing features of PD speech. In the case of finger tapping, the recorded videos of rapid finger tapping examination were processed using a novel computer-vision (CV) algorithm that extracts symptom information from video-based tapping signals using motion analysis of the index-finger which incorporates a face detection module for signal calibration. This algorithm was able to discriminate between UPDRS part III severity levels of finger tapping with high classification rates. Further analysis was performed on novel CV based gait features constructed using a standard human model to discriminate between a healthy gait and a Parkinsonian gait. The findings of this study suggest that the symptom severity levels in PD can be discriminated with high accuracies by involving a combination of first-principle (features) and data-driven (classification) approaches. The processing of audio and video recordings on one hand allows remote monitoring of speech, gait and finger-tapping examinations by the clinical staff. On the other hand, the first-principles approach eases the understanding of symptom estimates for clinicians. We have demonstrated that the selected features of speech, gait and finger tapping were able to discriminate between symptom severity levels, as well as, between healthy controls and PD patients with high classification rates. The findings support suitability of these methods to be used as decision support tools in the context of PD assessment.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vehicle activated signs (VAS) display a warning message when drivers exceed a particular threshold. VAS are often installed on local roads to display a warning message depending on the speed of the approaching vehicles. VAS are usually powered by electricity; however, battery and solar powered VAS are also commonplace. This thesis investigated devel-opment of an automatic trigger speed of vehicle activated signs in order to influence driver behaviour, the effect of which has been measured in terms of reduced mean speed and low standard deviation. A comprehen-sive understanding of the effectiveness of the trigger speed of the VAS on driver behaviour was established by systematically collecting data. Specif-ically, data on time of day, speed, length and direction of the vehicle have been collected for the purpose, using Doppler radar installed at the road. A data driven calibration method for the radar used in the experiment has also been developed and evaluated. Results indicate that trigger speed of the VAS had variable effect on driv-ers’ speed at different sites and at different times of the day. It is evident that the optimal trigger speed should be set near the 85th percentile speed, to be able to lower the standard deviation. In the case of battery and solar powered VAS, trigger speeds between the 50th and 85th per-centile offered the best compromise between safety and power consump-tion. Results also indicate that different classes of vehicles report differ-ences in mean speed and standard deviation; on a highway, the mean speed of cars differs slightly from the mean speed of trucks, whereas a significant difference was observed between the classes of vehicles on lo-cal roads. A differential trigger speed was therefore investigated for the sake of completion. A data driven approach using Random forest was found to be appropriate in predicting trigger speeds respective to types of vehicles and traffic conditions. The fact that the predicted trigger speed was found to be consistently around the 85th percentile speed justifies the choice of the automatic model.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this research the 3DVAR data assimilation scheme is implemented in the numerical model DIVAST in order to optimize the performance of the numerical model by selecting an appropriate turbulence scheme and tuning its parameters. Two turbulence closure schemes: the Prandtl mixing length model and the two-equation k-ε model were incorporated into DIVAST and examined with respect to their universality of application, complexity of solutions, computational efficiency and numerical stability. A square harbour with one symmetrical entrance subject to tide-induced flows was selected to investigate the structure of turbulent flows. The experimental part of the research was conducted in a tidal basin. A significant advantage of such laboratory experiment is a fully controlled environment where domain setup and forcing are user-defined. The research shows that the Prandtl mixing length model and the two-equation k-ε model, with default parameterization predefined according to literature recommendations, overestimate eddy viscosity which in turn results in a significant underestimation of velocity magnitudes in the harbour. The data assimilation of the model-predicted velocity and laboratory observations significantly improves model predictions for both turbulence models by adjusting modelled flows in the harbour to match de-errored observations. 3DVAR allows also to identify and quantify shortcomings of the numerical model. Such comprehensive analysis gives an optimal solution based on which numerical model parameters can be estimated. The process of turbulence model optimization by reparameterization and tuning towards optimal state led to new constants that may be potentially applied to complex turbulent flows, such as rapidly developing flows or recirculating flows.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

When an accurate hydraulic network model is available, direct modeling techniques are very straightforward and reliable for on-line leakage detection and localization applied to large class of water distribution networks. In general, this type of techniques based on analytical models can be seen as an application of the well-known fault detection and isolation theory for complex industrial systems. Nonetheless, the assumption of single leak scenarios is usually made considering a certain leak size pattern which may not hold in real applications. Upgrading a leak detection and localization method based on a direct modeling approach to handle multiple-leak scenarios can be, on one hand, quite straightforward but, on the other hand, highly computational demanding for large class of water distribution networks given the huge number of potential water loss hotspots. This paper presents a leakage detection and localization method suitable for multiple-leak scenarios and large class of water distribution networks. This method can be seen as an upgrade of the above mentioned method based on a direct modeling approach in which a global search method based on genetic algorithms has been integrated in order to estimate those network water loss hotspots and the size of the leaks. This is an inverse / direct modeling method which tries to take benefit from both approaches: on one hand, the exploration capability of genetic algorithms to estimate network water loss hotspots and the size of the leaks and on the other hand, the straightforwardness and reliability offered by the availability of an accurate hydraulic model to assess those close network areas around the estimated hotspots. The application of the resulting method in a DMA of the Barcelona water distribution network is provided and discussed. The obtained results show that leakage detection and localization under multiple-leak scenarios may be performed efficiently following an easy procedure.

Relevância:

100.00% 100.00%

Publicador: