Biblioteca Digital

913 resultados para Data-driven analysis

Data-driven topo-climatic mapping with machine learning methods

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Automatic environmental monitoring networks enforced by wireless communication technologies provide large and ever increasing volumes of data nowadays. The use of this information in natural hazard research is an important issue. Particularly useful for risk assessment and decision making are the spatial maps of hazard-related parameters produced from point observations and available auxiliary information. The purpose of this article is to present and explore the appropriate tools to process large amounts of available data and produce predictions at fine spatial scales. These are the algorithms of machine learning, which are aimed at non-parametric robust modelling of non-linear dependencies from empirical data. The computational efficiency of the data-driven methods allows producing the prediction maps in real time which makes them superior to physical models for the operational use in risk assessment and mitigation. Particularly, this situation encounters in spatial prediction of climatic variables (topo-climatic mapping). In complex topographies of the mountainous regions, the meteorological processes are highly influenced by the relief. The article shows how these relations, possibly regionalized and non-linear, can be modelled from data using the information from digital elevation models. The particular illustration of the developed methodology concerns the mapping of temperatures (including the situations of Föhn and temperature inversion) given the measurements taken from the Swiss meteorological monitoring network. The range of the methods used in the study includes data-driven feature selection, support vector algorithms and artificial neural networks.

Data Envelopment Analysis and non-discretionary inputs : How to select the most suitable model using multi-criteria decision analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Within Data Envelopment Analysis, several alternative models allow for an environmental adjustment. The majority of them deliver divergent results. Decision makers face the difficult task of selecting the most suitable model. This study is performed to overcome this difficulty. By doing so, it fills a research gap. First, a two-step web-based survey is conducted. It aims (1) to identify the selection criteria, (2) to prioritize and weight the selection criteria with respect to the goal of selecting the most suitable model and (3) to collect the preferences about which model is preferable to fulfil each selection criterion. Second, Analytic Hierarchy Process is used to quantify the preferences expressed in the survey. Results show that the understandability, the applicability and the acceptability of the alternative models are valid selection criteria. The selection of the most suitable model depends on the preferences of the decision makers with regards to these criteria.

Investigation and data adaptive analysis of CARS spectra

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this diploma work advantages of coherent anti-Stokes Raman scattering spectrometry (CARS) and various methods of the quantitative analysis of substance structure with its help are considered. The basic methods and concepts of the adaptive analysis are adduced. On the basis of these methods the algorithm of automatic measurement of a scattering strip size of a target component in CARS spectrum is developed. The algorithm uses known full spectrum of target substance and compares it with a CARS spectrum. The form of a differential spectrum is used as a feedback to control the accuracy of matching. To exclude the influence of a background in CARS spectra the differential spectrum is analysed by means of its second derivative. The algorithm is checked up on the simulated simple spectra and on the spectra of organic compounds received experimentally.

Comparison of a deterministic and a data driven model to describe MBR fouling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Membrane bioreactors (MBRs) are a combination of activated sludge bioreactors and membrane filtration, enabling high quality effluent with a small footprint. However, they can be beset by fouling, which causes an increase in transmembrane pressure (TMP). Modelling and simulation of changes in TMP could be useful to describe fouling through the identification of the most relevant operating conditions. Using experimental data from a MBR pilot plant operated for 462days, two different models were developed: a deterministic model using activated sludge model n°2d (ASM2d) for the biological component and a resistance in-series model for the filtration component as well as a data-driven model based on multivariable regressions. Once validated, these models were used to describe membrane fouling (as changes in TMP over time) under different operating conditions. The deterministic model performed better at higher temperatures (>20°C), constant operating conditions (DO set-point, membrane air-flow, pH and ORP), and high mixed liquor suspended solids (>6.9gL-1) and flux changes. At low pH (<7) or periods with higher pH changes, the data-driven model was more accurate. Changes in the DO set-point of the aerobic reactor that affected the TMP were also better described by the data-driven model. By combining the use of both models, a better description of fouling can be achieved under different operating conditions

The Use of Data Envelopment Analysis as Equity Portfolio Selection Criterion in the Finnish Stock Market

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This thesis examines the application of data envelopment analysis as an equity portfolio selection criterion in the Finnish stock market during period 2001-2011. A sample of publicly traded firms in the Helsinki Stock Exchange is examined in this thesis. The sample covers the majority of the publicly traded firms in the Helsinki Stock Exchange. Data envelopment analysis is used to determine the efficiency of firms using a set of input and output financial parameters. The set of financial parameters consist of asset utilization, liquidity, capital structure, growth, valuation and profitability measures. The firms are divided into artificial industry categories, because of the industry-specific nature of the input and output parameters. Comparable portfolios are formed inside the industry category according to the efficiency scores given by the DEA and the performance of the portfolios is evaluated with several measures. The empirical evidence of this thesis suggests that with certain limitations, data envelopment analysis can successfully be used as portfolio selection criterion in the Finnish stock market when the portfolios are rebalanced at annual frequency according to the efficiency scores given by the data envelopment analysis. However, when the portfolios were rebalanced every two or three years, the results are mixed and inconclusive.

Issues with Fedora & Hydra, experiences from a research-data-driven implementation

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014

Green supplier selection via Multiple Criteria Data Envelopment Analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

An appropriate supplier selection and its profound effects on increasing the competitive advantage of companies has been widely discussed in supply chain management (SCM) literature. By raising environmental awareness among companies and industries they attach more importance to sustainable and green activities in selection procedures of raw material providers. The current thesis benefits from data envelopment analysis (DEA) technique to evaluate the relative efficiency of suppliers in the presence of carbon dioxide (CO2) emission for green supplier selection. We incorporate the pollution of suppliers as an undesirable output into DEA. However, to do so, two conventional DEA model problems arise: the lack of the discrimination power among decision making units (DMUs) and flexibility of the inputs and outputs weights. To overcome these limitations, we use multiple criteria DEA (MCDEA) as one alternative. By applying MCDEA the number of suppliers which are identified as efficient will be decreased and will lead to a better ranking and selection of the suppliers. Besides, in order to compare the performance of the suppliers with an ideal supplier, a “virtual” best practice supplier is introduced. The presence of the ideal virtual supplier will also increase the discrimination power of the model for a better ranking of the suppliers. Therefore, a new MCDEA model is proposed to simultaneously handle undesirable outputs and virtual DMU. The developed model is applied for green supplier selection problem. A numerical example illustrates the applicability of the proposed model.

A Shift in Role: The Perspectives of Ontario Secondary Principals Working in Data-Driven Environments

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study occurred in 2009 and questioned how Ontario secondary school principals perceived their role had changed, over a 7 year period, in response to the increased demands of data-driven school environments. Specifically, it sought to identify principals' perceptions on how high-stakes testing and data-driven environments had affected their role, tasks, and accountability responsibilities. This study contextualized the emergence of the Education Quality and Accountability Offices (EQAO) as a central influence in the creation of data-driven school environments, and conceptualized the role of the principal as using data to inform and persuade a shift in thinking about the use of data to improve instruction and student achievement. The findings of the study suggest that data-driven environments had helped principals reclaim their positional power as instructional leaders, using data as an avenue back into the classroom. The use of data shifted the responsibilities of the principal to persuade teachers to work collaboratively to improve classroom instruction in order to demonstrate accountability.

A data-driven approach to disease control

Relevância:

100.00% 100.00%

Publicador:

Resumo:

As our world becomes increasingly interconnected, diseases can spread at a faster and faster rate. Recent years have seen large-scale influenza, cholera and ebola outbreaks and failing to react in a timely manner to outbreaks leads to a larger spread and longer persistence of the outbreak. Furthermore, diseases like malaria, polio and dengue fever have been eliminated in some parts of the world but continue to put a substantial burden on countries in which these diseases are still endemic. To reduce the disease burden and eventually move towards countrywide elimination of diseases such as malaria, understanding human mobility is crucial for both planning interventions as well as estimation of the prevalence of the disease. In this talk, I will discuss how various data sources can be used to estimate human movements, population distributions and disease prevalence as well as the relevance of this information for intervention planning. Particularly anonymised mobile phone data has been shown to be a valuable source of information for countries with unreliable population density and migration data and I will present several studies where mobile phone data has been used to derive these measures.

Data-Driven Text Generation using Neural Networks & Provenance is Complicated and Boring — Is there a solution?

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Title: Data-Driven Text Generation using Neural Networks Speaker: Pavlos Vougiouklis, University of Southampton Abstract: Recent work on neural networks shows their great potential at tackling a wide variety of Natural Language Processing (NLP) tasks. This talk will focus on the Natural Language Generation (NLG) problem and, more specifically, on the extend to which neural network language models could be employed for context-sensitive and data-driven text generation. In addition, a neural network architecture for response generation in social media along with the training methods that enable it to capture contextual information and effectively participate in public conversations will be discussed. Speaker Bio: Pavlos Vougiouklis obtained his 5-year Diploma in Electrical and Computer Engineering from the Aristotle University of Thessaloniki in 2013. He was awarded an MSc degree in Software Engineering from the University of Southampton in 2014. In 2015, he joined the Web and Internet Science (WAIS) research group of the University of Southampton and he is currently working towards the acquisition of his PhD degree in the field of Neural Network Approaches for Natural Language Processing. Title: Provenance is Complicated and Boring — Is there a solution? Speaker: Darren Richardson, University of Southampton Abstract: Paper trails, auditing, and accountability — arguably not the sexiest terms in computer science. But then you discover that you've possibly been eating horse-meat, and the importance of provenance becomes almost palpable. Having accepted that we should be creating provenance-enabled systems, the challenge of then communicating that provenance to casual users is not trivial: users should not have to have a detailed working knowledge of your system, and they certainly shouldn't be expected to understand the data model. So how, then, do you give users an insight into the provenance, without having to build a bespoke system for each and every different provenance installation? Speaker Bio: Darren is a final year Computer Science PhD student. He completed his undergraduate degree in Electronic Engineering at Southampton in 2012.

Excess open solar magnetic flux from satellite data: 1. Analysis of the third perihelion Ulysses pass

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We use the third perihelion pass by the Ulysses spacecraft to illustrate and investigate the “flux excess” effect, whereby open solar flux estimates from spacecraft increase with increasing heliocentric distance. We analyze the potential effects of small-scale structure in the heliospheric field (giving fluctuations in the radial component on timescales smaller than 1 h) and kinematic time-of-flight effects of longitudinal structure in the solar wind flow. We show that the flux excess is explained by neither very small-scale structure (timescales < 1 h) nor by the kinematic “bunching effect” on spacecraft sampling. The observed flux excesses is, however, well explained by the kinematic effect of larger-scale (>1 day) solar wind speed variations on the frozen-in heliospheric field. We show that averaging over an interval T (that is long enough to eliminate structure originating in the heliosphere yet small enough to avoid cancelling opposite polarity radial field that originates from genuine sector structure in the coronal source field) is only an approximately valid way of allowing for these effects and does not adequately explain or account for differences between the streamer belt and the polar coronal holes.

The development of a data-driven application benchmarking approach to performance modelling

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Performance modelling is a useful tool in the lifeycle of high performance scientific software, such as weather and climate models, especially as a means of ensuring efficient use of available computing resources. In particular, sufficiently accurate performance prediction could reduce the effort and experimental computer time required when porting and optimising a climate model to a new machine. In this paper, traditional techniques are used to predict the computation time of a simple shallow water model which is illustrative of the computation (and communication) involved in climate models. These models are compared with real execution data gathered on AMD Opteron-based systems, including several phases of the U.K. academic community HPC resource, HECToR. Some success is had in relating source code to achieved performance for the K10 series of Opterons, but the method is found to be inadequate for the next-generation Interlagos processor. The experience leads to the investigation of a data-driven application benchmarking approach to performance modelling. Results for an early version of the approach are presented using the shallow model as an example.

A data driven approach for automating vehicle activated signs

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Vehicle activated signs (VAS) display a warning message when drivers exceed a particular threshold. VAS are often installed on local roads to display a warning message depending on the speed of the approaching vehicles. VAS are usually powered by electricity; however, battery and solar powered VAS are also commonplace. This thesis investigated devel-opment of an automatic trigger speed of vehicle activated signs in order to influence driver behaviour, the effect of which has been measured in terms of reduced mean speed and low standard deviation. A comprehen-sive understanding of the effectiveness of the trigger speed of the VAS on driver behaviour was established by systematically collecting data. Specif-ically, data on time of day, speed, length and direction of the vehicle have been collected for the purpose, using Doppler radar installed at the road. A data driven calibration method for the radar used in the experiment has also been developed and evaluated. Results indicate that trigger speed of the VAS had variable effect on driv-ers’ speed at different sites and at different times of the day. It is evident that the optimal trigger speed should be set near the 85th percentile speed, to be able to lower the standard deviation. In the case of battery and solar powered VAS, trigger speeds between the 50th and 85th per-centile offered the best compromise between safety and power consump-tion. Results also indicate that different classes of vehicles report differ-ences in mean speed and standard deviation; on a highway, the mean speed of cars differs slightly from the mean speed of trucks, whereas a significant difference was observed between the classes of vehicles on lo-cal roads. A differential trigger speed was therefore investigated for the sake of completion. A data driven approach using Random forest was found to be appropriate in predicting trigger speeds respective to types of vehicles and traffic conditions. The fact that the predicted trigger speed was found to be consistently around the 85th percentile speed justifies the choice of the automatic model.

Análise comparativa da eficiência operacional entre bancos comerciais operando no Brasil e em economias desenvolvidas, utilizando o modelo não paramétrico de data envelopment analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

O presente trabalho visa comparar, através do modelo de Data Envelopment Analysis orientado a inputs, a eficiência dos bancos comerciais que atuam em economias desenvolvidas dos países do G10 com a eficiência dos bancos comerciais que atuam no mercado brasileiro. Primeiramente, os bancos são comparados utilizando-se um modelo ‘simples’, que considera somente os resultados das operações de cada banco e não contempla as características econômicas e regulatórias de cada mercado. Na sequência, um modelo ‘completo’ é introduzido, incorporando as características do ambiente de negócios de cada país, além dos resultados de cada banco. Os resultados obtidos evidenciam que as variáveis ambientais exercem grande influência na eficiência da indústria bancária. Os bancos que atuam no Brasil, de forma geral, mostraram-se mais eficientes do que os bancos que atuam nas economias mais desenvolvidas, quando consideramos o impacto das variáveis ambientais na eficiência das instituições.

Factors affecting the technical efficiency of production of the Brazilian banking system : a comparison of four statistical models in the context of data envelopment analysis

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper uses an output oriented Data Envelopment Analysis (DEA) measure of technical efficiency to assess the technical efficiencies of the Brazilian banking system. Four approaches to estimation are compared in order to assess the significance of factors affecting inefficiency. These are nonparametric Analysis of Covariance, maximum likelihood using a family of exponential distributions, maximum likelihood using a family of truncated normal distributions, and the normal Tobit model. The sole focus of the paper is on a combined measure of output and the data analyzed refers to the year 2001. The factors of interest in the analysis and likely to affect efficiency are bank nature (multiple and commercial), bank type (credit, business, bursary and retail), bank size (large, medium, small and micro), bank control (private and public), bank origin (domestic and foreign), and non-performing loans. The latter is a measure of bank risk. All quantitative variables, including non-performing loans, are measured on a per employee basis. The best fits to the data are provided by the exponential family and the nonparametric Analysis of Covariance. The significance of a factor however varies according to the model fit although it can be said that there is some agreements between the best models. A highly significant association in all models fitted is observed only for nonperforming loans. The nonparametric Analysis of Covariance is more consistent with the inefficiency median responses observed for the qualitative factors. The findings of the analysis reinforce the significant association of the level of bank inefficiency, measured by DEA residuals, with the risk of bank failure.

«
1
2
3
4
5
6
7
8
...
60
61
»