31 resultados para data-driven modelling


Relevância:

80.00% 80.00%

Publicador:

Resumo:

A major challenge in text mining for biomedicine is automatically extracting protein-protein interactions from the vast amount of biomedical literature. We have constructed an information extraction system based on the Hidden Vector State (HVS) model for protein-protein interactions. The HVS model can be trained using only lightly annotated data whilst simultaneously retaining sufficient ability to capture the hierarchical structure. When applied in extracting protein-protein interactions, we found that it performed better than other established statistical methods and achieved 61.5% in F-score with balanced recall and precision values. Moreover, the statistical nature of the pure data-driven HVS model makes it intrinsically robust and it can be easily adapted to other domains.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Many tests of financial contagion require a definition of the dates separating calm from crisis periods. We propose to use a battery of break search procedures for individual time series to objectively identify potential break dates in relationships between countries. Applied to the biggest European stock markets and combined with two well established tests for financial contagion, this approach results in break dates which correctly identify the timing of changes in cross-country transmission mechanisms. Application of break search procedures breathes new life into the established contagion tests, allowing for an objective, data-driven timing of crisis periods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper demonstrates that the conventional approach of using official liberalisation dates as the only existing breakdates could lead to inaccurate conclusions as to the effect of the underlying liberalisation policies. It also proposes an alternative paradigm for obtaining more robust estimates of volatility changes around official liberalisation dates and/or other important market events. By focusing on five East Asian emerging markets, all of which liberalised their financial markets in the late, and by using recent advances in the econometrics of structural change, it shows that (i) the detected breakdates in the volatility of stock market returns can be dramatically different to official liberalisation dates and (ii) the use of official liberalisation dates as breakdates can readily entail inaccurate inference. In contrast, the use of data-driven techniques for the detection of multiple structural changes leads to a richer and inevitably more accurate pattern of volatility evolution emerges in comparison with focussing on official liberalisation dates.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper investigates whether the non-normality typically observed in daily stock-market returns could arise because of the joint existence of breaks and GARCH effects. It proposes a data-driven procedure to credibly identify the number and timing of breaks and applies it on the benchmark stock-market indices of 27 OECD countries. The findings suggest that a substantial element of the observed deviations from normality might indeed be due to the co-existence of breaks and GARCH effects. However, the presence of structural changes is found to be the primary reason for the non-normality and not the GARCH effects. Also, there is still some remaining excess kurtosis that is unlikely to be linked to the specification of the conditional volatility or the presence of breaks. Finally, an interesting sideline result implies that GARCH models have limited capacity in forecasting stock-market volatility.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This article focuses on the deviations from normality of stock returns before and after a financial liberalisation reform, and shows the extent to which inference based on statistical measures of stock market efficiency can be affected by not controlling for breaks. Drawing from recent advances in the econometrics of structural change, it compares the distribution of the returns of five East Asian emerging markets when breaks in the mean and variance are either (i) imposed using certain official liberalisation dates or (ii) detected non-parametrically using a data-driven procedure. The results suggest that measuring deviations from normality of stock returns with no provision for potentially existing breaks incorporates substantial bias. This is likely to severely affect any inference based on the corresponding descriptive or test statistics.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

As torrents of new data now emerge from microbial genomics, bioinformatic prediction of immunogenic epitopes remains challenging but vital. In silico methods often produce paradoxically inconsistent results: good prediction rates on certain test sets but not others. The inherent complexity of immune presentation and recognition processes complicates epitope prediction. Two encouraging developments – data driven artificial intelligence sequence-based methods for epitope prediction and molecular modeling methods based on three-dimensional protein structures – offer hope for the future.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

This paper will explore a data-driven approach called Sales Resource Management (SRM) that can provide real insight into sales management. The DSMT (Diagnosis, Strategy, Metrics and Tools) framework can be used to solve field sales management challenges. This paper focus on the 6P's strategy of SRM and illustrates how to use them to solve the CAPS (Concentration, Attrition, Performance and Spend) challenges. © 2010 IEEE.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Most of the common techniques for estimating conditional probability densities are inappropriate for applications involving periodic variables. In this paper we apply two novel techniques to the problem of extracting the distribution of wind vector directions from radar scatterometer data gathered by a remote-sensing satellite.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Common approaches to IP-traffic modelling have featured the use of stochastic models, based on the Markov property, which can be classified into black box and white box models based on the approach used for modelling traffic. White box models, are simple to understand, transparent and have a physical meaning attributed to each of the associated parameters. To exploit this key advantage, this thesis explores the use of simple classic continuous-time Markov models based on a white box approach, to model, not only the network traffic statistics but also the source behaviour with respect to the network and application. The thesis is divided into two parts: The first part focuses on the use of simple Markov and Semi-Markov traffic models, starting from the simplest two-state model moving upwards to n-state models with Poisson and non-Poisson statistics. The thesis then introduces the convenient to use, mathematically derived, Gaussian Markov models which are used to model the measured network IP traffic statistics. As one of the most significant contributions, the thesis establishes the significance of the second-order density statistics as it reveals that, in contrast to first-order density, they carry much more unique information on traffic sources and behaviour. The thesis then exploits the use of Gaussian Markov models to model these unique features and finally shows how the use of simple classic Markov models coupled with use of second-order density statistics provides an excellent tool for capturing maximum traffic detail, which in itself is the essence of good traffic modelling. The second part of the thesis, studies the ON-OFF characteristics of VoIP traffic with reference to accurate measurements of the ON and OFF periods, made from a large multi-lingual database of over 100 hours worth of VoIP call recordings. The impact of the language, prosodic structure and speech rate of the speaker on the statistics of the ON-OFF periods is analysed and relevant conclusions are presented. Finally, an ON-OFF VoIP source model with log-normal transitions is contributed as an ideal candidate to model VoIP traffic and the results of this model are compared with those of previously published work.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

It is generally believed that the structural reforms that were introduced in India following the macro-economic crisis of 1991 ushered in competition and forced companies to become more efficient. However, whether the post-1991 growth is an outcome of more efficient use of resources or greater use of factor inputs remains an open empirical question. In this paper, we use plant-level data from 1989–1990 and 2000–2001 to address this question. Our results indicate that while there was an increase in the productivity of factor inputs during the 1990s, most of the growth in value added is explained by growth in the use of factor inputs. We also find that median technical efficiency declined in all but one of the industries between 1989–1990 and 2000–2001, and that change in technical efficiency explains a very small proportion of the change in gross value added.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

DUE TO COPYRIGHT RESTRICTIONS ONLY AVAILABLE FOR CONSULTATION AT ASTON UNIVERSITY LIBRARY AND INFORMATION SERVICES WITH PRIOR ARRANGEMENT

Relevância:

40.00% 40.00%

Publicador:

Resumo:

We argue that, for certain constrained domains, elaborate model transformation technologies-implemented from scratch in general-purpose programming languages-are unnecessary for model-driven engineering; instead, lightweight configuration of commercial off-the-shelf productivity tools suffices. In particular, in the CancerGrid project, we have been developing model-driven techniques for the generation of software tools to support clinical trials. A domain metamodel captures the community's best practice in trial design. A scientist authors a trial protocol, modelling their trial by instantiating the metamodel; customized software artifacts to support trial execution are generated automatically from the scientist's model. The metamodel is expressed as an XML Schema, in such a way that it can be instantiated by completing a form to generate a conformant XML document. The same process works at a second level for trial execution: among the artifacts generated from the protocol are models of the data to be collected, and the clinician conducting the trial instantiates such models in reporting observations-again by completing a form to create a conformant XML document, representing the data gathered during that observation. Simple standard form management tools are all that is needed. Our approach is applicable to a wide variety of information-modelling domains: not just clinical trials, but also electronic public sector computing, customer relationship management, document workflow, and so on. © 2012 Springer-Verlag.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Using the analogy between lateral convection of heat and the two-phase flow in bubble columns, alternative turbulence modelling methods were analysed. The k-ε turbulence and Reynolds stress models were used to predict the buoyant motion of fluids where a density difference arises due to the introduction of heat or a discrete phase. A large height to width aspect ratio cavity was employed in the transport of heat and it was shown that the Reynolds stress model with the use of velocity profiles including the laminar flow solution resulted in turbulent vortices developing. The turbulence models were then applied to the simulation of gas-liquid flow for a 5:1 height to width aspect ratio bubble column. In the case of a gas superficial velocity of 0.02 m s-1 it was determined that employing the Reynolds stress model yielded the most realistic simulation results. © 2003 Elsevier B.V. All rights reserved.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Developers of interactive software are confronted by an increasing variety of software tools to help engineer the interactive aspects of software applications. Not only do these tools fall into different categories in terms of functionality, but within each category there is a growing number of competing tools with similar, although not identical, features. Choice of user interface development tool (UIDT) is therefore becoming increasingly complex.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The breadth and depth of available clinico-genomic information, present an enormous opportunity for improving our ability to study disease mechanisms and meet the individualised medicine needs. A difficulty occurs when the results are to be transferred 'from bench to bedside'. Diversity of methods is one of the causes, but the most critical one relates to our inability to share and jointly exploit data and tools. This paper presents a perspective on current state-of-the-art in the analysis of clinico-genomic data and its relevance to medical decision support. It is an attempt to investigate the issues related to data and knowledge integration. Copyright © 2010 Inderscience Enterprises Ltd.