528 resultados para PROBABILITY
Resumo:
Early models of bankruptcy prediction employed financial ratios drawn from pre-bankruptcy financial statements and performed well both in-sample and out-of-sample. Since then there has been an ongoing effort in the literature to develop models with even greater predictive performance. A significant innovation in the literature was the introduction into bankruptcy prediction models of capital market data such as excess stock returns and stock return volatility, along with the application of the Black–Scholes–Merton option-pricing model. In this note, we test five key bankruptcy models from the literature using an upto- date data set and find that they each contain unique information regarding the probability of bankruptcy but that their performance varies over time. We build a new model comprising key variables from each of the five models and add a new variable that proxies for the degree of diversification within the firm. The degree of diversification is shown to be negatively associated with the risk of bankruptcy. This more general model outperforms the existing models in a variety of in-sample and out-of-sample tests.
Resumo:
Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs
Resumo:
Maternal and infant mortality is a global health issue with a significant social and economic impact. Each year, over half a million women worldwide die due to complications related to pregnancy or childbirth, four million infants die in the first 28 days of life, and eight million infants die in the first year. Ninety-nine percent of maternal and infant deaths are in developing countries. Reducing maternal and infant mortality is among the key international development goals. In China, the national maternal mortality ratio and infant mortality rate were reduced greatly in the past two decades, yet a large discrepancy remains between urban and rural areas. To address this problem, a large-scale Safe Motherhood Programme was initiated in 2000. The programme was implemented in Guangxi in 2003. Interventions in the programme included both demand-side and supply side-interventions focusing on increasing health service use and improving birth outcomes. Little is known about the effects and economic outcomes of the Safe Motherhood Programme in Guangxi, although it has been implemented for seven years. The aim of this research is to estimate the effectiveness and cost-effectiveness of the interventions in the Safe Motherhood Programme in Guangxi, China. The objectives of this research include: 1. To evaluate whether the changes of health service use and birth outcomes are associated with the interventions in the Safe Motherhood Programme. 2. To estimate the cost-effectiveness of the interventions in the Safe Motherhood Programme and quantify the uncertainty surrounding the decision. 3. To assess the expected value of perfect information associated with both the whole decision and individual parameters, and interpret the findings to inform priority setting in further research and policy making in this area. A quasi-experimental study design was used in this research to assess the effectiveness of the programme in increasing health service use and improving birth outcomes. The study subjects were 51 intervention counties and 30 control counties. Data on the health service use, birth outcomes and socio-economic factors from 2001 to 2007 were collected from the programme database and statistical yearbooks. Based on the profile plots of the data, general linear mixed models were used to evaluate the effectiveness of the programme while controlling for the effects of baseline levels of the response variables, change of socio-economic factors over time and correlations among repeated measurements from the same county. Redundant multicollinear variables were deleted from the mixed model using the results of the multicollinearity diagnoses. For each response variable, the best covariance structure was selected from 15 alternatives according to the fit statistics including Akaike information criterion, Finite-population corrected Akaike information criterion, and Schwarz.s Bayesian information criterion. Residual diagnostics were used to validate the model assumptions. Statistical inferences were made to show the effect of the programme on health service use and birth outcomes. A decision analytic model was developed to evaluate the cost-effectiveness of the programme, quantify the decision uncertainty, and estimate the expected value of perfect information associated with the decision. The model was used to describe the transitions between health states for women and infants and reflect the change of both costs and health benefits associated with implementing the programme. Result gained from the mixed models and other relevant evidence identified were synthesised appropriately to inform the input parameters of the model. Incremental cost-effectiveness ratios of the programme were calculated for the two groups of intervention counties over time. Uncertainty surrounding the parameters was dealt with using probabilistic sensitivity analysis, and uncertainty relating to model assumptions was handled using scenario analysis. Finally the expected value of perfect information for both the whole model and individual parameters in the model were estimated to inform priority setting in further research in this area.The annual change rates of the antenatal care rate and the institutionalised delivery rate were improved significantly in the intervention counties after the programme was implemented. Significant improvements were also found in the annual change rates of the maternal mortality ratio, the infant mortality rate, the incidence rate of neonatal tetanus and the mortality rate of neonatal tetanus in the intervention counties after the implementation of the programme. The annual change rate of the neonatal mortality rate was also improved, although the improvement was only close to statistical significance. The influences of the socio-economic factors on the health service use indicators and birth outcomes were identified. The rural income per capita had a significant positive impact on the health service use indicators, and a significant negative impact on the birth outcomes. The number of beds in healthcare institutions per 1,000 population and the number of rural telephone subscribers per 1,000 were found to be positively significantly related to the institutionalised delivery rate. The length of highway per square kilometre negatively influenced the maternal mortality ratio. The percentage of employed persons in the primary industry had a significant negative impact on the institutionalised delivery rate, and a significant positive impact on the infant mortality rate and neonatal mortality rate. The incremental costs of implementing the programme over the existing practice were US $11.1 million from the societal perspective, and US $13.8 million from the perspective of the Ministry of Health. Overall, 28,711 life years were generated by the programme, producing an overall incremental cost-effectiveness ratio of US $386 from the societal perspective, and US $480 from the perspective of the Ministry of Health, both of which were below the threshold willingness-to-pay ratio of US $675. The expected net monetary benefit generated by the programme was US $8.3 million from the societal perspective, and US $5.5 million from the perspective of the Ministry of Health. The overall probability that the programme was cost-effective was 0.93 and 0.89 from the two perspectives, respectively. The incremental cost-effectiveness ratio of the programme was insensitive to the different estimates of the three parameters relating to the model assumptions. Further research could be conducted to reduce the uncertainty surrounding the decision, in which the upper limit of investment was US $0.6 million from the societal perspective, and US $1.3 million from the perspective of the Ministry of Health. It is also worthwhile to get a more precise estimate of the improvement of infant mortality rate. The population expected value of perfect information for individual parameters associated with this parameter was US $0.99 million from the societal perspective, and US $1.14 million from the perspective of the Ministry of Health. The findings from this study have shown that the interventions in the Safe Motherhood Programme were both effective and cost-effective in increasing health service use and improving birth outcomes in rural areas of Guangxi, China. Therefore, the programme represents a good public health investment and should be adopted and further expanded to an even broader area if possible. This research provides economic evidence to inform efficient decision making in improving maternal and infant health in developing countries.
Resumo:
Stochastic models for competing clonotypes of T cells by multivariate, continuous-time, discrete state, Markov processes have been proposed in the literature by Stirk, Molina-París and van den Berg (2008). A stochastic modelling framework is important because of rare events associated with small populations of some critical cell types. Usually, computational methods for these problems employ a trajectory-based approach, based on Monte Carlo simulation. This is partly because the complementary, probability density function (PDF) approaches can be expensive but here we describe some efficient PDF approaches by directly solving the governing equations, known as the Master Equation. These computations are made very efficient through an approximation of the state space by the Finite State Projection and through the use of Krylov subspace methods when evolving the matrix exponential. These computational methods allow us to explore the evolution of the PDFs associated with these stochastic models, and bimodal distributions arise in some parameter regimes. Time-dependent propensities naturally arise in immunological processes due to, for example, age-dependent effects. Incorporating time-dependent propensities into the framework of the Master Equation significantly complicates the corresponding computational methods but here we describe an efficient approach via Magnus formulas. Although this contribution focuses on the example of competing clonotypes, the general principles are relevant to multivariate Markov processes and provide fundamental techniques for computational immunology.
Resumo:
The stochastic simulation algorithm was introduced by Gillespie and in a different form by Kurtz. There have been many attempts at accelerating the algorithm without deviating from the behavior of the simulated system. The crux of the explicit τ-leaping procedure is the use of Poisson random variables to approximate the number of occurrences of each type of reaction event during a carefully selected time period, τ. This method is acceptable providing the leap condition, that no propensity function changes “significantly” during any time-step, is met. Using this method there is a possibility that species numbers can, artificially, become negative. Several recent papers have demonstrated methods that avoid this situation. One such method classifies, as critical, those reactions in danger of sending species populations negative. At most, one of these critical reactions is allowed to occur in the next time-step. We argue that the criticality of a reactant species and its dependent reaction channels should be related to the probability of the species number becoming negative. This way only reactions that, if fired, produce a high probability of driving a reactant population negative are labeled critical. The number of firings of more reaction channels can be approximated using Poisson random variables thus speeding up the simulation while maintaining the accuracy. In implementing this revised method of criticality selection we make use of the probability distribution from which the random variable describing the change in species number is drawn. We give several numerical examples to demonstrate the effectiveness of our new method.
Resumo:
This thesis examines the ways in which citizens find out about socio-political issues. The project set out to discover how audience characteristics such as scepticism towards the media, gratifications sought, need for cognition and political interest influence information selection. While most previous information choice studies have focused on how individuals select from a narrow range of media types, this thesis considered a much wider sweep of the information landscape. This approach was taken to obtain an understanding of information choices in a more authentic context - in everyday life, people are not simply restricted to one or two news sources. Rather, they may obtain political information from a vast range of information sources, including media sources (e.g. radio, television, newspapers) and sources from beyond the media (eg. interpersonal sources, public speaking events, social networking websites). Thus, the study included both media and non-news media information sources. Data collection for the project consisted of a written, postal survey. The survey was administered to a probability sample in the greater Brisbane region, which is the third largest city in Australia. Data was collected during March and April 2008, approximately four months after the 2007 Australian Federal Election. Hence, the study was conducted in a non-election context. 585 usable surveys were obtained. In addition to measuring the attitudinal characteristics listed above, respondents were surveyed as to which information sources (eg. television shows, radio stations, websites and festivals) they usually use to find out about socio-political issues. Multiple linear regression analysis was conducted to explore patterns of influence between the audience characteristics and information consumption patterns. The results of this analysis indicated an apparent difference between the way citizens use news media sources and the way they use information sources from beyond the news media. In essence, it appears that non-news media information sources are used very deliberately to seek socio-political information, while media sources are used in a less purposeful way. If media use in a non-election context, such as that of the present study, is not primarily concerned with deliberate information seeking, media use must instead have other primary purposes, with political information acquisition as either a secondary driver, or a by-product of that primary purpose. It appears, then, that political information consumption in a media-saturated society is more about routine ‘practices’ than it is about ‘information seeking’. The suggestion that media use is no longer primarily concerned with information seeking, but rather, is simply a behaviour which occurs within the broader set of everyday practices reflects Couldry’s (2004) media as practice paradigm. These findings highlight the need for more authentic and holistic contexts for media research. It is insufficient to consider information choices in isolation, or even from a wider range of information sources, such as that incorporated in the present study. Future media research must take greater account of the broader social contexts and practices in which media-oriented behaviours occur. The findings also call into question the previously assumed centrality of trust to information selection decisions. Citizens regularly use media they do not trust to find out about politics. If people are willing to use information sources they do not trust for democratically important topics such as politics, it is important that citizens possess the media literacy skills to effectively understand and evaluate the information they are presented with. Without the application of such media literacy skills, a steady diet of ‘fast food’ media may result in uninformed or misinformed voting decisions, which have implications for the effectiveness of democratic processes. This research has emphasized the need for further holistic and authentically contextualised media use research, to better understand how citizens use information sources to find out about important topics such as politics.
Resumo:
Modelling how a word is activated in human memory is an important requirement for determining the probability of recall of a word in an extra-list cueing experiment. The spreading activation, spooky-action-at-a-distance and entanglement models have all been used to model the activation of a word. Recently a hypothesis was put forward that the mean activation levels of the respective models are as follows: Spreading � Entanglment � Spooking-action-at-a-distance This article investigates this hypothesis by means of a substantial empirical analysis of each model using the University of South Florida word association, rhyme and word norms.
Resumo:
This thesis investigates profiling and differentiating customers through the use of statistical data mining techniques. The business application of our work centres on examining individuals’ seldomly studied yet critical consumption behaviour over an extensive time period within the context of the wireless telecommunication industry; consumption behaviour (as oppose to purchasing behaviour) is behaviour that has been performed so frequently that it become habitual and involves minimal intentions or decision making. Key variables investigated are the activity initialised timestamp and cell tower location as well as the activity type and usage quantity (e.g., voice call with duration in seconds); and the research focuses are on customers’ spatial and temporal usage behaviour. The main methodological emphasis is on the development of clustering models based on Gaussian mixture models (GMMs) which are fitted with the use of the recently developed variational Bayesian (VB) method. VB is an efficient deterministic alternative to the popular but computationally demandingMarkov chainMonte Carlo (MCMC) methods. The standard VBGMMalgorithm is extended by allowing component splitting such that it is robust to initial parameter choices and can automatically and efficiently determine the number of components. The new algorithm we propose allows more effective modelling of individuals’ highly heterogeneous and spiky spatial usage behaviour, or more generally human mobility patterns; the term spiky describes data patterns with large areas of low probability mixed with small areas of high probability. Customers are then characterised and segmented based on the fitted GMM which corresponds to how each of them uses the products/services spatially in their daily lives; this is essentially their likely lifestyle and occupational traits. Other significant research contributions include fitting GMMs using VB to circular data i.e., the temporal usage behaviour, and developing clustering algorithms suitable for high dimensional data based on the use of VB-GMM.
Resumo:
Biochemical reactions underlying genetic regulation are often modelled as a continuous-time, discrete-state, Markov process, and the evolution of the associated probability density is described by the so-called chemical master equation (CME). However the CME is typically difficult to solve, since the state-space involved can be very large or even countably infinite. Recently a finite state projection method (FSP) that truncates the state-space was suggested and shown to be effective in an example of a model of the Pap-pili epigenetic switch. However in this example, both the model and the final time at which the solution was computed, were relatively small. Presented here is a Krylov FSP algorithm based on a combination of state-space truncation and inexact matrix-vector product routines. This allows larger-scale models to be studied and solutions for larger final times to be computed in a realistic execution time. Additionally the new method computes the solution at intermediate times at virtually no extra cost, since it is derived from Krylov-type methods for computing matrix exponentials. For the purpose of comparison the new algorithm is applied to the model of the Pap-pili epigenetic switch, where the original FSP was first demonstrated. Also the method is applied to a more sophisticated model of regulated transcription. Numerical results indicate that the new approach is significantly faster and extendable to larger biological models.
Resumo:
We consider a continuous time model for election timing in a Majoritarian Parliamentary System where the government maintains a constitutional right to call an early election. Our model is based on the two-party-preferred data that measure the popularity of the government and the opposition over time. We describe the poll process by a Stochastic Differential Equation (SDE) and use a martingale approach to derive a Partial Differential Equation (PDE) for the government’s expected remaining life in office. A comparison is made between a three-year and a four-year maximum term and we also provide the exercise boundary for calling an election. Impacts on changes in parameters in the SDE, the probability of winning the election and maximum terms on the call exercise boundaries are discussed and analysed. An application of our model to the Australian Federal Election for House of Representatives is also given.
Resumo:
In this paper we construct a mathematical model for the genetic regulatory network of the lactose operon. This mathematical model contains transcription and translation of the lactose permease (LacY) and a reporter gene GFP. The probability of transcription of LacY is determined by 14 binding states out of all 50 possible binding states of the lactose operon based on the quasi-steady-state assumption for the binding reactions, while we calculate the probability of transcription for the reporter gene GFP based on 5 binding states out of 19 possible binding states because the binding site O2 is missing for this reporter gene. We have tested different mechanisms for the transport of thio-methylgalactoside (TMG) and the effect of different Hill coefficients on the simulated LacY expression levels. Using this mathematical model we have realized one of the experimental results with different LacY concentrations, which are induced by different concentrations of TMG.
Resumo:
Statistics of the estimates of tricoherence are obtained analytically for nonlinear harmonic random processes with known true tricoherence. Expressions are presented for the bias, variance, and probability distributions of estimates of tricoherence as functions of the true tricoherence and the number of realizations averaged in the estimates. The expressions are applicable to arbitrary higher order coherence and arbitrary degree of interaction between modes. Theoretical results are compared with those obtained from numerical simulations of nonlinear harmonic random processes. Estimation of true values of tricoherence given observed values is also discussed
Resumo:
We consider a robust filtering problem for uncertain discrete-time, homogeneous, first-order, finite-state hidden Markov models (HMMs). The class of uncertain HMMs considered is described by a conditional relative entropy constraint on measures perturbed from a nominal regular conditional probability distribution given the previous posterior state distribution and the latest measurement. Under this class of perturbations, a robust infinite horizon filtering problem is first formulated as a constrained optimization problem before being transformed via variational results into an unconstrained optimization problem; the latter can be elegantly solved using a risk-sensitive information-state based filtering.
Resumo:
Determination of the placement and rating of transformers and feeders are the main objective of the basic distribution network planning. The bus voltage and the feeder current are two constraints which should be maintained within their standard range. The distribution network planning is hardened when the planning area is located far from the sources of power generation and the infrastructure. This is mainly as a consequence of the voltage drop, line loss and system reliability. Long distance to supply loads causes a significant amount of voltage drop across the distribution lines. Capacitors and Voltage Regulators (VRs) can be installed to decrease the voltage drop. This long distance also increases the probability of occurrence of a failure. This high probability leads the network reliability to be low. Cross-Connections (CC) and Distributed Generators (DGs) are devices which can be employed for improving system reliability. Another main factor which should be considered in planning of distribution networks (in both rural and urban areas) is load growth. For supporting this factor, transformers and feeders are conventionally upgraded which applies a large cost. Installation of DGs and capacitors in a distribution network can alleviate this issue while the other benefits are gained. In this research, a comprehensive planning is presented for the distribution networks. Since the distribution network is composed of low and medium voltage networks, both are included in this procedure. However, the main focus of this research is on the medium voltage network planning. The main objective is to minimize the investment cost, the line loss, and the reliability indices for a study timeframe and to support load growth. The investment cost is related to the distribution network elements such as the transformers, feeders, capacitors, VRs, CCs, and DGs. The voltage drop and the feeder current as the constraints are maintained within their standard range. In addition to minimizing the reliability and line loss costs, the planned network should support a continual growth of loads, which is an essential concern in planning distribution networks. In this thesis, a novel segmentation-based strategy is proposed for including this factor. Using this strategy, the computation time is significantly reduced compared with the exhaustive search method as the accuracy is still acceptable. In addition to being applicable for considering the load growth, this strategy is appropriate for inclusion of practical load characteristic (dynamic), as demonstrated in this thesis. The allocation and sizing problem has a discrete nature with several local minima. This highlights the importance of selecting a proper optimization method. Modified discrete particle swarm optimization as a heuristic method is introduced in this research to solve this complex planning problem. Discrete nonlinear programming and genetic algorithm as an analytical and a heuristic method respectively are also applied to this problem to evaluate the proposed optimization method.
Resumo:
Approximately 20 years have passed now since the NTSB issued its original recommendation to expedite development, certification and production of low-cost proximity warning and conflict detection systems for general aviation [1]. While some systems are in place (TCAS [2]), ¡¨see-and-avoid¡¨ remains the primary means of separation between light aircrafts sharing the national airspace. The requirement for a collision avoidance or sense-and-avoid capability onboard unmanned aircraft has been identified by leading government, industry and regulatory bodies as one of the most significant challenges facing the routine operation of unmanned aerial systems (UAS) in the national airspace system (NAS) [3, 4]. In this thesis, we propose and develop a novel image-based collision avoidance system to detect and avoid an upcoming conflict scenario (with an intruder) without first estimating or filtering range. The proposed collision avoidance system (CAS) uses relative bearing ƒÛ and angular-area subtended ƒê , estimated from an image, to form a test statistic AS C . This test statistic is used in a thresholding technique to decide if a conflict scenario is imminent. If deemed necessary, the system will command the aircraft to perform a manoeuvre based on ƒÛ and constrained by the CAS sensor field-of-view. Through the use of a simulation environment where the UAS is mathematically modelled and a flight controller developed, we show that using Monte Carlo simulations a probability of a Mid Air Collision (MAC) MAC RR or a Near Mid Air Collision (NMAC) RiskRatio can be estimated. We also show the performance gain this system has over a simplified version (bearings-only ƒÛ ). This performance gain is demonstrated in the form of a standard operating characteristic curve. Finally, it is shown that the proposed CAS performs at a level comparable to current manned aviations equivalent level of safety (ELOS) expectations for Class E airspace. In some cases, the CAS may be oversensitive in manoeuvring the owncraft when not necessary, but this constitutes a more conservative and therefore safer, flying procedures in most instances.