Biblioteca Digital

914 resultados para Dissociation probability

Statistical analyses of weigh-in-motion data for bridge live load development

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper discusses the statistical analyses used to derive bridge live loads models for Hong Kong from a 10-year weigh-in-motion (WIM) data. The statistical concepts required and the terminologies adopted in the development of bridge live load models are introduced. This paper includes studies for representative vehicles from the large amount of WIM data in Hong Kong. Different load affecting parameters such as gross vehicle weights, axle weights, axle spacings, average daily number of trucks etc are first analyzed by various stochastic processes in order to obtain the mathematical distributions of these parameters. As a prerequisite to determine accurate bridge design loadings in Hong Kong, this study not only takes advantages of code formulation methods used internationally but also presents a new method for modelling collected WIM data using a statistical approach.

Fitness and fatness in childhood obesity : implications of physical activity

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Overweight and obesity are a significant cause of poor health worldwide, particularly in conjunction with low levels of physical activity (PA). PA is health-protective and essential for the physical growth and development of children, promoting physical and psychological health while simultaneously increasing the probability of remaining active as an adult. However, many obese children and adolescents have a unique set of physiological, biomechanical, and neuromuscular barriers to PA that they must overcome. It is essential to understand the influence of these barriers on an obese child's motivation in order to exercise and tailor exercise programs to the special needs of this population. Chapter Outline • Introduction • Defining Physical Activity, Exercise, and Physical Fitness • Physical Activity, Physical Fitness, And Motor Competence In Obese Children • Physical Activity and Obesity in Children • Physical Fitness in Obese Children • Balance and Gait in Obese Children • Motor Competence in Obese Children • Physical Activity Guidelines for Obese Children • Clinical Assessment of the Obese Child • Physical Activity Characteristics: Mode • Physical Activity Characteristics: Intensity • Physical Activity Characteristics: Frequency • Physical Activity Characteristics: Duration • Conclusion

The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A3 √((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.

Empirical minimization

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We investigate the behavior of the empirical minimization algorithm using various methods. We first analyze it by comparing the empirical, random, structure and the original one on the class, either in an additive sense, via the uniform law of large numbers, or in a multiplicative sense, using isomorphic coordinate projections. We then show that a direct analysis of the empirical minimization algorithm yields a significantly better bound, and that the estimates we obtain are essentially sharp. The method of proof we use is based on Talagrand’s concentration inequality for empirical processes.

On the optimality of sample-based estimates of the expectation of the empirical minimizer

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We study sample-based estimates of the expectation of the function produced by the empirical minimization algorithm. We investigate the extent to which one can estimate the rate of convergence of the empirical minimizer in a data dependent manner. We establish three main results. First, we provide an algorithm that upper bounds the expectation of the empirical minimizer in a completely data-dependent manner. This bound is based on a structural result due to Bartlett and Mendelson, which relates expectations to sample averages. Second, we show that these structural upper bounds can be loose, compared to previous bounds. In particular, we demonstrate a class for which the expectation of the empirical minimizer decreases as O(1/n) for sample size n, although the upper bound based on structural properties is Ω(1). Third, we show that this looseness of the bound is inevitable: we present an example that shows that a sharp bound cannot be universally recovered from empirical data.

Classification with a reject option using a hinge loss

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider the problem of binary classification where the classifier can, for a particular cost, choose not to classify an observation. Just as in the conventional classification problem, minimization of the sample average of the cost is a difficult optimization problem. As an alternative, we propose the optimization of a certain convex loss function φ, analogous to the hinge loss used in support vector machines (SVMs). Its convexity ensures that the sample average of this surrogate loss can be efficiently minimized. We study its statistical properties. We show that minimizing the expected surrogate loss—the φ-risk—also minimizes the risk. We also study the rate at which the φ-risk approaches its minimum value. We show that fast rates are possible when the conditional probability P(Y=1|X) is unlikely to be close to certain critical values.

Sparseness vs estimating conditional probabilities : some asymptotic results

Relevância:

10.00% 10.00%

Publicador:

Resumo:

One of the nice properties of kernel classifiers such as SVMs is that they often produce sparse solutions. However, the decision functions of these classifiers cannot always be used to estimate the conditional probability of the class label. We investigate the relationship between these two properties and show that these are intimately related: sparseness does not occur when the conditional probabilities can be unambiguously estimated. We consider a family of convex loss functions and derive sharp asymptotic results for the fraction of data that becomes support vectors. This enables us to characterize the exact trade-off between sparseness and the ability to estimate conditional probabilities for these loss functions.

Adaboost is consistent

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The risk, or probability of error, of the classifier produced by the AdaBoost algorithm is investigated. In particular, we consider the stopping strategy to be used in AdaBoost to achieve universal consistency. We show that provided AdaBoost is stopped after n1-ε iterations---for sample size n and ε ∈ (0,1)---the sequence of risks of the classifiers it produces approaches the Bayes risk.

A comparison of alternative bankruptcy prediction models

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Early models of bankruptcy prediction employed financial ratios drawn from pre-bankruptcy financial statements and performed well both in-sample and out-of-sample. Since then there has been an ongoing effort in the literature to develop models with even greater predictive performance. A significant innovation in the literature was the introduction into bankruptcy prediction models of capital market data such as excess stock returns and stock return volatility, along with the application of the Black–Scholes–Merton option-pricing model. In this note, we test five key bankruptcy models from the literature using an upto- date data set and find that they each contain unique information regarding the probability of bankruptcy but that their performance varies over time. We build a new model comprising key variables from each of the five models and add a new variable that proxies for the degree of diversification within the firm. The degree of diversification is shown to be negatively associated with the risk of bankruptcy. This more general model outperforms the existing models in a variety of in-sample and out-of-sample tests.

Vector quantization based Gaussian modeling for speaker verification

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Gaussian mixture models (GMMs) have become an established means of modeling feature distributions in speaker recognition systems. It is useful for experimentation and practical implementation purposes to develop and test these models in an efficient manner particularly when computational resources are limited. A method of combining vector quantization (VQ) with single multi-dimensional Gaussians is proposed to rapidly generate a robust model approximation to the Gaussian mixture model. A fast method of testing these systems is also proposed and implemented. Results on the NIST 1996 Speaker Recognition Database suggest comparable and in some cases an improved verification performance to the traditional GMM based analysis scheme. In addition, previous research for the task of speaker identification indicated a similar system perfomance between the VQ Gaussian based technique and GMMs

An economic evaluation of the Safe Motherhood programme in Guangxi, China

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Maternal and infant mortality is a global health issue with a significant social and economic impact. Each year, over half a million women worldwide die due to complications related to pregnancy or childbirth, four million infants die in the first 28 days of life, and eight million infants die in the first year. Ninety-nine percent of maternal and infant deaths are in developing countries. Reducing maternal and infant mortality is among the key international development goals. In China, the national maternal mortality ratio and infant mortality rate were reduced greatly in the past two decades, yet a large discrepancy remains between urban and rural areas. To address this problem, a large-scale Safe Motherhood Programme was initiated in 2000. The programme was implemented in Guangxi in 2003. Interventions in the programme included both demand-side and supply side-interventions focusing on increasing health service use and improving birth outcomes. Little is known about the effects and economic outcomes of the Safe Motherhood Programme in Guangxi, although it has been implemented for seven years. The aim of this research is to estimate the effectiveness and cost-effectiveness of the interventions in the Safe Motherhood Programme in Guangxi, China. The objectives of this research include: 1. To evaluate whether the changes of health service use and birth outcomes are associated with the interventions in the Safe Motherhood Programme. 2. To estimate the cost-effectiveness of the interventions in the Safe Motherhood Programme and quantify the uncertainty surrounding the decision. 3. To assess the expected value of perfect information associated with both the whole decision and individual parameters, and interpret the findings to inform priority setting in further research and policy making in this area. A quasi-experimental study design was used in this research to assess the effectiveness of the programme in increasing health service use and improving birth outcomes. The study subjects were 51 intervention counties and 30 control counties. Data on the health service use, birth outcomes and socio-economic factors from 2001 to 2007 were collected from the programme database and statistical yearbooks. Based on the profile plots of the data, general linear mixed models were used to evaluate the effectiveness of the programme while controlling for the effects of baseline levels of the response variables, change of socio-economic factors over time and correlations among repeated measurements from the same county. Redundant multicollinear variables were deleted from the mixed model using the results of the multicollinearity diagnoses. For each response variable, the best covariance structure was selected from 15 alternatives according to the fit statistics including Akaike information criterion, Finite-population corrected Akaike information criterion, and Schwarz.s Bayesian information criterion. Residual diagnostics were used to validate the model assumptions. Statistical inferences were made to show the effect of the programme on health service use and birth outcomes. A decision analytic model was developed to evaluate the cost-effectiveness of the programme, quantify the decision uncertainty, and estimate the expected value of perfect information associated with the decision. The model was used to describe the transitions between health states for women and infants and reflect the change of both costs and health benefits associated with implementing the programme. Result gained from the mixed models and other relevant evidence identified were synthesised appropriately to inform the input parameters of the model. Incremental cost-effectiveness ratios of the programme were calculated for the two groups of intervention counties over time. Uncertainty surrounding the parameters was dealt with using probabilistic sensitivity analysis, and uncertainty relating to model assumptions was handled using scenario analysis. Finally the expected value of perfect information for both the whole model and individual parameters in the model were estimated to inform priority setting in further research in this area.The annual change rates of the antenatal care rate and the institutionalised delivery rate were improved significantly in the intervention counties after the programme was implemented. Significant improvements were also found in the annual change rates of the maternal mortality ratio, the infant mortality rate, the incidence rate of neonatal tetanus and the mortality rate of neonatal tetanus in the intervention counties after the implementation of the programme. The annual change rate of the neonatal mortality rate was also improved, although the improvement was only close to statistical significance. The influences of the socio-economic factors on the health service use indicators and birth outcomes were identified. The rural income per capita had a significant positive impact on the health service use indicators, and a significant negative impact on the birth outcomes. The number of beds in healthcare institutions per 1,000 population and the number of rural telephone subscribers per 1,000 were found to be positively significantly related to the institutionalised delivery rate. The length of highway per square kilometre negatively influenced the maternal mortality ratio. The percentage of employed persons in the primary industry had a significant negative impact on the institutionalised delivery rate, and a significant positive impact on the infant mortality rate and neonatal mortality rate. The incremental costs of implementing the programme over the existing practice were US $11.1 million from the societal perspective, and US $13.8 million from the perspective of the Ministry of Health. Overall, 28,711 life years were generated by the programme, producing an overall incremental cost-effectiveness ratio of US $386 from the societal perspective, and US $480 from the perspective of the Ministry of Health, both of which were below the threshold willingness-to-pay ratio of US $675. The expected net monetary benefit generated by the programme was US $8.3 million from the societal perspective, and US $5.5 million from the perspective of the Ministry of Health. The overall probability that the programme was cost-effective was 0.93 and 0.89 from the two perspectives, respectively. The incremental cost-effectiveness ratio of the programme was insensitive to the different estimates of the three parameters relating to the model assumptions. Further research could be conducted to reduce the uncertainty surrounding the decision, in which the upper limit of investment was US $0.6 million from the societal perspective, and US $1.3 million from the perspective of the Ministry of Health. It is also worthwhile to get a more precise estimate of the improvement of infant mortality rate. The population expected value of perfect information for individual parameters associated with this parameter was US $0.99 million from the societal perspective, and US $1.14 million from the perspective of the Ministry of Health. The findings from this study have shown that the interventions in the Safe Motherhood Programme were both effective and cost-effective in increasing health service use and improving birth outcomes in rural areas of Guangxi, China. Therefore, the programme represents a good public health investment and should be adopted and further expanded to an even broader area if possible. This research provides economic evidence to inform efficient decision making in improving maternal and infant health in developing countries.

Stochastic modelling of T cell homeostasis for two competing clonotypes via the master equation

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Stochastic models for competing clonotypes of T cells by multivariate, continuous-time, discrete state, Markov processes have been proposed in the literature by Stirk, Molina-París and van den Berg (2008). A stochastic modelling framework is important because of rare events associated with small populations of some critical cell types. Usually, computational methods for these problems employ a trajectory-based approach, based on Monte Carlo simulation. This is partly because the complementary, probability density function (PDF) approaches can be expensive but here we describe some efficient PDF approaches by directly solving the governing equations, known as the Master Equation. These computations are made very efficient through an approximation of the state space by the Finite State Projection and through the use of Krylov subspace methods when evolving the matrix exponential. These computational methods allow us to explore the evolution of the PDFs associated with these stochastic models, and bimodal distributions arise in some parameter regimes. Time-dependent propensities naturally arise in immunological processes due to, for example, age-dependent effects. Incorporating time-dependent propensities into the framework of the Master Equation significantly complicates the corresponding computational methods but here we describe an efficient approach via Magnus formulas. Although this contribution focuses on the example of competing clonotypes, the general principles are relevant to multivariate Markov processes and provide fundamental techniques for computational immunology.

Look before you leap : a confidence-based method for selecting species criticality while avoiding negative populations in τ-leaping

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The stochastic simulation algorithm was introduced by Gillespie and in a different form by Kurtz. There have been many attempts at accelerating the algorithm without deviating from the behavior of the simulated system. The crux of the explicit τ-leaping procedure is the use of Poisson random variables to approximate the number of occurrences of each type of reaction event during a carefully selected time period, τ. This method is acceptable providing the leap condition, that no propensity function changes “significantly” during any time-step, is met. Using this method there is a possibility that species numbers can, artificially, become negative. Several recent papers have demonstrated methods that avoid this situation. One such method classifies, as critical, those reactions in danger of sending species populations negative. At most, one of these critical reactions is allowed to occur in the next time-step. We argue that the criticality of a reactant species and its dependent reaction channels should be related to the probability of the species number becoming negative. This way only reactions that, if fired, produce a high probability of driving a reactant population negative are labeled critical. The number of firings of more reaction channels can be approximated using Poisson random variables thus speeding up the simulation while maintaining the accuracy. In implementing this revised method of criticality selection we make use of the probability distribution from which the random variable describing the change in species number is drawn. We give several numerical examples to demonstrate the effectiveness of our new method.

Political information choices in a media saturated environment : credibility or convenience?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This thesis examines the ways in which citizens find out about socio-political issues. The project set out to discover how audience characteristics such as scepticism towards the media, gratifications sought, need for cognition and political interest influence information selection. While most previous information choice studies have focused on how individuals select from a narrow range of media types, this thesis considered a much wider sweep of the information landscape. This approach was taken to obtain an understanding of information choices in a more authentic context - in everyday life, people are not simply restricted to one or two news sources. Rather, they may obtain political information from a vast range of information sources, including media sources (e.g. radio, television, newspapers) and sources from beyond the media (eg. interpersonal sources, public speaking events, social networking websites). Thus, the study included both media and non-news media information sources. Data collection for the project consisted of a written, postal survey. The survey was administered to a probability sample in the greater Brisbane region, which is the third largest city in Australia. Data was collected during March and April 2008, approximately four months after the 2007 Australian Federal Election. Hence, the study was conducted in a non-election context. 585 usable surveys were obtained. In addition to measuring the attitudinal characteristics listed above, respondents were surveyed as to which information sources (eg. television shows, radio stations, websites and festivals) they usually use to find out about socio-political issues. Multiple linear regression analysis was conducted to explore patterns of influence between the audience characteristics and information consumption patterns. The results of this analysis indicated an apparent difference between the way citizens use news media sources and the way they use information sources from beyond the news media. In essence, it appears that non-news media information sources are used very deliberately to seek socio-political information, while media sources are used in a less purposeful way. If media use in a non-election context, such as that of the present study, is not primarily concerned with deliberate information seeking, media use must instead have other primary purposes, with political information acquisition as either a secondary driver, or a by-product of that primary purpose. It appears, then, that political information consumption in a media-saturated society is more about routine ‘practices’ than it is about ‘information seeking’. The suggestion that media use is no longer primarily concerned with information seeking, but rather, is simply a behaviour which occurs within the broader set of everyday practices reflects Couldry’s (2004) media as practice paradigm. These findings highlight the need for more authentic and holistic contexts for media research. It is insufficient to consider information choices in isolation, or even from a wider range of information sources, such as that incorporated in the present study. Future media research must take greater account of the broader social contexts and practices in which media-oriented behaviours occur. The findings also call into question the previously assumed centrality of trust to information selection decisions. Citizens regularly use media they do not trust to find out about politics. If people are willing to use information sources they do not trust for democratically important topics such as politics, it is important that citizens possess the media literacy skills to effectively understand and evaluate the information they are presented with. Without the application of such media literacy skills, a steady diet of ‘fast food’ media may result in uninformed or misinformed voting decisions, which have implications for the effectiveness of democratic processes. This research has emphasized the need for further holistic and authentically contextualised media use research, to better understand how citizens use information sources to find out about important topics such as politics.

Modelling the activation of words in human memory : the spreading activation, spooky-activation-at-a-distance and the entanglement models compared

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Modelling how a word is activated in human memory is an important requirement for determining the probability of recall of a word in an extra-list cueing experiment. The spreading activation, spooky-action-at-a-distance and entanglement models have all been used to model the activation of a word. Recently a hypothesis was put forward that the mean activation levels of the respective models are as follows: Spreading � Entanglment � Spooking-action-at-a-distance This article investigates this hypothesis by means of a substantial empirical analysis of each model using the University of South Florida word association, rhyme and word norms.

«
1
2
...
52
53
54
55
56
57
58
...
60
61
»