990 resultados para conditional relative entropy
Resumo:
Illegal pedestrian behaviour is common and is reported as a factor in many pedestrian crashes. Since walking is being promoted for its health and environmental benefits, minimisation of its associated risks is of interest. The risk associated with illegal road crossing is unclear, and better information would assist in setting a rationale for enforcement and priorities for public education. An observation survey of pedestrian behaviour was conducted at signalised intersections in the Brisbane CBD (Queensland, Australia) on typical workdays, using behavioural categories that were identifiable in police crash reports. The survey confirmed high levels of crossing against the lights, or close enough to the lights that they should legally have been used. Measures of exposure for crossing legally, against the lights, and close to the lights were generated by weighting the observation data. Relative risk ratios were calculated for these categories using crash data from the observation sites and adjacent midblocks. Crossing against the lights and crossing close to the lights both exhibited a crash risk per crossing event approximately eight times that of legal crossing at signalised intersections. The implications of these results for enforcement and education are discussed, along with the limitations of the study.
Resumo:
This article explores two matrix methods to induce the ``shades of meaning" (SoM) of a word. A matrix representation of a word is computed from a corpus of traces based on the given word. Non-negative Matrix Factorisation (NMF) and Singular Value Decomposition (SVD) compute a set of vectors corresponding to a potential shade of meaning. The two methods were evaluated based on loss of conditional entropy with respect to two sets of manually tagged data. One set reflects concepts generally appearing in text, and the second set comprises words used for investigations into word sense disambiguation. Results show that for NMF consistently outperforms SVD for inducing both SoM of general concepts as well as word senses. The problem of inducing the shades of meaning of a word is more subtle than that of word sense induction and hence relevant to thematic analysis of opinion where nuances of opinion can arise.
Resumo:
Youth sports teams are usually grouped into yearly age groups based on fixed cut-off date (September 1st in the UK and January 1st in Australia). Children born just after this cut-off will be the oldest and most mature in their age group. This gives them an advantage in competitive sport, an advantage which has persisted into adulthood as shown by seasonal patterns in the dates of birth of professional ice hockey, football and basketball players. We were interested in whether a similar seasonal pattern exists in professional Australian Football League (AFL) players. We examined all AFL players in the 2009 season excluding foreign-born players. We compared the observed number of players’ born in each month with the expected number based on national statistics. There was a marked and statistically significant seasonality in players’ dates of birth. There were 33% more players than expected with dates of birth in January, and 25% fewer in December. Players who are relatively older in youth AFL teams have a better chance of turning professional.
Relative income, happiness, and utility : an explanation for the Easterlin paradox and other puzzles
Resumo:
The well-known Easterlin paradox points out that average happiness has remained constant over time despite sharp rises in GNP per head. At the same time, a micro literature has typically found positive correlations between individual income and individual measures of subjective well-being. This paper suggests that these two findings are consistent with the presence of relative income terms in the utility function. Income may be evaluated relative to others (social comparison) or to oneself in the past (habituation). We review the evidence on relative income from the subjective well-being literature. We also discuss the relation (or not) between happiness and utility, and discuss some nonhappiness research (behavioral, experimental, neurological) related to income comparisons. We last consider how relative income in the utility function can affect economic models of behavior in the domains of consumption, investment, economic growth, savings, taxation, labor supply, wages, and migration.
Resumo:
The main objective of this PhD was to further develop Bayesian spatio-temporal models (specifically the Conditional Autoregressive (CAR) class of models), for the analysis of sparse disease outcomes such as birth defects. The motivation for the thesis arose from problems encountered when analyzing a large birth defect registry in New South Wales. The specific components and related research objectives of the thesis were developed from gaps in the literature on current formulations of the CAR model, and health service planning requirements. Data from a large probabilistically-linked database from 1990 to 2004, consisting of fields from two separate registries: the Birth Defect Registry (BDR) and Midwives Data Collection (MDC) were used in the analyses in this thesis. The main objective was split into smaller goals. The first goal was to determine how the specification of the neighbourhood weight matrix will affect the smoothing properties of the CAR model, and this is the focus of chapter 6. Secondly, I hoped to evaluate the usefulness of incorporating a zero-inflated Poisson (ZIP) component as well as a shared-component model in terms of modeling a sparse outcome, and this is carried out in chapter 7. The third goal was to identify optimal sampling and sample size schemes designed to select individual level data for a hybrid ecological spatial model, and this is done in chapter 8. Finally, I wanted to put together the earlier improvements to the CAR model, and along with demographic projections, provide forecasts for birth defects at the SLA level. Chapter 9 describes how this is done. For the first objective, I examined a series of neighbourhood weight matrices, and showed how smoothing the relative risk estimates according to similarity by an important covariate (i.e. maternal age) helped improve the model’s ability to recover the underlying risk, as compared to the traditional adjacency (specifically the Queen) method of applying weights. Next, to address the sparseness and excess zeros commonly encountered in the analysis of rare outcomes such as birth defects, I compared a few models, including an extension of the usual Poisson model to encompass excess zeros in the data. This was achieved via a mixture model, which also encompassed the shared component model to improve on the estimation of sparse counts through borrowing strength across a shared component (e.g. latent risk factor/s) with the referent outcome (caesarean section was used in this example). Using the Deviance Information Criteria (DIC), I showed how the proposed model performed better than the usual models, but only when both outcomes shared a strong spatial correlation. The next objective involved identifying the optimal sampling and sample size strategy for incorporating individual-level data with areal covariates in a hybrid study design. I performed extensive simulation studies, evaluating thirteen different sampling schemes along with variations in sample size. This was done in the context of an ecological regression model that incorporated spatial correlation in the outcomes, as well as accommodating both individual and areal measures of covariates. Using the Average Mean Squared Error (AMSE), I showed how a simple random sample of 20% of the SLAs, followed by selecting all cases in the SLAs chosen, along with an equal number of controls, provided the lowest AMSE. The final objective involved combining the improved spatio-temporal CAR model with population (i.e. women) forecasts, to provide 30-year annual estimates of birth defects at the Statistical Local Area (SLA) level in New South Wales, Australia. The projections were illustrated using sixteen different SLAs, representing the various areal measures of socio-economic status and remoteness. A sensitivity analysis of the assumptions used in the projection was also undertaken. By the end of the thesis, I will show how challenges in the spatial analysis of rare diseases such as birth defects can be addressed, by specifically formulating the neighbourhood weight matrix to smooth according to a key covariate (i.e. maternal age), incorporating a ZIP component to model excess zeros in outcomes and borrowing strength from a referent outcome (i.e. caesarean counts). An efficient strategy to sample individual-level data and sample size considerations for rare disease will also be presented. Finally, projections in birth defect categories at the SLA level will be made.
Resumo:
In this paper, we propose a multivariate GARCH model with a time-varying conditional correlation structure. The new double smooth transition conditional correlation (DSTCC) GARCH model extends the smooth transition conditional correlation (STCC) GARCH model of Silvennoinen and Teräsvirta (2005) by including another variable according to which the correlations change smoothly between states of constant correlations. A Lagrange multiplier test is derived to test the constancy of correlations against the DSTCC-GARCH model, and another one to test for another transition in the STCC-GARCH framework. In addition, other specification tests, with the aim of aiding the model building procedure, are considered. Analytical expressions for the test statistics and the required derivatives are provided. Applying the model to the stock and bond futures data, we discover that the correlation pattern between them has dramatically changed around the turn of the century. The model is also applied to a selection of world stock indices, and we find evidence for an increasing degree of integration in the capital markets.
Resumo:
We explore the empirical usefulness of conditional coskewness to explain the cross-section of equity returns. We find that coskewness is an important determinant of the returns to equity, and that the pricing relationship varies through time. In particular we find that when the conditional market skewness is positive investors are willing to sacrifice 7.87% annually per unit of gamma (a standardized measure of coskewness risk) while they only demand a premium of 1.80% when the market is negatively skewed. A similar picture emerges from the coskewness factor of Harvey and Siddique (Harvey, C., Siddique, A., 2000a. Conditional skewness in asset pricing models tests. Journal of Finance 65, 1263–1295.) (a portfolio that is long stocks with small coskewness with the market and short high coskewness stocks) which earns 5.00% annually when the market is positively skewed but only 2.81% when the market is negatively skewed. The conditional two-moment CAPM and a conditional Fama and French (Fama, E., French, K., 1992. The cross-section of expected returns. Journal of Finance 47,427465.) three-factor model are rejected, but a model which includes coskewness is not rejected by the data. The model also passes a structural break test which many existing asset pricing models fail.
Resumo:
Why so many people pay their taxes, even though fines and audit probability are low, is a central question in the tax compliance literature. Positing a homo oeconomicus having a refined motivation structure sheds light on this puzzle. This paper provides empirical evidence for the relevance of conditional cooperation, using survey data from 30 West and East European countries. We find a high correlation between perceived tax evasion and tax morale. The results remain robust after exploiting endogeneity and conducting several robustness tests. We also observe a strong positive correlation between institutional quality and tax mmorale. Keywords: Tax morale; Tax compliance; Tax evasion; Pro-social behavior; Institutions
Resumo:
Habitat models are widely used in ecology, however there are relatively few studies of rare species, primarily because of a paucity of survey records and lack of robust means of assessing accuracy of modelled spatial predictions. We investigated the potential of compiled ecological data in developing habitat models for Macadamia integrifolia, a vulnerable mid-stratum tree endemic to lowland subtropical rainforests of southeast Queensland, Australia. We compared performance of two binomial models—Classification and Regression Trees (CART) and Generalised Additive Models (GAM)—with Maximum Entropy (MAXENT) models developed from (i) presence records and available absence data and (ii) developed using presence records and background data. The GAM model was the best performer across the range of evaluation measures employed, however all models were assessed as potentially useful for informing in situ conservation of M. integrifolia, A significant loss in the amount of M. integrifolia habitat has occurred (p < 0.05), with only 37% of former habitat (pre-clearing) remaining in 2003. Remnant patches are significantly smaller, have larger edge-to-area ratios and are more isolated from each other compared to pre-clearing configurations (p < 0.05). Whilst the network of suitable habitat patches is still largely intact, there are numerous smaller patches that are more isolated in the contemporary landscape compared with their connectedness before clearing. These results suggest that in situ conservation of M. integrifolia may be best achieved through a landscape approach that considers the relative contribution of small remnant habitat fragments to the species as a whole, as facilitating connectivity among the entire network of habitat patches.
Resumo:
The traditional searching method for model-order selection in linear regression is a nested full-parameters-set searching procedure over the desired orders, which we call full-model order selection. On the other hand, a method for model-selection searches for the best sub-model within each order. In this paper, we propose using the model-selection searching method for model-order selection, which we call partial-model order selection. We show by simulations that the proposed searching method gives better accuracies than the traditional one, especially for low signal-to-noise ratios over a wide range of model-order selection criteria (both information theoretic based and bootstrap-based). Also, we show that for some models the performance of the bootstrap-based criterion improves significantly by using the proposed partial-model selection searching method. Index Terms— Model order estimation, model selection, information theoretic criteria, bootstrap 1. INTRODUCTION Several model-order selection criteria can be applied to find the optimal order. Some of the more commonly used information theoretic-based procedures include Akaike’s information criterion (AIC) [1], corrected Akaike (AICc) [2], minimum description length (MDL) [3], normalized maximum likelihood (NML) [4], Hannan-Quinn criterion (HQC) [5], conditional model-order estimation (CME) [6], and the efficient detection criterion (EDC) [7]. From a practical point of view, it is difficult to decide which model order selection criterion to use. Many of them perform reasonably well when the signal-to-noise ratio (SNR) is high. The discrepancies in their performance, however, become more evident when the SNR is low. In those situations, the performance of the given technique is not only determined by the model structure (say a polynomial trend versus a Fourier series) but, more importantly, by the relative values of the parameters within the model. This makes the comparison between the model-order selection algorithms difficult as within the same model with a given order one could find an example for which one of the methods performs favourably well or fails [6, 8]. Our aim is to improve the performance of the model order selection criteria in cases where the SNR is low by considering a model-selection searching procedure that takes into account not only the full-model order search but also a partial model order search within the given model order. Understandably, the improvement in the performance of the model order estimation is at the expense of additional computational complexity.