882 resultados para Bayesian model selection


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged 1. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists. Methods: We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists. A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset. Results: The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results. Conclusion: The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers. However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses. We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Emotional liability and mood dysregulation characterize bipolar disorder (BD), yet no study has examined effective connectivity between parahippocampal gyrus and prefrontal cortical regions in ventromedial and dorsal/lateral neural systems subserving mood regulation in BD. Participants comprised 46 individuals (age range: 18-56 years): 21 with a DSM-IV diagnosis of BD, type I currently remitted; and 25 age- and gender-matched healthy controls (HC). Participants performed an event-related functional magnetic resonance imaging paradigm, viewing mild and intense happy and neutral faces. We employed dynamic causal modeling (DCM) to identify significant alterations in effective connectivity between BD and HC. Bayes model selection was used to determine the best model. The right parahippocampal gyrus (PHG) and right subgenual cingulate gyrus (sgCG) were included as representative regions of the ventromedial neural system. The right dorsolateral prefrontal cortex (DLPFC) region was included as representative of the dorsal/lateral neural system. Right PHG-sgCG effective connectivity was significantly greater in BD than HC, reflecting more rapid, forward PHG-sgCG signaling in BD than HC. There was no between-group difference in sgCG-DLPFC effective connectivity. In BD, abnormally increased right PHG-sgCG effective connectivity and reduced right PHG activity to emotional stimuli suggest a dysfunctional ventromedial neural system implicated in early stimulus appraisal, encoding and automatic regulation of emotion that may represent a pathophysiological functional neural mechanism for mood dysregulation in BD.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

2010 Mathematics Subject Classification: 94A17, 62B10, 62F03.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Successful conservation of migratory birds demands we understand how habitat factors on the breeding grounds influences breeding success. Multiple factors are known to directly influence breeding success in territorial songbirds. For example, greater food availability and fewer predators can have direct effects on breeding success. However, many of these same habitat factors can also result in higher conspecific density that may ultimately reduce breeding success through density dependence. In this case, there is a negative indirect effect of habitat on breeding success through its effects on conspecific density and territory size. Therefore, a key uncertainty facing land managers is whether important habitat attributes directly influence breeding success or indirectly influence breeding success through territory size. We used radio-telemetry, point-counts, vegetation sampling, predator observations, and insect sampling over two years to provide data on habitat selection of a steeply declining songbird species, the Canada Warbler (Cardellina canadensis). These data were then applied in a hierarchical path modeling framework and an AIC model selection approach to determine the habitat attributes that best predict breeding success. Canada Warblers had smaller territories in areas with high shrub cover, in the presence of red squirrels (Tamiasciurus hudsonicus), at shoreline sites relative to forest-interior sites and as conspecific density increased. Breeding success was lower for birds with smaller territories, which suggests competition for limited food resources, but there was no direct evidence that food availability influenced territory size or breeding success. The negative relationship between shrub cover and territory size in our study may arise because these specific habitat conditions are spatially heterogeneous, whereby individuals pack into patches of preferred breeding habitat scattered throughout the landscape, resulting in reduced territory size and an associated reduction in resource availability per territory. Our results therefore highlight the importance of considering direct and indirect effects for Canada warblers; efforts to increase the amount of breeding habitat may ultimately result in lower breeding success if habitat availability is limited and negative density dependent effects occur.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

At least since the seminal works of Jacob Mincer, labor economists have sought to understand how students make higher education investment decisions. Mincer’s original work seeks to understand how students decide how much education to accrue; subsequent work by various authors seeks to understand how students choose where to attend college, what field to major in, and whether to drop out of college.

Broadly speaking, this rich sub-field of literature contributes to society in two ways: First, it provides a better understanding of important social behaviors. Second, it helps policymakers anticipate the responses of students when evaluating various policy reforms.

While research on the higher education investment decisions of students has had an enormous impact on our understanding of society and has shaped countless education policies, students are only one interested party in the higher education landscape. In the jargon of economists, students represent only the `demand side’ of higher education---customers who are choosing options from a set of available alternatives. Opposite students are instructors and administrators who represent the `supply side’ of higher education---those who decide which options are available to students.

For similar reasons, it is also important to understand how individuals on the supply side of education make decisions: First, this provides a deeper understanding of the behaviors of important social institutions. Second, it helps policymakers anticipate the responses of instructors and administrators when evaluating various reforms. However, while there is substantial literature understanding decisions made on the demand side of education, there is far less attention paid to decisions on the supply side of education.

This dissertation uses empirical evidence to better understand how instructors and administrators make decisions and the implications of these decisions for students.

In the first chapter, I use data from Duke University and a Bayesian model of correlated learning to measure the signal quality of grades across academic fields. The correlated feature of the model allows grades in one academic field to signal ability in all other fields allowing me to measure both ‘own category' signal quality and ‘spillover' signal quality. Estimates reveal a clear division between information rich Science, Engineering, and Economics grades and less informative Humanities and Social Science grades. In many specifications, information spillovers are so powerful that precise Science, Engineering, and Economics grades are more informative about Humanities and Social Science abilities than Humanities and Social Science grades. This suggests students who take engineering courses during their Freshman year make more informed specialization decisions later in college.

In the second chapter, I use data from the University of Central Arkansas to understand how universities decide which courses to offer and how much to spend on instructors for these courses. Course offerings and instructor characteristics directly affect the courses students choose and the value they receive from these choices. This chapter reveals the university preferences over these student outcomes which best explain observed course offerings and instructors. This allows me to assess whether university incentives are aligned with students, to determine what alternative university choices would be preferred by students, and to illustrate how a revenue neutral tax/subsidy policy can induce a university to make these student-best decisions.

In the third chapter, co-authored with Thomas Ahn, Peter Arcidiacono, and Amy Hopson, we use data from the University of Kentucky to understand how instructors choose grading policies. In this chapter, we estimate an equilibrium model in which instructors choose grading policies and students choose courses and study effort given grading policies. In this model, instructors set both a grading intercept and a return on ability and effort. This builds a rich link between the grading policy decisions of instructors and the course choices of students. We use estimates of this model to infer what preference parameters best explain why instructors chose estimated grading policies. To illustrate the importance of these supply side decisions, we show changing grading policies can substantially reduce the gender gap in STEM enrollment.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The work presented in this dissertation is focused on applying engineering methods to develop and explore probabilistic survival models for the prediction of decompression sickness in US NAVY divers. Mathematical modeling, computational model development, and numerical optimization techniques were employed to formulate and evaluate the predictive quality of models fitted to empirical data. In Chapters 1 and 2 we present general background information relevant to the development of probabilistic models applied to predicting the incidence of decompression sickness. The remainder of the dissertation introduces techniques developed in an effort to improve the predictive quality of probabilistic decompression models and to reduce the difficulty of model parameter optimization.

The first project explored seventeen variations of the hazard function using a well-perfused parallel compartment model. Models were parametrically optimized using the maximum likelihood technique. Model performance was evaluated using both classical statistical methods and model selection techniques based on information theory. Optimized model parameters were overall similar to those of previously published Results indicated that a novel hazard function definition that included both ambient pressure scaling and individually fitted compartment exponent scaling terms.

We developed ten pharmacokinetic compartmental models that included explicit delay mechanics to determine if predictive quality could be improved through the inclusion of material transfer lags. A fitted discrete delay parameter augmented the inflow to the compartment systems from the environment. Based on the observation that symptoms are often reported after risk accumulation begins for many of our models, we hypothesized that the inclusion of delays might improve correlation between the model predictions and observed data. Model selection techniques identified two models as having the best overall performance, but comparison to the best performing model without delay and model selection using our best identified no delay pharmacokinetic model both indicated that the delay mechanism was not statistically justified and did not substantially improve model predictions.

Our final investigation explored parameter bounding techniques to identify parameter regions for which statistical model failure will not occur. When a model predicts a no probability of a diver experiencing decompression sickness for an exposure that is known to produce symptoms, statistical model failure occurs. Using a metric related to the instantaneous risk, we successfully identify regions where model failure will not occur and identify the boundaries of the region using a root bounding technique. Several models are used to demonstrate the techniques, which may be employed to reduce the difficulty of model optimization for future investigations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Radiocarbon dating and Bayesian chronological modelling, undertaken as part of the investigation by the Times of Their Lives project into the development of Late Neolithic settlement and pottery in Orkney, has provided precise new dating for the Grooved Ware settlement of Barnhouse, excavated in 1985–91. Previous understandings of the site and its pottery are presented. A Bayesian model based on 70 measurements on 62 samples (of which 50 samples are thought to date accurately the deposits from which they were recovered) suggests that the settlement probably began in the later 32nd century cal bc (with Houses 2, 9, 3 and perhaps 5a), possibly as a planned foundation. Structure 8 – a large, monumental structure that differs in character from the houses – was probably built just after the turn of the millennium. Varied house durations and replacements are estimated. House 2 went out of use before the end of the settlement, and Structure 8 was probably the last element to be abandoned, probably during the earlier 29th century cal bc. The Grooved Ware pottery from the site is characterised by small, medium-sized, and large vessels with incised and impressed decoration, including a distinctive, false-relief, wavy-line cordon motif. A considerable degree of consistency is apparent in many aspects of ceramic design and manufacture over the use-life of the settlement, the principal change being the appearance, from c. 3025–2975 cal bc, of large coarse ware vessels with uneven surfaces and thick applied cordons, and of the use of applied dimpled circular pellets. The circumstances of new foundation of settlement in the western part of Mainland are discussed, as well as the maintenance and character of the site. The pottery from the site is among the earliest Grooved Ware so far dated. Its wider connections are noted, as well as the significant implications for our understanding of the timing and circumstances of the emergence of Grooved Ware, and the role of material culture in social strategies.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cette thèse développe des méthodes bootstrap pour les modèles à facteurs qui sont couram- ment utilisés pour générer des prévisions depuis l'article pionnier de Stock et Watson (2002) sur les indices de diffusion. Ces modèles tolèrent l'inclusion d'un grand nombre de variables macroéconomiques et financières comme prédicteurs, une caractéristique utile pour inclure di- verses informations disponibles aux agents économiques. Ma thèse propose donc des outils éco- nométriques qui améliorent l'inférence dans les modèles à facteurs utilisant des facteurs latents extraits d'un large panel de prédicteurs observés. Il est subdivisé en trois chapitres complémen- taires dont les deux premiers en collaboration avec Sílvia Gonçalves et Benoit Perron. Dans le premier article, nous étudions comment les méthodes bootstrap peuvent être utilisées pour faire de l'inférence dans les modèles de prévision pour un horizon de h périodes dans le futur. Pour ce faire, il examine l'inférence bootstrap dans un contexte de régression augmentée de facteurs où les erreurs pourraient être autocorrélées. Il généralise les résultats de Gonçalves et Perron (2014) et propose puis justifie deux approches basées sur les résidus : le block wild bootstrap et le dependent wild bootstrap. Nos simulations montrent une amélioration des taux de couverture des intervalles de confiance des coefficients estimés en utilisant ces approches comparativement à la théorie asymptotique et au wild bootstrap en présence de corrélation sérielle dans les erreurs de régression. Le deuxième chapitre propose des méthodes bootstrap pour la construction des intervalles de prévision permettant de relâcher l'hypothèse de normalité des innovations. Nous y propo- sons des intervalles de prédiction bootstrap pour une observation h périodes dans le futur et sa moyenne conditionnelle. Nous supposons que ces prévisions sont faites en utilisant un ensemble de facteurs extraits d'un large panel de variables. Parce que nous traitons ces facteurs comme latents, nos prévisions dépendent à la fois des facteurs estimés et les coefficients de régres- sion estimés. Sous des conditions de régularité, Bai et Ng (2006) ont proposé la construction d'intervalles asymptotiques sous l'hypothèse de Gaussianité des innovations. Le bootstrap nous permet de relâcher cette hypothèse et de construire des intervalles de prédiction valides sous des hypothèses plus générales. En outre, même en supposant la Gaussianité, le bootstrap conduit à des intervalles plus précis dans les cas où la dimension transversale est relativement faible car il prend en considération le biais de l'estimateur des moindres carrés ordinaires comme le montre une étude récente de Gonçalves et Perron (2014). Dans le troisième chapitre, nous suggérons des procédures de sélection convergentes pour les regressions augmentées de facteurs en échantillons finis. Nous démontrons premièrement que la méthode de validation croisée usuelle est non-convergente mais que sa généralisation, la validation croisée «leave-d-out» sélectionne le plus petit ensemble de facteurs estimés pour l'espace généré par les vraies facteurs. Le deuxième critère dont nous montrons également la validité généralise l'approximation bootstrap de Shao (1996) pour les regressions augmentées de facteurs. Les simulations montrent une amélioration de la probabilité de sélectionner par- cimonieusement les facteurs estimés comparativement aux méthodes de sélection disponibles. L'application empirique revisite la relation entre les facteurs macroéconomiques et financiers, et l'excès de rendement sur le marché boursier américain. Parmi les facteurs estimés à partir d'un large panel de données macroéconomiques et financières des États Unis, les facteurs fortement correlés aux écarts de taux d'intérêt et les facteurs de Fama-French ont un bon pouvoir prédictif pour les excès de rendement.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Master's)--University of Washington, 2016-08

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Thesis (Ph.D.)--University of Washington, 2016-08

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Ce mémoire s’intéresse à l’étude du critère de validation croisée pour le choix des modèles relatifs aux petits domaines. L’étude est limitée aux modèles de petits domaines au niveau des unités. Le modèle de base des petits domaines est introduit par Battese, Harter et Fuller en 1988. C’est un modèle de régression linéaire mixte avec une ordonnée à l’origine aléatoire. Il se compose d’un certain nombre de paramètres : le paramètre β de la partie fixe, la composante aléatoire et les variances relatives à l’erreur résiduelle. Le modèle de Battese et al. est utilisé pour prédire, lors d’une enquête, la moyenne d’une variable d’intérêt y dans chaque petit domaine en utilisant une variable auxiliaire administrative x connue sur toute la population. La méthode d’estimation consiste à utiliser une distribution normale, pour modéliser la composante résiduelle du modèle. La considération d’une dépendance résiduelle générale, c’est-à-dire autre que la loi normale donne une méthodologie plus flexible. Cette généralisation conduit à une nouvelle classe de modèles échangeables. En effet, la généralisation se situe au niveau de la modélisation de la dépendance résiduelle qui peut être soit normale (c’est le cas du modèle de Battese et al.) ou non-normale. L’objectif est de déterminer les paramètres propres aux petits domaines avec le plus de précision possible. Cet enjeu est lié au choix de la bonne dépendance résiduelle à utiliser dans le modèle. Le critère de validation croisée sera étudié à cet effet.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Mechanistic models used for prediction should be parsimonious, as models which are over-parameterised may have poor predictive performance. Determining whether a model is parsimonious requires comparisons with alternative model formulations with differing levels of complexity. However, creating alternative formulations for large mechanistic models is often problematic, and usually time-consuming. Consequently, few are ever investigated. In this paper, we present an approach which rapidly generates reduced model formulations by replacing a model’s variables with constants. These reduced alternatives can be compared to the original model, using data based model selection criteria, to assist in the identification of potentially unnecessary model complexity, and thereby inform reformulation of the model. To illustrate the approach, we present its application to a published radiocaesium plant-uptake model, which predicts uptake on the basis of soil characteristics (e.g. pH, organic matter content, clay content). A total of 1024 reduced model formulations were generated, and ranked according to five model selection criteria: Residual Sum of Squares (RSS), AICc, BIC, MDL and ICOMP. The lowest scores for RSS and AICc occurred for the same reduced model in which pH dependent model components were replaced. The lowest scores for BIC, MDL and ICOMP occurred for a further reduced model in which model components related to the distinction between adsorption on clay and organic surfaces were replaced. Both these reduced models had a lower RSS for the parameterisation dataset than the original model. As a test of their predictive performance, the original model and the two reduced models outlined above were used to predict an independent dataset. The reduced models have lower prediction sums of squares than the original model, suggesting that the latter may be overfitted. The approach presented has the potential to inform model development by rapidly creating a class of alternative model formulations, which can be compared.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Cette thèse développe des méthodes bootstrap pour les modèles à facteurs qui sont couram- ment utilisés pour générer des prévisions depuis l'article pionnier de Stock et Watson (2002) sur les indices de diffusion. Ces modèles tolèrent l'inclusion d'un grand nombre de variables macroéconomiques et financières comme prédicteurs, une caractéristique utile pour inclure di- verses informations disponibles aux agents économiques. Ma thèse propose donc des outils éco- nométriques qui améliorent l'inférence dans les modèles à facteurs utilisant des facteurs latents extraits d'un large panel de prédicteurs observés. Il est subdivisé en trois chapitres complémen- taires dont les deux premiers en collaboration avec Sílvia Gonçalves et Benoit Perron. Dans le premier article, nous étudions comment les méthodes bootstrap peuvent être utilisées pour faire de l'inférence dans les modèles de prévision pour un horizon de h périodes dans le futur. Pour ce faire, il examine l'inférence bootstrap dans un contexte de régression augmentée de facteurs où les erreurs pourraient être autocorrélées. Il généralise les résultats de Gonçalves et Perron (2014) et propose puis justifie deux approches basées sur les résidus : le block wild bootstrap et le dependent wild bootstrap. Nos simulations montrent une amélioration des taux de couverture des intervalles de confiance des coefficients estimés en utilisant ces approches comparativement à la théorie asymptotique et au wild bootstrap en présence de corrélation sérielle dans les erreurs de régression. Le deuxième chapitre propose des méthodes bootstrap pour la construction des intervalles de prévision permettant de relâcher l'hypothèse de normalité des innovations. Nous y propo- sons des intervalles de prédiction bootstrap pour une observation h périodes dans le futur et sa moyenne conditionnelle. Nous supposons que ces prévisions sont faites en utilisant un ensemble de facteurs extraits d'un large panel de variables. Parce que nous traitons ces facteurs comme latents, nos prévisions dépendent à la fois des facteurs estimés et les coefficients de régres- sion estimés. Sous des conditions de régularité, Bai et Ng (2006) ont proposé la construction d'intervalles asymptotiques sous l'hypothèse de Gaussianité des innovations. Le bootstrap nous permet de relâcher cette hypothèse et de construire des intervalles de prédiction valides sous des hypothèses plus générales. En outre, même en supposant la Gaussianité, le bootstrap conduit à des intervalles plus précis dans les cas où la dimension transversale est relativement faible car il prend en considération le biais de l'estimateur des moindres carrés ordinaires comme le montre une étude récente de Gonçalves et Perron (2014). Dans le troisième chapitre, nous suggérons des procédures de sélection convergentes pour les regressions augmentées de facteurs en échantillons finis. Nous démontrons premièrement que la méthode de validation croisée usuelle est non-convergente mais que sa généralisation, la validation croisée «leave-d-out» sélectionne le plus petit ensemble de facteurs estimés pour l'espace généré par les vraies facteurs. Le deuxième critère dont nous montrons également la validité généralise l'approximation bootstrap de Shao (1996) pour les regressions augmentées de facteurs. Les simulations montrent une amélioration de la probabilité de sélectionner par- cimonieusement les facteurs estimés comparativement aux méthodes de sélection disponibles. L'application empirique revisite la relation entre les facteurs macroéconomiques et financiers, et l'excès de rendement sur le marché boursier américain. Parmi les facteurs estimés à partir d'un large panel de données macroéconomiques et financières des États Unis, les facteurs fortement correlés aux écarts de taux d'intérêt et les facteurs de Fama-French ont un bon pouvoir prédictif pour les excès de rendement.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Los mercados asociados a los servicios de voz móvil a móvil, brindados por operadoras del Sistema Móvil Avanzado en Latinoamérica, han estado sujetos a procesos regulatorios motivados por la dominancia en el mercado de un operador, buscando obtener óptimas condiciones de competencia. Específicamente en Ecuador, la Superintendencia de Telecomunicaciones (Organismo Técnico de Control de Telecomunicaciones) desarrolló un modelo para identificar acciones de regulación que puedan proporcionar al mercado efectos sostenibles de competencia en el largo plazo. Este artículo trata sobre la aplicación de la ingeniería de control para desarrollar un modelo integral del mercado, empleando redes neuronales para la predicción de trarifas de cada operador y un modelo de lógica difusa para predecir la demanda. Adicionalmente, se presenta un modelo de inferencia de lógica difusa para reproducir las estrategias de mercadeo de los operadores y la influencia sobre las tarifas. Dichos modelos permitirían la toma adecuada de decisiones y fueron validados con datos reales.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Salmonella is distributed worldwide and is a pathogen of economic and public health importance. As a multi-host pathogen with a long environmental persistence, it is a suitable model for the study of wildlife-livestock interactions. In this work, we aim to explore the spill-over of Salmonella between free-ranging wild boar and livestock in a protected natural area in NE Spain and the presence of antimicrobial resistance. Salmonella prevalence, serotypes and diversity were compared between wild boars, sympatric cattle and wild boars from cattle-free areas. The effect of age, sex, cattle presence and cattle herd size on Salmonella probability of infection in wild boars was explored by means of Generalized Linear Models and a model selection based on the Akaike's Information Criterion. Prevalence was higher in wild boars co-habiting with cattle (35.67%, CI 95% 28.19-43.70) than in wild boar from cattle-free areas (17.54%, CI 95% 8.74-29.91). Probability of a wild boar being a Salmonella carrier increased with cattle herd size but decreased with the host age. Serotypes Meleagridis, Anatum and Othmarschen were isolated concurrently from cattle and sympatric wild boars. Apart from serotypes shared with cattle, wild boars appear to have their own serotypes, which are also found in wild boars from cattle-free areas (Enteritidis, Mikawasima, 4:b:- and 35:r:z35). Serotype richness (diversity) was higher in wild boars co-habiting with cattle, but evenness was not altered by the introduction of serotypes from cattle. The finding of a S. Mbandaka strain resistant to sulfamethoxazole, streptomycin and chloramphenicol and a S. Enteritidis strain resistant to ciprofloxacin and nalidixic acid in wild boars is cause for public health concern.