Biblioteca Digital

956 resultados para Models for count data

A Combinatorial Approach to the Variable Selection in Multiple Linear Regression: Analysis of Selwood et al Data set -A Case Study.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A combinatorial protocol (CP) is introduced here to interface it with the multiple linear regression (MLR) for variable selection. The efficiency of CP-MLR is primarily based on the restriction of entry of correlated variables to the model development stage. It has been used for the analysis of Selwood et al data set [16], and the obtained models are compared with those reported from GFA [8] and MUSEUM [9] approaches. For this data set CP-MLR could identify three highly independent models (27, 28 and 31) with Q2 value in the range of 0.632-0.518. Also, these models are divergent and unique. Even though, the present study does not share any models with GFA [8], and MUSEUM [9] results, there are several descriptors common to all these studies, including the present one. Also a simulation is carried out on the same data set to explain the model formation in CP-MLR. The results demonstrate that the proposed method should be able to offer solutions to data sets with 50 to 60 descriptors in reasonable time frame. By carefully selecting the inter-parameter correlation cutoff values in CP-MLR one can identify divergent models and handle data sets larger than the present one without involving excessive computer time.

Assessment of Skill and Portability in Regional Marine Biogeochemical Models: Role of Multiple Planktonic Groups

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Application of biogeochemical models to the study of marine ecosystems is pervasive, yet objective quantification of these models' performance is rare. Here, 12 lower trophic level models of varying complexity are objectively assessed in two distinct regions (equatorial Pacific and Arabian Sea). Each model was run within an identical one-dimensional physical framework. A consistent variational adjoint implementation assimilating chlorophyll-a, nitrate, export, and primary productivity was applied and the same metrics were used to assess model skill. Experiments were performed in which data were assimilated from each site individually and from both sites simultaneously. A cross-validation experiment was also conducted whereby data were assimilated from one site and the resulting optimal parameters were used to generate a simulation for the second site. When a single pelagic regime is considered, the simplest models fit the data as well as those with multiple phytoplankton functional groups. However, those with multiple phytoplankton functional groups produced lower misfits when the models are required to simulate both regimes using identical parameter values. The cross-validation experiments revealed that as long as only a few key biogeochemical parameters were optimized, the models with greater phytoplankton complexity were generally more portable. Furthermore, models with multiple zooplankton compartments did not necessarily outperform models with single zooplankton compartments, even when zooplankton biomass data are assimilated. Finally, even when different models produced similar least squares model-data misfits, they often did so via very different element flow pathways, highlighting the need for more comprehensive data sets that uniquely constrain these pathways.

Logistic regression models for ordinal response: A study of self-efficacy in colorectal cancer screening

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The ordinal logistic regression models are used to analyze the dependant variable with multiple outcomes that can be ranked, but have been underutilized. In this study, we describe four logistic regression models for analyzing the ordinal response variable. ^ In this methodological study, the four regression models are proposed. The first model uses the multinomial logistic model. The second is adjacent-category logit model. The third is the proportional odds model and the fourth model is the continuation-ratio model. We illustrate and compare the fit of these models using data from the survey designed by the University of Texas, School of Public Health research project PCCaSO (Promoting Colon Cancer Screening in people 50 and Over), to study the patient’s confidence in the completion colorectal cancer screening (CRCS). ^ The purpose of this study is two fold: first, to provide a synthesized review of models for analyzing data with ordinal response, and second, to evaluate their usefulness in epidemiological research, with particular emphasis on model formulation, interpretation of model coefficients, and their implications. Four ordinal logistic models that are used in this study include (1) Multinomial logistic model, (2) Adjacent-category logistic model [9], (3) Continuation-ratio logistic model [10], (4) Proportional logistic model [11]. We recommend that the analyst performs (1) goodness-of-fit tests, (2) sensitivity analysis by fitting and comparing different models.^

Data quality in trauma transfusion studies and the impact of missing data on predicting massive transfusion

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Maximizing data quality may be especially difficult in trauma-related clinical research. Strategies are needed to improve data quality and assess the impact of data quality on clinical predictive models. This study had two objectives. The first was to compare missing data between two multi-center trauma transfusion studies: a retrospective study (RS) using medical chart data with minimal data quality review and the PRospective Observational Multi-center Major Trauma Transfusion (PROMMTT) study with standardized quality assurance. The second objective was to assess the impact of missing data on clinical prediction algorithms by evaluating blood transfusion prediction models using PROMMTT data. RS (2005-06) and PROMMTT (2009-10) investigated trauma patients receiving ≥ 1 unit of red blood cells (RBC) from ten Level I trauma centers. Missing data were compared for 33 variables collected in both studies using mixed effects logistic regression (including random intercepts for study site). Massive transfusion (MT) patients received ≥ 10 RBC units within 24h of admission. Correct classification percentages for three MT prediction models were evaluated using complete case analysis and multiple imputation based on the multivariate normal distribution. A sensitivity analysis for missing data was conducted to estimate the upper and lower bounds of correct classification using assumptions about missing data under best and worst case scenarios. Most variables (17/33=52%) had <1% missing data in RS and PROMMTT. Of the remaining variables, 50% demonstrated less missingness in PROMMTT, 25% had less missingness in RS, and 25% were similar between studies. Missing percentages for MT prediction variables in PROMMTT ranged from 2.2% (heart rate) to 45% (respiratory rate). For variables missing >1%, study site was associated with missingness (all p≤0.021). Survival time predicted missingness for 50% of RS and 60% of PROMMTT variables. MT models complete case proportions ranged from 41% to 88%. Complete case analysis and multiple imputation demonstrated similar correct classification results. Sensitivity analysis upper-lower bound ranges for the three MT models were 59-63%, 36-46%, and 46-58%. Prospective collection of ten-fold more variables with data quality assurance reduced overall missing data. Study site and patient survival were associated with missingness, suggesting that data were not missing completely at random, and complete case analysis may lead to biased results. Evaluating clinical prediction model accuracy may be misleading in the presence of missing data, especially with many predictor variables. The proposed sensitivity analysis estimating correct classification under upper (best case scenario)/lower (worst case scenario) bounds may be more informative than multiple imputation, which provided results similar to complete case analysis.^

Petrological parameters of sands and sandstones from DSDP and ODP holes in arc-related areas

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Detrital modes for 524 deep-marine sand and sandstone samples recovered on circum-Pacific, Caribbean, and Mediterranean legs of the Deep Sea Drilling Project and the Ocean Drilling Program form the basis for an actualistic model for arc-related provenance. This model refines the Dickinson and Suczek (1979) and Dickinson and others (1983) models and can be used to interpret the provenance/tectonic history of ancient arc-related sedimentary sequences. Four provenance groups are defined using QFL, QmKP, LmLvLs, and LvfLvmiLvl ternary plots of site means: (1) intraoceanic arc and remnant arc, (2) continental arc, (3) triple junction, and (4) strike-slip-continental arc. Intraoceanic- and remnant-arc sands are poor in quartz (mean QFL%Q < 5) and rich in lithics (QFL%L > 75); they are predominantly composed of plagioclase feldspar and volcanic lithic fragments. Continental-arc sand can be more quartzofeldspathic than the intraoceanic- and remnant-arc sand (mean QFL%Q values as much as 10, mean QFL%F values as much as 65, and mean QmKP%Qm as much as 20) and has more variable lithic populations, with minor metamorphic and sedimentary components. The triple-junction and strike-slip-continental groups compositionally overlap; both are more quartzofeldspathic than the other groups and show highly variable lithic proportions, but the strike-slip-continental group is more quartzose. Modal compositions of the triple junction group roughly correlate with the QFL transitional-arc field of Dickinson and others (1983), whereas the strike-slip-continental group approximately correlates with their dissected-arc field.

Transcriptional response to cardiac injury in the zebrafish: systematic identification of genes with highly concordant activity across in vivo models

Relevância:

90.00% 90.00%

Publicador:

Resumo:

BACKGROUND Zebrafish is a clinically-relevant model of heart regeneration. Unlike mammals, it has a remarkable heart repair capacity after injury, and promises novel translational applications. Amputation and cryoinjury models are key research tools for understanding injury response and regeneration in vivo. An understanding of the transcriptional responses following injury is needed to identify key players of heart tissue repair, as well as potential targets for boosting this property in humans. RESULTS We investigated amputation and cryoinjury in vivo models of heart damage in the zebrafish through unbiased, integrative analyses of independent molecular datasets. To detect genes with potential biological roles, we derived computational prediction models with microarray data from heart amputation experiments. We focused on a top-ranked set of genes highly activated in the early post-injury stage, whose activity was further verified in independent microarray datasets. Next, we performed independent validations of expression responses with qPCR in a cryoinjury model. Across in vivo models, the top candidates showed highly concordant responses at 1 and 3 days post-injury, which highlights the predictive power of our analysis strategies and the possible biological relevance of these genes. Top candidates are significantly involved in cell fate specification and differentiation, and include heart failure markers such as periostin, as well as potential new targets for heart regeneration. For example, ptgis and ca2 were overexpressed, while usp2a, a regulator of the p53 pathway, was down-regulated in our in vivo models. Interestingly, a high activity of ptgis and ca2 has been previously observed in failing hearts from rats and humans. CONCLUSIONS We identified genes with potential critical roles in the response to cardiac damage in the zebrafish. Their transcriptional activities are reproducible in different in vivo models of cardiac injury.

Determinants of inbound tourists in Cambodia : a dynamic panel data approach

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Understanding the determinants of tourism demand is crucial for the tourism sector. This paper develops a dynamic panel model to examine the determinants of inbound tourists to Siem Reap airport, Phnom Penh airport, and land and waterway borders in Cambodia. Consistent with the consumer theory of tourism consumption, a 10% increase in the origin country GDP per capita is predicted to increase the number of tourist visits to Siem Reap airport by 5.8%. A 10% increase in the real exchange rate between the origin country and Cambodia is predicted to decrease the number of tourist visits by 0.89%. In contrast, the number of foreign tourists in a previous period has little effect on the number of foreign tourists in the current period. Additionally, the determinants are different by the mode of entry to Cambodia.

VAR models and methods for monetary and health economics

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Negli ultimi anni i modelli VAR sono diventati il principale strumento econometrico per verificare se può esistere una relazione tra le variabili e per valutare gli effetti delle politiche economiche. Questa tesi studia tre diversi approcci di identificazione a partire dai modelli VAR in forma ridotta (tra cui periodo di campionamento, set di variabili endogene, termini deterministici). Usiamo nel caso di modelli VAR il test di Causalità di Granger per verificare la capacità di una variabile di prevedere un altra, nel caso di cointegrazione usiamo modelli VECM per stimare congiuntamente i coefficienti di lungo periodo ed i coefficienti di breve periodo e nel caso di piccoli set di dati e problemi di overfitting usiamo modelli VAR bayesiani con funzioni di risposta di impulso e decomposizione della varianza, per analizzare l'effetto degli shock sulle variabili macroeconomiche. A tale scopo, gli studi empirici sono effettuati utilizzando serie storiche di dati specifici e formulando diverse ipotesi. Sono stati utilizzati tre modelli VAR: in primis per studiare le decisioni di politica monetaria e discriminare tra le varie teorie post-keynesiane sulla politica monetaria ed in particolare sulla cosiddetta "regola di solvibilità" (Brancaccio e Fontana 2013, 2015) e regola del GDP nominale in Area Euro (paper 1); secondo per estendere l'evidenza dell'ipotesi di endogeneità della moneta valutando gli effetti della cartolarizzazione delle banche sul meccanismo di trasmissione della politica monetaria negli Stati Uniti (paper 2); terzo per valutare gli effetti dell'invecchiamento sulla spesa sanitaria in Italia in termini di implicazioni di politiche economiche (paper 3). La tesi è introdotta dal capitolo 1 in cui si delinea il contesto, la motivazione e lo scopo di questa ricerca, mentre la struttura e la sintesi, così come i principali risultati, sono descritti nei rimanenti capitoli. Nel capitolo 2 sono esaminati, utilizzando un modello VAR in differenze prime con dati trimestrali della zona Euro, se le decisioni in materia di politica monetaria possono essere interpretate in termini di una "regola di politica monetaria", con specifico riferimento alla cosiddetta "nominal GDP targeting rule" (McCallum 1988 Hall e Mankiw 1994; Woodford 2012). I risultati evidenziano una relazione causale che va dallo scostamento tra i tassi di crescita del PIL nominale e PIL obiettivo alle variazioni dei tassi di interesse di mercato a tre mesi. La stessa analisi non sembra confermare l'esistenza di una relazione causale significativa inversa dalla variazione del tasso di interesse di mercato allo scostamento tra i tassi di crescita del PIL nominale e PIL obiettivo. Risultati simili sono stati ottenuti sostituendo il tasso di interesse di mercato con il tasso di interesse di rifinanziamento della BCE. Questa conferma di una sola delle due direzioni di causalità non supporta un'interpretazione della politica monetaria basata sulla nominal GDP targeting rule e dà adito a dubbi in termini più generali per l'applicabilità della regola di Taylor e tutte le regole convenzionali della politica monetaria per il caso in questione. I risultati appaiono invece essere più in linea con altri approcci possibili, come quelli basati su alcune analisi post-keynesiane e marxiste della teoria monetaria e più in particolare la cosiddetta "regola di solvibilità" (Brancaccio e Fontana 2013, 2015). Queste linee di ricerca contestano la tesi semplicistica che l'ambito della politica monetaria consiste nella stabilizzazione dell'inflazione, del PIL reale o del reddito nominale intorno ad un livello "naturale equilibrio". Piuttosto, essi suggeriscono che le banche centrali in realtà seguono uno scopo più complesso, che è il regolamento del sistema finanziario, con particolare riferimento ai rapporti tra creditori e debitori e la relativa solvibilità delle unità economiche. Il capitolo 3 analizza l’offerta di prestiti considerando l’endogeneità della moneta derivante dall'attività di cartolarizzazione delle banche nel corso del periodo 1999-2012. Anche se gran parte della letteratura indaga sulla endogenità dell'offerta di moneta, questo approccio è stato adottato raramente per indagare la endogeneità della moneta nel breve e lungo termine con uno studio degli Stati Uniti durante le due crisi principali: scoppio della bolla dot-com (1998-1999) e la crisi dei mutui sub-prime (2008-2009). In particolare, si considerano gli effetti dell'innovazione finanziaria sul canale dei prestiti utilizzando la serie dei prestiti aggiustata per la cartolarizzazione al fine di verificare se il sistema bancario americano è stimolato a ricercare fonti più economiche di finanziamento come la cartolarizzazione, in caso di politica monetaria restrittiva (Altunbas et al., 2009). L'analisi si basa sull'aggregato monetario M1 ed M2. Utilizzando modelli VECM, esaminiamo una relazione di lungo periodo tra le variabili in livello e valutiamo gli effetti dell’offerta di moneta analizzando quanto la politica monetaria influisce sulle deviazioni di breve periodo dalla relazione di lungo periodo. I risultati mostrano che la cartolarizzazione influenza l'impatto dei prestiti su M1 ed M2. Ciò implica che l'offerta di moneta è endogena confermando l'approccio strutturalista ed evidenziando che gli agenti economici sono motivati ad aumentare la cartolarizzazione per una preventiva copertura contro shock di politica monetaria. Il capitolo 4 indaga il rapporto tra spesa pro capite sanitaria, PIL pro capite, indice di vecchiaia ed aspettativa di vita in Italia nel periodo 1990-2013, utilizzando i modelli VAR bayesiani e dati annuali estratti dalla banca dati OCSE ed Eurostat. Le funzioni di risposta d'impulso e la scomposizione della varianza evidenziano una relazione positiva: dal PIL pro capite alla spesa pro capite sanitaria, dalla speranza di vita alla spesa sanitaria, e dall'indice di invecchiamento alla spesa pro capite sanitaria. L'impatto dell'invecchiamento sulla spesa sanitaria è più significativo rispetto alle altre variabili. Nel complesso, i nostri risultati suggeriscono che le disabilità strettamente connesse all'invecchiamento possono essere il driver principale della spesa sanitaria nel breve-medio periodo. Una buona gestione della sanità contribuisce a migliorare il benessere del paziente, senza aumentare la spesa sanitaria totale. Tuttavia, le politiche che migliorano lo stato di salute delle persone anziane potrebbe essere necessarie per una più bassa domanda pro capite dei servizi sanitari e sociali.

Análise genética da produção in vitro de embriões em bovinos Guzerá

Relevância:

90.00% 90.00%

Publicador:

Resumo:

O objetivo dessa pesquisa foi avaliar aspectos genéticos que relacionados à produção in vitro de embriões na raça Guzerá. O primeiro estudo focou na estimação de (co) variâncias genéticas e fenotípicas em características relacionadas a produção de embriões e na detecção de possível associação com a idade ao primeiro parto (AFC). Foi detectada baixa e média herdabilidade para características relacionadas à produção de oócitos e embriões. Houve fraca associação genética entre características ligadas a reprodução artificial e a idade ao primeiro parto. O segundo estudo avaliou tendências genéticas e de endogamia em uma população Guzerá no Brasil. Doadoras e embriões produzidos in vitro foram considerados como duas subpopulações de forma a realizar comparações acerca das diferenças de variação anual genética e do coeficiente de endogamia. A tendência anual do coeficiente de endogamia (F) foi superior para a população geral, sendo detectado efeito quadrático. No entanto, a média de F para a sub- população de embriões foi maior do que na população geral e das doadoras. Foi observado ganho genético anual superior para a idade ao primeiro parto e para a produção de leite (305 dias) entre embriões produzidos in vitro do que entre doadoras ou entre a população geral. O terceiro estudo examinou os efeitos do coeficiente de endogamia da doadora, do reprodutor (usado na fertilização in vitro) e dos embriões sobre resultados de produção in vitro de embriões na raça Guzerá. Foi detectado efeito da endogamia da doadora e dos embriões sobre as características estudadas. O quarto (e último) estudo foi elaborado para comparar a adequação de modelos mistos lineares e generalizados sob método de Máxima Verossimilhança Restrita (REML) e sua adequação a variáveis discretas. Quatro modelos hierárquicos assumindo diferentes distribuições para dados de contagem encontrados no banco. Inferência foi realizada com base em diagnósticos de resíduo e comparação de razões entre componentes de variância para os modelos em cada variável. Modelos Poisson superaram tanto o modelo linear (com e sem transformação da variável) quanto binomial negativo à qualidade do ajuste e capacidade preditiva, apesar de claras diferenças observadas na distribuição das variáveis. Entre os modelos testados, a pior qualidade de ajuste foi obtida para o modelo linear mediante transformação logarítmica (Log10 X +1) da variável resposta.

Unifying the observational diversity of isolated neutron stars via magneto-thermal evolution models

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Observations of magnetars and some of the high magnetic field pulsars have shown that their thermal luminosity is systematically higher than that of classical radio-pulsars, thus confirming the idea that magnetic fields are involved in their X-ray emission. Here we present the results of 2D simulations of the fully coupled evolution of temperature and magnetic field in neutron stars, including the state-of-the-art kinetic coefficients and, for the first time, the important effect of the Hall term. After gathering and thoroughly re-analysing in a consistent way all the best available data on isolated, thermally emitting neutron stars, we compare our theoretical models to a data sample of 40 sources. We find that our evolutionary models can explain the phenomenological diversity of magnetars, high-B radio-pulsars, and isolated nearby neutron stars by only varying their initial magnetic field, mass and envelope composition. Nearly all sources appear to follow the expectations of the standard theoretical models. Finally, we discuss the expected outburst rates and the evolutionary links between different classes. Our results constitute a major step towards the grand unification of the isolated neutron star zoo.

Assessing the impacts of grazing levels on bird density in woodland habitat: a Bayesian approach using expert opinion

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Many studies on birds focus on the collection of data through an experimental design, suitable for investigation in a classical analysis of variance (ANOVA) framework. Although many findings are confirmed by one or more experts, expert information is rarely used in conjunction with the survey data to enhance the explanatory and predictive power of the model. We explore this neglected aspect of ecological modelling through a study on Australian woodland birds, focusing on the potential impact of different intensities of commercial cattle grazing on bird density in woodland habitat. We examine a number of Bayesian hierarchical random effects models, which cater for overdispersion and a high frequency of zeros in the data using WinBUGS and explore the variation between and within different grazing regimes and species. The impact and value of expert information is investigated through the inclusion of priors that reflect the experience of 20 experts in the field of bird responses to disturbance. Results indicate that expert information moderates the survey data, especially in situations where there are little or no data. When experts agreed, credible intervals for predictions were tightened considerably. When experts failed to agree, results were similar to those evaluated in the absence of expert information. Overall, we found that without expert opinion our knowledge was quite weak. The fact that the survey data is quite consistent, in general, with expert opinion shows that we do know something about birds and grazing and we could learn a lot faster if we used this approach more in ecology, where data are scarce. Copyright (c) 2005 John Wiley & Sons, Ltd.

Data mining and simulation: a grey relationship demonstration

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Fuzzy data has grown to be an important factor in data mining. Whenever uncertainty exists, simulation can be used as a model. Simulation is very flexible, although it can involve significant levels of computation. This article discusses fuzzy decision-making using the grey related analysis method. Fuzzy models are expected to better reflect decision-making uncertainty, at some cost in accuracy relative to crisp models. Monte Carlo simulation is used to incorporate experimental levels of uncertainty into the data and to measure the impact of fuzzy decision tree models using categorical data. Results are compared with decision tree models based on crisp continuous data.

Adding constrained discontinuities to Gaussian process models of wind fields

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Gaussian Processes provide good prior models for spatial data, but can be too smooth. In many physical situations there are discontinuities along bounding surfaces, for example fronts in near-surface wind fields. We describe a modelling method for such a constrained discontinuity and demonstrate how to infer the model parameters in wind fields with MCMC sampling.

Introduction to the theory and application of data envelopment analysis:a foundation text with integrated software

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The book aims to introduce the reader to DEA in the most accessible manner possible. It is specifically aimed at those who have had no prior exposure to DEA and wish to learn its essentials, how it works, its key uses, and the mechanics of using it. The latter will include using DEA software. Students on degree or training courses will find the book especially helpful. The same is true of practitioners engaging in comparative efficiency assessments and performance management within their organisation. Examples are used throughout the book to help the reader consolidate the concepts covered. Table of content: List of Tables. List of Figures. Preface. Abbreviations. 1. Introduction to Performance Measurement. 2. Definitions of Efficiency and Related Measures. 3. Data Envelopment Analysis Under Constant Returns to Scale: Basic Principles. 4. Data Envelopment Analysis under Constant Returns to Scale: General Models. 5. Using Data Envelopment Analysis in Practice. 6. Data Envelopment Analysis under Variable Returns to Scale. 7. Assessing Policy Effectiveness and Productivity Change Using DEA. 8. Incorporating Value Judgements in DEA Assessments. 9. Extensions to Basic DEA Models. 10. A Limited User Guide for Warwick DEA Software. Author Index. Topic Index. References.

Adding constrained discontinuities to Gaussian process models of wind fields

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Gaussian Processes provide good prior models for spatial data, but can be too smooth. In many physical situations there are discontinuities along bounding surfaces, for example fronts in near-surface wind fields. We describe a modelling method for such a constrained discontinuity and demonstrate how to infer the model parameters in wind fields with MCMC sampling.

«
1
2
...
11
12
13
14
15
16
17
...
63
64
»