963 resultados para Count data models
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
This study aimed at identifying clinical factors for predicting hematologic toxicity after radioimmunotherapy with (90)Y-ibritumomab tiuxetan or (131)I-tositumomab in clinical practice. Hematologic data were available from 14 non-Hodgkin lymphoma patients treated with (90)Y-ibritumomab tiuxetan and 18 who received (131)I-tositumomab. The percentage baseline at nadir and 4 wk post nadir and the time to nadir were selected as the toxicity indicators for both platelets and neutrophils. Multiple linear regression analysis was performed to identify significant predictors (P < 0.05) of each indicator. For both platelets and neutrophils, pooled and separate analyses of (90)Y-ibritumomab tiuxetan and (131)I-tositumomab data yielded the time elapsed since the last chemotherapy as the only significant predictor of the percentage baseline at nadir. The extent of bone marrow involvement was not a significant factor in this study, possibly because of the short time elapsed since the last chemotherapy of the 7 patients with bone marrow involvement. Because both treatments were designed to deliver a comparable bone marrow dose, this factor also was not significant. None of the 14 factors considered was predictive of the time to nadir. The R(2) value for the model predicting percentage baseline at nadir was 0.60 for platelets and 0.40 for neutrophils. This model predicted the platelet and neutrophil toxicity grade to within ±1 for 28 and 30 of the 32 patients, respectively. For the 7 patients predicted with grade I thrombocytopenia, 6 of whom had actual grade I-II, dosing might be increased to improve treatment efficacy. The elapsed time since the last chemotherapy can be used to predict hematologic toxicity and customize the current dosing method in radioimmunotherapy.
Resumo:
It has been argued that by truncating the sample space of the negative binomial and of the inverse Gaussian-Poisson mixture models at zero, one is allowed to extend the parameter space of the model. Here that is proved to be the case for the more general three parameter Tweedie-Poisson mixture model. It is also proved that the distributions in the extended part of the parameter space are not the zero truncation of mixed poisson distributions and that, other than for the negative binomial, they are not mixtures of zero truncated Poisson distributions either. By extending the parameter space one can improve the fit when the frequency of one is larger and the right tail is heavier than is allowed by the unextended model. Considering the extended model also allows one to use the basic maximum likelihood based inference tools when parameter estimates fall in the extended part of the parameter space, and hence when the m.l.e. does not exist under the unextended model. This extended truncated Tweedie-Poisson model is proved to be useful in the analysis of words and species frequency count data.
Resumo:
Failure to detect a species in an area where it is present is a major source of error in biological surveys. We assessed whether it is possible to optimize single-visit biological monitoring surveys of highly dynamic freshwater ecosystems by framing them a priori within a particular period of time. Alternatively, we also searched for the optimal number of visits and when they should be conducted. We developed single-species occupancy models to estimate the monthly probability of detection of pond-breeding amphibians during a four-year monitoring program. Our results revealed that detection probability was species-specific and changed among sampling visits within a breeding season and also among breeding seasons. Thereby, the optimization of biological surveys with minimal survey effort (a single visit) is not feasible as it proves impossible to select a priori an adequate sampling period that remains robust across years. Alternatively, a two-survey combination at the beginning of the sampling season yielded optimal results and constituted an acceptable compromise between sampling efficacy and survey effort. Our study provides evidence of the variability and uncertainty that likely affects the efficacy of monitoring surveys, highlighting the need of repeated sampling in both ecological studies and conservation management.
Resumo:
Background: Most mortality atlases show static maps from count data aggregated over time. This procedure has several methodological problems and serious limitations for decision making in Public Health. The evaluation of health outcomes, including mortality, should be approached from a dynamic time perspective that is specific for each gender and age group. At the moment, researches in Spain do not provide a dynamic image of the population’s mortality status from a spatio-temporal point of view. The aim of this paper is to describe the spatial distribution of mortality from all causes in small areas of Andalusia (Southern Spain) and evolution over time from 1981 to 2006. Methods: A small-area ecological study was devised using the municipality as the unit for analysis. Two spatiotemporal hierarchical Bayesian models were estimated for each age group and gender. One of these was used to estimate the specific mortality rate, together with its time trends, and the other to estimate the specific rate ratio for each municipality compared with Spain as a whole. Results: More than 97% of the municipalities showed a diminishing or flat mortality trend in all gender and age groups. In 2006, over 95% of municipalities showed male and female mortality specific rates similar or significantly lower than Spanish rates for all age groups below 65. Systematically, municipalities in Western Andalusia showed significant male and female mortality excess from 1981 to 2006 only in age groups over 65. Conclusions: The study shows a dynamic geographical distribution of mortality, with a different pattern for each year, gender and age group. This information will contribute towards a reflection on the past, present and future of mortality in Andalusia.
Resumo:
Until now, mortality atlases have been static. Most of them describe the geographical distribution of mortality using count data aggregated over time and standardized mortality rates. However, this methodology has several limitations. Count data aggregated over time produce a bias in the estimation of death rates. Moreover, this practice difficult the study of temporal changes in geographical distribution of mortality. On the other hand, using standardized mortality hamper to check differences in mortality among groups. The Interactive Mortality Atlas in Andalusia (AIMA) is an alternative to conventional static atlases. It is a dynamic Geographical Information System that allows visualizing in web-site more than 12.000 maps and 338.00 graphics related to the spatio-temporal distribution of the main death causes in Andalusia by age and sex groups from 1981. The objective of this paper is to describe the methods used for AIMA development, to show technical specifications and to present their interactivity. The system is available from the link products in www.demap.es. AIMA is the first interactive GIS that have been developed in Spain with these characteristics. Spatio-temporal Hierarchical Bayesian Models were used for statistical data analysis. The results were integrated into web-site using a PHP environment and a dynamic cartography in Flash. Thematic maps in AIMA demonstrate that the geographical distribution of mortality is dynamic, with differences among year, age and sex groups. The information nowadays provided by AIMA and the future updating will contribute to reflect on the past, the present and the future of population health in Andalusia.
Resumo:
Background Estimated cancer mortality statistics were published for the years 2011 and 2012 for the European Union (EU) and its six more populous countries. Patients and methods Using logarithmic Poisson count data joinpoint models and the World Health Organization mortality and population database, we estimated numbers of deaths and age-standardized (world) mortality rates (ASRs) in 2013 from all cancers and selected cancers. Results The 2013 predicted number of cancer deaths in the EU is 1 314 296 (737 747 men and 576 489 women). Between 2009 and 2013, all cancer ASRs are predicted to fall by 6% to 140.1/100 000 in men, and by 4% to 85.3/100 000 in women. The ASRs per 100 000 are 6.6 men and 2.9 women for stomach, 16.7 men and 9.5 women for intestines, 8.0 men and 5.5 women for pancreas, 37.1 men and 13.9 women for lung, 10.5 men for prostate, 14.6 women for breast, and 4.7 for uterine cancer, and 4.2 and 2.6 for leukaemia. Recent trends are favourable except for pancreatic cancer and lung cancer in women. Conclusions Favourable trends will continue in 2013. Pancreatic cancer has become the fourth cause of cancer death in both sexes, while in a few years lung cancer will likely become the first cause of cancer mortality in women as well, overtaking breast cancer.
Resumo:
The log-ratio methodology makes available powerful tools for analyzing compositionaldata. Nevertheless, the use of this methodology is only possible for those data setswithout null values. Consequently, in those data sets where the zeros are present, aprevious treatment becomes necessary. Last advances in the treatment of compositionalzeros have been centered especially in the zeros of structural nature and in the roundedzeros. These tools do not contemplate the particular case of count compositional datasets with null values. In this work we deal with \count zeros" and we introduce atreatment based on a mixed Bayesian-multiplicative estimation. We use the Dirichletprobability distribution as a prior and we estimate the posterior probabilities. Then weapply a multiplicative modi¯cation for the non-zero values. We present a case studywhere this new methodology is applied.Key words: count data, multiplicative replacement, composition, log-ratio analysis
Resumo:
In recent years there has been an explosive growth in the development of adaptive and data driven methods. One of the efficient and data-driven approaches is based on statistical learning theory (Vapnik 1998). The theory is based on Structural Risk Minimisation (SRM) principle and has a solid statistical background. When applying SRM we are trying not only to reduce training error ? to fit the available data with a model, but also to reduce the complexity of the model and to reduce generalisation error. Many nonlinear learning procedures recently developed in neural networks and statistics can be understood and interpreted in terms of the structural risk minimisation inductive principle. A recent methodology based on SRM is called Support Vector Machines (SVM). At present SLT is still under intensive development and SVM find new areas of application (www.kernel-machines.org). SVM develop robust and non linear data models with excellent generalisation abilities that is very important both for monitoring and forecasting. SVM are extremely good when input space is high dimensional and training data set i not big enough to develop corresponding nonlinear model. Moreover, SVM use only support vectors to derive decision boundaries. It opens a way to sampling optimization, estimation of noise in data, quantification of data redundancy etc. Presentation of SVM for spatially distributed data is given in (Kanevski and Maignan 2004).
Resumo:
BACKGROUND: Estimating current cancer mortality figures is important for defining priorities for prevention and treatment.Materials and methods:Using logarithmic Poisson count data joinpoint models on mortality and population data from the World Health Organization database, we estimated numbers of deaths and age-standardized rates in 2012 from all cancers and selected cancer sites for the whole European Union (EU) and its six more populated countries. RESULTS: Cancer deaths in the EU in 2012 are estimated to be 1 283 101 (717 398 men and 565 703 women) corresponding to standardized overall cancer death rates of 139/100 000 men and 85/100 000 women. The fall from 2007 was 10% in men and 7% in women. In men, declines are predicted for stomach (-20%), leukemias (-11%), lung and prostate (-10%) and colorectal (-7%) cancers, and for stomach (-23%), leukemias (-12%), uterus and colorectum (-11%) and breast (-9%) in women. Almost stable rates are expected for pancreatic cancer (+2-3%) and increases for female lung cancer (+7%). Younger women show the greatest falls in breast cancer mortality rates in the EU (-17%), and declines are expected in all individual countries, except Poland. CONCLUSION: Apart for lung cancer in women and pancreatic cancer, continuing falls are expected in mortality from major cancers in the EU.
Resumo:
OBJECTIVES: Patients with inflammatory bowel disease (IBD) have a high resource consumption, with considerable costs for the healthcare system. In a system with sparse resources, treatment is influenced not only by clinical judgement but also by resource consumption. We aimed to determine the resource consumption of IBD patients and to identify its significant predictors. MATERIALS AND METHODS: Data from the prospective Swiss Inflammatory Bowel Disease Cohort Study were analysed for the resource consumption endpoints hospitalization and outpatient consultations at enrolment [1187 patients; 41.1% ulcerative colitis (UC), 58.9% Crohn's disease (CD)] and at 1-year follow-up (794 patients). Predictors of interest were chosen through an expert panel and a review of the relevant literature. Logistic regressions were used for binary endpoints, and negative binomial regressions and zero-inflated Poisson regressions were used for count data. RESULTS: For CD, fistula, use of biologics and disease activity were significant predictors for hospitalization days (all P-values <0.001); age, sex, steroid therapy and biologics were significant predictors for the number of outpatient visits (P=0.0368, 0.023, 0.0002, 0.0003, respectively). For UC, biologics, C-reactive protein, smoke quitters, age and sex were significantly predictive for hospitalization days (P=0.0167, 0.0003, 0.0003, 0.0076 and 0.0175 respectively); disease activity and immunosuppressive therapy predicted the number of outpatient visits (P=0.0009 and 0.0017, respectively). The results of multivariate regressions are shown in detail. CONCLUSION: Several highly significant clinical predictors for resource consumption in IBD were identified that might be considered in medical decision-making. In terms of resource consumption and its predictors, CD and UC show a different behaviour.
Resumo:
While general equilibrium theories of trade stress the role of third-country effects, little work has been done in the empirical foreign direct investment (FDI) literature to test such spatial linkages. This paper aims to provide further insights into long-run determinants of Spanish FDI by considering not only bilateral but also spatially weighted third-country determinants. The few studies carried out so far have focused on FDI flows in a limited number of countries. However, Spanish FDI outflows have risen dramatically since 1995 and today account for a substantial part of global FDI. Therefore, we estimate recently developed Spatial Panel Data models by Maximum Likelihood (ML) procedures for Spanish outflows (1993-2004) to top-50 host countries. After controlling for unobservable effects, we find that spatial interdependence matters and provide evidence consistent with New Economic Geography (NEG) theories of agglomeration, mainly due to complex (vertical) FDI motivations. Spatial Error Models estimations also provide illuminating results regarding the transmission mechanism of shocks.
Resumo:
MicroEconometria és un paquet estadístic i economètric que contempla l’estimació de models uniequacionals: 1- Regressió simple i múltiple: anàlisi de residus, influència i atipicitat, diagnòstics de multicol·linealitat, estimació robusta, predicció, diagnòstics d’estabilitat, bootstrap. 2- Regressió en panell: efectes fixes, efectes aleatoris i efectes combinats. 3- Regressió lògit i probit. 4- Regressió censurada: tobit i model de selecció de Heckman. 5- Regressió multinomial. 6- Regressió poisson: model ‘count data’. 7- Índexs amb variables renda i riquesa i impostos transferències. Genera un informe per a cada una de les possibilitats contemplades que conté la presentació dels resultats de les estimacions, incloent les sortides gràfiques pertinents. L’input del programa és qualsevol base de dades, en la que es pugui identificar la variable endògena i les variables exògenes del model utilitzat, continguda en un llibre d’EXCEL de Microsoft.
Resumo:
Especially in global enterprises, key data is fragmented in multiple Enterprise Resource Planning (ERP) systems. Thus the data is inconsistent, fragmented and redundant across the various systems. Master Data Management (MDM) is a concept, which creates cross-references between customers, suppliers and business units, and enables corporate hierarchies and structures. The overall goal for MDM is the ability to create an enterprise-wide consistent data model, which enables analyzing and reporting customer and supplier data. The goal of the study was defining the properties and success factors of a master data system. The theoretical background was based on literature and the case consisted of enterprise specific needs and demands. The theoretical part presents the concept, background, and principles of MDM and then the phases of system planning and implementation project. Case consists of background, definition of as is situation, definition of project, evaluation criterions and concludes the key results of the thesis. In the end chapter Conclusions combines common principles with the results of the case. The case part ended up dividing important factors of the system in success factors, technical requirements and business benefits. To clarify the project and find funding for the project, business benefits have to be defined and the realization has to be monitored. The thesis found out six success factors for the MDM system: Well defined business case, data management and monitoring, data models and structures defined and maintained, customer and supplier data governance, delivery and quality, commitment, and continuous communication with business. Technical requirements emerged several times during the thesis and therefore those can’t be ignored in the project. Conclusions chapter goes through these factors on a general level. The success factors and technical requirements are related to the essentials of MDM: Governance, Action and Quality. This chapter could be used as guidance in a master data management project.
Approximation de la distribution a posteriori d'un modèle Gamma-Poisson hiérarchique à effets mixtes
Resumo:
La méthode que nous présentons pour modéliser des données dites de "comptage" ou données de Poisson est basée sur la procédure nommée Modélisation multi-niveau et interactive de la régression de Poisson (PRIMM) développée par Christiansen et Morris (1997). Dans la méthode PRIMM, la régression de Poisson ne comprend que des effets fixes tandis que notre modèle intègre en plus des effets aléatoires. De même que Christiansen et Morris (1997), le modèle étudié consiste à faire de l'inférence basée sur des approximations analytiques des distributions a posteriori des paramètres, évitant ainsi d'utiliser des méthodes computationnelles comme les méthodes de Monte Carlo par chaînes de Markov (MCMC). Les approximations sont basées sur la méthode de Laplace et la théorie asymptotique liée à l'approximation normale pour les lois a posteriori. L'estimation des paramètres de la régression de Poisson est faite par la maximisation de leur densité a posteriori via l'algorithme de Newton-Raphson. Cette étude détermine également les deux premiers moments a posteriori des paramètres de la loi de Poisson dont la distribution a posteriori de chacun d'eux est approximativement une loi gamma. Des applications sur deux exemples de données ont permis de vérifier que ce modèle peut être considéré dans une certaine mesure comme une généralisation de la méthode PRIMM. En effet, le modèle s'applique aussi bien aux données de Poisson non stratifiées qu'aux données stratifiées; et dans ce dernier cas, il comporte non seulement des effets fixes mais aussi des effets aléatoires liés aux strates. Enfin, le modèle est appliqué aux données relatives à plusieurs types d'effets indésirables observés chez les participants d'un essai clinique impliquant un vaccin quadrivalent contre la rougeole, les oreillons, la rub\'eole et la varicelle. La régression de Poisson comprend l'effet fixe correspondant à la variable traitement/contrôle, ainsi que des effets aléatoires liés aux systèmes biologiques du corps humain auxquels sont attribués les effets indésirables considérés.