960 resultados para Discrete Data Models
Resumo:
Time series regression models are especially suitable in epidemiology for evaluating short-term effects of time-varying exposures on health. The problem is that potential for confounding in time series regression is very high. Thus, it is important that trend and seasonality are properly accounted for. Our paper reviews the statistical models commonly used in time-series regression methods, specially allowing for serial correlation, make them potentially useful for selected epidemiological purposes. In particular, we discuss the use of time-series regression for counts using a wide range Generalised Linear Models as well as Generalised Additive Models. In addition, recently critical points in using statistical software for GAM were stressed, and reanalyses of time series data on air pollution and health were performed in order to update already published. Applications are offered through an example on the relationship between asthma emergency admissions and photochemical air pollutants
Resumo:
Planners in public and private institutions would like coherent forecasts of the components of age-specic mortality, such as causes of death. This has been di cult toachieve because the relative values of the forecast components often fail to behave ina way that is coherent with historical experience. In addition, when the group forecasts are combined the result is often incompatible with an all-groups forecast. It hasbeen shown that cause-specic mortality forecasts are pessimistic when compared withall-cause forecasts (Wilmoth, 1995). This paper abandons the conventional approachof using log mortality rates and forecasts the density of deaths in the life table. Sincethese values obey a unit sum constraint for both conventional single-decrement life tables (only one absorbing state) and multiple-decrement tables (more than one absorbingstate), they are intrinsically relative rather than absolute values across decrements aswell as ages. Using the methods of Compositional Data Analysis pioneered by Aitchison(1986), death densities are transformed into the real space so that the full range of multivariate statistics can be applied, then back-transformed to positive values so that theunit sum constraint is honoured. The structure of the best-known, single-decrementmortality-rate forecasting model, devised by Lee and Carter (1992), is expressed incompositional form and the results from the two models are compared. The compositional model is extended to a multiple-decrement form and used to forecast mortalityby cause of death for Japan
Resumo:
Self-organizing maps (Kohonen 1997) is a type of artificial neural network developedto explore patterns in high-dimensional multivariate data. The conventional versionof the algorithm involves the use of Euclidean metric in the process of adaptation ofthe model vectors, thus rendering in theory a whole methodology incompatible withnon-Euclidean geometries.In this contribution we explore the two main aspects of the problem:1. Whether the conventional approach using Euclidean metric can shed valid resultswith compositional data.2. If a modification of the conventional approach replacing vectorial sum and scalarmultiplication by the canonical operators in the simplex (i.e. perturbation andpowering) can converge to an adequate solution.Preliminary tests showed that both methodologies can be used on compositional data.However, the modified version of the algorithm performs poorer than the conventionalversion, in particular, when the data is pathological. Moreover, the conventional ap-proach converges faster to a solution, when data is \well-behaved".Key words: Self Organizing Map; Artificial Neural networks; Compositional data
Resumo:
The integration of geophysical data into the subsurface characterization problem has been shown in many cases to significantly improve hydrological knowledge by providing information at spatial scales and locations that is unattainable using conventional hydrological measurement techniques. The investigation of exactly how much benefit can be brought by geophysical data in terms of its effect on hydrological predictions, however, has received considerably less attention in the literature. Here, we examine the potential hydrological benefits brought by a recently introduced simulated annealing (SA) conditional stochastic simulation method designed for the assimilation of diverse hydrogeophysical data sets. We consider the specific case of integrating crosshole ground-penetrating radar (GPR) and borehole porosity log data to characterize the porosity distribution in saturated heterogeneous aquifers. In many cases, porosity is linked to hydraulic conductivity and thus to flow and transport behavior. To perform our evaluation, we first generate a number of synthetic porosity fields exhibiting varying degrees of spatial continuity and structural complexity. Next, we simulate the collection of crosshole GPR data between several boreholes in these fields, and the collection of porosity log data at the borehole locations. The inverted GPR data, together with the porosity logs, are then used to reconstruct the porosity field using the SA-based method, along with a number of other more elementary approaches. Assuming that the grid-cell-scale relationship between porosity and hydraulic conductivity is unique and known, the porosity realizations are then used in groundwater flow and contaminant transport simulations to assess the benefits and limitations of the different approaches.
Resumo:
This analysis was stimulated by the real data analysis problem of householdexpenditure data. The full dataset contains expenditure data for a sample of 1224 households. The expenditure is broken down at 2 hierarchical levels: 9 major levels (e.g. housing, food, utilities etc.) and 92 minor levels. There are also 5 factors and 5 covariates at the household level. Not surprisingly, there are a small number of zeros at the major level, but many zeros at the minor level. The question is how best to model the zeros. Clearly, models that tryto add a small amount to the zero terms are not appropriate in general as at least some of the zeros are clearly structural, e.g. alcohol/tobacco for households that are teetotal. The key question then is how to build suitable conditional models. For example, is the sub-composition of spendingexcluding alcohol/tobacco similar for teetotal and non-teetotal households?In other words, we are looking for sub-compositional independence. Also, what determines whether a household is teetotal? Can we assume that it is independent of the composition? In general, whether teetotal will clearly depend on the household level variables, so we need to be able to model this dependence. The other tricky question is that with zeros on more than onecomponent, we need to be able to model dependence and independence of zeros on the different components. Lastly, while some zeros are structural, others may not be, for example, for expenditure on durables, it may be chance as to whether a particular household spends money on durableswithin the sample period. This would clearly be distinguishable if we had longitudinal data, but may still be distinguishable by looking at the distribution, on the assumption that random zeros will usually be for situations where any non-zero expenditure is not small.While this analysis is based on around economic data, the ideas carry over tomany other situations, including geological data, where minerals may be missing for structural reasons (similar to alcohol), or missing because they occur only in random regions which may be missed in a sample (similar to the durables)
Resumo:
In this paper we present a novel structure from motion (SfM) approach able to infer 3D deformable models from uncalibrated stereo images. Using a stereo setup dramatically improves the 3D model estimation when the observed 3D shape is mostly deforming without undergoing strong rigid motion. Our approach first calibrates the stereo system automatically and then computes a single metric rigid structure for each frame. Afterwards, these 3D shapes are aligned to a reference view using a RANSAC method in order to compute the mean shape of the object and to select the subset of points on the object which have remained rigid throughout the sequence without deforming. The selected rigid points are then used to compute frame-wise shape registration and to extract the motion parameters robustly from frame to frame. Finally, all this information is used in a global optimization stage with bundle adjustment which allows to refine the frame-wise initial solution and also to recover the non-rigid 3D model. We show results on synthetic and real data that prove the performance of the proposed method even when there is no rigid motion in the original sequence
Using 3D surface datasets to understand landslide evolution: From analogue models to real case study
Resumo:
Early detection of landslide surface deformation with 3D remote sensing techniques, as TLS, has become a great challenge during last decade. To improve our understanding of landslide deformation, a series of analogue simulation have been carried out on non-rigid bodies coupled with 3D digitizer. All these experiments have been carried out under controlled conditions, as water level and slope angle inclination. We were able to follow 3D surface deformation suffered by complex landslide bodies from precursory deformation still larger failures. These experiments were the basis for the development of a new algorithm for the quantification of surface deformation using automatic tracking method on discrete points of the slope surface. To validate the algorithm, comparisons were made between manually obtained results and algorithm surface displacement results. Outputs will help in understanding 3D deformation during pre-failure stages and failure mechanisms, which are fundamental aspects for future implementation of 3D remote sensing techniques in early warning systems.
Resumo:
L’objectiu principal d’aquest projecte era implementar la visualització 3D demodels fusionats i aplicar totes les tècniques possibles per realitzar aquesta fusió. Aquestes tècniques s’integraran en la plataforma de visualització i processament de dades mèdiques STARVIEWER. Per assolir l’ objectiu principal s’ han definit els següents objectius específics:1- estudiar els algoritmes de visualització de models simples i analitzar els diferents paràmetres a tenir en compte. 2- ampliació de la tècnica de visualització bàsica seleccionada per tal de suportar els models fusionats. 3- avaluar i compar tots els mètodes implementats per poder determinar quin ofereix les millors visualitzacions
Resumo:
We analyze interviewer related nonresponse differences in face-to-face surveys distinguishing three types of interviewers: those who have previous experience with the same high standard cross-sectional survey ("experienced"), those who were chosen by the survey agency to complete refusal conversions ("seniors"), and usual interviewers. The nonresponse components are obtaining household contact, target person contact, and target person cooperation. In addition we examine if interviewer homogeneity with respect to these components is different across the three interviewer groups. Data come from the European Social Survey (ESS) contact forms from four countries which participated during the three rounds 2002/04/06 and used the same survey agency that in turn used to some extent the same interviewers. To analyze interviewer effects, we use discrete two-level models. We find some evidence of better performance by both senior and experienced interviewers and indications of greater homogeneity for nonresponse components, especially for those that contain room for improvement. Surprisingly, the senior interviewers do not outperform those experienced. We conclude that survey agencies should make more efforts to decrease the comparatively high interviewer turnover.
Resumo:
This is one of the few studies that have explored the value of baseline symptoms and health-related quality of life (HRQOL) in predicting survival in brain cancer patients. Baseline HRQOL scores (from the EORTC QLQ-C30 and the Brain Cancer Module (BN 20)) were examined in 490 newly diagnosed glioblastoma cancer patients for the relationship with overall survival by using Cox proportional hazards regression models. Refined techniques as the bootstrap re-sampling procedure and the computation of C-indexes and R(2)-coefficients were used to try and validate the model. Classical analysis controlled for major clinical prognostic factors selected cognitive functioning (P=0.0001), global health status (P=0.0055) and social functioning (P<0.0001) as statistically significant prognostic factors of survival. However, several issues question the validity of these findings. C-indexes and R(2)-coefficients, which are measures of the predictive ability of the models, did not exhibit major improvements when adding selected or all HRQOL scores to clinical factors. While classical techniques lead to positive results, more refined analyses suggest that baseline HRQOL scores add relatively little to clinical factors to predict survival. These results may have implications for future use of HRQOL as a prognostic factor in cancer patients.
Resumo:
Spatial data on species distributions are available in two main forms, point locations and distribution maps (polygon ranges and grids). The first are often temporally and spatially biased, and too discontinuous, to be useful (untransformed) in spatial analyses. A variety of modelling approaches are used to transform point locations into maps. We discuss the attributes that point location data and distribution maps must satisfy in order to be useful in conservation planning. We recommend that before point location data are used to produce and/or evaluate distribution models, the dataset should be assessed under a set of criteria, including sample size, age of data, environmental/geographical coverage, independence, accuracy, time relevance and (often forgotten) representation of areas of permanent and natural presence of the species. Distribution maps must satisfy additional attributes if used for conservation analyses and strategies, including minimizing commission and omission errors, credibility of the source/assessors and availability for public screening. We review currently available databases for mammals globally and show that they are highly variable in complying with these attributes. The heterogeneity and weakness of spatial data seriously constrain their utility to global and also sub-global scale conservation analyses.
Resumo:
Quantitative or algorithmic trading is the automatization of investments decisions obeying a fixed or dynamic sets of rules to determine trading orders. It has increasingly made its way up to 70% of the trading volume of one of the biggest financial markets such as the New York Stock Exchange (NYSE). However, there is not a signi cant amount of academic literature devoted to it due to the private nature of investment banks and hedge funds. This projects aims to review the literature and discuss the models available in a subject that publications are scarce and infrequently. We review the basic and fundamental mathematical concepts needed for modeling financial markets such as: stochastic processes, stochastic integration and basic models for prices and spreads dynamics necessary for building quantitative strategies. We also contrast these models with real market data with minutely sampling frequency from the Dow Jones Industrial Average (DJIA). Quantitative strategies try to exploit two types of behavior: trend following or mean reversion. The former is grouped in the so-called technical models and the later in the so-called pairs trading. Technical models have been discarded by financial theoreticians but we show that they can be properly cast into a well defined scientific predictor if the signal generated by them pass the test of being a Markov time. That is, we can tell if the signal has occurred or not by examining the information up to the current time; or more technically, if the event is F_t-measurable. On the other hand the concept of pairs trading or market neutral strategy is fairly simple. However it can be cast in a variety of mathematical models ranging from a method based on a simple euclidean distance, in a co-integration framework or involving stochastic differential equations such as the well-known Ornstein-Uhlenbeck mean reversal ODE and its variations. A model for forecasting any economic or financial magnitude could be properly defined with scientific rigor but it could also lack of any economical value and be considered useless from a practical point of view. This is why this project could not be complete without a backtesting of the mentioned strategies. Conducting a useful and realistic backtesting is by no means a trivial exercise since the \laws" that govern financial markets are constantly evolving in time. This is the reason because we make emphasis in the calibration process of the strategies' parameters to adapt the given market conditions. We find out that the parameters from technical models are more volatile than their counterpart form market neutral strategies and calibration must be done in a high-frequency sampling manner to constantly track the currently market situation. As a whole, the goal of this project is to provide an overview of a quantitative approach to investment reviewing basic strategies and illustrating them by means of a back-testing with real financial market data. The sources of the data used in this project are Bloomberg for intraday time series and Yahoo! for daily prices. All numeric computations and graphics used and shown in this project were implemented in MATLAB^R scratch from scratch as a part of this thesis. No other mathematical or statistical software was used.
Resumo:
Background: The increasing availability of different monoclonal antibodies (mAbs) opens the way to more specific biologic therapy of cancer patients. However, despite the significant success of therapy in breast and ovarian carcinomas with anti-HER2 mAbs as well as in non-Hodkin B cell lymphomas with anti-CD20 mAbs, certain B cell malignancies such as B chronic lymphocytic leukaemia (B-CLL) respond poorly to anti-CD20 mAb, due to the low surface expression of this molecule. Thus, new mAbs adapted to each types of tumour will help to develop personalised mAb treatment. To this aim, we analyse the biological and therapeutic properties of three mAbs directed against the CD5, CD71 or HLA-DR molecules highly expressed on B-CLL cells. Results: The three mAbs, after purification and radiolabelling demonstrated high and specific binding capacity to various human leukaemia target cells. Further in vitro analysis showed that mAb anti-CD5 induced neither growth inhibition nor apoptosis, mAb anti-CD71 induced proliferation inhibition with no early sign of cell death and mAb anti-HLA-DR induced specific cell aggregation, but without evidence of apoptosis. All three mAbs induced various degrees of ADCC by NK cells, as well as phagocytosis by macrophages. Only the anti-HLA-DR mAb induced complement mediated lysis. Coincubation of different pairs of mAbs did not significantly modify the in vitro results. In contrast with these discrete and heterogeneous in vitro effects, in vivo the three mAbs demonstrated marked anti-tumour efficacy and prolongation of mice survival in two models of SCID mice, grafted either intraperitoneally or intravenously with the CD5 transfected JOK1-5.3 cells. This cell line was derived from a human hairy cell leukaemia, a type of malignancy known to have very similar biological properties as the B-CLL, whose cells constitutively express CD5. Interestingly, the combined injection of anti-CD5 with anti-HLA-DR or with anti-CD71 led to longer mouse survival, as compared to single mAb injection, up to complete inhibition of tumour growth in 100% mice treated with both anti-HLA-DR and anti-CD5. Conclusions: Altogether these data suggest that the combined use of two mAbs, such as anti-HLA-DR and anti-CD5, may significantly enhance their therapeutic potential.
Resumo:
In this note, we consider claims problems with indivisible goods. Specifically, by applying recursively the P-rights lower bound (Jiménez-Gómez and Marco-Gil (2008)), we ensure the fulfillment of Weak Order Preservation, considered by many authors as a minimal requirement of fairness. Moreover, we retrieve the Discrete Constrained Equal Losses and the Discrete Constrained Equal Awards rules (Herrero and Martíınez (2008)). Finally, by the recursive double imposition of a lower and an upper bound, we obtain the average between them. Keywords: Claims problems, Indivisibilities, Order Preservation, Constrained Egalitarian rules, Midpoint. JEL classification: C71, D63, D71.
Resumo:
This paper studies the limits of discrete time repeated games with public monitoring. We solve and characterize the Abreu, Milgrom and Pearce (1991) problem. We found that for the "bad" ("good") news model the lower (higher) magnitude events suggest cooperation, i.e., zero punishment probability, while the highrt (lower) magnitude events suggest defection, i.e., punishment with probability one. Public correlation is used to connect these two sets of signals and to make the enforceability to bind. The dynamic and limit behavior of the punishment probabilities for variations in ... (the discount rate) and ... (the time interval) are characterized, as well as the limit payo¤s for all these scenarios (We also introduce uncertainty in the time domain). The obtained ... limits are to the best of my knowledge, new. The obtained ... limits coincide with Fudenberg and Levine (2007) and Fudenberg and Olszewski (2011), with the exception that we clearly state the precise informational conditions that cause the limit to converge from above, to converge from below or to degenerate. JEL: C73, D82, D86. KEYWORDS: Repeated Games, Frequent Monitoring, Random Pub- lic Monitoring, Moral Hazard, Stochastic Processes.