898 resultados para Discrete Regression and Qualitative Choice Models
Resumo:
Spectral sensors are a wide class of devices that are extremely useful for detecting essential information of the environment and materials with high degree of selectivity. Recently, they have achieved high degrees of integration and low implementation cost to be suited for fast, small, and non-invasive monitoring systems. However, the useful information is hidden in spectra and it is difficult to decode. So, mathematical algorithms are needed to infer the value of the variables of interest from the acquired data. Between the different families of predictive modeling, Principal Component Analysis and the techniques stemmed from it can provide very good performances, as well as small computational and memory requirements. For these reasons, they allow the implementation of the prediction even in embedded and autonomous devices. In this thesis, I will present 4 practical applications of these algorithms to the prediction of different variables: moisture of soil, moisture of concrete, freshness of anchovies/sardines, and concentration of gasses. In all of these cases, the workflow will be the same. Initially, an acquisition campaign was performed to acquire both spectra and the variables of interest from samples. Then these data are used as input for the creation of the prediction models, to solve both classification and regression problems. From these models, an array of calibration coefficients is derived and used for the implementation of the prediction in an embedded system. The presented results will show that this workflow was successfully applied to very different scientific fields, obtaining autonomous and non-invasive devices able to predict the value of physical parameters of choice from new spectral acquisitions.
Resumo:
Universidade Estadual de Campinas . Faculdade de Educação Física
Resumo:
This paper presents an agent-based approach to modelling individual driver behaviour under the influence of real-time traffic information. The driver behaviour models developed in this study are based on a behavioural survey of drivers which was conducted on a congested commuting corridor in Brisbane, Australia. Commuters' responses to travel information were analysed and a number of discrete choice models were developed to determine the factors influencing drivers' behaviour and their propensity to change route and adjust travel patterns. Based on the results obtained from the behavioural survey, the agent behaviour parameters which define driver characteristics, knowledge and preferences were identified and their values determined. A case study implementing a simple agent-based route choice decision model within a microscopic traffic simulation tool is also presented. Driver-vehicle units (DVUs) were modelled as autonomous software components that can each be assigned a set of goals to achieve and a database of knowledge comprising certain beliefs, intentions and preferences concerning the driving task. Each DVU provided route choice decision-making capabilities, based on perception of its environment, that were similar to the described intentions of the driver it represented. The case study clearly demonstrated the feasibility of the approach and the potential to develop more complex driver behavioural dynamics based on the belief-desire-intention agent architecture. (C) 2002 Elsevier Science Ltd. All rights reserved.
Resumo:
This study compared an enzyme-linked immunosorbent assay (ELISA) to a liquid chromatography-tandem mass spectrometry (LC/MS/MS) technique for measurement of tacrolimus concentrations in adult kidney and liver transplant recipients, and investigated how assay choice influenced pharmacokinetic parameter estimates and drug dosage decisions. Tacrolimus concentrations measured by both ELISA and LC/MS/MS from 29 kidney (n = 98 samples) and 27 liver (n = 97 samples) transplant recipients were used to evaluate the performance of these methods in the clinical setting. Tacrolimus concentrations measured by the two techniques were compared via regression analysis. Population pharmacokinetic models were developed independently using ELISA and LC/MS/MS data from 76 kidney recipients. Derived kinetic parameters were used to formulate typical dosing regimens for concentration targeting. Dosage recommendations for the two assays were compared. The relation between LC/MS/MS and ELISA measurements was best described by the regression equation ELISA = 1.02 . (LC/MS/MS) + 0.14 in kidney recipients, and ELISA = 1.12 . (LC/MS/MS) - 0.87 in liver recipients. ELISA displayed less accuracy than LC/MS/MS at lower tacrolimus concentrations. Population pharmacokinetic models based on ELISA and LC/MS/MS data were similar with residual random errors of 4.1 ng/mL and 3.7 ng/mL, respectively. Assay choice gave rise to dosage prediction differences ranging from 0% to 30%. ELISA measurements of tacrolimus are not automatically interchangeable with LC/MS/MS values. Assay differences were greatest in adult liver recipients, probably reflecting periods of liver dysfunction and impaired biliary secretion of metabolites. While the majority of data collected in this study suggested assay differences in adult kidney recipients were minimal, findings of ELISA dosage underpredictions of up to 25% in the long term must be investigated further.
Resumo:
AbstractBackground:30-40% of cardiac resynchronization therapy cases do not achieve favorable outcomes.Objective:This study aimed to develop predictive models for the combined endpoint of cardiac death and transplantation (Tx) at different stages of cardiac resynchronization therapy (CRT).Methods:Prospective observational study of 116 patients aged 64.8 ± 11.1 years, 68.1% of whom had functional class (FC) III and 31.9% had ambulatory class IV. Clinical, electrocardiographic and echocardiographic variables were assessed by using Cox regression and Kaplan-Meier curves.Results:The cardiac mortality/Tx rate was 16.3% during the follow-up period of 34.0 ± 17.9 months. Prior to implantation, right ventricular dysfunction (RVD), ejection fraction < 25% and use of high doses of diuretics (HDD) increased the risk of cardiac death and Tx by 3.9-, 4.8-, and 5.9-fold, respectively. In the first year after CRT, RVD, HDD and hospitalization due to congestive heart failure increased the risk of death at hazard ratios of 3.5, 5.3, and 12.5, respectively. In the second year after CRT, RVD and FC III/IV were significant risk factors of mortality in the multivariate Cox model. The accuracy rates of the models were 84.6% at preimplantation, 93% in the first year after CRT, and 90.5% in the second year after CRT. The models were validated by bootstrapping.Conclusion:We developed predictive models of cardiac death and Tx at different stages of CRT based on the analysis of simple and easily obtainable clinical and echocardiographic variables. The models showed good accuracy and adjustment, were validated internally, and are useful in the selection, monitoring and counseling of patients indicated for CRT.
Resumo:
The role of land cover change as a significant component of global change has become increasingly recognized in recent decades. Large databases measuring land cover change, and the data which can potentially be used to explain the observed changes, are also becoming more commonly available. When developing statistical models to investigate observed changes, it is important to be aware that the chosen sampling strategy and modelling techniques can influence results. We present a comparison of three sampling strategies and two forms of grouped logistic regression models (multinomial and ordinal) in the investigation of patterns of successional change after agricultural land abandonment in Switzerland. Results indicated that both ordinal and nominal transitional change occurs in the landscape and that the use of different sampling regimes and modelling techniques as investigative tools yield different results. Synthesis and applications. Our multimodel inference identified successfully a set of consistently selected indicators of land cover change, which can be used to predict further change, including annual average temperature, the number of already overgrown neighbouring areas of land and distance to historically destructive avalanche sites. This allows for more reliable decision making and planning with respect to landscape management. Although both model approaches gave similar results, ordinal regression yielded more parsimonious models that identified the important predictors of land cover change more efficiently. Thus, this approach is favourable where land cover change pattern can be interpreted as an ordinal process. Otherwise, multinomial logistic regression is a viable alternative.
Resumo:
When actuaries face with the problem of pricing an insurance contract that contains different types of coverage, such as a motor insurance or homeowner's insurance policy, they usually assume that types of claim are independent. However, this assumption may not be realistic: several studies have shown that there is a positive correlation between types of claim. Here we introduce different regression models in order to relax the independence assumption, including zero-inflated models to account for excess of zeros and overdispersion. These models have been largely ignored to multivariate Poisson date, mainly because of their computational di±culties. Bayesian inference based on MCMC helps to solve this problem (and also lets us derive, for several quantities of interest, posterior summaries to account for uncertainty). Finally, these models are applied to an automobile insurance claims database with three different types of claims. We analyse the consequences for pure and loaded premiums when the independence assumption is relaxed by using different multivariate Poisson regression models and their zero-inflated versions.
Resumo:
BACKGROUND: We sought to improve upon previously published statistical modeling strategies for binary classification of dyslipidemia for general population screening purposes based on the waist-to-hip circumference ratio and body mass index anthropometric measurements. METHODS: Study subjects were participants in WHO-MONICA population-based surveys conducted in two Swiss regions. Outcome variables were based on the total serum cholesterol to high density lipoprotein cholesterol ratio. The other potential predictor variables were gender, age, current cigarette smoking, and hypertension. The models investigated were: (i) linear regression; (ii) logistic classification; (iii) regression trees; (iv) classification trees (iii and iv are collectively known as "CART"). Binary classification performance of the region-specific models was externally validated by classifying the subjects from the other region. RESULTS: Waist-to-hip circumference ratio and body mass index remained modest predictors of dyslipidemia. Correct classification rates for all models were 60-80%, with marked gender differences. Gender-specific models provided only small gains in classification. The external validations provided assurance about the stability of the models. CONCLUSIONS: There were no striking differences between either the algebraic (i, ii) vs. non-algebraic (iii, iv), or the regression (i, iii) vs. classification (ii, iv) modeling approaches. Anticipated advantages of the CART vs. simple additive linear and logistic models were less than expected in this particular application with a relatively small set of predictor variables. CART models may be more useful when considering main effects and interactions between larger sets of predictor variables.
Resumo:
Interaction effects are usually modeled by means of moderated regression analysis. Structural equation models with non-linear constraints make it possible to estimate interaction effects while correcting formeasurement error. From the various specifications, Jöreskog and Yang's(1996, 1998), likely the most parsimonious, has been chosen and further simplified. Up to now, only direct effects have been specified, thus wasting much of the capability of the structural equation approach. This paper presents and discusses an extension of Jöreskog and Yang's specification that can handle direct, indirect and interaction effects simultaneously. The model is illustrated by a study of the effects of an interactive style of use of budgets on both company innovation and performance
Resumo:
Several studies have reported high performance of simple decision heuristics multi-attribute decision making. In this paper, we focus on situations where attributes are binary and analyze the performance of Deterministic-Elimination-By-Aspects (DEBA) and similar decision heuristics. We consider non-increasing weights and two probabilistic models for the attribute values: one where attribute values are independent Bernoulli randomvariables; the other one where they are binary random variables with inter-attribute positive correlations. Using these models, we show that good performance of DEBA is explained by the presence of cumulative as opposed to simple dominance. We therefore introduce the concepts of cumulative dominance compliance and fully cumulative dominance compliance and show that DEBA satisfies those properties. We derive a lower bound with which cumulative dominance compliant heuristics will choose a best alternative and show that, even with many attributes, this is not small. We also derive an upper bound for the expected loss of fully cumulative compliance heuristics and show that this is moderateeven when the number of attributes is large. Both bounds are independent of the values ofthe weights.
Resumo:
The network choice revenue management problem models customers as choosing from an offer-set, andthe firm decides the best subset to offer at any given moment to maximize expected revenue. The resultingdynamic program for the firm is intractable and approximated by a deterministic linear programcalled the CDLP which has an exponential number of columns. However, under the choice-set paradigmwhen the segment consideration sets overlap, the CDLP is difficult to solve. Column generation has beenproposed but finding an entering column has been shown to be NP-hard. In this paper, starting with aconcave program formulation based on segment-level consideration sets called SDCP, we add a class ofconstraints called product constraints, that project onto subsets of intersections. In addition we proposea natural direct tightening of the SDCP called ?SDCP, and compare the performance of both methodson the benchmark data sets in the literature. Both the product constraints and the ?SDCP method arevery simple and easy to implement and are applicable to the case of overlapping segment considerationsets. In our computational testing on the benchmark data sets in the literature, SDCP with productconstraints achieves the CDLP value at a fraction of the CPU time taken by column generation and webelieve is a very promising approach for quickly approximating CDLP when segment consideration setsoverlap and the consideration sets themselves are relatively small.
Resumo:
Many dynamic revenue management models divide the sale period into a finite number of periods T and assume, invoking a fine-enough grid of time, that each period sees at most one booking request. These Poisson-type assumptions restrict the variability of the demand in the model, but researchers and practitioners were willing to overlook this for the benefit of tractability of the models. In this paper, we criticize this model from another angle. Estimating the discrete finite-period model poses problems of indeterminacy and non-robustness: Arbitrarily fixing T leads to arbitrary control values and on the other hand estimating T from data adds an additional layer of indeterminacy. To counter this, we first propose an alternate finite-population model that avoids this problem of fixing T and allows a wider range of demand distributions, while retaining the useful marginal-value properties of the finite-period model. The finite-population model still requires jointly estimating market size and the parameters of the customer purchase model without observing no-purchases. Estimation of market-size when no-purchases are unobservable has rarely been attempted in the marketing or revenue management literature. Indeed, we point out that it is akin to the classical statistical problem of estimating the parameters of a binomial distribution with unknown population size and success probability, and hence likely to be challenging. However, when the purchase probabilities are given by a functional form such as a multinomial-logit model, we propose an estimation heuristic that exploits the specification of the functional form, the variety of the offer sets in a typical RM setting, and qualitative knowledge of arrival rates. Finally we perform simulations to show that the estimator is very promising in obtaining unbiased estimates of population size and the model parameters.
Resumo:
Several methods and approaches for measuring parameters to determine fecal sources of pollution in water have been developed in recent years. No single microbial or chemical parameter has proved sufficient to determine the source of fecal pollution. Combinations of parameters involving at least one discriminating indicator and one universal fecal indicator offer the most promising solutions for qualitative and quantitative analyses. The universal (nondiscriminating) fecal indicator provides quantitative information regarding the fecal load. The discriminating indicator contributes to the identification of a specific source. The relative values of the parameters derived from both kinds of indicators could provide information regarding the contribution to the total fecal load from each origin. It is also essential that both parameters characteristically persist in the environment for similar periods. Numerical analysis, such as inductive learning methods, could be used to select the most suitable and the lowest number of parameters to develop predictive models. These combinations of parameters provide information on factors affecting the models, such as dilution, specific types of animal source, persistence of microbial tracers, and complex mixtures from different sources. The combined use of the enumeration of somatic coliphages and the enumeration of Bacteroides-phages using different host specific strains (one from humans and another from pigs), both selected using the suggested approach, provides a feasible model for quantitative and qualitative analyses of fecal source identification.
Resumo:
Background: Transplantation is the treatment of choice when compared to dialysis. Long-term evolution of patients is rarely comprehensively described. Thirty end-stage renal disease patient's experience of illness was explored from registration for transplantation until twenty-four months after transplantation. Methods: Longitudinal semi-structured interviews were conducted, and qualitative discourse analysis performed. Findings: Before transplantation loss of quality of life (QOL), emotional fragility related to dialysis constraints were reported, and increased with waiting-time. Six months after transplantation, recovered freedom was described but acute rejection, and life-dependency to immunosuppressants generated concerns. After twelve months, long-term survival of the graft, and possible return-to-dialysis were mentioned. After twenty months graft's dysfunction, co-morbidities, immunosuppressants side effects rose concerns even though QOL persisted. Most patients report positive transformations after transplantation, which are related to graft survival and limited co-morbidities. Discussion: As time passes, patients deal with changing illness constraints, and contemplate with anxiety possible new return to dialysis and/or transplantation.
Resumo:
Abstract The main objective of this work is to show how the choice of the temporal dimension and of the spatial structure of the population influences an artificial evolutionary process. In the field of Artificial Evolution we can observe a common trend in synchronously evolv¬ing panmictic populations, i.e., populations in which any individual can be recombined with any other individual. Already in the '90s, the works of Spiessens and Manderick, Sarma and De Jong, and Gorges-Schleuter have pointed out that, if a population is struc¬tured according to a mono- or bi-dimensional regular lattice, the evolutionary process shows a different dynamic with respect to the panmictic case. In particular, Sarma and De Jong have studied the selection pressure (i.e., the diffusion of a best individual when the only selection operator is active) induced by a regular bi-dimensional structure of the population, proposing a logistic modeling of the selection pressure curves. This model supposes that the diffusion of a best individual in a population follows an exponential law. We show that such a model is inadequate to describe the process, since the growth speed must be quadratic or sub-quadratic in the case of a bi-dimensional regular lattice. New linear and sub-quadratic models are proposed for modeling the selection pressure curves in, respectively, mono- and bi-dimensional regu¬lar structures. These models are extended to describe the process when asynchronous evolutions are employed. Different dynamics of the populations imply different search strategies of the resulting algorithm, when the evolutionary process is used to solve optimisation problems. A benchmark of both discrete and continuous test problems is used to study the search characteristics of the different topologies and updates of the populations. In the last decade, the pioneering studies of Watts and Strogatz have shown that most real networks, both in the biological and sociological worlds as well as in man-made structures, have mathematical properties that set them apart from regular and random structures. In particular, they introduced the concepts of small-world graphs, and they showed that this new family of structures has interesting computing capabilities. Populations structured according to these new topologies are proposed, and their evolutionary dynamics are studied and modeled. We also propose asynchronous evolutions for these structures, and the resulting evolutionary behaviors are investigated. Many man-made networks have grown, and are still growing incrementally, and explanations have been proposed for their actual shape, such as Albert and Barabasi's preferential attachment growth rule. However, many actual networks seem to have undergone some kind of Darwinian variation and selection. Thus, how these networks might have come to be selected is an interesting yet unanswered question. In the last part of this work, we show how a simple evolutionary algorithm can enable the emrgence o these kinds of structures for two prototypical problems of the automata networks world, the majority classification and the synchronisation problems. Synopsis L'objectif principal de ce travail est de montrer l'influence du choix de la dimension temporelle et de la structure spatiale d'une population sur un processus évolutionnaire artificiel. Dans le domaine de l'Evolution Artificielle on peut observer une tendence à évoluer d'une façon synchrone des populations panmictiques, où chaque individu peut être récombiné avec tout autre individu dans la population. Déjà dans les année '90, Spiessens et Manderick, Sarma et De Jong, et Gorges-Schleuter ont observé que, si une population possède une structure régulière mono- ou bi-dimensionnelle, le processus évolutionnaire montre une dynamique différente de celle d'une population panmictique. En particulier, Sarma et De Jong ont étudié la pression de sélection (c-à-d la diffusion d'un individu optimal quand seul l'opérateur de sélection est actif) induite par une structure régulière bi-dimensionnelle de la population, proposant une modélisation logistique des courbes de pression de sélection. Ce modèle suppose que la diffusion d'un individu optimal suit une loi exponentielle. On montre que ce modèle est inadéquat pour décrire ce phénomène, étant donné que la vitesse de croissance doit obéir à une loi quadratique ou sous-quadratique dans le cas d'une structure régulière bi-dimensionnelle. De nouveaux modèles linéaires et sous-quadratique sont proposés pour des structures mono- et bi-dimensionnelles. Ces modèles sont étendus pour décrire des processus évolutionnaires asynchrones. Différentes dynamiques de la population impliquent strategies différentes de recherche de l'algorithme résultant lorsque le processus évolutionnaire est utilisé pour résoudre des problèmes d'optimisation. Un ensemble de problèmes discrets et continus est utilisé pour étudier les charactéristiques de recherche des différentes topologies et mises à jour des populations. Ces dernières années, les études de Watts et Strogatz ont montré que beaucoup de réseaux, aussi bien dans les mondes biologiques et sociologiques que dans les structures produites par l'homme, ont des propriétés mathématiques qui les séparent à la fois des structures régulières et des structures aléatoires. En particulier, ils ont introduit la notion de graphe sm,all-world et ont montré que cette nouvelle famille de structures possède des intéressantes propriétés dynamiques. Des populations ayant ces nouvelles topologies sont proposés, et leurs dynamiques évolutionnaires sont étudiées et modélisées. Pour des populations ayant ces structures, des méthodes d'évolution asynchrone sont proposées, et la dynamique résultante est étudiée. Beaucoup de réseaux produits par l'homme se sont formés d'une façon incrémentale, et des explications pour leur forme actuelle ont été proposées, comme le preferential attachment de Albert et Barabàsi. Toutefois, beaucoup de réseaux existants doivent être le produit d'un processus de variation et sélection darwiniennes. Ainsi, la façon dont ces structures ont pu être sélectionnées est une question intéressante restée sans réponse. Dans la dernière partie de ce travail, on montre comment un simple processus évolutif artificiel permet à ce type de topologies d'émerger dans le cas de deux problèmes prototypiques des réseaux d'automates, les tâches de densité et de synchronisation.