972 resultados para linear weighting methods
Resumo:
This paper introduces local distance-based generalized linear models. These models extend (weighted) distance-based linear models firstly with the generalized linear model concept, then by localizing. Distances between individuals are the only predictor information needed to fit these models. Therefore they are applicable to mixed (qualitative and quantitative) explanatory variables or when the regressor is of functional type. Models can be fitted and analysed with the R package dbstats, which implements several distancebased prediction methods.
Resumo:
Background: The imatinib trough plasma concentration (C(min)) correlates with clinical response in cancer patients. Therapeutic drug monitoring (TDM) of plasma C(min) is therefore suggested. In practice, however, blood sampling for TDM is often not performed at trough. The corresponding measurement is thus only remotely informative about C(min) exposure. Objectives: The objectives of this study were to improve the interpretation of randomly measured concentrations by using a Bayesian approach for the prediction of C(min), incorporating correlation between pharmacokinetic parameters, and to compare the predictive performance of this method with alternative approaches, by comparing predictions with actual measured trough levels, and with predictions obtained by a reference method, respectively. Methods: A Bayesian maximum a posteriori (MAP) estimation method accounting for correlation (MAP-ρ) between pharmacokinetic parameters was developed on the basis of a population pharmacokinetic model, which was validated on external data. Thirty-one paired random and trough levels, observed in gastrointestinal stromal tumour patients, were then used for the evaluation of the Bayesian MAP-ρ method: individual C(min) predictions, derived from single random observations, were compared with actual measured trough levels for assessment of predictive performance (accuracy and precision). The method was also compared with alternative approaches: classical Bayesian MAP estimation assuming uncorrelated pharmacokinetic parameters, linear extrapolation along the typical elimination constant of imatinib, and non-linear mixed-effects modelling (NONMEM) first-order conditional estimation (FOCE) with interaction. Predictions of all methods were finally compared with 'best-possible' predictions obtained by a reference method (NONMEM FOCE, using both random and trough observations for individual C(min) prediction). Results: The developed Bayesian MAP-ρ method accounting for correlation between pharmacokinetic parameters allowed non-biased prediction of imatinib C(min) with a precision of ±30.7%. This predictive performance was similar for the alternative methods that were applied. The range of relative prediction errors was, however, smallest for the Bayesian MAP-ρ method and largest for the linear extrapolation method. When compared with the reference method, predictive performance was comparable for all methods. The time interval between random and trough sampling did not influence the precision of Bayesian MAP-ρ predictions. Conclusion: Clinical interpretation of randomly measured imatinib plasma concentrations can be assisted by Bayesian TDM. Classical Bayesian MAP estimation can be applied even without consideration of the correlation between pharmacokinetic parameters. Individual C(min) predictions are expected to vary less through Bayesian TDM than linear extrapolation. Bayesian TDM could be developed in the future for other targeted anticancer drugs and for the prediction of other pharmacokinetic parameters that have been correlated with clinical outcomes.
Resumo:
BACKGROUND: We sought to improve upon previously published statistical modeling strategies for binary classification of dyslipidemia for general population screening purposes based on the waist-to-hip circumference ratio and body mass index anthropometric measurements. METHODS: Study subjects were participants in WHO-MONICA population-based surveys conducted in two Swiss regions. Outcome variables were based on the total serum cholesterol to high density lipoprotein cholesterol ratio. The other potential predictor variables were gender, age, current cigarette smoking, and hypertension. The models investigated were: (i) linear regression; (ii) logistic classification; (iii) regression trees; (iv) classification trees (iii and iv are collectively known as "CART"). Binary classification performance of the region-specific models was externally validated by classifying the subjects from the other region. RESULTS: Waist-to-hip circumference ratio and body mass index remained modest predictors of dyslipidemia. Correct classification rates for all models were 60-80%, with marked gender differences. Gender-specific models provided only small gains in classification. The external validations provided assurance about the stability of the models. CONCLUSIONS: There were no striking differences between either the algebraic (i, ii) vs. non-algebraic (iii, iv), or the regression (i, iii) vs. classification (ii, iv) modeling approaches. Anticipated advantages of the CART vs. simple additive linear and logistic models were less than expected in this particular application with a relatively small set of predictor variables. CART models may be more useful when considering main effects and interactions between larger sets of predictor variables.
Resumo:
Many multivariate methods that are apparently distinct can be linked by introducing oneor more parameters in their definition. Methods that can be linked in this way arecorrespondence analysis, unweighted or weighted logratio analysis (the latter alsoknown as "spectral mapping"), nonsymmetric correspondence analysis, principalcomponent analysis (with and without logarithmic transformation of the data) andmultidimensional scaling. In this presentation I will show how several of thesemethods, which are frequently used in compositional data analysis, may be linkedthrough parametrizations such as power transformations, linear transformations andconvex linear combinations. Since the methods of interest here all lead to visual mapsof data, a "movie" can be made where where the linking parameter is allowed to vary insmall steps: the results are recalculated "frame by frame" and one can see the smoothchange from one method to another. Several of these "movies" will be shown, giving adeeper insight into the similarities and differences between these methods
Resumo:
A variational approach for reliably calculating vibrational linear and nonlinear optical properties of molecules with large electrical and/or mechanical anharmonicity is introduced. This approach utilizes a self-consistent solution of the vibrational Schrödinger equation for the complete field-dependent potential-energy surface and, then, adds higher-level vibrational correlation corrections as desired. An initial application is made to static properties for three molecules of widely varying anharmonicity using the lowest-level vibrational correlation treatment (i.e., vibrational Møller-Plesset perturbation theory). Our results indicate when the conventional Bishop-Kirtman perturbation method can be expected to break down and when high-level vibrational correlation methods are likely to be required. Future improvements and extensions are discussed
Resumo:
An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001.We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling.
Resumo:
We consider the application of normal theory methods to the estimation and testing of a general type of multivariate regressionmodels with errors--in--variables, in the case where various data setsare merged into a single analysis and the observable variables deviatepossibly from normality. The various samples to be merged can differ on the set of observable variables available. We show that there is a convenient way to parameterize the model so that, despite the possiblenon--normality of the data, normal--theory methods yield correct inferencesfor the parameters of interest and for the goodness--of--fit test. Thetheory described encompasses both the functional and structural modelcases, and can be implemented using standard software for structuralequations models, such as LISREL, EQS, LISCOMP, among others. An illustration with Monte Carlo data is presented.
Resumo:
The network revenue management (RM) problem arises in airline, hotel, media,and other industries where the sale products use multiple resources. It can be formulatedas a stochastic dynamic program but the dynamic program is computationallyintractable because of an exponentially large state space, and a number of heuristicshave been proposed to approximate it. Notable amongst these -both for their revenueperformance, as well as their theoretically sound basis- are approximate dynamic programmingmethods that approximate the value function by basis functions (both affinefunctions as well as piecewise-linear functions have been proposed for network RM)and decomposition methods that relax the constraints of the dynamic program to solvesimpler dynamic programs (such as the Lagrangian relaxation methods). In this paperwe show that these two seemingly distinct approaches coincide for the network RMdynamic program, i.e., the piecewise-linear approximation method and the Lagrangianrelaxation method are one and the same.
Resumo:
Standard methods for the analysis of linear latent variable models oftenrely on the assumption that the vector of observed variables is normallydistributed. This normality assumption (NA) plays a crucial role inassessingoptimality of estimates, in computing standard errors, and in designinganasymptotic chi-square goodness-of-fit test. The asymptotic validity of NAinferences when the data deviates from normality has been calledasymptoticrobustness. In the present paper we extend previous work on asymptoticrobustnessto a general context of multi-sample analysis of linear latent variablemodels,with a latent component of the model allowed to be fixed across(hypothetical)sample replications, and with the asymptotic covariance matrix of thesamplemoments not necessarily finite. We will show that, under certainconditions,the matrix $\Gamma$ of asymptotic variances of the analyzed samplemomentscan be substituted by a matrix $\Omega$ that is a function only of thecross-product moments of the observed variables. The main advantage of thisis thatinferences based on $\Omega$ are readily available in standard softwareforcovariance structure analysis, and do not require to compute samplefourth-order moments. An illustration with simulated data in the context ofregressionwith errors in variables will be presented.
Resumo:
It is well accepted that people resist evidence that contradicts their beliefs.Moreover, despite their training, many scientists reject results that are inconsistent withtheir theories. This phenomenon is discussed in relation to the field of judgment anddecision making by describing four case studies. These concern findings that clinical judgment is less predictive than actuarial models; simple methods have proven superiorto more theoretically correct methods in times series forecasting; equal weighting ofvariables is often more accurate than using differential weights; and decisions cansometimes be improved by discarding relevant information. All findings relate to theapparently difficult-to-accept idea that simple models can predict complex phenomenabetter than complex ones. It is true that there is a scientific market place for ideas.However, like its economic counterpart, it is subject to inefficiencies (e.g., thinness,asymmetric information, and speculative bubbles). Unfortunately, the market is only correct in the long-run. The road to enlightenment is bumpy.
Resumo:
Members of the Chlamydiales order all share a biphasic lifecycle alternating between small infectious particles, the elementary bodies (EBs) and larger intracellular forms able to replicate, the reticulate bodies. Whereas the classical Chlamydia usually harbours round-shaped EBs, some members of the Chlamydia-related families display crescent and star-shaped morphologies by electron microscopy. To determine the impact of fixative methods on the shape of the bacterial cells, different buffer and fixative combinations were tested on purified EBs of Criblamydia sequanensis, Estrella lausannensis, Parachlamydia acanthamoebae, and Waddlia chondrophila. A linear discriminant analysis was performed on particle metrics extracted from electron microscopy images to recognize crescent, round, star and intermediary forms. Depending on the buffer and fixatives used, a mixture of alternative shapes were observed in varying proportions with stars and crescents being more frequent in C. sequanensis and P. acanthamoebae, respectively. No tested buffer and chemical fixative preserved ideally the round shape of a majority of bacteria and other methods such as deep-freezing and cryofixation should be applied. Although crescent and star shapes could represent a fixation artifact, they certainly point towards a diverse composition and organization of membrane proteins or intracellular structures rather than being a distinct developmental stage.
Resumo:
Detailed knowledge on water percolation into the soil in irrigated areas is fundamental for solving problems of drainage, pollution and the recharge of underground aquifers. The aim of this study was to evaluate the percolation estimated by time-domain-reflectometry (TDR) in a drainage lysimeter. We used Darcy's law with K(θ) functions determined by field and laboratory methods and by the change in water storage in the soil profile at 16 points of moisture measurement at different time intervals. A sandy clay soil was saturated and covered with plastic sheet to prevent evaporation and an internal drainage trial in a drainage lysimeter was installed. The relationship between the observed and estimated percolation values was evaluated by linear regression analysis. The results suggest that percolation in the field or laboratory can be estimated based on continuous monitoring with TDR, and at short time intervals, of the variations in soil water storage. The precision and accuracy of this approach are similar to those of the lysimeter and it has advantages over the other evaluated methods, of which the most relevant are the possibility of estimating percolation in short time intervals and exemption from the predetermination of soil hydraulic properties such as water retention and hydraulic conductivity. The estimates obtained by the Darcy-Buckingham equation for percolation levels using function K(θ) predicted by the method of Hillel et al. (1972) provided compatible water percolation estimates with those obtained in the lysimeter at time intervals greater than 1 h. The methods of Libardi et al. (1980), Sisson et al. (1980) and van Genuchten (1980) underestimated water percolation.
Resumo:
We conduct a large-scale comparative study on linearly combining superparent-one-dependence estimators (SPODEs), a popular family of seminaive Bayesian classifiers. Altogether, 16 model selection and weighing schemes, 58 benchmark data sets, and various statistical tests are employed. This paper's main contributions are threefold. First, it formally presents each scheme's definition, rationale, and time complexity and hence can serve as a comprehensive reference for researchers interested in ensemble learning. Second, it offers bias-variance analysis for each scheme's classification error performance. Third, it identifies effective schemes that meet various needs in practice. This leads to accurate and fast classification algorithms which have an immediate and significant impact on real-world applications. Another important feature of our study is using a variety of statistical tests to evaluate multiple learning methods across multiple data sets.