861 resultados para linear rank regression model
Resumo:
Consider a nonparametric regression model Y=mu*(X) + e, where the explanatory variables X are endogenous and e satisfies the conditional moment restriction E[e|W]=0 w.p.1 for instrumental variables W. It is well known that in these models the structural parameter mu* is 'ill-posed' in the sense that the function mapping the data to mu* is not continuous. In this paper, we derive the efficiency bounds for estimating linear functionals E[p(X)mu*(X)] and int_{supp(X)}p(x)mu*(x)dx, where p is a known weight function and supp(X) the support of X, without assuming mu* to be well-posed or even identified.
Resumo:
Classical regression analysis can be used to model time series. However, the assumption that model parameters are constant over time is not necessarily adapted to the data. In phytoplankton ecology, the relevance of time-varying parameter values has been shown using a dynamic linear regression model (DLRM). DLRMs, belonging to the class of Bayesian dynamic models, assume the existence of a non-observable time series of model parameters, which are estimated on-line, i.e. after each observation. The aim of this paper was to show how DLRM results could be used to explain variation of a time series of phytoplankton abundance. We applied DLRM to daily concentrations of Dinophysis cf. acuminata, determined in Antifer harbour (French coast of the English Channel), along with physical and chemical covariates (e.g. wind velocity, nutrient concentrations). A single model was built using 1989 and 1990 data, and then applied separately to each year. Equivalent static regression models were investigated for the purpose of comparison. Results showed that most of the Dinophysis cf. acuminata concentration variability was explained by the configuration of the sampling site, the wind regime and tide residual flow. Moreover, the relationships of these factors with the concentration of the microalga varied with time, a fact that could not be detected with static regression. Application of dynamic models to phytoplankton time series, especially in a monitoring context, is discussed.
Resumo:
This study focuses on multiple linear regression models relating six climate indices (temperature humidity THI, environmental stress ESI, equivalent temperature index ETI, heat load HLI, modified HLI (HLI new), and respiratory rate predictor RRP) with three main components of cow’s milk (yield, fat, and protein) for cows in Iran. The least absolute shrinkage selection operator (LASSO) and the Akaike information criterion (AIC) techniques are applied to select the best model for milk predictands with the smallest number of climate predictors. Uncertainty estimation is employed by applying bootstrapping through resampling. Cross validation is used to avoid over-fitting. Climatic parameters are calculated from the NASA-MERRA global atmospheric reanalysis. Milk data for the months from April to September, 2002 to 2010 are used. The best linear regression models are found in spring between milk yield as the predictand and THI, ESI, ETI, HLI, and RRP as predictors with p-value < 0.001 and R2 (0.50, 0.49) respectively. In summer, milk yield with independent variables of THI, ETI, and ESI show the highest relation (p-value < 0.001) with R2 (0.69). For fat and protein the results are only marginal. This method is suggested for the impact studies of climate variability/change on agriculture and food science fields when short-time series or data with large uncertainty are available.
Resumo:
Objectives To investigate whether a sudden temperature change between neighboring days has significant impact on mortality. Methods A Poisson generalized linear regression model combined with a distributed lag non-linear models was used to estimate the association of temperature change between neighboring days with mortality in a subtropical Chinese city during 2008–2012. Temperature change was calculated as the current day’s temperature minus the previous day’s temperature. Results A significant effect of temperature change between neighboring days on mortality was observed. Temperature increase was significantly associated with elevated mortality from non-accidental and cardiovascular diseases, while temperature decrease had a protective effect on non-accidental mortality and cardiovascular mortality. Males and people aged 65 years or older appeared to be more vulnerable to the impact of temperature change. Conclusions Temperature increase between neighboring days has a significant adverse impact on mortality. Further health mitigation strategies as a response to climate change should take into account temperature variation between neighboring days.
Resumo:
Change point estimation is recognized as an essential tool of root cause analyses within quality control programs as it enables clinical experts to search for potential causes of change in hospital outcomes more effectively. In this paper, we consider estimation of the time when a linear trend disturbance has occurred in survival time following an in-control clinical intervention in the presence of variable patient mix. To model the process and change point, a linear trend in the survival time of patients who underwent cardiac surgery is formulated using hierarchical models in a Bayesian framework. The data are right censored since the monitoring is conducted over a limited follow-up period. We capture the effect of risk factors prior to the surgery using a Weibull accelerated failure time regression model. We use Markov Chain Monte Carlo to obtain posterior distributions of the change point parameters including the location and the slope size of the trend and also corresponding probabilistic intervals and inferences. The performance of the Bayesian estimator is investigated through simulations and the result shows that precise estimates can be obtained when they are used in conjunction with the risk-adjusted survival time cumulative sum control chart (CUSUM) control charts for different trend scenarios. In comparison with the alternatives, step change point model and built-in CUSUM estimator, more accurate and precise estimates are obtained by the proposed Bayesian estimator over linear trends. These superiorities are enhanced when probability quantification, flexibility and generalizability of the Bayesian change point detection model are also considered.
Resumo:
Background The number of citations received by an article is considered as an objective marker judging the importance and the quality of the research work. The present study aims to study the determinants of citations for research articles published by Sri Lankan authors. Methods Papers were selectively retrieved from the SciVerse Scopus® (Elsevier Properties S.A, USA) database for 10 years from 1st January 1997 to 31st December 2006, of which 50% were selected for inclusion by simple random sampling. The primary outcome measure was citation rate (defined as the number of citations during the 2 subsequent years after publication). Citation data was collected using the SciVerse Scopus® Citation Analyzer and self citations were excluded. A linear regression analysis was performed with ‘number of citations’ as the continuous dependent variable and other independent variables. Result The number of publications has steadily increased during the period of study. Over three quarter of papers were published in international journals. More than half of publications were research studies (55.3%), and most of the research studies were descriptive cross-sectional studies (27.1%). The mean number of citations within 2 years of publication was 1.7 and 52.1% of papers were not cited within the first two years of publication. The mean number of citations for collaborative studies (2.74) was significantly higher than that of non-collaborative studies (0.66). The mean number of citations did not significantly change depending on whether the publication had a positive result (2.08) or not (2.92) and was also not influenced by the presence (2.30) or absence (1.99) of the main study conclusion in the title of the article. In the linear regression model, the journal rank, number of authors, conducting the study abroad, being a research study or systematic review/meta-analysis and having regional and/or international collaboration all significantly increased the number of citations. Conclusion The journal rank, number of authors, conducting the study abroad, being a research study or systematic review/meta-analysis and having regional and/or international collaboration all significantly increased the number of citations. However, the presence of a positive result in the study did not influence the citation rate.
Resumo:
A model of root water extraction is proposed, in which a linear variation of extraction rate with depth is assumed. Five crops are chosen for simulation studies of the model, and soil moisture depletion under optimal conditions from different layers for each crop is calculated. Similar calculations are also made using the constant extraction rate model. Rooting depth is assumed to vary linearly with potential evapotranspiration for each crop during the vegetative phase. The calculated depletion patterns are compared with measured mean depletion patterns for each crop. It is shown that the constant extraction rate model results in large errors in the prediction of soil moisture depletion, while the proposed linear extraction rate model gives satisfactory results. Hypothetical depletion patterns predicted by the model in combination with a moisture tension-dependent sink term developed elsewhere are indicated.
Resumo:
This paper presents a detailed analysis of a model for military conflicts where the defending forces have to determine an optimal partitioning of available resources to counter attacks from an adversary in two different fronts in an area fire situation. Lanchester linear law attrition model is used to develop the dynamical equations governing the variation in force strength. Here we address a static resource allocation problem namely, Time-Zero-Allocation (TZA) where the resource allocation is done only at the initial time. Numerical examples are given to support the analytical results.
Resumo:
Sub-pixel classification is essential for the successful description of many land cover (LC) features with spatial resolution less than the size of the image pixels. A commonly used approach for sub-pixel classification is linear mixture models (LMM). Even though, LMM have shown acceptable results, pragmatically, linear mixtures do not exist. A non-linear mixture model, therefore, may better describe the resultant mixture spectra for endmember (pure pixel) distribution. In this paper, we propose a new methodology for inferring LC fractions by a process called automatic linear-nonlinear mixture model (AL-NLMM). AL-NLMM is a three step process where the endmembers are first derived from an automated algorithm. These endmembers are used by the LMM in the second step that provides abundance estimation in a linear fashion. Finally, the abundance values along with the training samples representing the actual proportions are fed to multi-layer perceptron (MLP) architecture as input to train the neurons which further refines the abundance estimates to account for the non-linear nature of the mixing classes of interest. AL-NLMM is validated on computer simulated hyperspectral data of 200 bands. Validation of the output showed overall RMSE of 0.0089±0.0022 with LMM and 0.0030±0.0001 with the MLP based AL-NLMM, when compared to actual class proportions indicating that individual class abundances obtained from AL-NLMM are very close to the real observations.
Resumo:
Semisupervised dimensionality reduction has been attracting much attention as it not only utilizes both labeled and unlabeled data simultaneously, but also works well in the situation of out-of-sample. This paper proposes an effective approach of semisupervised dimensionality reduction through label propagation and label regression. Different from previous efforts, the new approach propagates the label information from labeled to unlabeled data with a well-designed mechanism of random walks, in which outliers are effectively detected and the obtained virtual labels of unlabeled data can be well encoded in a weighted regression model. These virtual labels are thereafter regressed with a linear model to calculate the projection matrix for dimensionality reduction. By this means, when the manifold or the clustering assumption of data is satisfied, the labels of labeled data can be correctly propagated to the unlabeled data; and thus, the proposed approach utilizes the labeled and the unlabeled data more effectively than previous work. Experimental results are carried out upon several databases, and the advantage of the new approach is well demonstrated.
Resumo:
The standard linear-quadratic survival model for radiotherapy is used to investigate different schedules of radiation treatment planning to study how these may be affected by different tumour repopulation kinetics between treatments.
Resumo:
Discrete Conditional Phase-type (DC-Ph) models are a family of models which represent skewed survival data conditioned on specific inter-related discrete variables. The survival data is modeled using a Coxian phase-type distribution which is associated with the inter-related variables using a range of possible data mining approaches such as Bayesian networks (BNs), the Naïve Bayes Classification method and classification regression trees. This paper utilizes the Discrete Conditional Phase-type model (DC-Ph) to explore the modeling of patient waiting times in an Accident and Emergency Department of a UK hospital. The resulting DC-Ph model takes on the form of the Coxian phase-type distribution conditioned on the outcome of a logistic regression model.
Resumo:
A parametric regression model for right-censored data with a log-linear median regression function and a transformation in both response and regression parts, named parametric Transform-Both-Sides (TBS) model, is presented. The TBS model has a parameter that handles data asymmetry while allowing various different distributions for the error, as long as they are unimodal symmetric distributions centered at zero. The discussion is focused on the estimation procedure with five important error distributions (normal, double-exponential, Student's t, Cauchy and logistic) and presents properties, associated functions (that is, survival and hazard functions) and estimation methods based on maximum likelihood and on the Bayesian paradigm. These procedures are implemented in TBSSurvival, an open-source fully documented R package. The use of the package is illustrated and the performance of the model is analyzed using both simulated and real data sets.
An integrated approach for real-time model-based state-of-charge estimation of lithium-ion batteries
Resumo:
Lithium-ion batteries have been widely adopted in electric vehicles (EVs), and accurate state of charge (SOC) estimation is of paramount importance for the EV battery management system. Though a number of methods have been proposed, the SOC estimation for Lithium-ion batteries, such as LiFePo4 battery, however, faces two key challenges: the flat open circuit voltage (OCV) vs SOC relationship for some SOC ranges and the hysteresis effect. To address these problems, an integrated approach for real-time model-based SOC estimation of Lithium-ion batteries is proposed in this paper. Firstly, an auto-regression model is adopted to reproduce the battery terminal behaviour, combined with a non-linear complementary model to capture the hysteresis effect. The model parameters, including linear parameters and non-linear parameters, are optimized off-line using a hybrid optimization method that combines a meta-heuristic method (i.e., the teaching learning based optimization method) and the least square method. Secondly, using the trained model, two real-time model-based SOC estimation methods are presented, one based on the real-time battery OCV regression model achieved through weighted recursive least square method, and the other based on the state estimation using the extended Kalman filter method (EKF). To tackle the problem caused by the flat OCV-vs-SOC segments when the OCV-based SOC estimation method is adopted, a method combining the coulombic counting and the OCV-based method is proposed. Finally, modelling results and SOC estimation results are presented and analysed using the data collected from LiFePo4 battery cell. The results confirmed the effectiveness of the proposed approach, in particular the joint-EKF method.
Resumo:
Most studies involving statistical time series analysis rely on assumptions of linearity, which by its simplicity facilitates parameter interpretation and estimation. However, the linearity assumption may be too restrictive for many practical applications. The implementation of nonlinear models in time series analysis involves the estimation of a large set of parameters, frequently leading to overfitting problems. In this article, a predictability coefficient is estimated using a combination of nonlinear autoregressive models and the use of support vector regression in this model is explored. We illustrate the usefulness and interpretability of results by using electroencephalographic records of an epileptic patient.