137 resultados para semiparametric regression
Resumo:
Focuses on a study which introduced an iterative modeling method that combines properties of ordinary least squares (OLS) with hierarchical tree-based regression (HTBR) in transportation engineering. Information on OLS and HTBR; Comparison and contrasts of OLS and HTBR; Conclusions.
Resumo:
There has been considerable research conducted over the last 20 years focused on predicting motor vehicle crashes on transportation facilities. The range of statistical models commonly applied includes binomial, Poisson, Poisson-gamma (or negative binomial), zero-inflated Poisson and negative binomial models (ZIP and ZINB), and multinomial probability models. Given the range of possible modeling approaches and the host of assumptions with each modeling approach, making an intelligent choice for modeling motor vehicle crash data is difficult. There is little discussion in the literature comparing different statistical modeling approaches, identifying which statistical models are most appropriate for modeling crash data, and providing a strong justification from basic crash principles. In the recent literature, it has been suggested that the motor vehicle crash process can successfully be modeled by assuming a dual-state data-generating process, which implies that entities (e.g., intersections, road segments, pedestrian crossings, etc.) exist in one of two states—perfectly safe and unsafe. As a result, the ZIP and ZINB are two models that have been applied to account for the preponderance of “excess” zeros frequently observed in crash count data. The objective of this study is to provide defensible guidance on how to appropriate model crash data. We first examine the motor vehicle crash process using theoretical principles and a basic understanding of the crash process. It is shown that the fundamental crash process follows a Bernoulli trial with unequal probability of independent events, also known as Poisson trials. We examine the evolution of statistical models as they apply to the motor vehicle crash process, and indicate how well they statistically approximate the crash process. We also present the theory behind dual-state process count models, and note why they have become popular for modeling crash data. A simulation experiment is then conducted to demonstrate how crash data give rise to “excess” zeros frequently observed in crash data. It is shown that the Poisson and other mixed probabilistic structures are approximations assumed for modeling the motor vehicle crash process. Furthermore, it is demonstrated that under certain (fairly common) circumstances excess zeros are observed—and that these circumstances arise from low exposure and/or inappropriate selection of time/space scales and not an underlying dual state process. In conclusion, carefully selecting the time/space scales for analysis, including an improved set of explanatory variables and/or unobserved heterogeneity effects in count regression models, or applying small-area statistical methods (observations with low exposure) represent the most defensible modeling approaches for datasets with a preponderance of zeros
Resumo:
This paper presents an approach to predict the operating conditions of machine based on classification and regression trees (CART) and adaptive neuro-fuzzy inference system (ANFIS) in association with direct prediction strategy for multi-step ahead prediction of time series techniques. In this study, the number of available observations and the number of predicted steps are initially determined by using false nearest neighbor method and auto mutual information technique, respectively. These values are subsequently utilized as inputs for prediction models to forecast the future values of the machines’ operating conditions. The performance of the proposed approach is then evaluated by using real trending data of low methane compressor. A comparative study of the predicted results obtained from CART and ANFIS models is also carried out to appraise the prediction capability of these models. The results show that the ANFIS prediction model can track the change in machine conditions and has the potential for using as a tool to machine fault prognosis.
Resumo:
There have been notable advances in learning to control complex robotic systems using methods such as Locally Weighted Regression (LWR). In this paper we explore some potential limits of LWR for robotic applications, particularly investigating its application to systems with a long horizon of temporal dependence. We define the horizon of temporal dependence as the delay from a control input to a desired change in output. LWR alone cannot be used in a temporally dependent system to find meaningful control values from only the current state variables and output, as the relationship between the input and the current state is under-constrained. By introducing a receding horizon of the future output states of the system, we show that sufficient constraint is applied to learn good solutions through LWR. The new method, Receding Horizon Locally Weighted Regression (RH-LWR), is demonstrated through one-shot learning on a real Series Elastic Actuator controlling a pendulum.
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
Background The residue-wise contact order (RWCO) describes the sequence separations between the residues of interest and its contacting residues in a protein sequence. It is a new kind of one-dimensional protein structure that represents the extent of long-range contacts and is considered as a generalization of contact order. Together with secondary structure, accessible surface area, the B factor, and contact number, RWCO provides comprehensive and indispensable important information to reconstructing the protein three-dimensional structure from a set of one-dimensional structural properties. Accurately predicting RWCO values could have many important applications in protein three-dimensional structure prediction and protein folding rate prediction, and give deep insights into protein sequence-structure relationships. Results We developed a novel approach to predict residue-wise contact order values in proteins based on support vector regression (SVR), starting from primary amino acid sequences. We explored seven different sequence encoding schemes to examine their effects on the prediction performance, including local sequence in the form of PSI-BLAST profiles, local sequence plus amino acid composition, local sequence plus molecular weight, local sequence plus secondary structure predicted by PSIPRED, local sequence plus molecular weight and amino acid composition, local sequence plus molecular weight and predicted secondary structure, and local sequence plus molecular weight, amino acid composition and predicted secondary structure. When using local sequences with multiple sequence alignments in the form of PSI-BLAST profiles, we could predict the RWCO distribution with a Pearson correlation coefficient (CC) between the predicted and observed RWCO values of 0.55, and root mean square error (RMSE) of 0.82, based on a well-defined dataset with 680 protein sequences. Moreover, by incorporating global features such as molecular weight and amino acid composition we could further improve the prediction performance with the CC to 0.57 and an RMSE of 0.79. In addition, combining the predicted secondary structure by PSIPRED was found to significantly improve the prediction performance and could yield the best prediction accuracy with a CC of 0.60 and RMSE of 0.78, which provided at least comparable performance compared with the other existing methods. Conclusion The SVR method shows a prediction performance competitive with or at least comparable to the previously developed linear regression-based methods for predicting RWCO values. In contrast to support vector classification (SVC), SVR is very good at estimating the raw value profiles of the samples. The successful application of the SVR approach in this study reinforces the fact that support vector regression is a powerful tool in extracting the protein sequence-structure relationship and in estimating the protein structural profiles from amino acid sequences.
Resumo:
We consider the problem of how to construct robust designs for Poisson regression models. An analytical expression is derived for robust designs for first-order Poisson regression models where uncertainty exists in the prior parameter estimates. Given certain constraints in the methodology, it may be necessary to extend the robust designs for implementation in practical experiments. With these extensions, our methodology constructs designs which perform similarly, in terms of estimation, to current techniques, and offers the solution in a more timely manner. We further apply this analytic result to cases where uncertainty exists in the linear predictor. The application of this methodology to practical design problems such as screening experiments is explored. Given the minimal prior knowledge that is usually available when conducting such experiments, it is recommended to derive designs robust across a variety of systems. However, incorporating such uncertainty into the design process can be a computationally intense exercise. Hence, our analytic approach is explored as an alternative.
Resumo:
Between 2001 and 2005, the US airline industry faced financial turmoil. At the same time, the European airline industry entered a period of substantive deregulation. This period witnessed opportunities for low-cost carriers to become more competitive in the market as a result of these combined events. To help assess airline performance in the aftermath of these events, this paper provides new evidence of technical efficiency for 42 national and international airlines in 2006 using the data envelopment analysis (DEA) bootstrap approach first proposed by Simar and Wilson (J Econ, 136:31-64, 2007). In the first stage, technical efficiency scores are estimated using a bootstrap DEA model. In the second stage, a truncated regression is employed to quantify the economic drivers underlying measured technical efficiency. The results highlight the key role played by non-discretionary inputs in measures of airline technical efficiency.
Resumo:
In this paper we explore the ability of a recent model-based learning technique Receding Horizon Locally Weighted Regression (RH-LWR) useful for learning temporally dependent systems. In particular this paper investigates the application of RH-LWR to learn control of Multiple-input Multiple-output robot systems. RH-LWR is demonstrated through learning joint velocity and position control of a three Degree of Freedom (DoF) rigid body robot.
Resumo:
The motivation of the study stems from the results reported in the Excellence in Research for Australia (ERA) 2010 report. The report showed that only 12 universities performed research at or above international standards, of which, the Group of Eight (G8) universities filled the top eight spots. While performance of universities was based on number of research outputs, total amount of research income and other quantitative indicators, the measure of efficiency or productivity was not considered. The objectives of this paper are twofold. First, to provide a review of the research performance of 37 Australian universities using the data envelopment analysis (DEA) bootstrap approach of Simar and Wilson (2007). Second, to determine sources of productivity drivers by regressing the efficiency scores against a set of environmental variables.