362 resultados para Software defect prediction


Relevância:

20.00% 20.00%

Publicador:

Resumo:

The Dynamic Data eXchange (DDX) is our third generation platform for building distributed robot controllers. DDX allows a coalition of programs to share data at run-time through an efficient shared memory mechanism managed by a store. Further, stores on multiple machines can be linked by means of a global catalog and data is moved between the stores on an as needed basis by multi-casting. Heterogeneous computer systems are handled. We describe the architecture of DDX and the standard clients we have developed that let us rapidly build complex control systems with minimal coding.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a travel time prediction model and evaluates its performance and transferability. Advanced Travelers Information Systems (ATIS) are gaining more and more importance, increasing the need for accurate, timely and useful information to the travelers. Travel time information quantifies the traffic condition in an easy to understand way for the users. The proposed travel time prediction model is based on an efficient use of nearest neighbor search. The model is calibrated for optimal performance using Genetic Algorithms. Results indicate better performance by using the proposed model than the presently used naïve model.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including (a) assessment of the logical defensibility of proposed models, (b) assessment of the transferability of models over future time periods and across different geographic locations, and (c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.