711 resultados para Reduced models
Resumo:
The high morbidity and mortality associated with atherosclerotic coronary vascular disease (CVD) and its complications are being lessened by the increased knowledge of risk factors, effective preventative measures and proven therapeutic interventions. However, significant CVD morbidity remains and sudden cardiac death continues to be a presenting feature for some subsequently diagnosed with CVD. Coronary vascular disease is also the leading cause of anaesthesia related complications. Stress electrocardiography/exercise testing is predictive of 10 year risk of CVD events and the cardiovascular variables used to score this test are monitored peri-operatively. Similar physiological time-series datasets are being subjected to data mining methods for the prediction of medical diagnoses and outcomes. This study aims to find predictors of CVD using anaesthesia time-series data and patient risk factor data. Several pre-processing and predictive data mining methods are applied to this data. Physiological time-series data related to anaesthetic procedures are subjected to pre-processing methods for removal of outliers, calculation of moving averages as well as data summarisation and data abstraction methods. Feature selection methods of both wrapper and filter types are applied to derived physiological time-series variable sets alone and to the same variables combined with risk factor variables. The ability of these methods to identify subsets of highly correlated but non-redundant variables is assessed. The major dataset is derived from the entire anaesthesia population and subsets of this population are considered to be at increased anaesthesia risk based on their need for more intensive monitoring (invasive haemodynamic monitoring and additional ECG leads). Because of the unbalanced class distribution in the data, majority class under-sampling and Kappa statistic together with misclassification rate and area under the ROC curve (AUC) are used for evaluation of models generated using different prediction algorithms. The performance based on models derived from feature reduced datasets reveal the filter method, Cfs subset evaluation, to be most consistently effective although Consistency derived subsets tended to slightly increased accuracy but markedly increased complexity. The use of misclassification rate (MR) for model performance evaluation is influenced by class distribution. This could be eliminated by consideration of the AUC or Kappa statistic as well by evaluation of subsets with under-sampled majority class. The noise and outlier removal pre-processing methods produced models with MR ranging from 10.69 to 12.62 with the lowest value being for data from which both outliers and noise were removed (MR 10.69). For the raw time-series dataset, MR is 12.34. Feature selection results in reduction in MR to 9.8 to 10.16 with time segmented summary data (dataset F) MR being 9.8 and raw time-series summary data (dataset A) being 9.92. However, for all time-series only based datasets, the complexity is high. For most pre-processing methods, Cfs could identify a subset of correlated and non-redundant variables from the time-series alone datasets but models derived from these subsets are of one leaf only. MR values are consistent with class distribution in the subset folds evaluated in the n-cross validation method. For models based on Cfs selected time-series derived and risk factor (RF) variables, the MR ranges from 8.83 to 10.36 with dataset RF_A (raw time-series data and RF) being 8.85 and dataset RF_F (time segmented time-series variables and RF) being 9.09. The models based on counts of outliers and counts of data points outside normal range (Dataset RF_E) and derived variables based on time series transformed using Symbolic Aggregate Approximation (SAX) with associated time-series pattern cluster membership (Dataset RF_ G) perform the least well with MR of 10.25 and 10.36 respectively. For coronary vascular disease prediction, nearest neighbour (NNge) and the support vector machine based method, SMO, have the highest MR of 10.1 and 10.28 while logistic regression (LR) and the decision tree (DT) method, J48, have MR of 8.85 and 9.0 respectively. DT rules are most comprehensible and clinically relevant. The predictive accuracy increase achieved by addition of risk factor variables to time-series variable based models is significant. The addition of time-series derived variables to models based on risk factor variables alone is associated with a trend to improved performance. Data mining of feature reduced, anaesthesia time-series variables together with risk factor variables can produce compact and moderately accurate models able to predict coronary vascular disease. Decision tree analysis of time-series data combined with risk factor variables yields rules which are more accurate than models based on time-series data alone. The limited additional value provided by electrocardiographic variables when compared to use of risk factors alone is similar to recent suggestions that exercise electrocardiography (exECG) under standardised conditions has limited additional diagnostic value over risk factor analysis and symptom pattern. The effect of the pre-processing used in this study had limited effect when time-series variables and risk factor variables are used as model input. In the absence of risk factor input, the use of time-series variables after outlier removal and time series variables based on physiological variable values’ being outside the accepted normal range is associated with some improvement in model performance.
Resumo:
Longitudinal data, where data are repeatedly observed or measured on a temporal basis of time or age provides the foundation of the analysis of processes which evolve over time, and these can be referred to as growth or trajectory models. One of the traditional ways of looking at growth models is to employ either linear or polynomial functional forms to model trajectory shape, and account for variation around an overall mean trend with the inclusion of random eects or individual variation on the functional shape parameters. The identification of distinct subgroups or sub-classes (latent classes) within these trajectory models which are not based on some pre-existing individual classification provides an important methodology with substantive implications. The identification of subgroups or classes has a wide application in the medical arena where responder/non-responder identification based on distinctly diering trajectories delivers further information for clinical processes. This thesis develops Bayesian statistical models and techniques for the identification of subgroups in the analysis of longitudinal data where the number of time intervals is limited. These models are then applied to a single case study which investigates the neuropsychological cognition for early stage breast cancer patients undergoing adjuvant chemotherapy treatment from the Cognition in Breast Cancer Study undertaken by the Wesley Research Institute of Brisbane, Queensland. Alternative formulations to the linear or polynomial approach are taken which use piecewise linear models with a single turning point, change-point or knot at a known time point and latent basis models for the non-linear trajectories found for the verbal memory domain of cognitive function before and after chemotherapy treatment. Hierarchical Bayesian random eects models are used as a starting point for the latent class modelling process and are extended with the incorporation of covariates in the trajectory profiles and as predictors of class membership. The Bayesian latent basis models enable the degree of recovery post-chemotherapy to be estimated for short and long-term followup occasions, and the distinct class trajectories assist in the identification of breast cancer patients who maybe at risk of long-term verbal memory impairment.
Resumo:
LiteSteel Beam (LSB) is a new cold-formed steel beam produced by OneSteel Australian Tube Mills. The new beam is effectively a channel section with two rectangular hollow flanges and a slender web, and is manufactured using a combined cold-forming and electric resistance welding process. OneSteel Australian Tube Mills is promoting the use of LSBs as flexural members in a range of applications, such as floor bearers. When LSBs are used as back to back built-up sections, they are likely to improve their moment capacity and thus extend their applications further. However, the structural behaviour of built-up beams is not well understood. Many steel design codes include guidelines for connecting two channels to form a built-up I-section including the required longitudinal spacing of connections. But these rules were found to be inadequate in some applications. Currently the safe spans of builtup beams are determined based on twice the moment capacity of a single section. Research has shown that these guidelines are conservative. Therefore large scale lateral buckling tests and advanced numerical analyses were undertaken to investigate the flexural behaviour of back to back LSBs connected by fasteners (bolts) at various longitudinal spacings under uniform moment conditions. In this research an experimental investigation was first undertaken to study the flexural behaviour of back to back LSBs including its buckling characteristics. This experimental study included tensile coupon tests, initial geometric imperfection measurements and lateral buckling tests. The initial geometric imperfection measurements taken on several back to back LSB specimens showed that the back to back bolting process is not likely to alter the imperfections, and the measured imperfections are well below the fabrication tolerance limits. Twelve large scale lateral buckling tests were conducted to investigate the behaviour of back to back built-up LSBs with various longitudinal fastener spacings under uniform moment conditions. Tests also included two single LSB specimens. Test results showed that the back to back LSBs gave higher moment capacities in comparison with single LSBs, and the fastener spacing influenced the ultimate moment capacities. As the fastener spacing was reduced the ultimate moment capacities of back to back LSBs increased. Finite element models of back to back LSBs with varying fastener spacings were then developed to conduct a detailed parametric study on the flexural behaviour of back to back built-up LSBs. Two finite element models were developed, namely experimental and ideal finite element models. The models included the complex contact behaviour between LSB web elements and intermittently fastened bolted connections along the web elements. They were validated by comparing their results with experimental results and numerical results obtained from an established buckling analysis program called THIN-WALL. These comparisons showed that the developed models could accurately predict both the elastic lateral distortional buckling moments and the non-linear ultimate moment capacities of back to back LSBs. Therefore the ideal finite element models incorporating ideal simply supported boundary conditions and uniform moment conditions were used in a detailed parametric study on the flexural behaviour of back to back LSB members. In the detailed parametric study, both elastic buckling and nonlinear analyses of back to back LSBs were conducted for 13 LSB sections with varying spans and fastener spacings. Finite element analysis results confirmed that the current design rules in AS/NZS 4600 (SA, 2005) are very conservative while the new design rules developed by Anapayan and Mahendran (2009a) for single LSB members were also found to be conservative. Thus new member capacity design rules were developed for back to back LSB members as a function of non-dimensional member slenderness. New empirical equations were also developed to aid in the calculation of elastic lateral distortional buckling moments of intermittently fastened back to back LSBs. Design guidelines were developed for the maximum fastener spacing of back to back LSBs in order to optimise the use of fasteners. A closer fastener spacing of span/6 was recommended for intermediate spans and some long spans where the influence of fastener spacing was found to be high. In the last phase of this research, a detailed investigation was conducted to investigate the potential use of different types of connections and stiffeners in improving the flexural strength of back to back LSB members. It was found that using transverse web stiffeners was the most cost-effective and simple strengthening method. It is recommended that web stiffeners are used at the supports and every third points within the span, and their thickness is in the range of 3 to 5 mm depending on the size of LSB section. The use of web stiffeners eliminated most of the lateral distortional buckling effects and hence improved the ultimate moment capacities. A suitable design equation was developed to calculate the elastic lateral buckling moments of back to back LSBs with the above recommended web stiffener configuration while the same design rules developed for unstiffened back to back LSBs were recommended to calculate the ultimate moment capacities.
Resumo:
This dissertation is primarily an applied statistical modelling investigation, motivated by a case study comprising real data and real questions. Theoretical questions on modelling and computation of normalization constants arose from pursuit of these data analytic questions. The essence of the thesis can be described as follows. Consider binary data observed on a two-dimensional lattice. A common problem with such data is the ambiguity of zeroes recorded. These may represent zero response given some threshold (presence) or that the threshold has not been triggered (absence). Suppose that the researcher wishes to estimate the effects of covariates on the binary responses, whilst taking into account underlying spatial variation, which is itself of some interest. This situation arises in many contexts and the dingo, cypress and toad case studies described in the motivation chapter are examples of this. Two main approaches to modelling and inference are investigated in this thesis. The first is frequentist and based on generalized linear models, with spatial variation modelled by using a block structure or by smoothing the residuals spatially. The EM algorithm can be used to obtain point estimates, coupled with bootstrapping or asymptotic MLE estimates for standard errors. The second approach is Bayesian and based on a three- or four-tier hierarchical model, comprising a logistic regression with covariates for the data layer, a binary Markov Random field (MRF) for the underlying spatial process, and suitable priors for parameters in these main models. The three-parameter autologistic model is a particular MRF of interest. Markov chain Monte Carlo (MCMC) methods comprising hybrid Metropolis/Gibbs samplers is suitable for computation in this situation. Model performance can be gauged by MCMC diagnostics. Model choice can be assessed by incorporating another tier in the modelling hierarchy. This requires evaluation of a normalization constant, a notoriously difficult problem. Difficulty with estimating the normalization constant for the MRF can be overcome by using a path integral approach, although this is a highly computationally intensive method. Different methods of estimating ratios of normalization constants (N Cs) are investigated, including importance sampling Monte Carlo (ISMC), dependent Monte Carlo based on MCMC simulations (MCMC), and reverse logistic regression (RLR). I develop an idea present though not fully developed in the literature, and propose the Integrated mean canonical statistic (IMCS) method for estimating log NC ratios for binary MRFs. The IMCS method falls within the framework of the newly identified path sampling methods of Gelman & Meng (1998) and outperforms ISMC, MCMC and RLR. It also does not rely on simplifying assumptions, such as ignoring spatio-temporal dependence in the process. A thorough investigation is made of the application of IMCS to the three-parameter Autologistic model. This work introduces background computations required for the full implementation of the four-tier model in Chapter 7. Two different extensions of the three-tier model to a four-tier version are investigated. The first extension incorporates temporal dependence in the underlying spatio-temporal process. The second extensions allows the successes and failures in the data layer to depend on time. The MCMC computational method is extended to incorporate the extra layer. A major contribution of the thesis is the development of a fully Bayesian approach to inference for these hierarchical models for the first time. Note: The author of this thesis has agreed to make it open access but invites people downloading the thesis to send her an email via the 'Contact Author' function.
Resumo:
Monotony has been identified as a contributing factor to road crashes. Drivers’ ability to react to unpredictable events deteriorates when exposed to highly predictable and uneventful driving tasks, such as driving on Australian rural roads, many of which are monotonous by nature. Highway design in particular attempts to reduce the driver’s task to a merely lane-keeping one. Such a task provides little stimulation and is monotonous, thus affecting the driver’s attention which is no longer directed towards the road. Inattention contributes to crashes, especially for professional drivers. Monotony has been studied mainly from the endogenous perspective (for instance through sleep deprivation) without taking into account the influence of the task itself (repetitiveness) or the surrounding environment. The aim and novelty of this thesis is to develop a methodology (mathematical framework) able to predict driver lapses of vigilance under monotonous environments in real time, using endogenous and exogenous data collected from the driver, the vehicle and the environment. Existing approaches have tended to neglect the specificity of task monotony, leaving the question of the existence of a “monotonous state” unanswered. Furthermore the issue of detecting vigilance decrement before it occurs (predictions) has not been investigated in the literature, let alone in real time. A multidisciplinary approach is necessary to explain how vigilance evolves in monotonous conditions. Such an approach needs to draw on psychology, physiology, road safety, computer science and mathematics. The systemic approach proposed in this study is unique with its predictive dimension and allows us to define, in real time, the impacts of monotony on the driver’s ability to drive. Such methodology is based on mathematical models integrating data available in vehicles to the vigilance state of the driver during a monotonous driving task in various environments. The model integrates different data measuring driver’s endogenous and exogenous factors (related to the driver, the vehicle and the surrounding environment). Electroencephalography (EEG) is used to measure driver vigilance since it has been shown to be the most reliable and real time methodology to assess vigilance level. There are a variety of mathematical models suitable to provide a framework for predictions however, to find the most accurate model, a collection of mathematical models were trained in this thesis and the most reliable was found. The methodology developed in this research is first applied to a theoretically sound measure of sustained attention called Sustained Attention Response to Task (SART) as adapted by Michael (2010), Michael and Meuter (2006, 2007). This experiment induced impairments due to monotony during a vigilance task. Analyses performed in this thesis confirm and extend findings from Michael (2010) that monotony leads to an important vigilance impairment independent of fatigue. This thesis is also the first to show that monotony changes the dynamics of vigilance evolution and tends to create a “monotonous state” characterised by reduced vigilance. Personality traits such as being a low sensation seeker can mitigate this vigilance decrement. It is also evident that lapses in vigilance can be predicted accurately with Bayesian modelling and Neural Networks. This framework was then applied to the driving task by designing a simulated monotonous driving task. The design of such task requires multidisciplinary knowledge and involved psychologist Rebecca Michael. Monotony was varied through both the road design and the road environment variables. This experiment demonstrated that road monotony can lead to driving impairment. Particularly monotonous road scenery was shown to have the most impact compared to monotonous road design. Next, this study identified a variety of surrogate measures that are correlated with vigilance levels obtained from the EEG. Such vigilance states can be predicted with these surrogate measures. This means that vigilance decrement can be detected in a car without the use of an EEG device. Amongst the different mathematical models tested in this thesis, only Neural Networks predicted the vigilance levels accurately. The results of both these experiments provide valuable information about the methodology to predict vigilance decrement. Such an issue is quite complex and requires modelling that can adapt to highly inter-individual differences. Only Neural Networks proved accurate in both studies, suggesting that these models are the most likely to be accurate when used on real roads or for further research on vigilance modelling. This research provides a better understanding of the driving task under monotonous conditions. Results demonstrate that mathematical modelling can be used to determine the driver’s vigilance state when driving using surrogate measures identified during this study. This research has opened up avenues for future research and could result in the development of an in-vehicle device predicting driver vigilance decrement. Such a device could contribute to a reduction in crashes and therefore improve road safety.