305 resultados para Statistical measures

em Queensland University of Technology - ePrints Archive


Relevância:

60.00% 60.00%

Publicador:

Resumo:

Objective To synthesise recent research on the use of machine learning approaches to mining textual injury surveillance data. Design Systematic review. Data sources The electronic databases which were searched included PubMed, Cinahl, Medline, Google Scholar, and Proquest. The bibliography of all relevant articles was examined and associated articles were identified using a snowballing technique. Selection criteria For inclusion, articles were required to meet the following criteria: (a) used a health-related database, (b) focused on injury-related cases, AND used machine learning approaches to analyse textual data. Methods The papers identified through the search were screened resulting in 16 papers selected for review. Articles were reviewed to describe the databases and methodology used, the strength and limitations of different techniques, and quality assurance approaches used. Due to heterogeneity between studies meta-analysis was not performed. Results Occupational injuries were the focus of half of the machine learning studies and the most common methods described were Bayesian probability or Bayesian network based methods to either predict injury categories or extract common injury scenarios. Models were evaluated through either comparison with gold standard data or content expert evaluation or statistical measures of quality. Machine learning was found to provide high precision and accuracy when predicting a small number of categories, was valuable for visualisation of injury patterns and prediction of future outcomes. However, difficulties related to generalizability, source data quality, complexity of models and integration of content and technical knowledge were discussed. Conclusions The use of narrative text for injury surveillance has grown in popularity, complexity and quality over recent years. With advances in data mining techniques, increased capacity for analysis of large databases, and involvement of computer scientists in the injury prevention field, along with more comprehensive use and description of quality assurance methods in text mining approaches, it is likely that we will see a continued growth and advancement in knowledge of text mining in the injury field.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Survey-based health research is in a boom phase following an increased amount of health spending in OECD countries and the interest in ageing. A general characteristic of survey-based health research is its diversity. Different studies are based on different health questions in different datasets; they use different statistical techniques; they differ in whether they approach health from an ordinal or cardinal perspective; and they differ in whether they measure short-term or long-term effects. The question in this paper is simple: do these differences matter for the findings? We investigate the effects of life-style choices (drinking, smoking, exercise) and income on six measures of health in the US Health and Retirement Study (HRS) between 1992 and 2002: (1) self-assessed general health status, (2) problems with undertaking daily tasks and chores, (3) mental health indicators, (4) BMI, (5) the presence of serious long-term health conditions, and (6) mortality. We compare ordinal models with cardinal models; we compare models with fixed effects to models without fixed-effects; and we compare short-term effects to long-term effects. We find considerable variation in the impact of different determinants on our chosen health outcome measures; we find that it matters whether ordinality or cardinality is assumed; we find substantial differences between estimates that account for fixed effects versus those that do not; and we find that short-run and long-run effects differ greatly. All this implies that health is an even more complicated notion than hitherto thought, defying generalizations from one measure to the others or one methodology to another.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Forecasts of volatility and correlation are important inputs into many practical financial problems. Broadly speaking, there are two ways of generating forecasts of these variables. Firstly, time-series models apply a statistical weighting scheme to historical measurements of the variable of interest. The alternative methodology extracts forecasts from the market traded value of option contracts. An efficient options market should be able to produce superior forecasts as it utilises a larger information set of not only historical information but also the market equilibrium expectation of options market participants. While much research has been conducted into the relative merits of these approaches, this thesis extends the literature along several lines through three empirical studies. Firstly, it is demonstrated that there exist statistically significant benefits to taking the volatility risk premium into account for the implied volatility for the purposes of univariate volatility forecasting. Secondly, high-frequency option implied measures are shown to lead to superior forecasts of the intraday stochastic component of intraday volatility and that these then lead on to superior forecasts of intraday total volatility. Finally, the use of realised and option implied measures of equicorrelation are shown to dominate measures based on daily returns.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Clarence-Moreton Basin (CMB) covers approximately 26000 km2 and is the only sub-basin of the Great Artesian Basin (GAB) in which there is flow to both the south-west and the east, although flow to the south-west is predominant. In many parts of the basin, including catchments of the Bremer, Logan and upper Condamine Rivers in southeast Queensland, the Walloon Coal Measures are under exploration for Coal Seam Gas (CSG). In order to assess spatial variations in groundwater flow and hydrochemistry at a basin-wide scale, a 3D hydrogeological model of the Queensland section of the CMB has been developed using GoCAD modelling software. Prior to any large-scale CSG extraction, it is essential to understand the existing hydrochemical character of the different aquifers and to establish any potential linkage. To effectively use the large amount of water chemistry data existing for assessment of hydrochemical evolution within the different lithostratigraphic units, multivariate statistical techniques were employed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis explored the development of statistical methods to support the monitoring and improvement in quality of treatment delivered to patients undergoing coronary angioplasty procedures. To achieve this goal, a suite of outcome measures was identified to characterise performance of the service, statistical tools were developed to monitor the various indicators and measures to strengthen governance processes were implemented and validated. Although this work focused on pursuit of these aims in the context of a an angioplasty service located at a single clinical site, development of the tools and techniques was undertaken mindful of the potential application to other clinical specialties and a wider, potentially national, scope.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents the field applications and validations for the controlled Monte Carlo data generation scheme. This scheme was previously derived to assist the Mahalanobis squared distance–based damage identification method to cope with data-shortage problems which often cause inadequate data multinormality and unreliable identification outcome. To do so, real-vibration datasets from two actual civil engineering structures with such data (and identification) problems are selected as the test objects which are then shown to be in need of enhancement to consolidate their conditions. By utilizing the robust probability measures of the data condition indices in controlled Monte Carlo data generation and statistical sensitivity analysis of the Mahalanobis squared distance computational system, well-conditioned synthetic data generated by an optimal controlled Monte Carlo data generation configurations can be unbiasedly evaluated against those generated by other set-ups and against the original data. The analysis results reconfirm that controlled Monte Carlo data generation is able to overcome the shortage of observations, improve the data multinormality and enhance the reliability of the Mahalanobis squared distance–based damage identification method particularly with respect to false-positive errors. The results also highlight the dynamic structure of controlled Monte Carlo data generation that makes this scheme well adaptive to any type of input data with any (original) distributional condition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Alcohol is a major factor in road deaths and serious injuries. In Victoria, between 2008 and 2013, 30% of drivers killed were involved in alcohol-related crashes. From the early 1980s Victoria progressively introduced a series of measures, such as driver licence cancellation and alcohol interlocks, to reduce the level of drink-driving on Victoria's roads. This project tracked drink-driving offenders to measure and understand their re-offence and road trauma involvement levels during and after periods of licensing and driving interventions. The methodology controlled for exposure by aggregating crashes and traffic violations within relevant categories (e.g. licence cancelled/relicensed/relicensing not sought) and calculated as rates 'per thousand person-years'. Inferential statistical techniques were used to compare crash and offence rates between control and treatment groups across three distinct time periods, which coincided with the introduction of new interventions. This paper focuses on the extent to which the Victorian drink-driving measures have been successful in reducing re-offending and road trauma involvement during and after periods of licence interventions. It was found that a licence cancellation/ban is an effective drink-driving countermeasure as it reduced drink-driving offending and drink-driving crashes. Interlocks also had a positive effect on drink-driving offences as they were reduced during the interlock period as well as for the entire intervention period. Possible drink-driving policy implications are briefly discussed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The method of generalized estimating equation-, (GEEs) has been criticized recently for a failure to protect against misspecification of working correlation models, which in some cases leads to loss of efficiency or infeasibility of solutions. However, the feasibility and efficiency of GEE methods can be enhanced considerably by using flexible families of working correlation models. We propose two ways of constructing unbiased estimating equations from general correlation models for irregularly timed repeated measures to supplement and enhance GEE. The supplementary estimating equations are obtained by differentiation of the Cholesky decomposition of the working correlation, or as score equations for decoupled Gaussian pseudolikelihood. The estimating equations are solved with computational effort equivalent to that required for a first-order GEE. Full details and analytic expressions are developed for a generalized Markovian model that was evaluated through simulation. Large-sample ".sandwich" standard errors for working correlation parameter estimates are derived and shown to have good performance. The proposed estimating functions are further illustrated in an analysis of repeated measures of pulmonary function in children.