13 resultados para statistical application

em Deakin Research Online - Australia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Wildlife managers are often faced with the difficult task of determining the distribution of species, and their preferred habitats, at large spatial scales. This task is even more challenging when the species of concern is in low abundance and/or the terrain is largely inaccessible. Spatially explicit distribution models, derived from multivariate statistical analyses and implemented in a geographic information system (GIS), can be used to predict the distributions of species and their habitats, thus making them a useful conservation tool. We present two such models: one for a dasyurid, the Swamp Antechinus (Antechinus minimus), and the other for a ground-dwelling bird, the Rufous Bristlebird (Dasyornis broadbenti), both of which are rare species occurring in the coastal heathlands of south-western Victoria. Models were generated using generalized linear modelling (GLM) techniques with species presence or absence as the independent variable and a series of landscape variables derived from GIS layers and high-resolution imagery as the predictors. The most parsimonious model, based on the Akaike Information Criterion, for each species then was extrapolated spatially in a GIS. Probability of species presence was used as an index of habitat suitability. Because habitat fragmentation is thought to be one of the major threats to these species, an assessment of the spatial distribution of suitable habitat across the landscape is vital in prescribing management actions to prevent further habitat fragmentation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective
The use of then-test (retrospective pre-test) scores has frequently been proposed as a solution to potential confounding of change scores because of response shift, as it is assumed that then-test and post-test responses are provided from the same perspective. However, this assumption has not been formally tested using robust quantitative methods. The aim of this study was to compare the psychometric performance of then-test/post-test with traditional pre-test/post-test data and assessing whether the resulting data structures support the application of the then-test for evaluations of chronic disease self-management interventions.

Study Design and Setting
Pre-test, post-test, and then-test data were collected from 314 participants of self-management courses using the Health Education Impact Questionnaire (heiQ). The derived change scores (pre-test/post-test; then-test/post-test) were examined for their psychometric performance using tests of measurement invariance.

Results
Few questionnaire items were noninvariant across pre-test/post-test, with four items identified and requiring removal to enable an unbiased comparison of factor means. In contrast, 12 items were identified and required removal in then-test/post-test data to avoid biased change score estimates.

Conclusion
Traditional pre-test/post-test data appear to be robust with little indication of response shift. In contrast, the weaker psychometric performance of then-test/post-test data suggests psychometric flaws that may be the result of implicit theory of change, social desirability, and recall bias.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis describes the research undertaken for a degree of Master of Science in a retrospective study of airborne remotely sensed data registered in 1990 and 1993, and field captured data of aquatic humus concentrations for ~ 45 lakes in Tasmania. The aim was to investigate and describe the relationship between the remotely sensed data and the field data and to test the hypothesis that the remotely sensed data would establish further evidence of a limnological corridor of change running north-west to south- east. The airborne remotely sensed data consisted of data captured by the CSIRO Ocean Colour Scanner (OCS) and a newly developed Canadian scanner, a compact airborne spectrographic imager (CASI). The thesis investigates the relationship between the two kinds of data sources. The remotely sensed data was collected with the OCS scanner in 1990 (during one day) and with both the OCS and the CASI in 1993 (during three days). The OCS scanner registers data in 9 wavelength bands between 380 nm and 960 nm with a 10-20 nm bandwidth, and the CASI in 288 wavelength bands between 379.57 nm and 893.5 nm (ie. spectral mode) with a spectral resolution of 2.5 nm. The remotely sensed data were extracted from the original tapes with the help of the CSIRO and supplied software and digital sample areas (band value means) for each lake were subsequently extracted for data manipulation and statistical analysis. Field data was captured concurrently with the remotely sensed data in 1993 by lake hopping using a light aircraft with floats. The field data used for analysis with the remotely sensed data were the laboratory determined g440 values from the 1993 water samples collated with g440 values determined from earlier years. No spectro-radiometric data of the lakes, data of incoming irradiance or ancillary climatic data were captured during the remote sensing missions. The sections of the background chapter in the thesis provide a background to the research both in regards to remote sensing of water quality and the relationship between remotely sensed spectral data and water quality parameters, as well as a description of the Tasmanian lakes flown. The lakes were divided into four groups based on results from previous studies and optical parameters, especially aquatic humus concentrations as measured from field captured data. The four groups consist of the ‘green” clear water lakes mostly situated on the Central Plateau, the ‘brown” highly dystrophic lakes in western Tasmania, the ‘corridor” lakes situated along a corridor of change lying approximately between the two lines denoting the Jurassic edge and 1200 mm isohyet, and the ‘eastern, turbid” lakes make up the fourth group. The analytical part of the research work was mostly concerned with manipulating and analysing the CASI data because of its higher spectral resolution. The research explores methods to apply corrections to this data to reduce the disturbing effects of varying illumination and atmospheric conditions. Three different methods were attempted. In the first method two different standardisation formulas are applied to the data as well as ‘day correction” factors calculated from data from one of the lakes, Lake Rolleston, which had data captured for all three days of the remote sensing operations. The standardisation formulas were also applied to the OCS data. In second method an attempt to reduce the effects of the atmosphere was performed using spectro-radiometric captured in 1988 for one of the lakes flown, Great Lake. All the lake sample data were time normalised using general irradiance data obtained from the University of Tasmania and the sky portion as calculated from Great Lake upwelling irradiance data was then subtracted. The last method involved using two different band ratios to eliminate atmospheric effects. Statistical analysis was applied to the data resulting from the three methods to try to describe the relationship between the remotely sensed data and the field captured data. Discriminant analysis, cluster analysis and factor analysis using principal component analysis (pea) were applied to the remotely sensed data and the field data. The factor scores resulting from the pca were regressed against the field collated data of g440 as were the values resulting from last method. The results from the statistical analysis of the data from the first method show that the lakes group well (100%) against the predetermined groups using discriminant analysis applied to the remotely sensed CASI data. Most variance in the data are contained in the first factor resulting from pca regardless of data manipulation method. Regression of the factor scores against g440 field data show a strong non- linear relationship and a one-sided linear regression test is therefore considered an inappropriate analysis method to describe the dataset relationships. The research has shown that with the available data, correction and analysis methods, and within the scope of the Masters study, it was not possible to establish the relationships between the remotely sensed data and the field measured parameters as hoped. The main reason for this was the failure to retrieve remotely sensed lake signatures adequately corrected for atmospheric noise for comparison with the field data. This in turn is a result of the lack of detailed ancillary information needed to apply available established methods for noise reduction - to apply these methods we require field spectroradiometric measurements and environmental information of the varying conditions both within the study area and within the time frame of capture of the remotely sensed data.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Various statistical methods have been proposed to evaluate associations between measured genetic variants and disease, including some using family designs. For breast cancer and rare variants, we applied a modified segregation analysis method that uses the population cancer incidence and population-based case families in which a mutation is known to be segregating. Here we extend the method to a common polymorphism, and use a regressive logistic approach to model familial aggregation by conditioning each individual on their mother's breast cancer history. We considered three models: 1) class A regressive logistic model; 2) age-of-onset regressive logistic model; and 3) proportional hazards familial model. Maximum likelihood estimates were calculated using the software MENDEL. We applied these methods to data from the Australian Breast Cancer Family Study on the CYP17 5UTR TC MspA1 polymorphism measured for 1,447 case probands, 787 controls, and 213 relatives of case probands found to have the CC genotype. Breast cancer data for first- and second-degree relatives of case probands were used. The three methods gave consistent estimates. The best-fitting model involved a recessive inheritance, with homozygotes being at an increased risk of 47% (95% CI, 28-68%). The cumulative risk of the disease up to age 70 years was estimated to be 10% or 22% for a CYP17 homozygote whose mother was unaffected or affected, respectively. This analytical approach is well-suited to the data that arise from population-based case-control-family studies, in which cases, controls and relatives are studied, and genotype is measured for some but not all subjects.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article reviews how current social network analysis might be used to investigate individual and group behavior in sporting teams. Social network analysis methods permit researchers to explore social relations between team members and their individual-level qualities simultaneously. As such, social network analysis can be seen as augmenting existing approaches for the examination of intra-group relations among teams and provide detail of team members' informal connections to others within the team. Social network analysis is useful in addressing the issue of interdependencies in the data inherent in team structures. Social network terms are introduced and explained by way of an example team, software and resources are discussed, and a statistical approach to social network analysis is introduced.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Stormwater pipe systems in Australia are designed to convey water from rainfall and surface runoff only and do not transport sewage. Any blockage can cause flooding events with the probability of subsequent property damage. Proactive maintenance plans that can enhance their serviceability need to be developed based on a sound deterioration model. This paper uses a neural network (NN) approach to model deterioration in serviceability of concrete stormwater pipes, which make up the bulk of the stormwater network in Australia. System condition data was collected using CCTV images. The outcomes of model are the identification of the significant factors influencing the serviceability deterioration and the forecasting of the change of serviceability condition over time for individual pipes based on the pipe attributes. The proposed method is validated and compared with multiple discriminant analysis, a traditionally statistical method. The results show that the NN model can be applied to forecasting serviceability deterioration. However, further improvements in data collection and condition grading schemes should be carried out to increase the prediction accuracy of the NN model.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Identifying applications and classifying network traffic flows according to their source applications are critical for a broad range of network activities. Such a decision can be based on packet header fields, packet payload content, statistical characteristics of traffic and communication patterns of network hosts. However, most present techniques rely on some sort of apriori knowledge, which means they require labor-intensive preprocessing before running and cannot deal with previously unknown applications. In this paper, we propose a traffic classification system based on application signatures, with a novel approach to fully automate the process of deriving signatures from unidentified traffic. The key idea is to integrate statistics-based flow clustering with payload-based signature matching method, so as to eliminate the requirement of pre-labeled training data sets. We evaluate the efficiency of our approach using real-world traffic trace, and the results indicate that signature classifiers built from clustered data and pre-labeled data are able to achieve similar high accuracy better than 99%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper reviews the application of statistical models to planning and evaluating cancer screening programmes. Models used to analyse screening strategies can be classified as either surface models, which consider only those events which can be directly observed such as disease incidence, prevalence or mortality, or deep models, which incorporate hypotheses about the disease process that generates the observed events. This paper focuses on the latter type. These can be further classified as analytic models, which use a model of the disease to derive direct estimates of characteristics of the screening procedure and its consequent benefits, and simulation models, which use the disease model to simulate the course of the disease in a hypothetical population with and without screening and derive measures of the benefit of screening from the simulation outcomes. The main approaches to each type of model are described and an overview given of their historical development and strengths and weaknesses. A brief review of fitting and validating such models is given and finally a discussion of the current state of, and likely future trends in, cancer screening models is presented.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Randomised, placebo-controlled trials of treatments for depression typically collect outcomes data but traditionally only analyse data to demonstrate efficacy and safety. Additional post-hoc statistical techniques may reveal important insights about treatment variables useful when considering inter-individual differences amongst depressed patients. This paper aims to examine the Gradient Boosted Model (GBM), a statistical technique that uses regression tree analyses and can be applied to clinical trial data to identify and measure variables that may influence treatment outcomes.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In this paper, a hybrid model consisting of the fuzzy ARTMAP (FAM) neural network and the classification and regression tree (CART) is formulated. FAM is useful for tackling the stability–plasticity dilemma pertaining to data-based learning systems, while CART is useful for depicting its learned knowledge explicitly in a tree structure. By combining the benefits of both models, FAM–CART is capable of learning data samples stably and, at the same time, explaining its predictions with a set of decision rules. In other words, FAM–CART possesses two important properties of an intelligent system, i.e., learning in a stable manner (by overcoming the stability–plasticity dilemma) and extracting useful explanatory rules (by overcoming the opaqueness issue). To evaluate the usefulness of FAM–CART, six benchmark medical data sets from the UCI repository of machine learning and a real-world medical data classification problem are used for evaluation. For performance comparison, a number of performance metrics which include accuracy, specificity, sensitivity, and the area under the receiver operation characteristic curve are computed. The results are quantified with statistical indicators and compared with those reported in the literature. The outcomes positively indicate that FAM–CART is effective for undertaking data classification tasks. In addition to producing good results, it provides justifications of the predictions in the form of a decision tree so that domain users can easily understand the predictions, therefore making it a useful decision support tool.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The bulk of existing work on the statistical forecasting of air quality is based on either neural networks or linear regressions, which are both subject to important drawbacks. In particular, while neural networks are complicated and prone to in-sample overfitting, linear regressions are highly dependent on the specification of the regression function. The present paper shows how combining linear regression forecasts can be used to circumvent all of these problems. The usefulness of the proposed combination approach is verified using both Monte Carlo simulation and an extensive application to air quality in Bogota, one of the largest and most polluted cities in Latin America. © 2014 Elsevier Ltd.