103 resultados para Reliability measure

em University of Queensland eSpace - Australia


Relevância:

70.00% 70.00%

Publicador:

Resumo:

Most of the modem developments with classification trees are aimed at improving their predictive capacity. This article considers a curiously neglected aspect of classification trees, namely the reliability of predictions that come from a given classification tree. In the sense that a node of a tree represents a point in the predictor space in the limit, the aim of this article is the development of localized assessment of the reliability of prediction rules. A classification tree may be used either to provide a probability forecast, where for each node the membership probabilities for each class constitutes the prediction, or a true classification where each new observation is predictively assigned to a unique class. Correspondingly, two types of reliability measure will be derived-namely, prediction reliability and classification reliability. We use bootstrapping methods as the main tool to construct these measures. We also provide a suite of graphical displays by which they may be easily appreciated. In addition to providing some estimate of the reliability of specific forecasts of each type, these measures can also be used to guide future data collection to improve the effectiveness of the tree model. The motivating example we give has a binary response, namely the presence or absence of a species of Eucalypt, Eucalyptus cloeziana, at a given sampling location in response to a suite of environmental covariates, (although the methods are not restricted to binary response data).

Relevância:

40.00% 40.00%

Publicador:

Resumo:

The present investigation assessed the reliability and validity of the scores of a subjective measure of desired aspirations and a behavioral measure of enacted aspirations. A sample of 5,655 employees was randomly split into two halves. Principal components analysis on Sample 1, followed by confirmatory factor analysis on Sample 2, confirmed the desired and enacted scales as distinct but related measures of managerial aspirations. The desired and enacted scales had satisfactory levels of internal consistency and temporal stability over a 1-year period. Relationships between the measures of desired and enacted managerial aspirations and both attitudinal and behavioral criteria, measured concurrently and 1 year later, provided preliminary support for convergent and discriminant validity for our sample. Desired aspirations demonstrated stronger validity than enacted aspirations. Although further examination of the psychometric properties of the scales is warranted, the present findings provide promising support for their validity and reliability for our sample.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Purpose: This study was conducted to examine the test-retest reliability of a measure of prediagnosis physical activity participation administered to colorecial cancer survivors recruited from a population-based state cancer registry. Methods: A total of 112 participants completed two telephone interviews. I month apart, reporting usual weekly physical activity in the year before their cancer diagnosis. Intraclass correlation coefficients (ICC) and standard en-or of measurement (SEM) were used to describe the test-retest reliability of the measure across the sample: the Bland-Altman approach was used to describe reliability at the individual level. The test-retest reliability for categorized total physical activity (active, insufficiently active, sedentary) was assessed using the kappa statistic. Results: When the complete sample was considered, the ICC ranged from 0.40 (95% Cl: 0.24, 0.55) for vigorous gardening to 0.77 (95% Cl: 0.68, 0.84) for moderate physical activity. The SEM, however, were large. indicating high measurement error. The Bland-Altman plots indicated that the reproducibility of data decreases as the aniount of physical activity reported each week increases The kappa coefficient for the categorized data was 0.62 (95% Cl: 0.48, 0.76). Conclusion: Overall. the results indicated low levels of repeatability for this measure of historical physical activity. Categorizing participants as active, insufficiently active, or sedentary provides a higher level of test-retest reliability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: The physical environment plays an important role in influencing participation in physical activity, although which factors of the physical environment have the greatest effect on patterns of activity remain to be determined. We describe the development of a comprehensive instrument to measure the physical environmental factors that may influence walking and cycling in local neighborhoods and report on its reliability. Methods: Following consultation with experts from a variety of fields and a literature search, we developed a Systematic Pedestrian and Cycling Environmental Scan (SPACES) instrument and used it to collect data over a total of 1987 kilometers of roads in metropolitan Perth, Western Australia. The audit instrument is available from the first author on request. Additional environmental information was collected using desktop methods and geographic information systems (GIS) technology. We assessed inter- and intra-rater reliability of the instrument among the 16 observers who collected the data. Results: The observers reported that the audit instrument was easy to use. Both inter- and intra-rater reliability of the environmental scan instrument were generally high. Conclusions: Our instrument provides a reliable, practical, and easy to-use method for collecting detailed street-level data on physical environmental factors that are potential influences on walking in local neighborhoods.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To assess the intrarater and interrater reliability among rheumatologists of a standardised protocol for measurement of shoulder movements using a gravity inclinometer. Methods: After instruction, six rheurnatologists independently assessed eight movements of the shoulder, including total and glenohumeral flexion, total and glenohumeral abduction, external rotation in neutral and in abduction, internal rotation in abduction and hand behind back, in random order in six patients with shoulder pain and stiffness according to a 6x6 Latin square design using a standardised protocol. These assessments were then repeated. Analysis of variance was used to partition total variability into components of variance in order to calculate intraclass correlation coefficients (ICCs). Results: The intrarater and interrater reliability of different shoulder movements varied widely. The movement of hand behind back and total shoulder flexion yielded the highest ICC scores for both intrarater reliability (0.91 and 0.83, respectively) and interrater reliability (0.80 and 0.72, respectively). Low ICC scores were found for the movements of glenohumeral abduction, external rotation in abduction, and internal rotation in abduction (intrarater ICCs 0.35, 0.43, and 0.32, respectively), and external rotation in neutral, external rotation in abduction, and internal rotation in abduction (interrater ICCs 0.29, 0.11, and 0.06, respectively). Conclusions: The measurement of shoulder movements using a standardised protocol by rheumatologists produced variable intrarater and interrater reliability. Reasonable reliability was obtained only for the movement of hand behind back and total shoulder flexion.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Developed, piloted, and examined the psychometric properties of the Child and Adolescent Social and Adaptive Functioning Scale (CASAFS), a self-report measure designed to examine the social functioning of young people in the areas of school performance, peer relationships, family relationships, and home duties/self-care. The findings of confirmatory and exploratory factor analysis support a 4-factor solution consistent with the hypothesized domains. Fit indexes suggested that the 4-correlated factor model represented a satisfactory solution for the data, with the covariation between factors being satisfactorily explained by a single, higher order factor reflecting social and adaptive functioning in general. The internal consistency and 12-month test-retest reliability of the total scale was acceptable. A significant, negative correlation was found between the CASAFS and a measure of depressive symptoms, showing that high levels of social functioning are associated with low levels of depression. Significant differences in CASAFS total and subscale scores were found between clinically depressed adolescents and a matched sample of nonclinical controls. Adolescents who reported elevated but subclinical levels of depression also reported lower levels of social functioning in comparison to nonclinical controls.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE Because there is discordance between different immunoassay values for serum hGH, and because clinical state may not correlate with immunoreactive hGH, we have developed an assay to accurately measure serum hGH somatogenic bioactivity. The results of this assay were compared with the Elegance two-site ELISA assay across 135 patient samples in a variety of clinical states. DESIGN The somatogenic assay was based on stable expression of hGH receptor in the murine BaF line, allowing these cells to proliferate in response to hGH. To eliminate interference by other growth factors in serum, we created a specific antagonist of the hGH receptor (similar to Trovert or Pegvisomant) which allowed us to obtain a true measure of hGH somatogenic activity by subtraction of the activity in the presence of the antagonist. The assay was carried out in microtiter plates over 24 h, with oxidation of a chromogenic tetrazolium salt (MTT) as the endpoint. PATIENTS These encompassed a number of different clinical conditions related to short stature, including idiopathic short stature, neurosecretory dysfunction and renal failure, as well as obese patients on dietary restriction and normal volunteers. MEASUREMENTS In addition to the colourimetric (MTT) response to hGH, we measured free hGH by stripping out GHBP-bound hGH using beads coupled to a monoclonal antibody to the GHBP (GH binding protein). All samples were measured in both bioassay and ELISA assay. RESULTS This bioassay was sensitive (5 mU/l or 2 mug/l) and precise, and not subject to interference by the GHBP. There was a good correlation (r = 0.95) between bioactivity and immunoactivity across clinical states. There was, however, an increased bioactivity during secretory peaks (over 25 mU/l), which has been reported previously for the Nb2 bioassay. Free hGH did not correlate with clinical state. CONCLUSIONS Because the results of the Elegance ELISA and the bioassay correlate well, even though there is greater bioactivity at higher hormone concentrations, it is evident that an appropriate immunoassay is able to act as a reliable indicator for clinical assessment. In those rare cases where bio-inactive GH exists, our bioassay should provide an appropriate means to demonstrate this.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background. The purpose of this study was to examine the reliability of stage of change (SOC) measures for moderate-intensity and vigorous physical activity in two separate samples of young adults. Staging measures have focused on vigorous exercise, but current public health guidelines emphasize moderate-intensity activity. Method. For college students in the USA (n = 105) and in Australia (n = 123), SOC was assessed separately on two occasions for moderate-intensity activity and for vigorous activity. Test-retest repeatability was determined, using Cohen's kappa coefficient. Results. In both samples, the reliability scores for the moderate-intensity physical activity staging measure were lower than the scores for the vigorous exercise staging measure. Weighted kappa values for the moderate-intensity staging measure were in the fair to good range for both studies (0.50 and 0.45); for the vigorous staging measure kappa values were excellent and fair to good (0.76 and 0.72). Conclusions. There is a need to standardize and improve methods for staging moderate-intensity activity, given that such measures are used in public health interventions targeting HEPA (health-enhancing physical activity). (C) 2003 American Health Foundation and Elsevier Science (USA). All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

We investigated the accuracy and reliability of observational kinematic gait assessments performed via a low-bandwidth Internet link (118 kbit/s) and a higher-speed Internet link (128 kbit/s). Twenty-four subjects were randomized to either bandwidth group. Gait was assessed with the Gait Assessment Rating Scale (GARS) in the traditional manner, which is from video-recordings, and with repeated measurements via the online method. Online assessment was found to provide as accurate a measure of gait performance as the traditional assessment (limits of agreement

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To assess the reliability and validity of a brief measure of quality of life recently developed by the World Health Organization, the WHOQOL-BREF, and to examine its association with a variety of clinical and sociodemographic factors in older depressed patients. Design: Cross-sectional study. Methods: Older depressed patients (N=41) underwent diagnostic assessment using the Composite International Diagnostic Interview (CIDI) and were independently assessed on a variety of measures including the WHOQOL-BREF (a 26-item self-report questionnaire generating four domain scores), Hamilton Depression Rating Scale (HAM-D); Geriatric Depression Scale (GDS); Mini-mental State Examination (MMSE); Modified Barthel Index (MBI); Instrumental activities of daily living (IADL), and measures of physical health status and social relationships. Estimates of inter-rater and test-retest reliability, and concurrent validity were made. Results: 39 subjects completed the study. The majority of subjects (94.9%) received a diagnosis of DSM-IV Major Depressive Disorder. Levels of comorbidity were high. Three of the four domains of the WHOQOL-BREF (Physical, Psychological and Environment domains) demonstrated satisfactory reliability and validity. However, the Social Relationships domain exhibited poor validity. Quality of life scores were strongly correlated with severity of depression, number of self-reported physical symptoms and self-assessed general health status. There was no relationship between diagnostic comorbidity and quality of life scores. Conclusions: The WHOQOL-BREF was successfully administered to older depressed patients although the concurrent validity of one of its four domains was poor. Quality of life scores were strongly correlated with severity of depression, raising the issue of measurement redundancy.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The reliability of measurement refers to unsystematic error in observed responses. Investigations of the prevalence of random error in stated estimates of willingness to pay (WTP) are important to an understanding of why tests of validity in CV can fail. However, published reliability studies have tended to adopt empirical methods that have practical and conceptual limitations when applied to WTP responses. This contention is supported in a review of contingent valuation reliability studies that demonstrate important limitations of existing approaches to WTP reliability. It is argued that empirical assessments of the reliability of contingent values may be better dealt with by using multiple indicators to measure the latent WTP distribution. This latent variable approach is demonstrated with data obtained from a WTP study for stormwater pollution abatement. Attitude variables were employed as a way of assessing the reliability of open-ended WTP (with benchmarked payment cards) for stormwater pollution abatement. The results indicated that participants' decisions to pay were reliably measured, but not the magnitude of the WTP bids. This finding highlights the need to better discern what is actually being measured in VVTP studies, (C) 2003 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

To determine whether the visuospatial n-back working memory task is a reliable and valid measure of cognitive processes believed to underlie intelligence, this study compared the reaction times and accuracy of perforniance of 70 participants, with performance on the Multidimensional Aptitude Battery (MAB). Testing was conducted over two sessions separated by 1 week. Participants completed the MAB during the second test session. Moderate testretest reliability for percentage accuracy scores was found across the four levels of the n-back task, whilst reaction times were highly reliable. Furthermore, participants' performance on the MAB was negatively correlated with accuracy of performance at the easier levels of the n-back task and positively correlated with accuracy of performance at the harder task levels. These findings confirm previous research examining the cognitive basis of intelligence, and suggest that intelligence is the product of faster speed of information processing, as well as superior working memory capacity. (C) 2004 Elsevier Inc. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study examined the psychometric properties of the parent version of the Spence Children's Anxiety Scale (SCAS-P); 484 parents of anxiety disordered children and 261 parents in a normal control group participated in the study. Results of confirmatory factor analysis provided support for six intercorrelated factors, that corresponded with the child self-report as well as with the classification of anxiety disorders by DSM-IV (namely separation anxiety, generalized anxiety, social phobia, panic/agoraphobia, obsessive-compulsive disorder, and fear of physical injuries). A post-hoc model in which generalized anxiety functioned as the higher order factor for the other five factors described the data equally well. The reliability of the subscales was satisfactory to excellent. Evidence was found for both convergent and divergent validity: the measure correlated well with the parent report for internalizing symptoms, and lower with externalizing symptoms. Parent-child agreement ranged from 0.41 to 0.66 in the anxiety-disordered group, and from 0.23 to 0.60 in the control group. The measure differentiated significantly between anxiety-disordered children versus controls, and also between the different anxiety disorders except GAD. The SCAS-P is recommended as a screening instrument for normal children and as a diagnostic instrument in clinical settings. (C) 2003 Elsevier Ltd. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objectives: To report the research and development of a new approach to Functional Capacity Evaluation, the Gibson Approach to Functional Capacity Evaluation (GAPP FCE) for chronic back pain clients. Methods: Four Studies, including pilot and feasibility testing, expert review, and preliminary interrater reliability examination, are described here. Participants included 7 healthy young adults and 19 rehabilitation clients with back pain who underwent assessment using the GAPP FCE. Thirteen therapists were trained in the approach and were silently observed administering the Functional Capacity Evalutions by at least 1 other trained therapists or the first investigator Or both. An expert review using 5 expert occupational therapists was also conducted. Results: Study 1, the pilot with healthy individuals, indicated that the GAPP FCE was a feasible approach with good utility. Study 2, a pilot using 2 trained therapists assessing 5 back pain clients, supported the clinical feasibility of the approach. The expert review in Study 3 found support for GAPP FCE. Study 4, a trial of the approach with 14 rehabilitation clients, found support for the interrater reliability of recommendations for return to work based on performance in the GAPP FCE. Discussion: The evidence thus far available supports the GAPP FCE as ail approach that provides a Sound method for evaluating the performance of the physical demands of work with clients with chronic back pain. The tool has been shown to have good face and content validity, to meet acceptable test standards, and to have reasonable interrater reliability. Further research is occurring to look at a larger interrater reliability study, to further examine content validity, and to examine predictive validity.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes the development of an instrument to assess coping strategies for auditory hallucinations. An inventory of coping strategies was obtained by conducting semi-structured interviews with 17 male participants. This inventory was then used to develop a 27-item questionnaire, the Responses to Auditory Hallucinations Questionnaire (RAHQ). The RAHQ was administered to 125 respondents. Measures of symptom severity, appraisal, anxiety, depression and coping dissatisfaction were also administered. Factor Analysis of the RAHQ yielded three coping subscales, Active coping, Passive coping and Suppression coping. The subscales were shown to be empirically distinct and to possess satisfactory internal reliability. For a small subgroup of participants, two of the three subscales demonstrated satisfactory test-retest reliability. Construct validity was assessed within a stress and coping framework. The RAHQ will facilitate the investigation of the efficacy of coping strategies for the management of auditory hallucinations.