954 resultados para Test reliability


30.00% 30.00%



Occupational therapists often assess the visual motor integration (VMI) skills of children and young people. It is important that therapists use tools with strong psychometric properties. This study aims to examine the reliability of 2 VMI tests. Ninety-two children between the ages of 5 and 17 years (response rate of 31%) completed 2 VMI tests: the Developmental Test of Visual Motor Integration (DTVMI) and the Full Range Test of Visual Motor Integration (FRTVMI). Cronbach's alpha coefficient was used to examine the internal consistency of the 2 VMI tests whereas Spearman's rho correlation was used to evaluate the test–retest reliability, intrarater reliability, and interrater reliability of the 2 VMI tests. The Cronbach's alpha coefficient for the DTVMI was .82 and .72 for the FRTVMI. The test–retest reliability coefficient was .73 (p = .000) for the DTVMI and .49 (p = .05) for the FRTVMI. The interrater correlation was significant for both the DTVMI at .94 (p = .000) and FRTVMI at .68 (p = .001). The DTVMI intrarater reliability correlation result was .90 (p = .000) and the FRTVMI at .85 (p = .000). Overall, the DTVMI exhibited a higher level of reliability than the FRTVMI. Both VMI tests appear to exhibit reasonable levels of reliability and are recommended for use with children and young people.


30.00% 30.00%



Background : Insufficient participation in physical activity and excessive screen time have been observed among Chinese children. The role of social and environmental factors in shaping physical activity and sedentary behaviors among Chinese children is under-investigated. The purpose of the present study was to assess the reliability and validity of a questionnaire to measure child- and parent-reported psychosocial and environmental correlates of physical activity and screen-based behaviors among Chinese children in Hong Kong.

Methods :
A total of 303 schoolchildren aged 9-14 years and their parents volunteered to participate in this study and 160 of them completed the questionnaire twice within an interval of 10 days. Intraclass correlation coefficients (ICCs), kappa statistics, and percent agreement were performed to evaluate test-retest reliability of the continuous and categorical variables, respectively. Exploratory factor analyses (EFAs) were conducted to assess convergent validity of the emergent scales. Cronbach's alpha and ICCs were performed to assess internal and test-retest reliability of the emergent scales. Criterion validity was assessed by correlating psychosocial and environmental measures with self-reported physical activity and screen-based behaviors, measured by a validated questionnaire.

Results :
Reliability statistics for both child- and parent-reported continuous variables showed acceptable consistency for all of the ICC values greater than 0.70. Kappa statistics showed fair to perfect test-retest reliability for the categorical items. Adequate internal consistency and test-retest reliability were observed in most of the emergent scales. Criterion validity assessed by correlating psychosocial and environmental measures with child-reported physical activity found associations with physical activity in the self-efficacy scale (r = 0.25, P < 0.05), the peer support for physical activity scale (r = 0.25, P < 0.05) and home physical activity environmental (r = 0.14, P < 0.05). Children's screen-based behaviors were associated with the family support for physical activity scale (r = -0.22, P < 0.05) and parental role modeling of TV (r = 0.12, P = 0.053).

Conclusions :
The findings provide psychometric support for using this questionnaire for examining psychosocial and environmental correlates of physical activity and screen-based behaviors among Chinese children in Hong Kong. Further research is needed to develop more robust measures based on the current questionnaire, especially for peer influence on physical activity and parental rules on screen-based behaviors.


30.00% 30.00%



This study aimed to examine the reliability and validity of the modified Children’s Leisure Activities Study Survey (CLASS) Chinese-version questionnaire in assessing physical activity among Hong Kong Chinese Children. Test-retest reliability was examined in 84 boys and 136 girls aged 9–12 years by comparing data from two administrations of the survey conducted one week apart. Validity was determined by comparing data from the second administration with accelerometer estimates. The results suggested that the questionnaire provided reliable and valid estimates in overall physical activity patterns in Hong Kong Chinese children. However, substantial overestimation was observed in vigorous activity.


30.00% 30.00%



Background The diagnosis of displacement in scaphoid fractures is notorious for poor interobserver reliability.

Questions/purposes We tested whether training can improve interobserver reliability and sensitivity, specificity, and accuracy for the diagnosis of scaphoid fracture displacement on radiographs and CT scans.

Methods Sixty-four orthopaedic surgeons rated a set of radiographs and CT scans of 10 displaced and 10 nondisplaced scaphoid fractures for the presence of displacement, using a web-based rating application. Before rating, observers were randomized to a training group (34 observers) and a nontraining group (30 observers). The training group received an online training module before the rating session, and the nontraining group did not. Interobserver reliability for training and nontraining was assessed by Siegel’s multirater kappa and the Z-test was used to test for significance.

Results There was a small, but significant difference in the interobserver reliability for displacement ratings in favor of the training group compared with the nontraining group. Ratings of radiographs and CT scans combined resulted in moderate agreement for both groups. The average sensitivity, specificity, and accuracy of diagnosing displacement of scaphoid fractures were, respectively, 83%, 85%, and 84% for the nontraining group and 87%, 86%, and 87% for the training group. Assuming a 5% prevalence of fracture displacement, the positive predictive value was 0.23 in the nontraining group and 0.25 in the training group. The negative predictive value was 0.99 in both groups.

Conclusions Our results suggest training can improve interobserver reliability and sensitivity, specificity and accuracy for the diagnosis of scaphoid fracture displacement, but the improvements are slight. These findings are encouraging for future research regarding interobserver variation and how to reduce it further.


30.00% 30.00%



PURPOSE. To compare the reliability, validity, and responsiveness of the Mars Letter Contrast Sensitivity (CS) Test to the Pelli-Robson CS Chart.

METHODS. One eye of 47 normal control subjects, 27 patients with open-angle glaucoma, and 17 with age-related macular degeneration (AMD) was tested twice with the Mars test and twice with the Pelli-Robson test, in random order on separate days. In addition, 17 patients undergoing cataract surgery were tested, once before and once after surgery.

RESULTS. The mean Mars CS was 1.62 log CS (0.06 SD) for normal subjects aged 22 to 77 years, with significantly lower values in patients with glaucoma or AMD (P < 0.001). Mars test-retest 95% limits of agreement (LOA) were ±0.13, ±0.19, and ±0.24 log CS for normal, glaucoma, and AMD, respectively. In comparison, Pelli-Robson test-retest 95% LOA were ±0.18, ±0.19, and ±0.33 log CS. The Spearman correlation between the Mars and Pelli-Robson tests was 0.83 (P < 0.001). However, systematic differences were observed, particularly at the upper-normal end of the range, where Mars CS was lower than Pelli-Robson CS. After cataract surgery, Mars and Pelli-Robson effect size statistics were 0.92 and 0.88, respectively.

CONCLUSIONS. The results indicate the Mars test has test-retest reliability equal to or better than the Pelli-Robson test and comparable responsiveness. The strong correlation between the tests provides evidence the Mars test is valid. However, systematic differences indicate normative values are likely to be different for each test. The Mars Letter CS Test is a useful and practical alternative to the Pelli-Robson CS Chart.


30.00% 30.00%



Background: The Broberg and Morrey modification of the Mason classification of radial head fractures has substantial interobserver variation. This study used a large web-based collaborative of experienced orthopaedic surgeons to test the hypothesis that three-dimensional reconstructions of computed tomography (CT) scans improve the interobserver reliability of the classification of radial head fractures according to the Broberg and Morrey modification of the Mason classification.

Methods: Eighty-five orthopaedic surgeons evaluated twelve radial head fractures. They were randomly assigned to review either radiographs and two-dimensional CT scans or radiographs and three-dimensional CT images to determine the fracture classification, fracture characteristics, and treatment recommendations. The kappa multirater measure (κ) was calculated to estimate agreement between observers.

Results: Three-dimensional CT had moderate agreement and two-dimensional CT had fair agreement among observers for the Broberg and Morrey modification of the Mason classification, a difference that was significant. Observers assessed seven fracture characteristics, including fracture line, comminution, articular surface involvement, articular step or gap of ≥2 mm, central impaction, recognition of more than three fracture fragments, and fracture fragments too small to repair. There was a significant difference in kappa values between three-dimensional CT and two-dimensional CT for fracture fragments too small to repair, recognition of three fracture fragments, and central impaction. The difference between the other four fracture characteristics was not significant. Among treatment recommendations, there was fair agreement for both three-dimensional CT and two-dimensional CT.

Conclusions: Although three-dimensional CT led to some small but significant decreases in interobserver variation, there is still considerable disagreement regarding classification and characterization of radial head fractures. Three-dimensional CT may be insufficient to optimize interobserver agreement.


30.00% 30.00%



Urban Sustainability expresses the level of conservation of a city while living a town or consuming its urban resources, but the measurement of urban sustainability depends on what are considered important indicators of conservation besides the permitted levels of consumption in accordance with adopted criteria. This criterion should have common factors that are shared for all the members tested or cities to be evaluated as in this particular case for Abu Dhabi, but also have specific factors that are related to the geographic place, community and culture, that is the measures of urban sustainability specific to a middle east climate, community and culture where GIS Vector and Raster analysis have a role or add a value in urban sustainability measurements or grading are considered herein. Scenarios were tested using various GIS data types to replicate urban history (ten years period), current status and expected future of Abu Dhabi City setting factors to climate, community needs and culture. The useful Vector or Raster GIS data sets that are related to every scenario where selected and analysed in the sense of how and how much it can benefit the urban sustainability ranking in quantity and quality tests, this besides assessing the suitable data nature, type and format, the important topology rules to be considered, the useful attributes to be added, the relationships which should be maintained between data types of a geo- database, and specify its usage in a specific scenario test, then setting weights to each and every data type representing some elements of a phenomenon related to urban suitability factor. The results of assessing the role of GIS analysis provided data collection specifications such as the measures of accuracy reliable to a certain type of GIS functional analysis used in an urban sustainability ranking scenario tests. This paper reflects the prior results of the research that is conducted to test the multidiscipline evaluation of urban sustainability using different indicator metrics, that implement vector GIS Analysis and Raster GIS analysis as basic tools to assist the evaluation and increase of its reliability besides assessing and decomposing it, after which a hypothetical implementation of the chosen evaluation model represented by various scenarios was implemented on the planned urban sustainability factors for a certain period of time to appraise the expected future grade of urban sustainability and come out with advises associated with scenarios for assuring gap filling and relative high urban future sustainability. The results this paper is reflecting are concentrating on the elements of vector and raster GIS analysis that assists the proper urban sustainability grading within the chosen model, the reliability of spatial data collected; analysis selected and resulted spatial information. Starting from selecting some important indicators to comprise the model which include regional culture, climate and community needs an example of what was used is Energy Demand & Consumption (Cooling systems). Thus, this factor is related to the climate and it‟s regional specific as the temperature varies around 30-45 degrees centigrade in city areas, GIS 3D Polygons of building data used to analyse the volume of buildings, attributes „building heights‟, estimate the number of floors from the equation, following energy demand was calculated and consumption for the unit volume, and compared it in scenario with possible sustainable energy supply or using different environmental friendly cooling systems this is followed by calculating the cooling system effects on an area unit selected to be 1 sq. km, combined with the level of greenery area, and open space, as represented by parks polygons, trees polygons, empty areas, pedestrian polygons and road surface area polygons. (initial measures showed that cooling system consumption can be reduced by around 15 -20 % with a well-planned building distributions, proper spaces and with using environmental friendly products and building material, temperature levels were also combined in the scenario extracted from satellite images as interpreted from thermal bands 3 times during the period of assessment. Other examples of the assessment of GIS analysis to urban sustainability took place included Waste Productivity, some effects of greenhouse gases measured by the intensity of road polygons and closeness to dwelling areas, industry areas as defined from land use land cover thematic maps produced from classified satellite images then vectors were created to take part in defining their role within the scenarios. City Noise and light intensity assessment was also investigated, as the region experiences rapid development and noise is magnified due to construction activities, closeness of the airports, and highways. The assessment investigated the measures taken by urban planners to reduce degradation or properly manage it. Finally as a conclusion tables were presented to reflect the scenario results in combination with GIS data types, analysis types, and the level of GIS data reliability to measure the sustainability level of a city related to cultural and regional demands.


30.00% 30.00%



Background : Walking is a preferred, prevalent and recommended activity for aging populations and is influenced by the neighborhood built environment. To study this influence it is necessary to differentiate whether walking occurs within or outside of the neighborhood. The Neighborhood Physical Activity Questionnaire (NPAQ) collects information on setting-specific physical activity, including walking, inside and outside one's neighborhood. While the NPAQ has shown to be a reliable measure in adults, its reliability in older adults is unknown. Additionally its validity and the influence of type of neighborhood on reliability and validity have yet to be explored. Methods : The NPAQ walking component was adapted for Chinese speaking elders (NWQ-CS). Ninety-six Chinese elders, stratified by social economic status and neighborhood walkability, wore an accelerometer and completed a log of walks for 7 days. Following the collection of valid data the NWQ-CS was interviewer-administered. Fourteen to 20 days (average of 17 days) later the NWQ-CS was re-administered. Test-retest reliability and validity of the NWQ-CS were assessed. Results : Reliability and validity estimates did not differ with type of neighborhood. NWQ-CS measures of walking showed moderate to excellent reliability. Reliability was generally higher for estimates of weekly frequency than minutes of walking. Total weekly minutes of walking were moderately related to all accelerometry measures. Moderate-to-strong associations were found between the NWQ-CS and log-of-walks variables. The NWQ-CS yielded statistically significantly lower mean values of total walking, weekly minutes of walking for transportation and weekly frequency of walking for transportation outside the neighborhood than the log-of-walks. Conclusions : The NWQ-CS showed measurement invariance across types of neighborhoods. It is a valid measure of walking for recreation and frequency of walking for transport. However, it may systematically underestimate the duration of walking for transport in samples that engage in high levels of this type of walking.


30.00% 30.00%



Objective : To investigate the reliability and the validity of the long format, Chinese version of the International Physical Activity Questionnaire (IPAQ-LC).

Design : Cross-sectional study, examining the reliability and validity of the IPAQ-LC compared with a physical activity log (PA-log) and objective accelerometry.

Setting : Self-reported physical activity (PA) in Hong Kong adults. Subjects : A total of eighty-three Chinese adults (forty-seven males, thirty-six females) were asked to wear an ActiTrainer accelerometer (MTI-ActiGraph, Fort Walton Beach, FL, USA) for >10 h over 7 d, to complete a PA-log at the end of each day and to complete the IPAQ-LC on day 8. On a sub-sample of twenty-eight adults the IPAQ-LC was also administered on day 11 to assess its reliability.

Results : The IPAQ-LC had good test–retest reliability for grouped activities, with intra-class correlation coefficients ranging from 0·74 to 0·97 for vigorous, moderate, walking and total PA, with between-test effect sizes that were small (<0·49). The Spearman correlation coefficients were statistically significant for vigorous PA (r = 0·28), moderate + walking PA (r = 0·27), as well as overall PA (r = 0·35), when compared with the accelerometry-based criterion measures, but none of the IPAQ activity categories correlated significantly with the PA-log. In absolute units, only the IPAQ light and overall PA did not differ significantly from the accelerometry measures, yet overall PA was able to faithfully discriminate between quartiles of PA (P = 0·019) when compared to accelerometry.

Conclusions : The IPAQ-LC demonstrated adequate reliability and showed sufficient evidence of validity in assessing overall levels of habitual PA to be used on Hong Kong adults.


30.00% 30.00%



This study aimed to develop and then test the reliability and validity of a new self-report questionnaire method called the building environmental quality questionnaire (BEQQ) designed to assess the perceived environmental quality in residential apartments in Hong Kong. A total of 108 (46 men and 62 women) Chinese-speaking residents, between 16 and 81 years of age, took part and completed the questionnaire study. The subjects were recruited from 12 different buildings of three distinct quality ratings (low, medium and high) assigned by the building assessment tool called the building health and hygiene index (BHHI). The study was evaluated to determine reliability and this was assessed involving 20 of the participants (18% of the total sample size). The BEQQ was found to have good test-retest reliability, with intra-class correlation coefficient (ICC) values typically around 0.70. The validity testing, also using ICCs, generated moderate to high values for all BEQQ sub-categories (the mean value was around 0.80), indicating a good consistency among residents living within the same building. Finally, the summary BEQQ scores were significantly correlated (—0.68) with the BHHI ratings as the criterion standard. It is concluded that this eight-dimension instrument would provide a short and efficient questionnaire method to obtain self-reported information to determine the perceived residential building quality. The method was shown to yield adequate reliability and has been validated for use in empirical research.


30.00% 30.00%



Swiftlets are small insectivorous birds, many of which nest in caves and are known to echolocate. Due to a lack of distinguishing morphological characters, the taxonomy of swiftlets is primarily based on the presence or absence of echolocating ability, together with nest characters. To test the reliability of these behavioral characters, we constructed an independent phylogeny using cytochrome b mitochondrial DNA sequences from swiftlets and their relatives. This phylogeny is broadly consistent with the higher classification of swifts but does not support the monophyly of swiftlets. Echolocating swiftlets (Aerodramus) and the nonecholocating "giant swiftlet" (Hydrochous gigas) group together, but the remaining nonecholocating swiftlets belonging to Collocalia are not sister taxa to these swiftlets. While echolocation may be a synapomorphy of Aerodramus (perhaps secondarily lost in Hydrochous), no character of Aerodramus nests showed a statistically significant fit to the molecular phylogeny, indicating that nest characters are not phylogenetically reliable in this group.


30.00% 30.00%



Interobserver reliability for the classification of proximal humeral fractures is limited. The aim of this study was to test the null hypothesis that interobserver reliability of the AO classification of proximal humeral fractures, the preferred treatment, and fracture characteristics is the same for two-dimensional (2-D) and three-dimensional (3-D) computed tomography (CT). Members of the Science of Variation Group--fully trained practicing orthopaedic and trauma surgeons from around the world--were randomized to evaluate radiographs and either 2-D CT or 3-D CT images of fifteen proximal humeral fractures via a web-based survey and respond to the following four questions: (1) Is the greater tuberosity displaced? (2) Is the humeral head split? (3) Is the arterial supply compromised? (4) Is the glenohumeral joint dislocated? They also classified the fracture according to the AO system and indicated their preferred treatment of the fracture (operative or nonoperative). Agreement among observers was assessed with use of the multirater kappa (κ) measure. Interobserver reliability of the AO classification, fracture characteristics, and preferred treatment generally ranged from "slight" to "fair." A few small but statistically significant differences were found. Observers randomized to the 2-D CT group had slightly but significantly better agreement on displacement of the greater tuberosity (κ = 0.35 compared with 0.30, p < 0.001) and on the AO classification (κ = 0.18 compared with 0.17, p = 0.018). A subgroup analysis of the AO classification results revealed that shoulder and elbow surgeons, orthopaedic trauma surgeons, and surgeons in the United States had slightly greater reliability on 2-D CT, whereas surgeons in practice for ten years or less and surgeons from other subspecialties had slightly greater reliability on 3-D CT. Proximal humeral fracture classifications may be helpful conceptually, but they have poor interobserver reliability even when 3-D rather than 2-D CT is utilized. This may contribute to the similarly poor interobserver reliability that was observed for selection of the treatment for proximal humeral fractures. The lack of a reliable classification confounds efforts to compare the outcomes of treatment methods among different clinical trials and reports.


30.00% 30.00%



The primary aim of this study was to develop and validate a golf-specific approach-iron test for use with elite and high-level amateur golfers. Elite (n=26) and high-level amateur (n=23) golfers were recruited for this study. The ‘Approach-Iron Skill Test’ requires players to hit a total of 27 shots. Specifically, three shots are hit at each of nine targets on a specially constructed driving range in a randomised order. A real-time launch monitor positioned behind the player, measured the carry distance for each of these shots. A scoring system was developed based on the percentage error index of each shot, meaning that 81 points was the maximum score possible (with a maximum of three points per shot). Two rounds of the test were performed. For both rounds of the test, elite-level golfers scored significantly higher than their high-level amateur counterparts (56.3±5.6 and 58.5±4.6 points versus 46.0±6.3 and 46.1±6.7 points, respectively) (P<0.05). For both elite and high-level players, 95% limits of agreement statistics also indicated that the test showed good test–retest reliability (2.1±7.9 and 0.2±10.8, respectively). Due to the clinimetric properties of the test, we conclude that the Approach-Iron Skill Test is suitable for further examination with the players examined in this study.


30.00% 30.00%



Despite a recent increase in the amount of research investigating performance in golf, a comprehensive putting skill test has not been reported in the peer-reviewed literature. In this study, the Golf Australia Putting Test (GAPT) was developed and a series of measurement properties were assessed. Elite (n = 18) and high-level amateur (HLA; n = 22) participants completed six single putts from various areas on six concentric circles (circle radii = 0.9, 1.5, 3.0, 4.6, 6.1 and 7.6 m). Using a scoring system that rewarded participants for holing putts from longer distances, the maximum score from a single round of the test (i.e. 36 putts) was 27 points. After two rounds of the test were completed by all players, a subsample of participants (elite, n = 15; HLA, n = 7) had their putting performance recorded during tournament play for a period of 90 days to assess criterion (predictive) validity of the test. The reliability, sensitivity and discriminative validity of the GAPT were also assessed. Better agreement between Rounds 1 and 2 scores was noted in the elite group, whilst reliability values were similar for both groups. Further, the GAPT scores were shown to predict players from the elite and high-ability groups with a low classification error. An equation for predicting on-course performance from GAPT scores was also developed. Findings from this study indicate that the GAPT is a valid and reliable tool for high-level players and the GAPT may be used for player evaluation in the field.


30.00% 30.00%



No current validated survey instrument allows a comprehensive assessment of both physical activity and travel behaviours for use in interdisciplinary research on walking and cycling. This study reports on the test-retest reliability and validity of physical activity measures in the transport and physical activity questionnaire (TPAQ).