270 resultados para Validation par connaissance expert
Resumo:
The popularity of Bayesian Network modelling of complex domains using expert elicitation has raised questions of how one might validate such a model given that no objective dataset exists for the model. Past attempts at delineating a set of tests for establishing confidence in an entirely expert-elicited model have focused on single types of validity stemming from individual sources of uncertainty within the model. This paper seeks to extend the frameworks proposed by earlier researchers by drawing upon other disciplines where measuring latent variables is also an issue. We demonstrate that even in cases where no data exist at all there is a broad range of validity tests that can be used to establish confidence in the validity of a Bayesian Belief Network.
Resumo:
We developed and validated a new method to create automated 3D parametric surface models of the lateral ventricles, designed for monitoring degenerative disease effects in clinical neuroscience studies and drug trials. First we used a set of parameterized surfaces to represent the ventricles in a manually labeled set of 9 subjects' MRIs (atlases). We fluidly registered each of these atlases and mesh models to a set of MRIs from 12 Alzheimer's disease (AD) patients and 14 matched healthy elderly subjects, and we averaged the resulting meshes for each of these images. Validation experiments on expert segmentations showed that (1) the Hausdorff labeling error rapidly decreased, and (2) the power to detect disease-related alterations monotonically improved as the number of atlases, N, was increased from 1 to 9. We then combined the segmentations with a radial mapping approach to localize ventricular shape differences in patients. In surface-based statistical maps, we detected more widespread and intense anatomical deficits as we increased the number of atlases, and we formulated a statistical stopping criterion to determine the optimal value of N. Anterior horn anomalies in Alzheimer's patients were only detected with the multi-atlas segmentation, which clearly outperformed the standard single-atlas approach.
Resumo:
A national-level safety analysis tool is needed to complement existing analytical tools for assessment of the safety impacts of roadway design alternatives. FHWA has sponsored the development of the Interactive Highway Safety Design Model (IHSDM), which is roadway design and redesign software that estimates the safety effects of alternative designs. Considering the importance of IHSDM in shaping the future of safety-related transportation investment decisions, FHWA justifiably sponsored research with the sole intent of independently validating some of the statistical models and algorithms in IHSDM. Statistical model validation aims to accomplish many important tasks, including (a) assessment of the logical defensibility of proposed models, (b) assessment of the transferability of models over future time periods and across different geographic locations, and (c) identification of areas in which future model improvements should be made. These three activities are reported for five proposed types of rural intersection crash prediction models. The internal validation of the model revealed that the crash models potentially suffer from omitted variables that affect safety, site selection and countermeasure selection bias, poorly measured and surrogate variables, and misspecification of model functional forms. The external validation indicated the inability of models to perform on par with model estimation performance. Recommendations for improving the state of the practice from this research include the systematic conduct of carefully designed before-and-after studies, improvements in data standardization and collection practices, and the development of analytical methods to combine the results of before-and-after studies with cross-sectional studies in a meaningful and useful way.
Resumo:
Design teams are confronted with the quandary of choosing apposite building control systems to suit the needs of particular intelligent building projects, due to the availability of innumerable ‘intelligent’ building products and a dearth of inclusive evaluation tools. This paper is organised to develop a model for facilitating the selection evaluation for intelligent HVAC control systems for commercial intelligent buildings. To achieve these objectives, systematic research activities have been conducted to first develop, test and refine the general conceptual model using consecutive surveys; then, to convert the developed conceptual framework into a practical model; and, finally, to evaluate the effectiveness of the model by means of expert validation. The results of the surveys are that ‘total energy use’ is perceived as the top selection criterion, followed by the‘system reliability and stability’, ‘operating and maintenance costs’, and ‘control of indoor humidity and temperature’. This research not only presents a systematic and structured approach to evaluate candidate intelligent HVAC control system against the critical selection criteria (CSC), but it also suggests a benchmark for the selection of one control system candidate against another.
Resumo:
BACKGROUND. Physical symptoms are common in pregnancy and are predominantly associated with normal physiological changes. These symptoms have a social and economic cost, leading to absenteeism from work and additional medical interventions. There is currently no simple method for identifying common pregnancy related problems in the antenatal period. A validated tool, for use by pregnancy care providers would be useful. AIM: The aim of the project was to develop and validate a Pregnancy Symptoms Inventory for use by healthcare professionals (HCPs). METHODS: A list of symptoms was generated via expert consultation with midwives and obstetrician gynaecologists. Focus groups were conducted with pregnant women in their first, second or third trimester. The inventory was then tested for face validity and piloted for readability and comprehension. For test-re-test reliability, it was administered to the same women 2 to 3 days apart. Finally, outpatient midwives trialled the inventory for 1 month and rated its usefulness on a 10cm visual analogue scale (VAS). The number of referrals to other health care professionals was recorded during this month. RESULTS: Expert consultation and focus group discussions led to the generation of a 41-item inventory. Following face validity and readability testing, several items were modified. Individual item test re-test reliability was between .51 to 1 with the majority (34 items) scoring .0.70. During the testing phase, 211 surveys were collected in the 1 month trial. Tiredness (45.5%), poor sleep (27.5%) back pain (19.5%) and nausea (12.6%) were experienced often. Among the women surveyed, 16.2% claimed to sometimes or often be incontinent. Referrals to the incontinence nurse increased > 8 fold during the study period. The median rating by midwives of the ‘usefulness’ of the inventory was 8.4 (range 0.9 to 10). CONCLUSIONS: The Pregnancy Symptoms Inventory (PSI) was well accepted by women in the 1 month trial and may be a useful tool for pregnancy care providers and aids clinicians in early detection and subsequent treatment of symptoms. It shows promise for use in the research community for assessing the impact of lifestyle intervention in pregnancy.
Resumo:
1. Expert knowledge continues to gain recognition as a valuable source of information in a wide range of research applications. Despite recent advances in defining expert knowledge, comparatively little attention has been given to how to view expertise as a system of interacting contributory factors, and thereby, to quantify an individual’s expertise. 2. We present a systems approach to describing expertise that accounts for many contributing factors and their interrelationships, and allows quantification of an individual’s expertise. A Bayesian network (BN) was chosen for this purpose. For the purpose of illustration, we focused on taxonomic expertise. The model structure was developed in consultation with professional taxonomists. The relative importance of the factors within the network were determined by a second set of senior taxonomists. This second set of experts (i.e. supra-experts) also provided validation of the model structure. Model performance was then assessed by applying the model to hypothetical career states in the discipline of taxonomy. Hypothetical career states were used to incorporate the greatest possible differences in career states and provide an opportunity to test the model against known inputs. 3. The resulting BN model consisted of 18 primary nodes feeding through one to three higher-order nodes before converging on the target node (Taxonomic Expert). There was strong consistency among node weights provided by the supra-experts for some nodes, but not others. The higher order nodes, “Quality of work” and “Total productivity”, had the greatest weights. Sensitivity analysis indicated that although some factors had stronger influence in the outer nodes of the network, there was relatively equal influence of the factors leading directly into the target node. Despite differences in the node weights provided by our supra-experts, there was remarkably good agreement among assessments of our hypothetical experts that accurately reflected differences we had built into them. 4. This systems approach provides a novel way of assessing the overall level of expertise of individuals, accounting for multiple contributory factors, and their interactions. Our approach is adaptable to other situations where it is desirable to understand components of expertise.
Resumo:
Background Physical symptoms are common in pregnancy and are predominantly associated with normal physiological changes. These symptoms have a social and economic cost, leading to absenteeism from work and additional medical interventions. There is currently no simple method for identifying common pregnancy related problems in the antenatal period. A validated tool, for use by pregnancy care providers would be useful. The aim of this study was to develop and validate a Pregnancy Symptoms Inventory for use by health professionals. Methods A list of symptoms was generated via expert consultation with health professionals. Focus groups were conducted with pregnant women. The inventory was tested for face validity and piloted for readability and comprehension. For test-re-test reliability, the tool was administered to the same women 2 to 3 days apart. Finally, midwives trialled the inventory for 1 month and rated its usefulness on a 10cm visual analogue scale (VAS). Results A 41-item Likert inventory assessing how often symptoms occurred and what effect they had, was developed. Individual item test re-test reliability was between .51 to 1, the majority (34 items) scoring ≥0.70. The top four “often” reported symptoms were urinary frequency (52.2%), tiredness (45.5%), poor sleep (27.5%) and back pain (19.5%). Among the women surveyed, 16.2% claimed to sometimes or often be incontinent. Referrals to the incontinence nurse increased > 8 fold during the study period. Conclusions The PSI provides a comprehensive inventory of pregnancy related symptoms, with a mechanism for assessing their effect on function. It was robustly developed, with good test re-test reliability, face validity, comprehension and readability. This provides a validated tool for assessing the impact of interventions in pregnancy.
Resumo:
Validation is an important issue in the development and application of Bayesian Belief Network (BBN) models, especially when the outcome of the model cannot be directly observed. Despite this, few frameworks for validating BBNs have been proposed and fewer have been applied to substantive real-world problems. In this paper we adopt the approach by Pitchforth and Mengersen (2013), which includes nine validation tests that each focus on the structure, discretisation, parameterisation and behaviour of the BBNs included in the case study. We describe the process and result of implementing a validation framework on a model of a real airport terminal system with particular reference to its effectiveness in producing a valid model that can be used and understood by operational decision makers. In applying the proposed validation framework we demonstrate the overall validity of the Inbound Passenger Facilitation Model as well as the effectiveness of the validity framework itself.
Resumo:
Background Early feeding practices lay the foundation for children’s eating habits and weight gain. Questionnaires are available to assess parental feeding but overlapping and inconsistent items, subscales and terminology limit conceptual clarity and between study comparisons. Our aim was to consolidate a range of existing items into a parsimonious and conceptually robust questionnaire for assessing feeding practices with very young children (<3 years). Methods Data were from 462 mothers and children (age 21–27 months) from the NOURISH trial. Items from five questionnaires and two study-specific items were submitted to a priori item selection, allocation and verification, before theoretically-derived factors were tested using Confirmatory Factor Analysis. Construct validity of the new factors was examined by correlating these with child eating behaviours and weight. Results Following expert review 10 factors were specified. Of these, 9 factors (40 items) showed acceptable model fit and internal reliability (Cronbach’s α: 0.61-0.89). Four factors reflected non-responsive feeding practices: ‘Distrust in Appetite’, ‘Reward for Behaviour’, ‘Reward for Eating’, and ‘Persuasive Feeding’. Five factors reflected structure of the meal environment and limits: ‘Structured Meal Setting’, ‘Structured Meal Timing’, ‘Family Meal Setting’, ‘Overt Restriction’ and ‘Covert Restriction’. Feeding practices generally showed the expected pattern of associations with child eating behaviours but none with weight. Conclusion The Feeding Practices and Structure Questionnaire (FPSQ) provides a new reliable and valid measure of parental feeding practices, specifically maternal responsiveness to children’s hunger/satiety signals facilitated by routine and structure in feeding. Further validation in more diverse samples is required.
Resumo:
Abstract Background The purpose of this study was the development of a valid and reliable “Mechanical and Inflammatory Low Back Pain Index” (MIL) for assessment of non-specific low back pain (NSLBP). This 7-item tool assists practitioners in determining whether symptoms are predominantly mechanical or inflammatory. Methods Participants (n = 170, 96 females, age = 38 ± 14 years-old) with NSLP were referred to two Spanish physiotherapy clinics and completed the MIL and the following measures: the Roland Morris Questionnaire (RMQ), SF-12 and “Backache Index” (BAI) physical assessment test. For test-retest reliability, 37 consecutive patients were assessed at baseline and three days later during a non-treatment period. Face and content validity, practical characteristics, factor analysis, internal consistency, discriminant validity and convergent validity were assessed from the full sample. Results A total of 27 potential items that had been identified for inclusion were subsequently reduced to 11 by an expert panel. Four items were then removed due to cross-loading under confirmatory factor analysis where a two-factor model yielded a good fit to the data (χ2 = 14.80, df = 13, p = 0.37, CFI = 0.98, and RMSEA = 0.029). The internal consistency was moderate (α = 0.68 for MLBP; 0.72 for ILBP), test-retest reliability high (ICC = 0.91; 95%CI = 0.88-0.93) and discriminant validity good for either MLBP (AUC = 0.74) and ILBP (AUC = 0.92). Convergent validity was demonstrated through similar but weak correlations between the ILBP and both the RMQ and BAI (r = 0.34, p < 0.001) and the MLBP and BAI (r = 0.38, p < 0.001). Conclusions The MIL is a valid and reliable clinical tool for patients with NSLBP that discriminates between mechanical and inflammatory LBP. Keywords: Low back pain; Psychometrics properties; Pain measurement; Screening tool; Inflammatory; Mechanical
Resumo:
Background Domestic violence against women is a major public health problem and violations of women’s human rights. Health professionals could play an important role in screening for the victims. From the evidence to date, it is unclear whether health professionals do play an active role in identification of the victims. Objectives To develop a reliable and valid instrument to measure health professionals’ attitude to identifying female victims of domestic violence. Methods A primary questionnaire was constructed in accordance with established guidelines using the Theory of Planned Behaviour Ajzen (1975) to develop an instrument to measure health professionals’ attitudes in identifying female victim of DV. An expert panel was used to establish content validity. Focus groups amongst a group of health professionals (N = 5) of the target population were performed to confirm face validity. A pilot study (N = 30 nurses and doctors) was undertaken to elicit the feasibility and reliability of the questionnaire. The questionnaire was also administered a second time after one week to check the stability of the tests. Results Feedbacks of the expert panel’s and group discussion confirmed that the questionnaire had the content and face validity. Cronbach’s alpha values for all the items were greater than 0.7. Strong correlations between the direct and indirect measures confirmed that the indirect measures were well constructed. High test-retest correlations confirmed that the measures were reliable in the sense of temporal stability. Significance This tool has the potential to be used by researchers in expanding the knowledge base in this important area.
Resumo:
- Background Expressed emotion (EE) captures the affective quality of the relationship between family caregivers and their care recipients and is known to increase the risk of poor health outcomes for caregiving dyads. Little is known about expressed emotion in the context of caregiving for persons with dementia, especially in non-Western cultures. The Family Attitude Scale (FAS) is a psychometrically sound self-reporting measure for EE. Its use in the examination of caregiving for patients with dementia has not yet been explored. - Objectives This study was performed to examine the psychometric properties of the Chinese version of the FAS (FAS-C) in Chinese caregivers of relatives with dementia, and its validity in predicting severe depressive symptoms among the caregivers. - Methods The FAS was translated into Chinese using Brislin's model. Two expert panels evaluated the semantic equivalence and content validity of this Chinese version (FAS-C), respectively. A total of 123 Chinese primary caregivers of relatives with dementia were recruited from three elderly community care centers in Hong Kong. The FAS-C was administered with the Chinese versions of the 5-item Mental Health Inventory (MHI-5), the Zarit Burden Interview (ZBI) and the Revised Memory and Behavioral Problem Checklist (RMBPC). - Results The FAS-C had excellent semantic equivalence with the original version and a content validity index of 0.92. Exploratory factor analysis identified a three-factor structure for the FAS-C (hostile acts, criticism and distancing). Cronbach's alpha of the FAS-C was 0.92. Pearson's correlation indicated that there were significant associations between a higher score on the FAS-C and greater caregiver burden (r = 0.66, p < 0.001), poorer mental health of the caregivers (r = −0.65, p < 0.001) and a higher level of dementia-related symptoms (frequency of symptoms: r = 0.45, p < 0.001; symptom disturbance: r = 0.51, p < 0.001), which serves to suggest its construct validity. For detecting severe depressive symptoms of the family caregivers, the receiving operating characteristics (ROC) curve had an area under curve of 0.78 (95% confidence interval (CI) = 0.69–0.87, p < 0.0001). The optimal cut-off score was >47 with a sensitivity of 0.720 (95% CI = 0.506–0.879) and specificity of 0.742 (95% CI = 0.643–0.826). - Conclusions The FAS-C is a reliable and valid measure to assess the affective quality of the relationship between Chinese caregivers and their relatives with dementia. It also has acceptable predictability in identifying family caregivers with severe depressive symptoms.