947 resultados para evaluation methods


Relevância:

70.00% 70.00%

Publicador:

Resumo:

In the article - Menu Analysis: Review and Evaluation - by Lendal H. Kotschevar, Distinguished Professor School of Hospitality Management, Florida International University, Kotschevar’s initial statement reads: “Various methods are used to evaluate menus. Some have quite different approaches and give different information. Even those using quite similar methods vary in the information they give. The author attempts to describe the most frequently used methods and to indicate their value. A correlation calculation is made to see how well certain of these methods agree in the information they give.” There is more than one way to look at the word menu. The culinary selections decided upon by the head chef or owner of a restaurant, which ultimately define the type of restaurant is one way. The physical outline of the food, which a patron actually holds in his or her hand, is another. These descriptions are most common to the word, menu. The author primarily concentrates on the latter description, and uses the act of counting the number of items sold on a menu to measure the popularity of any particular item. This, along with a formula, allows Kotschevar to arrive at a specific value per item. Menu analysis would appear a difficult subject to broach. How does a person approach a menu analysis, how do you qualify and quantify a menu; it seems such a subjective exercise. The author offers methods and outlines on approaching menu analysis from empirical perspectives. “Menus are often examined visually through the evaluation of various factors. It is a subjective method but has the advantage of allowing scrutiny of a wide range of factors which other methods do not,” says Distinguished Professor, Kotschevar. “The method is also highly flexible. Factors can be given a score value and scores summed to give a total for a menu. This allows comparison between menus. If the one making the evaluations knows menu values, it is a good method of judgment,” he further offers. The author wants you to know that assigning values is fundamental to a pragmatic menu analysis; it is how the reviewer keeps score, so to speak. Value merit provides reliable criteria from which to gauge a particular menu item. In the final analysis, menu evaluation provides the mechanism for either keeping or rejecting selected items on a menu. Kotschevar provides at least three different matrix evaluation methods; they are defined as the Miller method, the Smith and Kasavana method, and the Pavesic method. He offers illustrated examples of each via a table format. These are helpful tools since trying to explain the theories behind the tables would be difficult at best. Kotschevar also references examples of analysis methods which aren’t matrix based. The Hayes and Huffman - Goal Value Analysis - is one such method. The author sees no one method better than another, and suggests that combining two or more of the methods to be a benefit.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Since the establishment of the evaluation system in 1975, the junior colleges in the Republic of China (Taiwan), have gone through six formal evaluations. We know that evaluation in schooling, like quality control in businesses, should be a systematic, formal, and a continual process. It can doubtless serve as a strategy to refine the quality of education. The purpose of this research is to explore the current practice of junior college evaluation in Taiwan. This provides insight into the development of and quality of the current evaluation system. Moreover, this study also identified the source of problems with the current evaluation system and provided suggestion for improvements.^ In order to attain the above purposes, this research was undertaken in both theoretical and practical ways. First, theoretically, on the basis of a literature review, the theories of educational evaluation and, according to the course and principles of development, a view of the current practice in Taiwan. Secondly, in practice, by means of questionnaires, an analysis of the views of evaluation committeemen, junior college presidents, and administrators were obtained on evaluation models, methods, contents, organization, functions, criteria, grades reports, and others with suggestions for improvement. The summary of findings concludes that most evaluators and evaluatees think the purpose of evaluation can help the colleges explore their difficulties and problems. In addition, it was found that there is significant difference between the two groups regarding the evaluation methods, contents, organization, functions, criteria, grades reports and others, while analyzing these objective data forms the basis for an improved method of evaluation for Junior Colleges in Taiwan. ^

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Background: An evaluation was completed on the One-Day Meditech Magic Training Program for Registered Nurses (RNs) and Licensed Practical Nurses (LPNs) developed for the Long Term Care (LTC) Program. Methods: Both a literature review and consultation with stakeholders were completed to determine possible evaluation methods, expected outcomes, and ways to measure the effectiveness of the education program. A pretest/posttest design and questionnaire were chosen as the evaluation tools for this project. Results: No significant difference was found between the pretest and posttest total scores indicating that learners retained information from the orientation session (Z = -1.820, p = 0.069). Additional Wilcoxon matched-pairs signed rank tests were performed on the individual sections of the tests and revealed a significant decrease in the posttest scores for entering a Diagnostic Imaging requisition (Z = -1.975, p = 0.048). No other significant findings were present. Questionnaires were also analyzed revealing that most participants were pleased with the Meditech documentation education they received and did not indicate barriers that would affect electronic documentation. Conclusions: Further testing is required to ensure reliability and validity of the evaluation tools. Finally, caution is needed due to a small sample size. However, problematic documentation tasks were identified during the evaluation, and as a result both the training session and support materials will be improved as a result of this project.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Aim This paper will report findings from the first phase of an evaluation of a new e-health intervention designed to allow mothers to ‘see’ their baby in neonatal care (NNU) when they are not able to be with them. The intervention, MyLittleOne, involves a web-camera being placed over the incubator in NNU, which transmits a real-time video wirelessly to a coupled tablet device at the mother’s bedside. Guided by the MRC Framework for the Development and Evaluation of Healthcare Interventions (MRC, 2008), the aim was to explore parent and professional views of the technology and make recommendations for its future development, use and evaluation. Methods A qualitative approach was adopted, guided by a critical realist perspective (McEvoy and Richards, 2003). The study took place in a Level 3 NNU in Scotland. Participants were recruited purposively and included parents (n = 33) and a range of health professionals working in neonatal and postnatal care (n = 21). The data were collected during semi-structured individual, paired and small group interviews and were analysed thematically using NVivo v10. Results The majority of parents and professionals spoke positively about MyLittleOne. Perceptions were that: use of the technology assisted bonding and responsiveness; it promoted the recovery process following birth; and, for mothers who wished to breast-feed, being able to see their baby on the tablet device encouraged the ‘let-down’ reflex. An additional benefit was that siblings and others who may not be able to visit the NNU were able to see the baby. In contrast, for a small number of mothers, viewing their baby remotely appeared to increase their levels of anxiety. Switching off the camera during a medical procedure and back on after the procedure was completed was found to be problematic, at times and in different ways, for both parents and professionals. Conclusions Findings from this preliminary evaluation will guide future developments of the technology, including its use in family homes following the mother’s discharge. The findings will also inform the design of a feasibility study and subsequent RCT to assess the impact of MyLittleOne on a range of psychological indicators of postnatal adjustment.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This thesis investigates how web search evaluation can be improved using historical interaction data. Modern search engines combine offline and online evaluation approaches in a sequence of steps that a tested change needs to pass through to be accepted as an improvement and subsequently deployed. We refer to such a sequence of steps as an evaluation pipeline. In this thesis, we consider the evaluation pipeline to contain three sequential steps: an offline evaluation step, an online evaluation scheduling step, and an online evaluation step. In this thesis we show that historical user interaction data can aid in improving the accuracy or efficiency of each of the steps of the web search evaluation pipeline. As a result of these improvements, the overall efficiency of the entire evaluation pipeline is increased. Firstly, we investigate how user interaction data can be used to build accurate offline evaluation methods for query auto-completion mechanisms. We propose a family of offline evaluation metrics for query auto-completion that represents the effort the user has to spend in order to submit their query. The parameters of our proposed metrics are trained against a set of user interactions recorded in the search engine’s query logs. From our experimental study, we observe that our proposed metrics are significantly more correlated with an online user satisfaction indicator than the metrics proposed in the existing literature. Hence, fewer changes will pass the offline evaluation step to be rejected after the online evaluation step. As a result, this would allow us to achieve a higher efficiency of the entire evaluation pipeline. Secondly, we state the problem of the optimised scheduling of online experiments. We tackle this problem by considering a greedy scheduler that prioritises the evaluation queue according to the predicted likelihood of success of a particular experiment. This predictor is trained on a set of online experiments, and uses a diverse set of features to represent an online experiment. Our study demonstrates that a higher number of successful experiments per unit of time can be achieved by deploying such a scheduler on the second step of the evaluation pipeline. Consequently, we argue that the efficiency of the evaluation pipeline can be increased. Next, to improve the efficiency of the online evaluation step, we propose the Generalised Team Draft interleaving framework. Generalised Team Draft considers both the interleaving policy (how often a particular combination of results is shown) and click scoring (how important each click is) as parameters in a data-driven optimisation of the interleaving sensitivity. Further, Generalised Team Draft is applicable beyond domains with a list-based representation of results, i.e. in domains with a grid-based representation, such as image search. Our study using datasets of interleaving experiments performed both in document and image search domains demonstrates that Generalised Team Draft achieves the highest sensitivity. A higher sensitivity indicates that the interleaving experiments can be deployed for a shorter period of time or use a smaller sample of users. Importantly, Generalised Team Draft optimises the interleaving parameters w.r.t. historical interaction data recorded in the interleaving experiments. Finally, we propose to apply the sequential testing methods to reduce the mean deployment time for the interleaving experiments. We adapt two sequential tests for the interleaving experimentation. We demonstrate that one can achieve a significant decrease in experiment duration by using such sequential testing methods. The highest efficiency is achieved by the sequential tests that adjust their stopping thresholds using historical interaction data recorded in diagnostic experiments. Our further experimental study demonstrates that cumulative gains in the online experimentation efficiency can be achieved by combining the interleaving sensitivity optimisation approaches, including Generalised Team Draft, and the sequential testing approaches. Overall, the central contributions of this thesis are the proposed approaches to improve the accuracy or efficiency of the steps of the evaluation pipeline: the offline evaluation frameworks for the query auto-completion, an approach for the optimised scheduling of online experiments, a general framework for the efficient online interleaving evaluation, and a sequential testing approach for the online search evaluation. The experiments in this thesis are based on massive real-life datasets obtained from Yandex, a leading commercial search engine. These experiments demonstrate the potential of the proposed approaches to improve the efficiency of the evaluation pipeline.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Research in human computer interaction (HCI) covers both technological and human behavioural concerns. As a consequence, the contributions made in HCI research tend to be aware to either engineering or the social sciences. In HCI the purpose of practical research contributions is to reveal unknown insights about human behaviour and its relationship to technology. Practical research methods normally used in HCI include formal experiments, field experiments, field studies, interviews, focus groups, surveys, usability tests, case studies, diary studies, ethnography, contextual inquiry, experience sampling, and automated data collection. In this paper, we report on our experience using the evaluation methods focus groups, surveys and interviews and how we adopted these methods to develop artefacts: either interface’s design or information and technological systems. Four projects are examples of the different methods application to gather information about user’s wants, habits, practices, concerns and preferences. The goal was to build an understanding of the attitudes and satisfaction of the people who might interact with a technological artefact or information system. Conversely, we intended to design for information systems and technological applications, to promote resilience in organisations (a set of routines that allow to recover from obstacles) and user’s experiences. Organisations can here also be viewed within a system approach, which means that the system perturbations even failures could be characterized and improved. The term resilience has been applied to everything from the real estate, to the economy, sports, events, business, psychology, and more. In this study, we highlight that resilience is also made up of a number of different skills and abilities (self-awareness, creating meaning from other experiences, self-efficacy, optimism, and building strong relationships) that are a few foundational ingredients, which people should use along with the process of enhancing an organisation’s resilience. Resilience enhances knowledge of resources available to people confronting existing problems.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Issue addressed: Our Watch led a complex 12-month evaluation of a whole school approach to Respectful Relationships Education (RRE) implemented in 19 schools. RRE is an emerging field aimed at preventing gender-based violence. This paper will illustrate how from an implementation science perspective, the evaluation was a critical element in the change process at both a school and policy level. Methods: Using several conceptual approaches from systems science, the evaluation sought to examine how the multiple systems layers – student, teacher, school, community and government – interacted and influenced each other. A distinguishing feature of the evaluation included ‘feedback loops’; that is, evaluation data was provided to participants as it became available. Evaluation tools included a combination of standardised surveys (with pre- and post-intervention data provided to schools via individualised reports), reflection tools, regular reflection interviews and summative focus groups. Results: Data was shared during implementation with project staff, department staff and schools to support continuous improvement at these multiple systems levels. In complex settings, implementation can vary according to context; and the impact of evaluation processes, tools and findings differed across the schools. Interviews and focus groups conducted at the end of the project illustrated which of these methods were instrumental in motivating change and engaging stakeholders at both a school and departmental level and why. Conclusion: The evaluation methods were a critical component of the pilot’s approach, helping to shape implementation through data feedback loops and reflective practice for ongoing, responsive and continuous improvement. Future health promotion research on complex interventions needs to examine how the evaluation itself is influencing implementation. So what? The pilot has demonstrated that the evaluation, including feedback loops to inform project activity, were an asset to implementation. This has implications for other health promotion activities, where evaluation tools could be utilised to enhance, rather than simply measure, an intervention. The findings are relevant to a range of health promotion research activities because they demonstrate the importance of meta-evaluation techniques that seek to understand how the evaluation itself was influencing implementation and outcomes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Lane Change Test (LCT) is one of the growing number of methods developed to quantify driving performance degradation brought about by the use of in-vehicle devices. Beyond its validity and reliability, for such a test to be of practical use, it must also be sensitive to the varied demands of individual tasks. The current study evaluated the ability of several recent LCT lateral control and event detection parameters to discriminate between visual-manual and cognitive surrogate In-Vehicle Information System tasks with different levels of demand. Twenty-seven participants (mean age 24.4 years) completed a PC version of the LCT while performing visual search and math problem solving tasks. A number of the lateral control metrics were found to be sensitive to task differences, but the event detection metrics were less able to discriminate between tasks. The mean deviation and lane excursion measures were able to distinguish between the visual and cognitive tasks, but were less sensitive to the different levels of task demand. The other LCT metrics examined were less sensitive to task differences. A major factor influencing the sensitivity of at least some of the LCT metrics could be the type of lane change instructions given to participants. The provision of clear and explicit lane change instructions and further refinement of its metrics will be essential for increasing the utility of the LCT as an evaluation tool.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Diabetic peripheral neuropathy (DPN) is one of the most debilitating complications of diabetes. DPN is a major cause of foot ulceration and lower limb amputation. Early diagnosis and management is a key factor in reducing morbidity and mortality. Current techniques for clinical assessment of DPN are relatively insensitive for detecting early disease or involve invasive procedures such as skin biopsies. There is a need for less painful, non-invasive and safe evaluation methods. Eye care professionals already play an important role in the management of diabetic retinopathy; however recent studies have indicated that the eye may also be an important site for the diagnosis and monitoring of neuropathy. Corneal nerve morphology has been shown to be a promising marker of diabetic neuropathy occurring elsewhere in the body, and emerging evidence tentatively suggests that retinal anatomical markers and a range of functional visual indicators could similarly provide useful information regarding neural damage in diabetes – although this line of research is, as yet, less well established. This review outlines the growing body of evidence supporting a potential diagnostic role for retinal structure and visual functional markers in the diagnosis and monitoring of peripheral neuropathy in diabetes.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper describes the evaluation in benchmarking the effectiveness of cross-lingual link discovery (CLLD). Cross lingual link discovery is a way of automatically finding prospective links between documents in different languages, which is particularly helpful for knowledge discovery of different language domains. A CLLD evaluation framework is proposed for system performance benchmarking. The framework includes standard document collections, evaluation metrics, and link assessment and evaluation tools. The evaluation methods described in this paper have been utilised to quantify the system performance at NTCIR-9 Crosslink task. It is shown that using the manual assessment for generating gold standard can deliver a more reliable evaluation result.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

PURPOSE: This pilot project’s aim was to trial a tool and process for developing students’ ability to engage in self-assessment using reflection on their clinical experiences, including feedback from workplace learning, in order to aid them in linking theory to practice and develop strategies to improve performance. BACKGROUND: In nursing education, students can experience a mismatch in performance compared to theoretical learning, this is referred to as the ‘theory practice gap’ (Scully 2011, Chan Chan & Liu 2011). One specific contributing factor seems to be students’ inability to engage in meaningful reflection and self-correcting behaviours. A self-assessment strategy was implemented within a third year clinical unit to ameliorate this mismatch with encouraging results, as students developed self-direction in addressing learning needs. In this pilot project the above strategy was adapted for implementation between different clinical units, to create a whole of course approach to integrating workplace learning. METHOD: The methodology underpinning this project is a scaffolded, supported reflective practice process. Improved self-assessment skills is achieved by students reflecting on and engaging with feedback, then mapping this to learning outcomes to identify where performance can be improved. Evaluation of this project includes: collation of student feedback identifying successful strategies along with barriers encountered in implementation; feedback from students and teachers via above processes and tools; and comparison of the number of learning contracts issued in clinical nursing units with similar cohorts. RESULTS: Results will be complete by May 2012 and include analysis of the data collected via the above evaluation methods. Other outcomes will include the refined process and tool, plus resources that should improve cost effectiveness without reducing student support. CONCLUSION: Implementing these tools and processes over the entire student’s learning package, will assist them to demonstrate progressive development through the course. Students will have learnt to understand feedback and integrate these skills for life-long learning.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The INEX 2011 Relevance Feedback track offered a refined approach to the evaluation of Focused Relevance Feedback algorithms through simulated exhaustive user feedback. Run in largely identical fashion to the Relevance Feedback track in INEX 2010[2], we simulated a user-in-the loop by re-using the assessments of ad-hoc retrieval obtained from real users who assess focused ad-hoc retrieval submissions. We present the evaluation methodology, its implementation, and experimental results obtained for four submissions from two participating organisations. As the task and evaluation methods did not change between INEX 2010 and now, explanations of these details from the INEX 2010 version of the track have been repeated verbatim where appropriate.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Organisations are engaging in e-learning as a mechanism for delivering flexible learning to meet the needs of individuals and organisations. In light of the increasing use and organisational investment in e-learning, the need for methods to evaluate the success of its design and implementation seems more important than ever. To date, developing a standard for the evaluation of e-learning appears to have eluded both academics and practitioners. The currently accepted evaluation methods for e-learning are traditional learning and development models, such as Kirkpatrick’s model (1976). Due to the technical nature of e-learning it is important to broaden the scope and consider other evaluation models or techniques, such as the DeLone and McLean Information Success Model, that may be applicable to the e-learning domain. Research into the use of e-learning courses has largely avoided considering the applicability of information systems research. Given this observation, it is reasonable to conclude that e-learning implementation decisions and practice could be overlooking useful or additional viewpoints. This research investigated how existing evaluation models apply in the context of organisational e-learning, and resulted in an Organisational E-learning success Framework, which identifies the critical elements for success in an e-learning environment. In particular this thesis highlights the critical importance of three e-learning system creation elements; system quality, information quality, and support quality. These elements were explored in depth and the nature of each element is described in detail. In addition, two further elements were identified as factors integral to the success of an e-learning system; learner preferences and change management. Overall, this research has demonstrated the need for a holistic approach to e-learning evaluation. Furthermore, it has shown that the application of both traditional training evaluation approaches and the D&M IS Success Model are appropriate to the organisational e-learning context, and when combined can provide this holistic approach. Practically, this thesis has reported the need for organisations to consider evaluation at all stages of e-learning from design through to implementation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Achieving sustainable urban development is identified as one ultimate goal of many contemporary planning endeavours and has become central to formulation of urban planning policies. Within this concept, land-use and transport integration is highlighted as one of the most important and attainable policy objectives. In many cities, integration is embraced as an integral part of local development plans, and a number of key integration principles are identified. However, the lack of available evaluation methods to measure extent of urban sustainability levels prevents successful implementation of these principles. This paper introduces a new indicator-based spatial composite indexing model developed to measure sustainability performance of urban settings by taking into account land-use and transport integration principles. Model indicators are chosen via a thorough selection process in line with key principles of land-use and transport integration. These indicators are grouped into categories and themes according to their topical relevance. These indicators are then aggregated to form a spatial composite index to portray an overview of the sustainability performance of the pilot study area used for model demonstration. The study results revealed that the model is a practical instrument for evaluating success of local integration policies and visualizing sustainability performance of built environments and useful in both identifying problematic areas as well as formulating policy interventions.