825 resultados para performance assessment
Resumo:
This study focused on the relationship between students’ Advanced Placement (AP) English language performance and their subsequent college success. Targeted students were divided into three groups according to their AP English Language performance. Subsequent college success was measured by students’ first-year college GPA, retention to the second year, and institutional selectivity. The demographic characteristics of the three AP performance groups with regard to gender, ethnicity, and best language spoken are provided. Results indicated that, after controlling for students’ SAT scores as a measure of prior academic performance, AP English Language performance was positively related to all three measures of college success.
Resumo:
In this academic library case study, the concept of a multidimensional approach to organizational assessment focuses on one of Kaplan and Norton’s four Balanced Scorecard dimensions, “Learning and Growth”, as its measurement target.
Resumo:
Background. Previous studies show emergency rooms to be over crowded nation wide. With growing attention to this problem, the Houston-Galveston Area Council (H-GAC) initiated a study in 2005 to assess their region's emergency health care system, and continued this effort in 2007. The purpose of this study was to examine recent changes in volume, capacity and performance in the Houston-Galveston region's emergency health care system and determine whether the system has been able to effectively respond to the residents' demands. Methods. Data were collected by the Houston-Galveston Area Council and The Abaris Group using a self-administered 2002-2006 survey completed by administrators of the region's hospitals, EMS providers, and select fire departments that provide EMS services. Data from both studies were combined and matched to examine trends. Results. Volume increased among the reporting hospitals within the Houston-Galveston region from 2002 to 2006; however, capacity remained relatively unchanged. EMS providers reported higher average off load times in 2007 compared to 2005, but the increases were not statistically significant. Hospitals reported transferring a statistically significant greater percentage of patients in 2006 than 2004. There was no statistically significant change in any of the other measures. Conclusion. These findings indicate an increase in demand for the Houston-Galveston region's emergency healthcare services with no change in supply. Additional studies within the area are needed to fully assess and evaluate the impact of these changes on system performance. ^
Resumo:
The United States Air Force School of Aerospace Medicine (USAFSAM) and Aeromedical Consult Service (ACS) have developed waiver criteria for pilots with subtle substandard depth perception. This is to allow United States Air Force (USAF) pilots with mild depth perception deficiency to continue flying duties while limiting the risk to flight safety and ensuring the availability of costly human resources. From 1999 to 2005, 166 aviators were given waivers for intermittent monofixation syndrome (IMFS). Of these, 96 were student pilots who performed slightly worse at stereoptic dependent flight maneuvers than student pilots (8,907) with normal depth perception (Lowry, 2006).^ This study's purpose is to evaluate the performance of the extended-trail maneuver, a non-stereoptic dependent flying maneuver, as executed by a cohort of 12 United States Air Force student pilots with intermittent monofixation syndrome versus the cohort of 100 student pilots with normal depth perception. These subjects are extracted from the cohorts examined by Lowry (2006) and the null hypothesis predicts no statistical difference in the performance of the non-stereoptic dependant flight maneuver extended-trail between student pilots with intermittent monofixation syndrome and those without the condition. ^
Resumo:
Introduction. Selectively manned units have a long, international history, both military and civilian. Some examples include SWAT teams, firefighters, the FBI, the DEA, the CIA, and military Special Operations. These special duty operators are individuals who perform a highly skilled and dangerous job in a unique environment. A significant amount of money is spent by the Department of Defense (DoD) and other federal agencies to recruit, select, train, equip and support these operators. When a critical incident or significant life event occurs, that jeopardizes an operator's performance; there can be heavy losses in terms of training, time, money, and potentially, lives. In order to limit the number of critical incidents, selection processes have been developed over time to “select out” those individuals most likely to perform below desired performance standards under pressure or stress and to "select in" those with the "right stuff". This study is part of a larger program evaluation to assess markers that identify whether a person will fail under the stresses in a selectively manned unit. The primary question of the study is whether there are indicators in the selection process that signify potential negative performance at a later date. ^ Methods. The population being studied included applicants to a selectively manned DoD organization between 1993 and 2001 as part of a unit assessment and selection process (A&S). Approximately 1900 A&S records were included in the analysis. Over this nine year period, seventy-two individuals were determined to have had a critical incident. A critical incident can come in the form of problems with the law, personal, behavioral or family problems, integrity issues, and skills deficit. Of the seventy-two individuals, fifty-four of these had full assessment data and subsequent supervisor performance ratings which assessed how an individual performed while on the job. This group was compared across a variety of variables including demographics and psychometric testing with a group of 178 individuals who did not have a critical incident and had been determined to be good performers with positive ratings by their supervisors.^ Results. In approximately 2004, an online pre-screen survey was developed in the hopes of preselecting out those individuals with items that would potentially make them ineligible for selection to this organization. This survey has aided the organization to increase its selection rates and save resources in the process. (Patterson, Howard Smith, & Fisher, Unit Assessment and Selection Project, 2008) When the same prescreen was used on the critical incident individuals, it was found that over 60% of the individuals would have been flagged as unacceptable. This would have saved the organization valuable resources and heartache.^ There were some subtle demographic differences between the two groups (i.e. those with critical incidents were almost twice as likely to be divorced compared with the positive performers). Upon comparison of Psychometric testing several items were noted to be different. The two groups were similar when their IQ levels were compared using the Multidimensional Aptitude Battery (MAB). When looking at the Minnesota Multiphasic Personality Inventory (MMPI), there appeared to be a difference on the MMPI Social Introversion; the Critical Incidence group scored somewhat higher. When analysis was done, the number of MMPI Critical Items between the two groups was similar as well. When scores on the NEO Personality Inventory (NEO) were compared, the critical incident individuals tended to score higher on Openness and on its subscales (Ideas, Actions, and Feelings). There was a positive correlation between Total Neuroticism T Score and number of MMPI critical items.^ Conclusions. This study shows that the current pre-screening process is working and would have saved the organization significant resources. ^ If one was to develop a profile of a candidate who potentially could suffer a critical incident and subsequently jeopardize the unit, mission and the safety of the public they would look like the following: either divorced or never married, score high on the MMPI in Social Introversion, score low on MMPI with an "excessive" amount of MMPI critical items; and finally scores high on the NEO Openness and subscales Ideas, Feelings, and Actions.^ Based on the results gleaned from the analysis in this study there seems to be several factors, within psychometric testing, that when taken together, will aid the evaluators in selecting only the highest quality operators in order to save resources and to help protect the public from unfortunate critical incidents which may adversely affect our health and safety.^
Resumo:
This Doctoral Thesis entitled Contribution to the analysis, design and assessment of compact antenna test ranges at millimeter wavelengths aims to deepen the knowledge of a particular antenna measurement system: the compact range, operating in the frequency bands of millimeter wavelengths. The thesis has been developed at Radiation Group (GR), an antenna laboratory which belongs to the Signals, Systems and Radiocommunications department (SSR), from Technical University of Madrid (UPM). The Radiation Group owns an extensive experience on antenna measurements, running at present four facilities which operate in different configurations: Gregorian compact antenna test range, spherical near field, planar near field and semianechoic arch system. The research work performed in line with this thesis contributes the knowledge of the first measurement configuration at higher frequencies, beyond the microwaves region where Radiation Group features customer-level performance. To reach this high level purpose, a set of scientific tasks were sequentially carried out. Those are succinctly described in the subsequent paragraphs. A first step dealed with the State of Art review. The study of scientific literature dealed with the analysis of measurement practices in compact antenna test ranges in addition with the particularities of millimeter wavelength technologies. Joint study of both fields of knowledge converged, when this measurement facilities are of interest, in a series of technological challenges which become serious bottlenecks at different stages: analysis, design and assessment. Thirdly after the overview study, focus was set on Electromagnetic analysis algorithms. These formulations allow to approach certain electromagnetic features of interest, such as field distribution phase or stray signal analysis of particular structures when they interact with electromagnetic waves sources. Properly operated, a CATR facility features electromagnetic waves collimation optics which are large, in terms of wavelengths. Accordingly, the electromagnetic analysis tasks introduce an extense number of mathematic unknowns which grow with frequency, following different polynomic order laws depending on the used algorithmia. In particular, the optics configuration which was of our interest consisted on the reflection type serrated edge collimator. The analysis of these devices requires a flexible handling of almost arbitrary scattering geometries, becoming this flexibility the nucleus of the algorithmia’s ability to perform the subsequent design tasks. This thesis’ contribution to this field of knowledge consisted on reaching a formulation which was powerful at the same time when dealing with various analysis geometries and computationally speaking. Two algorithmia were developed. While based on the same principle of hybridization, they reached different order Physics performance at the cost of the computational efficiency. Inter-comparison of their CATR design capabilities was performed, reaching both qualitative as well as quantitative conclusions on their scope. In third place, interest was shifted from analysis - design tasks towards range assessment. Millimetre wavelengths imply strict mechanical tolerances and fine setup adjustment. In addition, the large number of unknowns issue already faced in the analysis stage appears as well in the on chamber field probing stage. Natural decrease of dynamic range available by semiconductor millimeter waves sources requires in addition larger integration times at each probing point. These peculiarities increase exponentially the difficulty of performing assessment processes in CATR facilities beyond microwaves. The bottleneck becomes so tight that it compromises the range characterization beyond a certain limit frequency which typically lies on the lowest segment of millimeter wavelength frequencies. However the value of range assessment moves, on the contrary, towards the highest segment. This thesis contributes this technological scenario developing quiet zone probing techniques which achieves substantial data reduction ratii. Collaterally, it increases the robustness of the results to noise, which is a virtual rise of the setup’s available dynamic range. In fourth place, the environmental sensitivity of millimeter wavelengths issue was approached. It is well known the drifts of electromagnetic experiments due to the dependance of the re sults with respect to the surrounding environment. This feature relegates many industrial practices of microwave frequencies to the experimental stage, at millimeter wavelengths. In particular, evolution of the atmosphere within acceptable conditioning bounds redounds in drift phenomena which completely mask the experimental results. The contribution of this thesis on this aspect consists on modeling electrically the indoor atmosphere existing in a CATR, as a function of environmental variables which affect the range’s performance. A simple model was developed, being able to handle high level phenomena, such as feed - probe phase drift as a function of low level magnitudes easy to be sampled: relative humidity and temperature. With this model, environmental compensation can be performed and chamber conditioning is automatically extended towards higher frequencies. Therefore, the purpose of this thesis is to go further into the knowledge of millimetre wavelengths involving compact antenna test ranges. This knowledge is dosified through the sequential stages of a CATR conception, form early low level electromagnetic analysis towards the assessment of an operative facility, stages for each one of which nowadays bottleneck phenomena exist and seriously compromise the antenna measurement practices at millimeter wavelengths.
Resumo:
Ozone (O3) phytototoxicity has been reported on a wide range of crops and wild Central European plantspecies, however no information has been provided regarding the sensitivity of plantspecies from dehesa Mediterranean therophytic grasslands in spite of their great plantspecies richness and the high O3 levels that are recorded in this area. A study was carried out in open-top chambers (OTCs) to assess the effects of O3 and competition on the reproductiveability of threecloverspecies: Trifolium cherleri, Trifolium subterraneum and Trifolium striatum. A phytometer approach was followed, therefore plants of these species were grown in mesoscosms composed of monocultures of four plants of each species, of threeplants of each species competing against a Briza maxima individual or of a single plant of each cloverspecies competing with threeB. maximaplants. Three O3 treatments were adopted: charcoal filtered air (CFA), non-filtered air (NFA) and non-filtered air supplemented with 40 nl l−1 of O3 (NFA+). The different mesocosms were exposed to the different O3 treatments for 45 days and then they remained in the open. Ozoneexposure caused reductions in the flower biomass of the threecloverspecies assessed. In the case of T. cherleri and T. subterraneum this effect was found following their exposure to the different O3 treatments during their vegetative period. An attenuation of these effects was found when the plants remained in the open. Ozone-induced detrimental effects on the seed output of T. striatum were also observed. The flower biomass of the cloverplants grown in monocultures was greater than when competing with one or threeB. maxima individuals. An increased flower biomass was found in the CFA monoculture mesocosms of T. cherleri when compared with the remaining mesocosms, once the plants were exposed in the open for 60 days. The implications of these effects on the performance of dehesa acid grasslands and for the definition of O3 critical levels is discussed
Resumo:
Systems of Systems (SoS) present challenging features and existing tools result often inadequate for their analysis, especially for heteregeneous networked infrastructures. Most accident scenarios in networked systems cannot be addressed by a simplistic black or white (i.e. functioning or failed) approach. Slow deviations from nominal operation conditions may cause degraded behaviours that suddenly end up into unexpected malfunctioning, with large portions of the network affected. In this paper,we present a language for modelling networked SoS. The language makes it possible to represent interdependencies of various natures, e.g. technical, organizational and human. The representation of interdependencies is based on control relationships that exchange physical quantities and related information. The language also makes it possible the identification of accident scenarios, by representing the propagation of failure events throughout the network. The results can be used for assessing the effectiveness of those mechanisms and measures that contribute to the overall resilience, both in qualitative and quantitative terms. The presented modelling methodology is general enough to be applied in combination with already existing system analysis techniques, such as risk assessment, dependability and performance evaluation
Resumo:
This work explores the automatic recognition of physical activity intensity patterns from multi-axial accelerometry and heart rate signals. Data collection was carried out in free-living conditions and in three controlled gymnasium circuits, for a total amount of 179.80 h of data divided into: sedentary situations (65.5%), light-to-moderate activity (17.6%) and vigorous exercise (16.9%). The proposed machine learning algorithms comprise the following steps: time-domain feature definition, standardization and PCA projection, unsupervised clustering (by k-means and GMM) and a HMM to account for long-term temporal trends. Performance was evaluated by 30 runs of a 10-fold cross-validation. Both k-means and GMM-based approaches yielded high overall accuracy (86.97% and 85.03%, respectively) and, given the imbalance of the dataset, meritorious F-measures (up to 77.88%) for non-sedentary cases. Classification errors tended to be concentrated around transients, what constrains their practical impact. Hence, we consider our proposal to be suitable for 24 h-based monitoring of physical activity in ambulatory scenarios and a first step towards intensity-specific energy expenditure estimators
Resumo:
Métrica de calidad de video de alta definición construida a partir de ratios de referencia completa. La medida de calidad de video, en inglés Visual Quality Assessment (VQA), es uno de los mayores retos por solucionar en el entorno multimedia. La calidad de vídeo tiene un impacto altísimo en la percepción del usuario final (consumidor) de los servicios sustentados en la provisión de contenidos multimedia y, por tanto, factor clave en la valoración del nuevo paradigma denominado Calidad de la Experiencia, en inglés Quality of Experience (QoE). Los modelos de medida de calidad de vídeo se pueden agrupar en varias ramas según la base técnica que sustenta el sistema de medida, destacando en importancia los que emplean modelos psicovisuales orientados a reproducir las características del sistema visual humano, en inglés Human Visual System, del que toman sus siglas HVS, y los que, por el contrario, optan por una aproximación ingenieril en la que el cálculo de calidad está basado en la extracción de parámetros intrínsecos de la imagen y su comparación. A pesar de los avances recogidos en este campo en los últimos años, la investigación en métricas de calidad de vídeo, tanto en presencia de referencia (los modelos denominados de referencia completa), como en presencia de parte de ella (modelos de referencia reducida) e incluso los que trabajan en ausencia de la misma (denominados sin referencia), tiene un amplio camino de mejora y objetivos por alcanzar. Dentro de ellos, la medida de señales de alta definición, especialmente las utilizadas en las primeras etapas de la cadena de valor que son de muy alta calidad, son de especial interés por su influencia en la calidad final del servicio y no existen modelos fiables de medida en la actualidad. Esta tesis doctoral presenta un modelo de medida de calidad de referencia completa que hemos llamado PARMENIA (PArallel Ratios MEtric from iNtrInsic features Analysis), basado en la ponderación de cuatro ratios de calidad calculados a partir de características intrínsecas de la imagen. Son: El Ratio de Fidelidad, calculado mediante el gradiente morfológico o gradiente de Beucher. El Ratio de Similitud Visual, calculado mediante los puntos visualmente significativos de la imagen a través de filtrados locales de contraste. El Ratio de Nitidez, que procede de la extracción del estadístico de textura de Haralick contraste. El Ratio de Complejidad, obtenido de la definición de homogeneidad del conjunto de estadísticos de textura de Haralick PARMENIA presenta como novedad la utilización de la morfología matemática y estadísticos de Haralick como base de una métrica de medida de calidad, pues esas técnicas han estado tradicionalmente más ligadas a la teledetección y la segmentación de objetos. Además, la aproximación de la métrica como un conjunto ponderado de ratios es igualmente novedosa debido a que se alimenta de modelos de similitud estructural y otros más clásicos, basados en la perceptibilidad del error generado por la degradación de la señal asociada a la compresión. PARMENIA presenta resultados con una altísima correlación con las valoraciones MOS procedentes de las pruebas subjetivas a usuarios que se han realizado para la validación de la misma. El corpus de trabajo seleccionado procede de conjuntos de secuencias validados internacionalmente, de modo que los resultados aportados sean de la máxima calidad y el máximo rigor posible. La metodología de trabajo seguida ha consistido en la generación de un conjunto de secuencias de prueba de distintas calidades a través de la codificación con distintos escalones de cuantificación, la obtención de las valoraciones subjetivas de las mismas a través de pruebas subjetivas de calidad (basadas en la recomendación de la Unión Internacional de Telecomunicaciones BT.500), y la validación mediante el cálculo de la correlación de PARMENIA con estos valores subjetivos, cuantificada a través del coeficiente de correlación de Pearson. Una vez realizada la validación de los ratios y optimizada su influencia en la medida final y su alta correlación con la percepción, se ha realizado una segunda revisión sobre secuencias del hdtv test dataset 1 del Grupo de Expertos de Calidad de Vídeo (VQEG, Video Quality Expert Group) mostrando los resultados obtenidos sus claras ventajas. Abstract Visual Quality Assessment has been so far one of the most intriguing challenges on the media environment. Progressive evolution towards higher resolutions while increasing the quality needed (e.g. high definition and better image quality) aims to redefine models for quality measuring. Given the growing interest in multimedia services delivery, perceptual quality measurement has become a very active area of research. First, in this work, a classification of objective video quality metrics based on their underlying methodologies and approaches for measuring video quality has been introduced to sum up the state of the art. Then, this doctoral thesis describes an enhanced solution for full reference objective quality measurement based on mathematical morphology, texture features and visual similarity information that provides a normalized metric that we have called PARMENIA (PArallel Ratios MEtric from iNtrInsic features Analysis), with a high correlated MOS score. The PARMENIA metric is based on the pooling of different quality ratios that are obtained from three different approaches: Beucher’s gradient, local contrast filtering, and contrast and homogeneity Haralick’s texture features. The metric performance is excellent, and improves the current state of the art by providing a wide dynamic range that make easier to discriminate between very close quality coded sequences, especially for very high bit rates whose quality, currently, is transparent for quality metrics. PARMENIA introduces a degree of novelty against other working metrics: on the one hand, exploits the structural information variation to build the metric’s kernel, but complements the measure with texture information and a ratio of visual meaningful points that is closer to typical error sensitivity based approaches. We would like to point out that PARMENIA approach is the only metric built upon full reference ratios, and using mathematical morphology and texture features (typically used in segmentation) for quality assessment. On the other hand, it gets results with a wide dynamic range that allows measuring the quality of high definition sequences from bit rates of hundreds of Megabits (Mbps) down to typical distribution rates (5-6 Mbps), even streaming rates (1- 2 Mbps). Thus, a direct correlation between PARMENIA and MOS scores are easily constructed. PARMENIA may further enhance the number of available choices in objective quality measurement, especially for very high quality HD materials. All this results come from validation that has been achieved through internationally validated datasets on which subjective tests based on ITU-T BT.500 methodology have been carried out. Pearson correlation coefficient has been calculated to verify the accuracy of PARMENIA and its reliability.
Resumo:
Las técnicas de cirugía de mínima invasión (CMI) se están consolidando hoy en día como alternativa a la cirugía tradicional, debido a sus numerosos beneficios para los pacientes. Este cambio de paradigma implica que los cirujanos deben aprender una serie de habilidades distintas de aquellas requeridas en cirugía abierta. El entrenamiento y evaluación de estas habilidades se ha convertido en una de las mayores preocupaciones en los programas de formación de cirujanos, debido en gran parte a la presión de una sociedad que exige cirujanos bien preparados y una reducción en el número de errores médicos. Por tanto, se está prestando especial atención a la definición de nuevos programas que permitan el entrenamiento y la evaluación de las habilidades psicomotoras en entornos seguros antes de que los nuevos cirujanos puedan operar sobre pacientes reales. Para tal fin, hospitales y centros de formación están gradualmente incorporando instalaciones de entrenamiento donde los residentes puedan practicar y aprender sin riesgos. Es cada vez más común que estos laboratorios dispongan de simuladores virtuales o simuladores físicos capaces de registrar los movimientos del instrumental de cada residente. Estos simuladores ofrecen una gran variedad de tareas de entrenamiento y evaluación, así como la posibilidad de obtener información objetiva de los ejercicios. Los diferentes estudios de validación llevados a cabo dan muestra de su utilidad; pese a todo, los niveles de evidencia presentados son en muchas ocasiones insuficientes. Lo que es más importante, no existe un consenso claro a la hora de definir qué métricas son más útiles para caracterizar la pericia quirúrgica. El objetivo de esta tesis doctoral es diseñar y validar un marco de trabajo conceptual para la definición y validación de entornos para la evaluación de habilidades en CMI, en base a un modelo en tres fases: pedagógica (tareas y métricas a emplear), tecnológica (tecnologías de adquisición de métricas) y analítica (interpretación de la competencia en base a las métricas). Para tal fin, se describe la implementación práctica de un entorno basado en (1) un sistema de seguimiento de instrumental fundamentado en el análisis del vídeo laparoscópico; y (2) la determinación de la pericia en base a métricas de movimiento del instrumental. Para la fase pedagógica se diseñó e implementó un conjunto de tareas para la evaluación de habilidades psicomotoras básicas, así como una serie de métricas de movimiento. La validación de construcción llevada a cabo sobre ellas mostró buenos resultados para tiempo, camino recorrido, profundidad, velocidad media, aceleración media, economía de área y economía de volumen. Adicionalmente, los resultados obtenidos en la validación de apariencia fueron en general positivos en todos los grupos considerados (noveles, residentes, expertos). Para la fase tecnológica, se introdujo el EVA Tracking System, una solución para el seguimiento del instrumental quirúrgico basado en el análisis del vídeo endoscópico. La precisión del sistema se evaluó a 16,33ppRMS para el seguimiento 2D de la herramienta en la imagen; y a 13mmRMS para el seguimiento espacial de la misma. La validación de construcción con una de las tareas de evaluación mostró buenos resultados para tiempo, camino recorrido, profundidad, velocidad media, aceleración media, economía de área y economía de volumen. La validación concurrente con el TrEndo® Tracking System por su parte presentó valores altos de correlación para 8 de las 9 métricas analizadas. Finalmente, para la fase analítica se comparó el comportamiento de tres clasificadores supervisados a la hora de determinar automáticamente la pericia quirúrgica en base a la información de movimiento del instrumental, basados en aproximaciones lineales (análisis lineal discriminante, LDA), no lineales (máquinas de soporte vectorial, SVM) y difusas (sistemas adaptativos de inferencia neurodifusa, ANFIS). Los resultados muestran que en media SVM presenta un comportamiento ligeramente superior: 78,2% frente a los 71% y 71,7% obtenidos por ANFIS y LDA respectivamente. Sin embargo las diferencias estadísticas medidas entre los tres no fueron demostradas significativas. En general, esta tesis doctoral corrobora las hipótesis de investigación postuladas relativas a la definición de sistemas de evaluación de habilidades para cirugía de mínima invasión, a la utilidad del análisis de vídeo como fuente de información y a la importancia de la información de movimiento de instrumental a la hora de caracterizar la pericia quirúrgica. Basándose en estos cimientos, se han de abrir nuevos campos de investigación que contribuyan a la definición de programas de formación estructurados y objetivos, que puedan garantizar la acreditación de cirujanos sobradamente preparados y promocionen la seguridad del paciente en el quirófano. Abstract Minimally invasive surgery (MIS) techniques have become a standard in many surgical sub-specialties, due to their many benefits for patients. However, this shift in paradigm implies that surgeons must acquire a complete different set of skills than those normally attributed to open surgery. Training and assessment of these skills has become a major concern in surgical learning programmes, especially considering the social demand for better-prepared professionals and for the decrease of medical errors. Therefore, much effort is being put in the definition of structured MIS learning programmes, where practice with real patients in the operating room (OR) can be delayed until the resident can attest for a minimum level of psychomotor competence. To this end, skills’ laboratory settings are being introduced in hospitals and training centres where residents may practice and be assessed on their psychomotor skills. Technological advances in the field of tracking technologies and virtual reality (VR) have enabled the creation of new learning systems such as VR simulators or enhanced box trainers. These systems offer a wide range of tasks, as well as the capability of registering objective data on the trainees’ performance. Validation studies give proof of their usefulness; however, levels of evidence reported are in many cases low. More importantly, there is still no clear consensus on topics such as the optimal metrics that must be used to assess competence, the validity of VR simulation, the portability of tracking technologies into real surgeries (for advanced assessment) or the degree to which the skills measured and obtained in laboratory environments transfer to the OR. The purpose of this PhD is to design and validate a conceptual framework for the definition and validation of MIS assessment environments based on a three-pillared model defining three main stages: pedagogical (tasks and metrics to employ), technological (metric acquisition technologies) and analytical (interpretation of competence based on metrics). To this end, a practical implementation of the framework is presented, focused on (1) a video-based tracking system and (2) the determination of surgical competence based on the laparoscopic instruments’ motionrelated data. The pedagogical stage’s results led to the design and implementation of a set of basic tasks for MIS psychomotor skills’ assessment, as well as the definition of motion analysis parameters (MAPs) to measure performance on said tasks. Validation yielded good construct results for parameters such as time, path length, depth, average speed, average acceleration, economy of area and economy of volume. Additionally, face validation results showed positive acceptance on behalf of the experts, residents and novices. For the technological stage the EVA Tracking System is introduced. EVA provides a solution for tracking laparoscopic instruments from the analysis of the monoscopic video image. Accuracy tests for the system are presented, which yielded an average RMSE of 16.33pp for 2D tracking of the instrument on the image and of 13mm for 3D spatial tracking. A validation experiment was conducted using one of the tasks and the most relevant MAPs. Construct validation showed significant differences for time, path length, depth, average speed, average acceleration, economy of area and economy of volume; especially between novices and residents/experts. More importantly, concurrent validation with the TrEndo® Tracking System presented high correlation values (>0.7) for 8 of the 9 MAPs proposed. Finally, the analytical stage allowed comparing the performance of three different supervised classification strategies in the determination of surgical competence based on motion-related information. The three classifiers were based on linear (linear discriminant analysis, LDA), non-linear (support vector machines, SVM) and fuzzy (adaptive neuro fuzzy inference systems, ANFIS) approaches. Results for SVM show slightly better performance than the other two classifiers: on average, accuracy for LDA, SVM and ANFIS was of 71.7%, 78.2% and 71% respectively. However, when confronted, no statistical significance was found between any of the three. Overall, this PhD corroborates the investigated research hypotheses regarding the definition of MIS assessment systems, the use of endoscopic video analysis as the main source of information and the relevance of motion analysis in the determination of surgical competence. New research fields in the training and assessment of MIS surgeons can be proposed based on these foundations, in order to contribute to the definition of structured and objective learning programmes that guarantee the accreditation of well-prepared professionals and the promotion of patient safety in the OR.
Resumo:
INTRODUCTION: The EVA (Endoscopic Video Analysis) tracking system a new tracking system for extracting motions of laparoscopic instruments based on non-obtrusive video tracking was developed. The feasibility of using EVA in laparoscopic settings has been tested in a box trainer setup. METHODS: EVA makes use of an algorithm that employs information of the laparoscopic instrument's shaft edges in the image, the instrument's insertion point, and the camera's optical centre to track the 3D position of the instrument tip. A validation study of EVA comprised a comparison of the measurements achieved with EVA and the TrEndo tracking system. To this end, 42 participants (16 novices, 22 residents, and 4 experts) were asked to perform a peg transfer task in a box trainer. Ten motion-based metrics were used to assess their performance. RESULTS: Construct validation of the EVA has been obtained for seven motion-based metrics. Concurrent validation revealed that there is a strong correlation between the results obtained by EVA and the TrEndo for metrics such as path length (p=0,97), average speed (p=0,94) or economy of volume (p=0,85), proving the viability of EVA. CONCLUSIONS: EVA has been successfully used in the training setup showing potential of endoscopic video analysis to assess laparoscopic psychomotor skills. The results encourage further implementation of video tracking in training setups and in image guided surgery.
Resumo:
INTRODUCTION: Motion metrics have become an important source of information when addressing the assessment of surgical expertise. However, their direct relationship with the different surgical skills has not been fully explored. The purpose of this study is to investigate the relevance of motion-related metrics in the evaluation processes of basic psychomotor laparoscopic skills, as well as their correlation with the different abilities sought to measure. METHODS: A framework for task definition and metric analysis is proposed. An explorative survey was first conducted with a board of experts to identify metrics to assess basic psychomotor skills. Based on the output of that survey, three novel tasks for surgical assessment were designed. Face and construct validation study was performed, with focus on motion-related metrics. Tasks were performed by 42 participants (16 novices, 22 residents and 4 experts). Movements of the laparoscopic instruments were registered with the TrEndo tracking system and analyzed. RESULTS: Time, path length and depth showed construct validity for all three tasks. Motion smoothness and idle time also showed validity for tasks involving bi-manual coordination and tasks requiring a more tactical approach respectively. Additionally, motion smoothness and average speed showed a high internal consistency, proving them to be the most task-independent of all the metrics analyzed. CONCLUSION: Motion metrics are complementary and valid for assessing basic psychomotor skills, and their relevance depends on the skill being evaluated. A larger clinical implementation, combined with quality performance information, will give more insight on the relevance of the results shown in this study.
Resumo:
The influence of atmospheric gases and tropospheric phenomena becomes more relevant at frequencies within the THz band (100 GHz to 10 THz), severely affecting the propagation conditions. The use of radiosoundings in propagation studies is a well established measurement technique in order to collect information about the vertical structure of the atmosphere, from which gaseous and cloud attenuation can be estimated with the use of propagation models. However, some of these prediction models are not suitable to be used under rainy conditions. In the present study, a method to identify the presence of rainy conditions during radiosoundings is introduced, with the aim of filtering out these events from yearly statistics of predicted atmospheric attenuation. The detection procedure is based on the analysis of a set of parameters, some of them extracted from synoptical observations of weather (SYNOP reports) and other derived from radiosonde observations (RAOBs). The performance of the method has been evaluated under different climatic conditions, corresponding to three locations in Spain, where colocated rain gauge data were available. Rain events detected by the method have been compared with those precipitations identified by the rain gauge. The pertinence Received 26 June 2012, Accepted 31 July 2012, Scheduled 15 August 2012 * Corresponding author: Gustavo Adolfo Siles Soria (gsiles@grc.ssr.upm.es). 258 Siles et al. of the method is discussed on the basis of an analysis of cumulative distributions of total attenuation at 100 and 300 GHz. This study demonstrates that the proposed method can be useful to identify events probably associated to rainy conditions. Hence, it can be considered as a suitable algorithm in order to filter out this kind of events from annual attenuation statistics.
Resumo:
A Kuhnian approach to research assessment requires us to consider that the important scientific breakthroughs that drive scientific progress are infrequent and that the progress of science does not depend on normal research. Consequently, indicators of research performance based on the total number of papers do not accurately measure scientific progress. Similarly, those universities with the best reputations in terms of scientific progress differ widely from other universities in terms of the scale of investments made in research and in the higher concentrations of outstanding scientists present, but less so in terms of the total number of papers or citations. This study argues that indicators for the 1% high-citation tail of the citation distribution reveal the contribution of universities to the progress of science and provide quantifiable justification for the large investments in research made by elite research universities. In this tail, which follows a power low, the number of the less frequent and highly cited important breakthroughs can be predicted from the frequencies of papers in the upper part of the tail. This study quantifies the false impression of excellence produced by multinational papers, and by other types of papers that do not contribute to the progress of science. Many of these papers are concentrated in and dominate lists of highly cited papers, especially in lower-ranked universities. The h-index obscures the differences between higher- and lower-ranked universities because the proportion of h-core papers in the 1% high-citation tail is not proportional to the value of the h-index.