987 resultados para Video genre classification


Relevância:

30.00% 30.00%

Publicador:

Resumo:

This study presents an analysis of the application of underwater video data collected for training and validating benthic habitat distribution models. Specifically, we quantify the two major sources of error pertaining to collection of this type of reference data. A theoretical spatial error budget is developed for a positioning system used to co-register video frames to their corresponding locations at the seafloor. Second, we compare interpretation variability among trained operators assessing the same video frames between times over three hierarchical levels of a benthic classification scheme. Propagated error in the positioning system described was found to be highly correlated with depth of operation and varies from 1.5m near the surface to 5.7m in 100m of water. In order of decreasing classification hierarchy, mean overall observer agreement was found to be 98% (range 6%), 82% (range 12%) and 75% (range 17%) for the 2, 4, and 6 class levels of the scheme, respectively. Patterns in between-observer variation are related to the level of detail imposed by each hierarchical level of the classification scheme, the feature of interest, and to the amount of observer experience. © 2014 Copyright © Taylor & Francis Group, LLC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

An intelligent system that emulates human decision behaviour based on visual data acquisition is proposed. The approach is useful in applications where images are used to supply information to specialists who will choose suitable actions. An artificial neural classifier aids a fuzzy decision support system to deal with uncertainty and imprecision present in available information. Advantages of both techniques are exploited complementarily. As an example, this method was applied in automatic focus checking and adjustment in video monitor manufacturing. Copyright © 2005 IFAC.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

30.00% 30.00%

Publicador:

Resumo:

There is a wide range of telecommunications services that transmit voice, video and data through complex transmission networks and in some cases, the service has not an acceptable quality level for the end user. In this sense the study of methods for assessing video quality and voice have a very important role. This paper presents a classification scheme, based on different criteria, of the methods and metrics that are being studied in recent years. This paper presents how the video quality is affected by degradation in the transmission channel in two kinds of services: Digital TV (ISDB-TB) due the fading in the air interface and video streaming service on an IP network due packet loss. For Digital TV tests was set up a scenario where the digital TV transmitter is connected to an RF channel emulator, where are inserted different fading models and at the end, the videos are saved in a mobile device. The tests of streaming video were performed in an isolated scenario of IP network, which are scheduled several network conditions, resulting in different qualities of video reception. The video quality assessment is performed using objective assessment methods: PSNR, SSIM and VQM. The results show how the losses in the transmission channel affects the quality of end-user experience on both services studied.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article describes a classification scheme for computer-mediated discourse that classifies samples in terms of clusters of features, or “facets”. The goal of the scheme is to synthesize and articulate aspects of technical and social context that influence discourse usage in CMC environments. The classification scheme is motivated, presented in detail with support from existing literature, and illustrated through a comparison of two types of weblog (blog) data. In concluding, the advantages and limitations of the scheme are weighed.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVE Vestibular neuritis is often mimicked by stroke (pseudoneuritis). Vestibular eye movements help discriminate the two conditions. We report vestibulo-ocular reflex (VOR) gain measures in neuritis and stroke presenting acute vestibular syndrome (AVS). METHODS Prospective cross-sectional study of AVS (acute continuous vertigo/dizziness lasting >24 h) at two academic centers. We measured horizontal head impulse test (HIT) VOR gains in 26 AVS patients using a video HIT device (ICS Impulse). All patients were assessed within 1 week of symptom onset. Diagnoses were confirmed by clinical examinations, brain magnetic resonance imaging with diffusion-weighted images, and follow-up. Brainstem and cerebellar strokes were classified by vascular territory-posterior inferior cerebellar artery (PICA) or anterior inferior cerebellar artery (AICA). RESULTS Diagnoses were vestibular neuritis (n = 16) and posterior fossa stroke (PICA, n = 7; AICA, n = 3). Mean HIT VOR gains (ipsilesional [standard error of the mean], contralesional [standard error of the mean]) were as follows: vestibular neuritis (0.52 [0.04], 0.87 [0.04]); PICA stroke (0.94 [0.04], 0.93 [0.04]); AICA stroke (0.84 [0.10], 0.74 [0.10]). VOR gains were asymmetric in neuritis (unilateral vestibulopathy) and symmetric in PICA stroke (bilaterally normal VOR), whereas gains in AICA stroke were heterogeneous (asymmetric, bilaterally low, or normal). In vestibular neuritis, borderline gains ranged from 0.62 to 0.73. Twenty patients (12 neuritis, six PICA strokes, two AICA strokes) had at least five interpretable HIT trials (for both ears), allowing an appropriate classification based on mean VOR gains per ear. Classifying AVS patients with bilateral VOR mean gains of 0.70 or more as suspected strokes yielded a total diagnostic accuracy of 90%, with stroke sensitivity of 88% and specificity of 92%. CONCLUSION Video HIT VOR gains differ between peripheral and central causes of AVS. PICA strokes were readily separated from neuritis using gain measures, but AICA strokes were at risk of being misclassified based on VOR gain alone.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Kelp forests represent a major habitat type in coastal waters worldwide and their structure and distribution is predicted to change due to global warming. Despite their ecological and economical importance, there is still a lack of reliable spatial information on their abundance and distribution. In recent years, various hydroacoustic mapping techniques for sublittoral environments evolved. However, in turbid coastal waters, such as off the island of Helgoland (Germany, North Sea), the kelp vegetation is present in shallow water depths normally excluded from hydroacoustic surveys. In this study, single beam survey data consisting of the two seafloor parameters roughness and hardness were obtained with RoxAnn from water depth between 2 and 18 m. Our primary aim was to reliably detect the kelp forest habitat with different densities and distinguish it from other vegetated zones. Five habitat classes were identified using underwater-video and were applied for classification of acoustic signatures. Subsequently, spatial prediction maps were produced via two classification approaches: Linear discriminant analysis (LDA) and manual classification routine (MC). LDA was able to distinguish dense kelp forest from other habitats (i.e. mixed seaweed vegetation, sand, and barren bedrock), but no variances in kelp density. In contrast, MC also provided information on medium dense kelp distribution which is characterized by intermediate roughness and hardness values evoked by reduced kelp abundances. The prediction maps reach accordance levels of 62% (LDA) and 68% (MC). The presence of vegetation (kelp and mixed seaweed vegetation) was determined with higher prediction abilities of 75% (LDA) and 76% (MC). Since the different habitat classes reveal acoustic signatures that strongly overlap, the manual classification method was more appropriate for separating different kelp forest densities and low-lying vegetation. It became evident that the occurrence of kelp in this area is not simply linked to water depth. Moreover, this study shows that the two seafloor parameters collected with RoxAnn are suitable indicators for the discrimination of different densely vegetated seafloor habitats in shallow environments.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This article presents a probabilistic method for vehicle detection and tracking through the analysis of monocular images obtained from a vehicle-mounted camera. The method is designed to address the main shortcomings of traditional particle filtering approaches, namely Bayesian methods based on importance sampling, for use in traffic environments. These methods do not scale well when the dimensionality of the feature space grows, which creates significant limitations when tracking multiple objects. Alternatively, the proposed method is based on a Markov chain Monte Carlo (MCMC) approach, which allows efficient sampling of the feature space. The method involves important contributions in both the motion and the observation models of the tracker. Indeed, as opposed to particle filter-based tracking methods in the literature, which typically resort to observation models based on appearance or template matching, in this study a likelihood model that combines appearance analysis with information from motion parallax is introduced. Regarding the motion model, a new interaction treatment is defined based on Markov random fields (MRF) that allows for the handling of possible inter-dependencies in vehicle trajectories. As for vehicle detection, the method relies on a supervised classification stage using support vector machines (SVM). The contribution in this field is twofold. First, a new descriptor based on the analysis of gradient orientations in concentric rectangles is dened. This descriptor involves a much smaller feature space compared to traditional descriptors, which are too costly for real-time applications. Second, a new vehicle image database is generated to train the SVM and made public. The proposed vehicle detection and tracking method is proven to outperform existing methods and to successfully handle challenging situations in the test sequences.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Métrica de calidad de video de alta definición construida a partir de ratios de referencia completa. La medida de calidad de video, en inglés Visual Quality Assessment (VQA), es uno de los mayores retos por solucionar en el entorno multimedia. La calidad de vídeo tiene un impacto altísimo en la percepción del usuario final (consumidor) de los servicios sustentados en la provisión de contenidos multimedia y, por tanto, factor clave en la valoración del nuevo paradigma denominado Calidad de la Experiencia, en inglés Quality of Experience (QoE). Los modelos de medida de calidad de vídeo se pueden agrupar en varias ramas según la base técnica que sustenta el sistema de medida, destacando en importancia los que emplean modelos psicovisuales orientados a reproducir las características del sistema visual humano, en inglés Human Visual System, del que toman sus siglas HVS, y los que, por el contrario, optan por una aproximación ingenieril en la que el cálculo de calidad está basado en la extracción de parámetros intrínsecos de la imagen y su comparación. A pesar de los avances recogidos en este campo en los últimos años, la investigación en métricas de calidad de vídeo, tanto en presencia de referencia (los modelos denominados de referencia completa), como en presencia de parte de ella (modelos de referencia reducida) e incluso los que trabajan en ausencia de la misma (denominados sin referencia), tiene un amplio camino de mejora y objetivos por alcanzar. Dentro de ellos, la medida de señales de alta definición, especialmente las utilizadas en las primeras etapas de la cadena de valor que son de muy alta calidad, son de especial interés por su influencia en la calidad final del servicio y no existen modelos fiables de medida en la actualidad. Esta tesis doctoral presenta un modelo de medida de calidad de referencia completa que hemos llamado PARMENIA (PArallel Ratios MEtric from iNtrInsic features Analysis), basado en la ponderación de cuatro ratios de calidad calculados a partir de características intrínsecas de la imagen. Son: El Ratio de Fidelidad, calculado mediante el gradiente morfológico o gradiente de Beucher. El Ratio de Similitud Visual, calculado mediante los puntos visualmente significativos de la imagen a través de filtrados locales de contraste. El Ratio de Nitidez, que procede de la extracción del estadístico de textura de Haralick contraste. El Ratio de Complejidad, obtenido de la definición de homogeneidad del conjunto de estadísticos de textura de Haralick PARMENIA presenta como novedad la utilización de la morfología matemática y estadísticos de Haralick como base de una métrica de medida de calidad, pues esas técnicas han estado tradicionalmente más ligadas a la teledetección y la segmentación de objetos. Además, la aproximación de la métrica como un conjunto ponderado de ratios es igualmente novedosa debido a que se alimenta de modelos de similitud estructural y otros más clásicos, basados en la perceptibilidad del error generado por la degradación de la señal asociada a la compresión. PARMENIA presenta resultados con una altísima correlación con las valoraciones MOS procedentes de las pruebas subjetivas a usuarios que se han realizado para la validación de la misma. El corpus de trabajo seleccionado procede de conjuntos de secuencias validados internacionalmente, de modo que los resultados aportados sean de la máxima calidad y el máximo rigor posible. La metodología de trabajo seguida ha consistido en la generación de un conjunto de secuencias de prueba de distintas calidades a través de la codificación con distintos escalones de cuantificación, la obtención de las valoraciones subjetivas de las mismas a través de pruebas subjetivas de calidad (basadas en la recomendación de la Unión Internacional de Telecomunicaciones BT.500), y la validación mediante el cálculo de la correlación de PARMENIA con estos valores subjetivos, cuantificada a través del coeficiente de correlación de Pearson. Una vez realizada la validación de los ratios y optimizada su influencia en la medida final y su alta correlación con la percepción, se ha realizado una segunda revisión sobre secuencias del hdtv test dataset 1 del Grupo de Expertos de Calidad de Vídeo (VQEG, Video Quality Expert Group) mostrando los resultados obtenidos sus claras ventajas. Abstract Visual Quality Assessment has been so far one of the most intriguing challenges on the media environment. Progressive evolution towards higher resolutions while increasing the quality needed (e.g. high definition and better image quality) aims to redefine models for quality measuring. Given the growing interest in multimedia services delivery, perceptual quality measurement has become a very active area of research. First, in this work, a classification of objective video quality metrics based on their underlying methodologies and approaches for measuring video quality has been introduced to sum up the state of the art. Then, this doctoral thesis describes an enhanced solution for full reference objective quality measurement based on mathematical morphology, texture features and visual similarity information that provides a normalized metric that we have called PARMENIA (PArallel Ratios MEtric from iNtrInsic features Analysis), with a high correlated MOS score. The PARMENIA metric is based on the pooling of different quality ratios that are obtained from three different approaches: Beucher’s gradient, local contrast filtering, and contrast and homogeneity Haralick’s texture features. The metric performance is excellent, and improves the current state of the art by providing a wide dynamic range that make easier to discriminate between very close quality coded sequences, especially for very high bit rates whose quality, currently, is transparent for quality metrics. PARMENIA introduces a degree of novelty against other working metrics: on the one hand, exploits the structural information variation to build the metric’s kernel, but complements the measure with texture information and a ratio of visual meaningful points that is closer to typical error sensitivity based approaches. We would like to point out that PARMENIA approach is the only metric built upon full reference ratios, and using mathematical morphology and texture features (typically used in segmentation) for quality assessment. On the other hand, it gets results with a wide dynamic range that allows measuring the quality of high definition sequences from bit rates of hundreds of Megabits (Mbps) down to typical distribution rates (5-6 Mbps), even streaming rates (1- 2 Mbps). Thus, a direct correlation between PARMENIA and MOS scores are easily constructed. PARMENIA may further enhance the number of available choices in objective quality measurement, especially for very high quality HD materials. All this results come from validation that has been achieved through internationally validated datasets on which subjective tests based on ITU-T BT.500 methodology have been carried out. Pearson correlation coefficient has been calculated to verify the accuracy of PARMENIA and its reliability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

INTRODUCTION: Objective assessment of motor skills has become an important challenge in minimally invasive surgery (MIS) training.Currently, there is no gold standard defining and determining the residents' surgical competence.To aid in the decision process, we analyze the validity of a supervised classifier to determine the degree of MIS competence based on assessment of psychomotor skills METHODOLOGY: The ANFIS is trained to classify performance in a box trainer peg transfer task performed by two groups (expert/non expert). There were 42 participants included in the study: the non-expert group consisted of 16 medical students and 8 residents (< 10 MIS procedures performed), whereas the expert group consisted of 14 residents (> 10 MIS procedures performed) and 4 experienced surgeons. Instrument movements were captured by means of the Endoscopic Video Analysis (EVA) tracking system. Nine motion analysis parameters (MAPs) were analyzed, including time, path length, depth, average speed, average acceleration, economy of area, economy of volume, idle time and motion smoothness. Data reduction was performed by means of principal component analysis, and then used to train the ANFIS net. Performance was measured by leave one out cross validation. RESULTS: The ANFIS presented an accuracy of 80.95%, where 13 experts and 21 non-experts were correctly classified. Total root mean square error was 0.88, while the area under the classifiers' ROC curve (AUC) was measured at 0.81. DISCUSSION: We have shown the usefulness of ANFIS for classification of MIS competence in a simple box trainer exercise. The main advantage of using ANFIS resides in its continuous output, which allows fine discrimination of surgical competence. There are, however, challenges that must be taken into account when considering use of ANFIS (e.g. training time, architecture modeling). Despite this, we have shown discriminative power of ANFIS for a low-difficulty box trainer task, regardless of the individual significances between MAPs. Future studies are required to confirm the findings, inclusion of new tasks, conditions and sample population.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The present work covers the first validation efforts of the EVA Tracking System for the assessment of minimally invasive surgery (MIS) psychomotor skills. Instrument movements were recorded for 42 surgeons (4 expert, 22 residents, 16 novice medical students) and analyzed for a box trainer peg transfer task. Construct validation was established for 7/9 motion analysis parameters (MAPs). Concurrent validation was determined for 8/9 MAPs against the TrEndo Tracking System. Finally, automatic determination of surgical proficiency based on the MAPs was sought by 3 different approaches to supervised classification (LDA, SVM, ANFIS), with accuracy results of 61.9%, 83.3% and 80.9% respectively. Results not only reflect on the validation of EVA for skills? assessment, but also on the relevance of motion analysis of instruments in the determination of surgical competence.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video-based vehicle detection is the focus of increasing interest due to its potential towards collision avoidance. In particular, vehicle verification is especially challenging due to the enormous variability of vehicles in size, color, pose, etc. In this paper, a new approach based on supervised learning using Principal Component Analysis (PCA) is proposed that addresses the main limitations of existing methods. Namely, in contrast to classical approaches which train a single classifier regardless of the relative position of the candidate (thus ignoring valuable pose information), a region-dependent analysis is performed by considering four different areas. In addition, a study on the evolution of the classification performance according to the dimensionality of the principal subspace is carried out using PCA features within a SVM-based classification scheme. Indeed, the experiments performed on a publicly available database prove that PCA dimensionality requirements are region-dependent. Hence, in this work, the optimal configuration is adapted to each of them, rendering very good vehicle verification results.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Vision-based object detection from a moving platform becomes particularly challenging in the field of advanced driver assistance systems (ADAS). In this context, onboard vision-based vehicle verification strategies become critical, facing challenges derived from the variability of vehicles appearance, illumination, and vehicle speed. In this paper, an optimized HOG configuration for onboard vehicle verification is proposed which not only considers its spatial and orientation resolution, but descriptor processing strategies and classification. An in-depth analysis of the optimal settings for HOG for onboard vehicle verification is presented, in the context of SVM classification with different kernels. In contrast to many existing approaches, the evaluation is realized in a public and heterogeneous database of vehicle and non-vehicle images in different areas of the road, rendering excellent verification rates that outperform other similar approaches in the literature.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

2000 Mathematics Subject Classification: 14D20, 14J60.