926 resultados para Video data
Resumo:
In this paper, we present an integrated system for real-time automatic detection of human actions from video. The proposed approach uses the boundary of humans as the main feature for recognizing actions. Background subtraction is performed using Gaussian mixture model. Then, features are extracted from silhouettes and Vector Quantization is used to map features into symbols (bag of words approach). Finally, actions are detected using the Hidden Markov Model. The proposed system was validated using a newly collected real- world dataset. The obtained results show that the system is capable of achieving robust human detection, in both indoor and outdoor environments. Moreover, promising classification results were achieved when detecting two basic human actions: walking and sitting.
Resumo:
With the rapid increase in both centralized video archives and distributed WWW video resources, content-based video retrieval is gaining its importance. To support such applications efficiently, content-based video indexing must be addressed. Typically, each video is represented by a sequence of frames. Due to the high dimensionality of frame representation and the large number of frames, video indexing introduces an additional degree of complexity. In this paper, we address the problem of content-based video indexing and propose an efficient solution, called the Ordered VA-File (OVA-File) based on the VA-file. OVA-File is a hierarchical structure and has two novel features: 1) partitioning the whole file into slices such that only a small number of slices are accessed and checked during k Nearest Neighbor (kNN) search and 2) efficient handling of insertions of new vectors into the OVA-File, such that the average distance between the new vectors and those approximations near that position is minimized. To facilitate a search, we present an efficient approximate kNN algorithm named Ordered VA-LOW (OVA-LOW) based on the proposed OVA-File. OVA-LOW first chooses possible OVA-Slices by ranking the distances between their corresponding centers and the query vector, and then visits all approximations in the selected OVA-Slices to work out approximate kNN. The number of possible OVA-Slices is controlled by a user-defined parameter delta. By adjusting delta, OVA-LOW provides a trade-off between the query cost and the result quality. Query by video clip consisting of multiple frames is also discussed. Extensive experimental studies using real video data sets were conducted and the results showed that our methods can yield a significant speed-up over an existing VA-file-based method and iDistance with high query result quality. Furthermore, by incorporating temporal correlation of video content, our methods achieved much more efficient performance.
Resumo:
The contributions of this dissertation are in the development of two new interrelated approaches to video data compression: (1) A level-refined motion estimation and subband compensation method for the effective motion estimation and motion compensation. (2) A shift-invariant sub-decimation decomposition method in order to overcome the deficiency of the decimation process in estimating motion due to its shift-invariant property of wavelet transform. ^ The enormous data generated by digital videos call for an intense need of efficient video compression techniques to conserve storage space and minimize bandwidth utilization. The main idea of video compression is to reduce the interpixel redundancies inside and between the video frames by applying motion estimation and motion compensation (MEMO) in combination with spatial transform coding. To locate the global minimum of the matching criterion function reasonably, hierarchical motion estimation by coarse to fine resolution refinements using discrete wavelet transform is applied due to its intrinsic multiresolution and scalability natures. ^ Due to the fact that most of the energies are concentrated in the low resolution subbands while decreased in the high resolution subbands, a new approach called level-refined motion estimation and subband compensation (LRSC) method is proposed. It realizes the possible intrablocks in the subbands for lower entropy coding while keeping the low computational loads of motion estimation as the level-refined method, thus to achieve both temporal compression quality and computational simplicity. ^ Since circular convolution is applied in wavelet transform to obtain the decomposed subframes without coefficient expansion, symmetric-extended wavelet transform is designed on the finite length frame signals for more accurate motion estimation without discontinuous boundary distortions. ^ Although wavelet transformed coefficients still contain spatial domain information, motion estimation in wavelet domain is not as straightforward as in spatial domain due to the shift variance property of the decimation process of the wavelet transform. A new approach called sub-decimation decomposition method is proposed, which maintains the motion consistency between the original frame and the decomposed subframes, improving as a consequence the wavelet domain video compressions by shift invariant motion estimation and compensation. ^
Resumo:
Acknowledgements This work received funding from the Marine Alliance for Science and Technology for Scotland (MASTS) pooling initiative and their support is gratefully acknowledged. MASTS is funded by the Scottish Funding Council (grant reference HR09011) and contributing institutions. We thank Joshua Lawrence and Niall Fallon for their assistance in collecting some of the video data.
Resumo:
In this thesis, we propose to infer pixel-level labelling in video by utilising only object category information, exploiting the intrinsic structure of video data. Our motivation is the observation that image-level labels are much more easily to be acquired than pixel-level labels, and it is natural to find a link between the image level recognition and pixel level classification in video data, which would transfer learned recognition models from one domain to the other one. To this end, this thesis proposes two domain adaptation approaches to adapt the deep convolutional neural network (CNN) image recognition model trained from labelled image data to the target domain exploiting both semantic evidence learned from CNN, and the intrinsic structures of unlabelled video data. Our proposed approaches explicitly model and compensate for the domain adaptation from the source domain to the target domain which in turn underpins a robust semantic object segmentation method for natural videos. We demonstrate the superior performance of our methods by presenting extensive evaluations on challenging datasets comparing with the state-of-the-art methods.
Resumo:
Our objective for this thesis work was the deployment of a Neural Network based approach for video object detection on board a nano-drone. Furthermore, we have studied some possible extensions to exploit the temporal nature of videos to improve the detection capabilities of our algorithm. For our project, we have utilized the Mobilenetv2/v3SSDLite due to their limited computational and memory requirements. We have trained our networks on the IMAGENET VID 2015 dataset and to deploy it onto the nano-drone we have used the NNtool and Autotiler tools by GreenWaves. To exploit the temporal nature of video data we have tried different approaches: the introduction of an LSTM based convolutional layer in our architecture, the introduction of a Kalman filter based tracker as a postprocessing step to augment the results of our base architecture. We have obtain a total improvement in our performances of about 2.5 mAP with the Kalman filter based method(BYTE). Our detector run on a microcontroller class processor on board the nano-drone at 1.63 fps.
Resumo:
Background. Although digital and videotaped images are known to be comparable for the evaluation of left ventricular function, their relative accuracy for assessment of more complex anatomy is unclear. We sought to compare reading time, storage costs, and concordance of video and digital interpretations across multiple observers and sites. Methods. One hundred one patients with valvular (90 mitral, 48 aortic, 80 tricuspid) disease were selected prospectively, and studies were stored according to video and standardized digital protocols. The same reviewer interpreted video and digital images independently and at different times with the use of a standard report form to evaluate 40 items (eg, severity of stenosis or regurgitation, leaflet thickening, and calcification) as normal or mildly, moderately, or severely abnormal Concordance between modalities was expressed at kappa Major discordance (difference of >1 level of severity) was ascribed to the modality that gave the lesser severity. CD-ROM was used to store digital data (20:1 lossy compression), and super-VHS video-tape was used to store video data The reading time and storage costs for each modality were compared Results. Measured parameters were highly concordant (ejection fraction was 52% +/- 13% by both). Major discordance was rare, and lesser values were reported with digital rather than video interpretation in the categories of aortic and mitral valve thicken ing (1% to 2%) and severity of mitral regurgitation (2%). Digital reading time was 6.8 +/- 2.4 minutes, 38% shorter than with video (11.0 +/- 3.0, range 8 to 22 minutes, P < .001). Compressed digital studies had an average size of 60 <plus/minus> 14 megabytes (range 26 to 96 megabytes). Storage cost for video was A$0.62 per patient (18 studies per tape, total cost A$11.20), compared with A$0.31 per patient for digital storage (8 studies per CD-ROM, total cost A$2.50). Conclusion. Digital and video interpretation were highly concordant; in the few cases of major discordance, the digital scores were lower, perhaps reflecting undersampling. Use of additional views and longer clips may be indicated to minimize discordance with video in patients with complex problems. Digital interpretation offers a significant reduction in reading times and the cost of archiving.
Resumo:
Mestrado em Engenharia Electrotécnica e de Computadores
Resumo:
Introdução e objetivos – A contração muscular do quadricípete, isquiotibiais e rotadores externos apresenta um papel crucial no controlo da estabilidade articular dinâmica, nomeadamente no valgo de joelho no plano frontal. O objetivo deste estudo foi descrever a existência de relação entre a ativação muscular (medida pela recolha eletromiográfica do reto anterior, dos isquiotibiais e do grande glúteo) e a variação da angulação do joelho durante a fase de apoio do salto vertical no plano frontal (medida pela análise cinemática de vídeo) através de um estudo‑piloto. Metodologia – Trata‑se de um estudo descritivo correlacional com uma amostra de 220 saltos verticais (110 saltos correspondentes a cada um dos membros inferiores) realizados por 4 executantes (dois do género feminino e dois do género masculino com idade média de 28 anos ± 6,4). Resultados – Verificou‑se a existência de correlações significativas entre a ativação muscular do reto anterior na fase descendente do salto à direita e à esquerda e a tendência para um menor ou maior ângulo de valgo, respetivamente. O género influencia a dinâmica do joelho, verificando‑se que as mulheres apresentam estratégias de ativação diferentes dos homens. Conclusão – A diminuição da ativação muscular da anca parece influenciar os movimentos dinâmicos do joelho no plano frontal, enquadrando‑se nos resultados obtidos por outros autores. O material utilizado na recolha de dados, o sincronismo entre o trigger e o vídeo e o número reduzido de executantes que realizaram a amostra poderão constituir limitações ao estudo que se deverão considerar na realização de estudos futuros.
Resumo:
Photo-mosaicing techniques have become popular for seafloor mapping in various marine science applications. However, the common methods cannot accurately map regions with high relief and topographical variations. Ortho-mosaicing borrowed from photogrammetry is an alternative technique that enables taking into account the 3-D shape of the terrain. A serious bottleneck is the volume of elevation information that needs to be estimated from the video data, fused, and processed for the generation of a composite ortho-photo that covers a relatively large seafloor area. We present a framework that combines the advantages of dense depth-map and 3-D feature estimation techniques based on visual motion cues. The main goal is to identify and reconstruct certain key terrain feature points that adequately represent the surface with minimal complexity in the form of piecewise planar patches. The proposed implementation utilizes local depth maps for feature selection, while tracking over several views enables 3-D reconstruction by bundle adjustment. Experimental results with synthetic and real data validate the effectiveness of the proposed approach
Resumo:
The project aims at advancing the state of the art in the use of context information for classification of image and video data. The use of context in the classification of images has been showed of great importance to improve the performance of actual object recognition systems. In our project we proposed the concept of Multi-scale Feature Labels as a general and compact method to exploit the local and global context. The feature extraction from the discriminative probability or classification confidence label field is of great novelty. Moreover the use of a multi-scale representation of the feature labels lead to a compact and efficient description of the context. The goal of the project has been also to provide a general-purpose method and prove its suitability in different image/video analysis problem. The two-year project generated 5 journal publications (plus 2 under submission), 10 conference publications (plus 2 under submission) and one patent (plus 1 pending). Of these publications, a relevant number make use of the main result of this project to improve the results in detection and/or segmentation of objects.
Resumo:
A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisitionsessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of eithermonomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to beavailable for research purposes through the BioSecure Association during 2008.
Resumo:
The differentiation between benign and malignant focal liver lesions plays an important role in diagnosis of liver disease and therapeutic planning of local or general disease. This differentiation, based on characterization, relies on the observation of the dynamic vascular patterns (DVP) of lesions with respect to adjacent parenchyma, and may be assessed during contrast-enhanced ultrasound imaging after a bolus injection. For instance, hemangiomas (i.e., benign lesions) exhibit hyper-enhanced signatures over time, whereas metastases (i.e., malignant lesions) frequently present hyperenhanced foci during the arterial phase and always become hypo-enhanced afterwards. The objective of this work was to develop a new parametric imaging technique, aimed at mapping the DVP signatures into a single image called a DVP parametric image, conceived as a diagnostic aid tool for characterizing lesion types. The methodology consisted in processing a time sequence of images (DICOM video data) using four consecutive steps: (1) pre-processing combining image motion correction and linearization to derive an echo-power signal, in each pixel, proportional to local contrast agent concentration over time; (2) signal modeling, by means of a curve-fitting optimization, to compute a difference signal in each pixel, as the subtraction of adjacent parenchyma kinetic from the echopower signal; (3) classification of difference signals; and (4) parametric image rendering to represent classified pixels as a support for diagnosis. DVP parametric imaging was the object of a clinical assessment on a total of 146 lesions, imaged using different medical ultrasound systems. The resulting sensitivity and specificity were 97% and 91%, respectively, which compare favorably with scores of 81 to 95% and 80 to 95% reported in medical literature for sensitivity and specificity, respectively.
Resumo:
OBJECTIVE: The aim of this paper was to examine sexual knowledge, concerns and needs of youth with spina bifida (SB) to inform the medical community on ways to better support their sexual health. METHODS: As part of the Video Intervention/Prevention Assessment (VIA) - transitions, a prospective cohort study, 309 h of video data were collected from 14 participants (13-28 years old) with SB. Participants were loaned a video camcorder for 8-12 weeks to shoot visual narratives about any aspects of their lives. V/A visual narratives were analysed with grounded theory using NVivo. RESULTS: Out of 14 participants, 11 (six women) addressed issues surrounding romantic relationships and sexuality in their video clips. Analysis revealed shared concerns, questions and challenges regarding sexuality gathered under four main themes: romantic relationships, sexuality, fertility and parenthood, and need for more talk on sexuality. CONCLUSIONS: Youth with SB reported difficulties in finding answers to questions regarding their sexuality, romantic relationships and fertility. This study revealed a need for help from the medical community to inform and empower youth with SB in the area of sexual health. Through sexual and reproductive health education with patients and parents starting at an early age, medical providers can further encourage healthy emotional and physical development in adolescents transitioning into adulthood.
Resumo:
The focus of the present work was on 10- to 12-year-old elementary school students’ conceptual learning outcomes in science in two specific inquiry-learning environments, laboratory and simulation. The main aim was to examine if it would be more beneficial to combine than contrast simulation and laboratory activities in science teaching. It was argued that the status quo where laboratories and simulations are seen as alternative or competing methods in science teaching is hardly an optimal solution to promote students’ learning and understanding in various science domains. It was hypothesized that it would make more sense and be more productive to combine laboratories and simulations. Several explanations and examples were provided to back up the hypothesis. In order to test whether learning with the combination of laboratory and simulation activities can result in better conceptual understanding in science than learning with laboratory or simulation activities alone, two experiments were conducted in the domain of electricity. In these experiments students constructed and studied electrical circuits in three different learning environments: laboratory (real circuits), simulation (virtual circuits), and simulation-laboratory combination (real and virtual circuits were used simultaneously). In order to measure and compare how these environments affected students’ conceptual understanding of circuits, a subject knowledge assessment questionnaire was administered before and after the experimentation. The results of the experiments were presented in four empirical studies. Three of the studies focused on learning outcomes between the conditions and one on learning processes. Study I analyzed learning outcomes from experiment I. The aim of the study was to investigate if it would be more beneficial to combine simulation and laboratory activities than to use them separately in teaching the concepts of simple electricity. Matched-trios were created based on the pre-test results of 66 elementary school students and divided randomly into a laboratory (real circuits), simulation (virtual circuits) and simulation-laboratory combination (real and virtual circuits simultaneously) conditions. In each condition students had 90 minutes to construct and study various circuits. The results showed that studying electrical circuits in the simulation–laboratory combination environment improved students’ conceptual understanding more than studying circuits in simulation and laboratory environments alone. Although there were no statistical differences between simulation and laboratory environments, the learning effect was more pronounced in the simulation condition where the students made clear progress during the intervention, whereas in the laboratory condition students’ conceptual understanding remained at an elementary level after the intervention. Study II analyzed learning outcomes from experiment II. The aim of the study was to investigate if and how learning outcomes in simulation and simulation-laboratory combination environments are mediated by implicit (only procedural guidance) and explicit (more structure and guidance for the discovery process) instruction in the context of simple DC circuits. Matched-quartets were created based on the pre-test results of 50 elementary school students and divided randomly into a simulation implicit (SI), simulation explicit (SE), combination implicit (CI) and combination explicit (CE) conditions. The results showed that when the students were working with the simulation alone, they were able to gain significantly greater amount of subject knowledge when they received metacognitive support (explicit instruction; SE) for the discovery process than when they received only procedural guidance (implicit instruction: SI). However, this additional scaffolding was not enough to reach the level of the students in the combination environment (CI and CE). A surprising finding in Study II was that instructional support had a different effect in the combination environment than in the simulation environment. In the combination environment explicit instruction (CE) did not seem to elicit much additional gain for students’ understanding of electric circuits compared to implicit instruction (CI). Instead, explicit instruction slowed down the inquiry process substantially in the combination environment. Study III analyzed from video data learning processes of those 50 students that participated in experiment II (cf. Study II above). The focus was on three specific learning processes: cognitive conflicts, self-explanations, and analogical encodings. The aim of the study was to find out possible explanations for the success of the combination condition in Experiments I and II. The video data provided clear evidence about the benefits of studying with the real and virtual circuits simultaneously (the combination conditions). Mostly the representations complemented each other, that is, one representation helped students to interpret and understand the outcomes they received from the other representation. However, there were also instances in which analogical encoding took place, that is, situations in which the slightly discrepant results between the representations ‘forced’ students to focus on those features that could be generalised across the two representations. No statistical differences were found in the amount of experienced cognitive conflicts and self-explanations between simulation and combination conditions, though in self-explanations there was a nascent trend in favour of the combination. There was also a clear tendency suggesting that explicit guidance increased the amount of self-explanations. Overall, the amount of cognitive conflicts and self-explanations was very low. The aim of the Study IV was twofold: the main aim was to provide an aggregated overview of the learning outcomes of experiments I and II; the secondary aim was to explore the relationship between the learning environments and students’ prior domain knowledge (low and high) in the experiments. Aggregated results of experiments I & II showed that on average, 91% of the students in the combination environment scored above the average of the laboratory environment, and 76% of them scored also above the average of the simulation environment. Seventy percent of the students in the simulation environment scored above the average of the laboratory environment. The results further showed that overall students seemed to benefit from combining simulations and laboratories regardless of their level of prior knowledge, that is, students with either low or high prior knowledge who studied circuits in the combination environment outperformed their counterparts who studied in the laboratory or simulation environment alone. The effect seemed to be slightly bigger among the students with low prior knowledge. However, more detailed inspection of the results showed that there were considerable differences between the experiments regarding how students with low and high prior knowledge benefitted from the combination: in Experiment I, especially students with low prior knowledge benefitted from the combination as compared to those students that used only the simulation, whereas in Experiment II, only students with high prior knowledge seemed to benefit from the combination relative to the simulation group. Regarding the differences between simulation and laboratory groups, the benefits of using a simulation seemed to be slightly higher among students with high prior knowledge. The results of the four empirical studies support the hypothesis concerning the benefits of using simulation along with laboratory activities to promote students’ conceptual understanding of electricity. It can be concluded that when teaching students about electricity, the students can gain better understanding when they have an opportunity to use the simulation and the real circuits in parallel than if they have only the real circuits or only a computer simulation available, even when the use of the simulation is supported with the explicit instruction. The outcomes of the empirical studies can be considered as the first unambiguous evidence on the (additional) benefits of combining laboratory and simulation activities in science education as compared to learning with laboratories and simulations alone.