19 resultados para Audio-visual content classification

em Repositório Científico do Instituto Politécnico de Lisboa - Portugal


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Personal memories composed of digital pictures are very popular at the moment. To retrieve these media items annotation is required. During the last years, several approaches have been proposed in order to overcome the image annotation problem. This paper presents our proposals to address this problem. Automatic and semi-automatic learning methods for semantic concepts are presented. The automatic method is based on semantic concepts estimated using visual content, context metadata and audio information. The semi-automatic method is based on results provided by a computer game. The paper describes our proposals and presents their evaluations.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada à Escola Superior de Comunicação Social como parte dos requisitos para obtenção de grau de mestre em Audiovisual e Multimédia.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Dissertação apresentada para obtenção do grau de Mestre em Educação Matemática na Educação Pré-Escolar e nos 1º e 2º Ciclos do Ensino Básico na especialidade de Didática da Matemática

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Les méthodes modernes d’enseignement exigent de recréer le milieu de la langue étudiée, de faire parler les élèves dans des situations différentes. En Géorgie, l’enseignement de la langue étrangère s’effectue à partir de 6 ans, en même temps que celui de la langue maternelle. Les élèves apprennent à écrire en français après l’apprentissage de l’écriture en géorgien. A l’âge de 7-10 ans, ils connaissent déjà 3 alphabets différents : le géorgien, le latin et le cyrillique. L’objectif de cet article est de proposer une méthode qui pourra faciliter l’apprentissage du français aux non francophones grâce aux moyens audiovisuels qui sont très efficaces surtout au moment quand l’enfant ne sait ni lire, ni écrire en langue étrangère. Cependant, les moyens audiovisuels doivent être utilisés à des doses normales sans empêcher l’activité de l’élève.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Relatório Final de Estágio apresentado à Escola Superior de Dança, com vista à obtenção do grau de Mestre em Ensino de Dança.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Relevant past events can be remembered when visualizing related pictures. The main difficulty is how to find these photos in a large personal collection. Query definition and image annotation are key issues to overcome this problem. The former is relevant due to the diversity of the clues provided by our memory when recovering a past moment and the later because images need to be annotated with information regarding those clues to be retrieved. Consequently, tools to recover past memories should deal carefully with these two tasks. This paper describes a user interface designed to explore pictures from personal memories. Users can query the media collection in several ways and for this reason an iconic visual language to define queries is proposed. Automatic and semi-automatic annotation is also performed using the image content and the audio information obtained when users show their images to others. The paper also presents the user interface evaluation based on tests with 58 participants.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an integrated system for vehicle classification. This system aims to classify vehicles using different approaches: 1) based on the height of the first axle and_the number of axles; 2) based on volumetric measurements and; 3) based on features extracted from the captured image of the vehicle. The system uses a laser sensor for measurements and a set of image analysis algorithms to compute some visual features. By combining different classification methods, it is shown that the system improves its accuracy and robustness, enabling its usage in more difficult environments satisfying the proposed requirements established by the Portuguese motorway contractor BRISA.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In music genre classification, most approaches rely on statistical characteristics of low-level features computed on short audio frames. In these methods, it is implicitly considered that frames carry equally relevant information loads and that either individual frames, or distributions thereof, somehow capture the specificities of each genre. In this paper we study the representation space defined by short-term audio features with respect to class boundaries, and compare different processing techniques to partition this space. These partitions are evaluated in terms of accuracy on two genre classification tasks, with several types of classifiers. Experiments show that a randomized and unsupervised partition of the space, used in conjunction with a Markov Model classifier lead to accuracies comparable to the state of the art. We also show that unsupervised partitions of the space tend to create less hubs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Video coding technologies have played a major role in the explosion of large market digital video applications and services. In this context, the very popular MPEG-x and H-26x video coding standards adopted a predictive coding paradigm, where complex encoders exploit the data redundancy and irrelevancy to 'control' much simpler decoders. This codec paradigm fits well applications and services such as digital television and video storage where the decoder complexity is critical, but does not match well the requirements of emerging applications such as visual sensor networks where the encoder complexity is more critical. The Slepian Wolf and Wyner-Ziv theorems brought the possibility to develop the so-called Wyner-Ziv video codecs, following a different coding paradigm where it is the task of the decoder, and not anymore of the encoder, to (fully or partly) exploit the video redundancy. Theoretically, Wyner-Ziv video coding does not incur in any compression performance penalty regarding the more traditional predictive coding paradigm (at least for certain conditions). In the context of Wyner-Ziv video codecs, the so-called side information, which is a decoder estimate of the original frame to code, plays a critical role in the overall compression performance. For this reason, much research effort has been invested in the past decade to develop increasingly more efficient side information creation methods. This paper has the main objective to review and evaluate the available side information methods after proposing a classification taxonomy to guide this review, allowing to achieve more solid conclusions and better identify the next relevant research challenges. After classifying the side information creation methods into four classes, notably guess, try, hint and learn, the review of the most important techniques in each class and the evaluation of some of them leads to the important conclusion that the side information creation methods provide better rate-distortion (RD) performance depending on the amount of temporal correlation in each video sequence. It became also clear that the best available Wyner-Ziv video coding solutions are almost systematically based on the learn approach. The best solutions are already able to systematically outperform the H.264/AVC Intra, and also the H.264/AVC zero-motion standard solutions for specific types of content. (C) 2013 Elsevier B.V. All rights reserved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: Screening programs to detect visual abnormalities in children vary among countries. The aim of this study is to describe experts' perception of best practice guidelines and competency framework for visual screening in children. METHODS: A qualitative focus group technique was applied during the Portuguese national orthoptic congress to obtain the perception of an expert panel of 5 orthoptists and 2 ophthalmologists with experience in visual screening for children (mean age 53.43 years, SD ± 9.40). The panel received in advance a script with the description of three tuning competencies dimensions (instrumental, systemic, and interpersonal) for visual screening. The session was recorded in video and audio. Qualitative data were analyzed using a categorical technique. RESULTS: According to experts' views, six tests (35.29%) have to be included in a visual screening: distance visual acuity test, cover test, bi-prism or 4/6(Δ) prism, fusion, ocular movements, and refraction. Screening should be performed according to the child age before and after 3 years of age (17.65%). The expert panel highlighted the influence of the professional experience in the application of a screening protocol (23.53%). They also showed concern about the false negatives control (23.53%). Instrumental competencies were the most cited (54.09%), followed by interpersonal (29.51%) and systemic (16.4%). CONCLUSIONS: Orthoptists should have professional experience before starting to apply a screening protocol. False negative results are a concern that has to be more thoroughly investigated. The proposed framework focuses on core competencies highlighted by the expert panel. Competencies programs could be important do develop better screening programs.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

PURPOSE: Fatty liver disease (FLD) is an increasing prevalent disease that can be reversed if detected early. Ultrasound is the safest and ubiquitous method for identifying FLD. Since expert sonographers are required to accurately interpret the liver ultrasound images, lack of the same will result in interobserver variability. For more objective interpretation, high accuracy, and quick second opinions, computer aided diagnostic (CAD) techniques may be exploited. The purpose of this work is to develop one such CAD technique for accurate classification of normal livers and abnormal livers affected by FLD. METHODS: In this paper, the authors present a CAD technique (called Symtosis) that uses a novel combination of significant features based on the texture, wavelet transform, and higher order spectra of the liver ultrasound images in various supervised learning-based classifiers in order to determine parameters that classify normal and FLD-affected abnormal livers. RESULTS: On evaluating the proposed technique on a database of 58 abnormal and 42 normal liver ultrasound images, the authors were able to achieve a high classification accuracy of 93.3% using the decision tree classifier. CONCLUSIONS: This high accuracy added to the completely automated classification procedure makes the authors' proposed technique highly suitable for clinical deployment and usage.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Steatosis, also known as fatty liver, corresponds to an abnormal retention of lipids within the hepatic cells and reflects an impairment of the normal processes of synthesis and elimination of fat. Several causes may lead to this condition, namely obesity, diabetes, or alcoholism. In this paper an automatic classification algorithm is proposed for the diagnosis of the liver steatosis from ultrasound images. The features are selected in order to catch the same characteristics used by the physicians in the diagnosis of the disease based on visual inspection of the ultrasound images. The algorithm, designed in a Bayesian framework, computes two images: i) a despeckled one, containing the anatomic and echogenic information of the liver, and ii) an image containing only the speckle used to compute the textural features. These images are computed from the estimated RF signal generated by the ultrasound probe where the dynamic range compression performed by the equipment is taken into account. A Bayes classifier, trained with data manually classified by expert clinicians and used as ground truth, reaches an overall accuracy of 95% and a 100% of sensitivity. The main novelties of the method are the estimations of the RF and speckle images which make it possible to accurately compute textural features of the liver parenchyma relevant for the diagnosis.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mestrado em Radiações Aplicadas às Tecnologias da Saúde - Ramo de especialização: Imagem por Ressonância Magnética

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background - Medical image perception research relies on visual data to study the diagnostic relationship between observers and medical images. A consistent method to assess visual function for participants in medical imaging research has not been developed and represents a significant gap in existing research. Methods - Three visual assessment factors appropriate to observer studies were identified: visual acuity, contrast sensitivity, and stereopsis. A test was designed for each, and 30 radiography observers (mean age 31.6 years) participated in each test. Results - Mean binocular visual acuity for distance was 20/14 for all observers. The difference between observers who did and did not use corrective lenses was not statistically significant (P = .12). All subjects had a normal value for near visual acuity and stereoacuity. Contrast sensitivity was better than population norms. Conclusion - All observers had normal visual function and could participate in medical imaging visual analysis studies. Protocols of evaluation and populations norms are provided. Further studies are necessary to understand fully the relationship between visual performance on tests and diagnostic accuracy in practice.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

O processo de envelhecimento fisiológico é marcado por um decréscimo das capacidades motoras, redução da força, flexibilidade, função visual, entre outros. Adicionalmente, alterações patológicas do sistema visual com impacto na função visual podem alterar o equilíbrio e aumentar o risco de quedas. Os indivíduos com idade ≥ 50 anos representam entre 65% a 82% dos casos de baixa visão e cegueira. O risco de quedas é superior em mulheres com acuidade visual ≤0,5. Objectivo do estudo: avaliar a relação entre a função visual e o risco de quedas.