795 resultados para Annotated Video Clips
Resumo:
[EN]The use of new technologies in order to step up the inter- action between humans and machines is the main proof that faces are important in videos. Therefore we suggest a novel Face Video Database for development, testing and veri cation of algorithms related to face- based applications and to facial recognition applications. In addition of facial expression videos, the database includes body videos. The videos are taken by three di erent cameras, working in real time, without vary- ing illumination conditions.
Resumo:
[EN]This paper summarizes the proposal made by the SIANI team for the LifeCLEF 2015 Fish task. The approach makes use of standard detection techniques, applying a multiclass SVM based classifier on large enough Regions Of Interest (ROIs) automatically extracted from the provided video frames. The selection of the detection and classification modules is based on the best performance achieved for the validation dataset consisting of 20 annotated videos. For that dataset, the best classification achieved for an ideal detection module, reaches an accuracy around 40%.
Resumo:
Facial expression recognition is one of the most challenging research areas in the image recognition ¯eld and has been actively studied since the 70's. For instance, smile recognition has been studied due to the fact that it is considered an important facial expression in human communication, it is therefore likely useful for human–machine interaction. Moreover, if a smile can be detected and also its intensity estimated, it will raise the possibility of new applications in the future
Resumo:
[EN]Parliamentary websites have become one of the most important windows for citizens and media to follow the activities of their legislatures and to hold parliaments to account. Therefore, most parliamentary institutions aim to provide new multimedia solutions capable of displaying video fragments on demand on plenary activities. This paper presents a multimedia system for parliamentary institutions to produce video fragments on demand through a website with linked information and public feedback that helps to explain the content shown in these fragments. A prototype implementation has been developed for the Canary Islands Parliament (Spain) and shows how traditional parliamentary streaming systems can be enhanced by the use of semantics and computer vision for video analytics...
Resumo:
In questa tesi viene affrontato il tema del tracciamento video, analizzando le principali tecniche, metodologie e strumenti per la video analytics. L'intero lavoro, è stato svolto interamente presso l'azienda BitBang, dal reperimento di informazioni e materiale utile, fino alla stesura dell'elaborato. Nella stessa azienda ho avuto modo di svolgere il tirocinio, durante il quale ho approfondito gli aspetti pratici della web e video analytics, osservando il lavoro sul campo degli specialisti del settore e acquisendo familiarità con gli strumenti di analisi dati tramite l'utilizzo delle principali piattaforme di web analytics. Per comprendere a pieno questo argomento, è stato necessario innanzitutto conoscere la web analytics di base. Saranno illustrate quindi, le metodologie classiche della web analytics, ovvero come analizzare il comportamento dei visitatori nelle pagine web con le metriche più adatte in base alle diverse tipologie di business, fino ad arrivare alla nuova tecnica di tracciamento eventi. Questa nasce subito dopo la diffusione nelle pagine dei contenuti multimediali, i quali hanno portato a un cambiamento nelle modalità di navigazione degli utenti e, di conseguenza, all'esigenza di tracciare le nuove azioni generate su essi, per avere un quadro completo dell'esperienza dei visitatori sul sito. Non sono più sufficienti i dati ottenuti con i tradizionali metodi della web analytics, ma è necessario integrarla con tecniche nuove, indispensabili se si vuole ottenere una panoramica a 360 gradi di tutto ciò che succede sul sito. Da qui viene introdotto il tracciamento video, chiamato video analytics. Verranno illustrate le principali metriche per l'analisi, e come sfruttarle al meglio in base alla tipologia di sito web e allo scopo di business per cui il video viene utilizzato. Per capire in quali modi sfruttare il video come strumento di marketing e analizzare il comportamento dei visitatori su di esso, è necessario fare prima un passo indietro, facendo una panoramica sui principali aspetti legati ad esso: dalla sua produzione, all'inserimento sulle pagine web, i player per farlo, e la diffusione attraverso i siti di social netwok e su tutti i nuovi dispositivi e le piattaforme connessi nella rete. A questo proposito viene affrontata la panoramica generale di approfondimento sugli aspetti più tecnici, dove vengono mostrate le differenze tra i formati di file e i formati video, le tecniche di trasmissione sul web, come ottimizzare l'inserimento dei contenuti sulle pagine, la descrizione dei più famosi player per l'upload, infine un breve sguardo sulla situazione attuale riguardo alla guerra tra formati video open source e proprietari sul web. La sezione finale è relativa alla parte più pratica e sperimentale del lavoro. Nel capitolo 7 verranno descritte le principali funzionalità di due piattaforme di web analytics tra le più utilizzate, una gratuita, Google Analytics e una a pagamento, Omniture SyteCatalyst, con particolare attenzione alle metriche per il tracciamento video, e le differenze tra i due prodotti. Inoltre, mi è sembrato interessante illustrare le caratteristiche di alcune piattaforme specifiche per la video analytics, analizzando le più interessanti funzionalità offerte, anche se non ho avuto modo di testare il loro funzionamento nella pratica. Nell'ultimo capitolo vengono illustrate alcune applicazioni pratiche della video analytics, che ho avuto modo di osservare durante il periodo di tirocinio e tesi in azienda. Vengono descritte in particolare le problematiche riscontrate con i prodotti utilizzati per il tracciamento, le soluzioni proposte e le questioni che ancora restano irrisolte in questo campo.
Resumo:
The construction and use of multimedia corpora has been advocated for a while in the literature as one of the expected future application fields of Corpus Linguistics. This research project represents a pioneering experience aimed at applying a data-driven methodology to the study of the field of AVT, similarly to what has been done in the last few decades in the macro-field of Translation Studies. This research was based on the experience of Forlixt 1, the Forlì Corpus of Screen Translation, developed at the University of Bologna’s Department of Interdisciplinary Studies in Translation, Languages and Culture. As a matter of fact, in order to quantify strategies of linguistic transfer of an AV product, we need to take into consideration not only the linguistic aspect of such a product but all the meaning-making resources deployed in the filmic text. Provided that one major benefit of Forlixt 1 is the combination of audiovisual and textual data, this corpus allows the user to access primary data for scientific investigation, and thus no longer rely on pre-processed material such as traditional annotated transcriptions. Based on this rationale, the first chapter of the thesis sets out to illustrate the state of the art of research in the disciplinary fields involved. The primary objective was to underline the main repercussions on multimedia texts resulting from the interaction of a double support, audio and video, and, accordingly, on procedures, means, and methods adopted in their translation. By drawing on previous research in semiotics and film studies, the relevant codes at work in visual and acoustic channels were outlined. Subsequently, we concentrated on the analysis of the verbal component and on the peculiar characteristics of filmic orality as opposed to spontaneous dialogic production. In the second part, an overview of the main AVT modalities was presented (dubbing, voice-over, interlinguistic and intra-linguistic subtitling, audio-description, etc.) in order to define the different technologies, processes and professional qualifications that this umbrella term presently includes. The second chapter focuses diachronically on various theories’ contribution to the application of Corpus Linguistics’ methods and tools to the field of Translation Studies (i.e. Descriptive Translation Studies, Polysystem Theory). In particular, we discussed how the use of corpora can favourably help reduce the gap existing between qualitative and quantitative approaches. Subsequently, we reviewed the tools traditionally employed by Corpus Linguistics in regard to the construction of traditional “written language” corpora, to assess whether and how they can be adapted to meet the needs of multimedia corpora. In particular, we reviewed existing speech and spoken corpora, as well as multimedia corpora specifically designed to investigate Translation. The third chapter reviews Forlixt 1's main developing steps, from a technical (IT design principles, data query functions) and methodological point of view, by laying down extensive scientific foundations for the annotation methods adopted, which presently encompass categories of pragmatic, sociolinguistic, linguacultural and semiotic nature. Finally, we described the main query tools (free search, guided search, advanced search and combined search) and the main intended uses of the database in a pedagogical perspective. The fourth chapter lists specific compilation criteria retained, as well as statistics of the two sub-corpora, by presenting data broken down by language pair (French-Italian and German-Italian) and genre (cinema’s comedies, television’s soapoperas and crime series). Next, we concentrated on the discussion of the results obtained from the analysis of summary tables reporting the frequency of categories applied to the French-Italian sub-corpus. The detailed observation of the distribution of categories identified in the original and dubbed corpus allowed us to empirically confirm some of the theories put forward in the literature and notably concerning the nature of the filmic text, the dubbing process and Italian dubbed language’s features. This was possible by looking into some of the most problematic aspects, like the rendering of socio-linguistic variation. The corpus equally allowed us to consider so far neglected aspects, such as pragmatic, prosodic, kinetic, facial, and semiotic elements, and their combination. At the end of this first exploration, some specific observations concerning possible macrotranslation trends were made for each type of sub-genre considered (cinematic and TV genre). On the grounds of this first quantitative investigation, the fifth chapter intended to further examine data, by applying ad hoc models of analysis. Given the virtually infinite number of combinations of categories adopted, and of the latter with searchable textual units, three possible qualitative and quantitative methods were designed, each of which was to concentrate on a particular translation dimension of the filmic text. The first one was the cultural dimension, which specifically focused on the rendering of selected cultural references and on the investigation of recurrent translation choices and strategies justified on the basis of the occurrence of specific clusters of categories. The second analysis was conducted on the linguistic dimension by exploring the occurrence of phrasal verbs in the Italian dubbed corpus and by ascertaining the influence on the adoption of related translation strategies of possible semiotic traits, such as gestures and facial expressions. Finally, the main aim of the third study was to verify whether, under which circumstances, and through which modality, graphic and iconic elements were translated into Italian from an original corpus of both German and French films. After having reviewed the main translation techniques at work, an exhaustive account of possible causes for their non-translation was equally provided. By way of conclusion, the discussion of results obtained from the distribution of annotation categories on the French-Italian corpus, as well as the application of specific models of analysis allowed us to underline possible advantages and drawbacks related to the adoption of a corpus-based approach to AVT studies. Even though possible updating and improvement were proposed in order to help solve some of the problems identified, it is argued that the added value of Forlixt 1 lies ultimately in having created a valuable instrument, allowing to carry out empirically-sound contrastive studies that may be usefully replicated on different language pairs and several types of multimedia texts. Furthermore, multimedia corpora can also play a crucial role in L2 and translation teaching, two disciplines in which their use still lacks systematic investigation.
Resumo:
In questo elaborato, dopo una descrizione delle procedure per la creazione degli standard per il il broadcasting numerico adottate dal DVB forum, vengono presi in considerazione i trends del mercato del broadcasting numerico e analizzato in dettaglio lo standard utilizzato per la diffusione televisiva terrestre DVB-T e la sua evoluzione DVB-T2.
Resumo:
La tesi si propone di affrontare il tema del Live Streaming in sistemi P2P con particolare riferimento a Sopcast, un applicativo di P2PTV. Viene fatto un ricorso storico riguardo alla nascita dello streaming e al suo sviluppo, vengono descritte le caratteristiche, il protocollo di comunicazione e i modelli più diffusi per il live streaming P2P. Inoltre si tratterà come viene garantita la qualità del servizio e valutate le performance di un servizio P2PTV.
Resumo:
This work has been realized by the author in his PhD course in Electrical, Computer Science and Telecommunication at the University of Bologna, Faculty of Engineering, Italy. All the documentation here reported is a summary of years of work, under the supervision of Prof. Oreste Andrisano, coordinator of Wireless Communication Laboratory - WiLab, in Bologna. The subject of this thesis is the transmission of video in a context of heterogeneous network, and in particular, using a wireless channel. All the instrumentation that has been used for the characterization of the telecommunication systems belongs to CNR (National Research Council), CNIT (Italian Inter- University Center), and DEIS (Dept. of Electrical, Computer Science, and Systems). From November 2009 to July 2010, the author spent his time abroad, working in collaboration with DLR - German Aerospace Center in Munich, Germany, on channel coding area, developing a general purpose decoder machine to decode a huge family of iterative codes. A patent concerning Doubly Generalized-Low Density Parity Check codes has been produced by the author as well as some important scientic papers, published on IEEE journals and conferences.
Resumo:
Skype is one of the well-known applications that has guided the evolution of real-time video streaming and has become one of the most used software in everyday life. It provides VoIP audio/video calls as well as messaging chat and file transfer. Many versions are available covering all the principal operating systems like Windows, Macintosh and Linux but also mobile systems. Voice quality decreed Skype success since its birth in 2003 and peer-to-peer architecture has allowed worldwide diffusion. After video call introduction in 2006 Skype became a complete solution to communicate between two or more people. As a primarily video conferencing application, Skype assumes certain characteristics of the delivered video to optimize its perceived quality. However in the last years, and with the recent release of SkypeKit1, many new Skype video-enabled devices came out especially in the mobile world. This forced a change to the traditional recording, streaming and receiving settings allowing for a wide range of network and content dynamics. Video calls are not anymore based on static ‘chatting’ but mobile devices have opened new possibilities and can be used in several scenarios. For instance, lecture streaming or one-to-one mobile video conferences exhibit more dynamics as both caller and callee might be on move. Most of these cases are different from “head&shoulder” only content. Therefore, Skype needs to optimize its video streaming engine to cover more video types. Heterogeneous connections require different behaviors and solutions and Skype must face with this variety to maintain a certain quality independently from connection used. Part of the present work will be focused on analyzing Skype behavior depending on video content. Since Skype protocol is proprietary most of the studies so far have tried to characterize its traffic and to reverse engineer its protocol. However, questions related to the behavior of Skype, especially on quality as perceived by users, remain unanswered. We will study Skype video codecs capabilities and video quality assessment. Another motivation of our work is the design of a mechanism that estimates the perceived cost of network conditions on Skype video delivery. To this extent we will try to assess in an objective way the impact of network impairments on the perceived quality of a Skype video call. Traditional video streaming schemes lack the necessary flexibility and adaptivity that Skype tries to achieve at the edge of a network. Our contribution will lye on a testbed and consequent objective video quality analysis that we will carry out on input videos. We will stream raw video files with Skype via an impaired channel and then we will record it at the receiver side to analyze with objective quality of experience metrics.
Resumo:
A main objective of the human movement analysis is the quantitative description of joint kinematics and kinetics. This information may have great possibility to address clinical problems both in orthopaedics and motor rehabilitation. Previous studies have shown that the assessment of kinematics and kinetics from stereophotogrammetric data necessitates a setup phase, special equipment and expertise to operate. Besides, this procedure may cause feeling of uneasiness on the subjects and may hinder with their walking. The general aim of this thesis is the implementation and evaluation of new 2D markerless techniques, in order to contribute to the development of an alternative technique to the traditional stereophotogrammetric techniques. At first, the focus of the study has been the estimation of the ankle-foot complex kinematics during stance phase of the gait. Two particular cases were considered: subjects barefoot and subjects wearing ankle socks. The use of socks was investigated in view of the development of the hybrid method proposed in this work. Different algorithms were analyzed, evaluated and implemented in order to have a 2D markerless solution to estimate the kinematics for both cases. The validation of the proposed technique was done with a traditional stereophotogrammetric system. The implementation of the technique leads towards an easy to configure (and more comfortable for the subject) alternative to the traditional stereophotogrammetric system. Then, the abovementioned technique has been improved so that the measurement of knee flexion/extension could be done with a 2D markerless technique. The main changes on the implementation were on occlusion handling and background segmentation. With the additional constraints, the proposed technique was applied to the estimation of knee flexion/extension and compared with a traditional stereophotogrammetric system. Results showed that the knee flexion/extension estimation from traditional stereophotogrammetric system and the proposed markerless system were highly comparable, making the latter a potential alternative for clinical use. A contribution has also been given in the estimation of lower limb kinematics of the children with cerebral palsy (CP). For this purpose, a hybrid technique, which uses high-cut underwear and ankle socks as “segmental markers” in combination with a markerless methodology, was proposed. The proposed hybrid technique is different than the abovementioned markerless technique in terms of the algorithm chosen. Results showed that the proposed hybrid technique can become a simple and low-cost alternative to the traditional stereophotogrammetric systems.