745 resultados para video-database


Relevância:

100.00% 100.00%

Publicador:

Resumo:

[EN]The use of new technologies in order to step up the inter- action between humans and machines is the main proof that faces are important in videos. Therefore we suggest a novel Face Video Database for development, testing and veri cation of algorithms related to face- based applications and to facial recognition applications. In addition of facial expression videos, the database includes body videos. The videos are taken by three di erent cameras, working in real time, without vary- ing illumination conditions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This paper introduces a database of freely available stereo-3D content designed to facilitate research in stereo post-production. It describes the structure and content of the database and provides some details about how the material was gathered. The database includes examples of many of the scenarios characteristic to broadcast footage. Material was gathered at different locations including a studio with controlled lighting and both indoor and outdoor on-location sites with more restricted lighting control. The database also includes video sequences with accompanying 3D audio data recorded in an Ambisonics format. An intended consequence of gathering the material is that the database contains examples of degradations that would be commonly present in real-world scenarios. This paper describes one such artefact caused by uneven exposure in the stereo views, causing saturation in the over-exposed view. An algorithm for the restoration of this artefact is proposed in order to highlight the usefuiness of the database.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With rapid advances in video processing technologies and ever fast increments in network bandwidth, the popularity of video content publishing and sharing has made similarity search an indispensable operation to retrieve videos of user interests. The video similarity is usually measured by the percentage of similar frames shared by two video sequences, and each frame is typically represented as a high-dimensional feature vector. Unfortunately, high complexity of video content has posed the following major challenges for fast retrieval: (a) effective and compact video representations, (b) efficient similarity measurements, and (c) efficient indexing on the compact representations. In this paper, we propose a number of methods to achieve fast similarity search for very large video database. First, each video sequence is summarized into a small number of clusters, each of which contains similar frames and is represented by a novel compact model called Video Triplet (ViTri). ViTri models a cluster as a tightly bounded hypersphere described by its position, radius, and density. The ViTri similarity is measured by the volume of intersection between two hyperspheres multiplying the minimal density, i.e., the estimated number of similar frames shared by two clusters. The total number of similar frames is then estimated to derive the overall similarity between two video sequences. Hence the time complexity of video similarity measure can be reduced greatly. To further reduce the number of similarity computations on ViTris, we introduce a new one dimensional transformation technique which rotates and shifts the original axis system using PCA in such a way that the original inter-distance between two high-dimensional vectors can be maximally retained after mapping. An efficient B+-tree is then built on the transformed one dimensional values of ViTris' positions. Such a transformation enables B+-tree to achieve its optimal performance by quickly filtering a large portion of non-similar ViTris. Our extensive experiments on real large video datasets prove the effectiveness of our proposals that outperform existing methods significantly.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Currently the world swiftly adapts to visual communication. Online services like YouTube and Vine show that video is no longer the domain of broadcast television only. Video is used for different purposes like entertainment, information, education or communication. The rapid growth of today’s video archives with sparsely available editorial data creates a big problem of its retrieval. The humans see a video like a complex interplay of cognitive concepts. As a result there is a need to build a bridge between numeric values and semantic concepts. This establishes a connection that will facilitate videos’ retrieval by humans. The critical aspect of this bridge is video annotation. The process could be done manually or automatically. Manual annotation is very tedious, subjective and expensive. Therefore automatic annotation is being actively studied. In this thesis we focus on the multimedia content automatic annotation. Namely the use of analysis techniques for information retrieval allowing to automatically extract metadata from video in a videomail system. Furthermore the identification of text, people, actions, spaces, objects, including animals and plants. Hence it will be possible to align multimedia content with the text presented in the email message and the creation of applications for semantic video database indexing and retrieving.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

A real-time large scale part-to-part video matching algorithm, based on the cross correlation of the intensity of motion curves, is proposed with a view to originality recognition, video database cleansing, copyright enforcement, video tagging or video result re-ranking. Moreover, it is suggested how the most representative hashes and distance functions - strada, discrete cosine transformation, Marr-Hildreth and radial - should be integrated in order for the matching algorithm to be invariant against blur, compression and rotation distortions: (R; _) 2 [1; 20]_[1; 8], from 512_512 to 32_32pixels2 and from 10 to 180_. The DCT hash is invariant against blur and compression up to 64x64 pixels2. Nevertheless, although its performance against rotation is the best, with a success up to 70%, it should be combined with the Marr-Hildreth distance function. With the latter, the image selected by the DCT hash should be at a distance lower than 1.15 times the Marr-Hildreth minimum distance.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the proliferation of multimedia data and ever-growing requests for multimedia applications, there is an increasing need for efficient and effective indexing, storage and retrieval of multimedia data, such as graphics, images, animation, video, audio and text. Due to the special characteristics of the multimedia data, the Multimedia Database management Systems (MMDBMSs) have emerged and attracted great research attention in recent years. Though much research effort has been devoted to this area, it is still far from maturity and there exist many open issues. In this dissertation, with the focus of addressing three of the essential challenges in developing the MMDBMS, namely, semantic gap, perception subjectivity and data organization, a systematic and integrated framework is proposed with video database and image database serving as the testbed. In particular, the framework addresses these challenges separately yet coherently from three main aspects of a MMDBMS: multimedia data representation, indexing and retrieval. In terms of multimedia data representation, the key to address the semantic gap issue is to intelligently and automatically model the mid-level representation and/or semi-semantic descriptors besides the extraction of the low-level media features. The data organization challenge is mainly addressed by the aspect of media indexing where various levels of indexing are required to support the diverse query requirements. In particular, the focus of this study is to facilitate the high-level video indexing by proposing a multimodal event mining framework associated with temporal knowledge discovery approaches. With respect to the perception subjectivity issue, advanced techniques are proposed to support users' interaction and to effectively model users' perception from the feedback at both the image-level and object-level.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

With the recent explosion in the complexity and amount of digital multimedia data, there has been a huge impact on the operations of various organizations in distinct areas, such as government services, education, medical care, business, entertainment, etc. To satisfy the growing demand of multimedia data management systems, an integrated framework called DIMUSE is proposed and deployed for distributed multimedia applications to offer a full scope of multimedia related tools and provide appealing experiences for the users. This research mainly focuses on video database modeling and retrieval by addressing a set of core challenges. First, a comprehensive multimedia database modeling mechanism called Hierarchical Markov Model Mediator (HMMM) is proposed to model high dimensional media data including video objects, low-level visual/audio features, as well as historical access patterns and frequencies. The associated retrieval and ranking algorithms are designed to support not only the general queries, but also the complicated temporal event pattern queries. Second, system training and learning methodologies are incorporated such that user interests are mined efficiently to improve the retrieval performance. Third, video clustering techniques are proposed to continuously increase the searching speed and accuracy by architecting a more efficient multimedia database structure. A distributed video management and retrieval system is designed and implemented to demonstrate the overall performance. The proposed approach is further customized for a mobile-based video retrieval system to solve the perception subjectivity issue by considering individual user's profile. Moreover, to deal with security and privacy issues and concerns in distributed multimedia applications, DIMUSE also incorporates a practical framework called SMARXO, which supports multilevel multimedia security control. SMARXO efficiently combines role-based access control (RBAC), XML and object-relational database management system (ORDBMS) to achieve the target of proficient security control. A distributed multimedia management system named DMMManager (Distributed MultiMedia Manager) is developed with the proposed framework DEMUR; to support multimedia capturing, analysis, retrieval, authoring and presentation in one single framework.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of critical challenges in automatic recognition of TV commercials is to generate a unique, robust and compact signature. Uniqueness indicates the ability to identify the similarity among the commercial video clips which may have slight content variation. Robustness means the ability to match commercial video clips containing the same content but probably with different digitalization/encoding, some noise data, and/or transmission and recording distortion. Efficiency is about the capability of effectively matching commercial video sequences with a low computation cost and storage overhead. In this paper, we present a binary signature based method, which meets all the three criteria above, by combining the techniques of ordinal and color measurements. Experimental results on a real large commercial video database show that our novel approach delivers a significantly better performance comparing to the existing methods.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a comparison among different consumer 3D display technologies by means of a subjective assessment test. Therefore, four 55-in displays have been considered: one autostereoscopic display, one stereoscopic with polarized passive glasses, and two with active shutter glasses. In addition, a high-quality 3D video database has been used to show diverse material with both views in high definition. To carry out the test, standard recommendations have been followed considering also some modifications looking for a test environment more similar to real home viewing conditions, with the objective of obtaining more representative conclusions. Moreover, several perceptual factors have been considered to study the performance of the displays, such as picture quality, depth perception, and visual discomfort. The obtained results show interesting issues, like the performance improvement of active shutter glasses technology, the high performance of the polarized glasses technology in terms of quality and comfort, and the need of improvement of the autostereoscopic displays to complement the visual comfort to reach a global high-quality visual experience.

Relevância:

40.00% 40.00%

Publicador:

Resumo:

Research in stereoscopic 3D coding, transmission and subjective assessment methodology depends largely on the availability of source content that can be used in cross-lab evaluations. While several studies have already been presented using proprietary content, comparisons between the studies are difficult since discrepant contents are used. Therefore in this paper, a freely available dataset of high quality Full-HD stereoscopic sequences shot with a semiprofessional 3D camera is introduced in detail. The content was designed to be suited for usage in a wide variety of applications, including high quality studies. A set of depth maps was calculated from the stereoscopic pair. As an application example, a subjective assessment has been performed using coding and spatial degradations. The Absolute Category Rating with Hidden Reference method was used. The observers were instructed to vote on video quality only. Results of this experiment are also freely available and will be presented in this paper as a first step towards objective video quality measurement for 3DTV.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisitionsessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of eithermonomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to beavailable for research purposes through the BioSecure Association during 2008.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: To evaluate perioperative outcomes, safety and feasibility of video-assisted resection for primary and secondary liver lesions. Methods : From a prospective database, we analyzed the perioperative results (up to 90 days) of 25 consecutive patients undergoing video-assisted resections in the period between June 2007 and June 2013. Results : The mean age was 53.4 years (23-73) and 16 (64%) patients were female. Of the total, 84% were suffering from malignant diseases. We performed 33 resections (1 to 4 nodules per patient). The procedures performed were non-anatomical resections (n = 26), segmentectomy (n = 1), 2/3 bisegmentectomy (n = 1), 6/7 bisegmentectomy (n = 1), left hepatectomy (n = 2) and right hepatectomy (n = 2). The procedures contemplated postero-superior segments in 66.7%, requiring multiple or larger resections. The average operating time was 226 minutes (80-420), and anesthesia time, 360 minutes (200-630). The average size of resected nodes was 3.2 cm (0.8 to 10) and the surgical margins were free in all the analyzed specimens. Eight percent of patients needed blood transfusion and no case was converted to open surgery. The length of stay was 6.5 days (3-16). Postoperative complications occurred in 20% of patients, with no perioperative mortality. Conclusion : The video-assisted liver resection is feasible and safe and should be part of the liver surgeon armamentarium for resection of primary and secondary liver lesions.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Traffic Control Signs or destination boards on roadways offer significant information for drivers. Regulation signs tell something like your speed, turns, etc; Warning signs warn drivers of conditions ahead to help them avoid accidents; Destination signs show distances and directions to various locations; Service signs display location of hospitals, gas and rest areas etc. Because the signs are so important and there is always a certain distance from them to drivers, to let the drivers get information clearly and easily even in bad weather or other situations. The idea is to develop software which can collect useful information from a special camera which is mounted in the front of a moving car to extract the important information and finally show it to the drivers. For example, when a frame contains on a destination drive sign board it will be text something like "Linkoping 50",so the software should extract every character of "Linkoping 50", compare them with the already known character data in the database. if there is extracted character match "k" in the database then output the destination name and show to the driver. In this project C++ will be used to write the code for this software.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A new multimodal biometric database designed and acquired within the framework of the European BioSecure Network of Excellence is presented. It is comprised of more than 600 individuals acquired simultaneously in three scenarios: 1) over the Internet, 2) in an office environment with desktop PC, and 3) in indoor/outdoor environments with mobile portable hardware. The three scenarios include a common part of audio/video data. Also, signature and fingerprint data have been acquired both with desktop PC and mobile portable hardware. Additionally, hand and iris data were acquired in the second scenario using desktop PC. Acquisition has been conducted by 11 European institutions. Additional features of the BioSecure Multimodal Database (BMDB) are: two acquisition sessions, several sensors in certain modalities, balanced gender and age distributions, multimodal realistic scenarios with simple and quick tasks per modality, cross-European diversity, availability of demographic data, and compatibility with other multimodal databases. The novel acquisition conditions of the BMDB allow us to perform new challenging research and evaluation of either monomodal or multimodal biometric systems, as in the recent BioSecure Multimodal Evaluation campaign. A description of this campaign including baseline results of individual modalities from the new database is also given. The database is expected to be available for research purposes through the BioSecure Association during 2008.