996 resultados para Multimedia documents


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The utilization of massive multimedia documents collections, such as multimedia documents in the global Internet, needs search engines which can rank using both text and image evidence. Massive size and (dynamic) nature of collection can make manual indexing prohibitively expensive in such situations. Traditional search engines utilize only text components of multimedia documents. But there are information needs, which require the utilization of image evidence. In this paper, we investigate image-feature for large and heterogeneous collections. Both the nature and complexities of information needs are key elements for an effective retrieval. Retrieval needs that depend on perceptual similarities (as found in art galleries, building architecture) require the utilization of visual cues. In such situations, the retrieval of multimedia document based on image ranking can provide higher effectiveness. Experimental results show that effectiveness of ranking based on image feature can be higher where perceptual similarities are key elements for retrieval than the retrieval effectiveness of algorithms based on text ranking algorithms

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Audio-visual documents obtained from German TV news are classified according to the IPTC topic categorization scheme. To this end usual text classification techniques are adapted to speech, video, and non-speech audio. For each of the three modalities word analogues are generated: sequences of syllables for speech, “video words” based on low level color features (color moments, color correlogram and color wavelet), and “audio words” based on low-level spectral features (spectral envelope and spectral flatness) for non-speech audio. Such audio and video words provide a means to represent the different modalities in a uniform way. The frequencies of the word analogues represent audio-visual documents: the standard bag-of-words approach. Support vector machines are used for supervised classification in a 1 vs. n setting. Classification based on speech outperforms all other single modalities. Combining speech with non-speech audio improves classification. Classification is further improved by supplementing speech and non-speech audio with video words. Optimal F-scores range between 62% and 94% corresponding to 50% - 84% above chance. The optimal combination of modalities depends on the category to be recognized. The construction of audio and video words from low-level features provide a good basis for the integration of speech, non-speech audio and video.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Document engineering is the computer science discipline that investigates systems for documents in any form and in all media. As with the relationship between software engineering and software, document engineering is concerned with principles, tools and processes that improve our ability to create, manage, and maintain documents (http://www.documentengineering.org). The ACM Symposium on Document Engineering is an annual meeting of researchers active in document engineering: it is sponsored by ACM by means of the ACM SIGWEB Special Interest Group. In this editorial, we first point to work carried out in the context of document engineering, which are directly related to multimedia tools and applications. We conclude with a summary of the papers presented in this special issue.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Internet and its widespread usage for multimedia document distribution put the copyright issue in a complete new setting. Multimedia documents, specifically those installed on a web page, are no longer passive as they typically include active applets. Copyright protection safeguards the intellectual property (IP) of multimedia documents, which are either sold or distributed free of charge. In this Chapter, the basic tools for copyright protection are discussed. First, general concepts and the vocabulary used in copyright protection of multimedia documents are discussed. Later, taxonomy of watermarking and fingerprinting techniques are studied. This part is concluded by a review of the literature dealing with IP security. The main part of the chapter discusses the generic watermarking scheme and illustrates it on three specific examples: collusion-free watermarking, spread spectrum watermarking, and software fingerprinting. Future trends and conclusions close the chapter.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This research has made contributions to the area of spoken term detection (STD), defined as the process of finding all occurrences of a specified search term in a large collection of speech segments. The use of visual information in the form of lip movements of the speaker in addition to audio and the use of topic of the speech segments, and the expected frequency of words in the target speech domain, are proposed. By using these complementary information, improvement in the performance of STD has been achieved which enables efficient search of key words in large collection of multimedia documents.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The literature reports research efforts allowing the editing of interactive TV multimedia documents by end-users. In this article we propose complementary contributions relative to end-user generated interactive video, video tagging, and collaboration. In earlier work we proposed the watch-and-comment (WaC) paradigm as the seamless capture of an individual`s comments so that corresponding annotated interactive videos be automatically generated. As a proof of concept, we implemented a prototype application, the WACTOOL, that supports the capture of digital ink and voice comments over individual frames and segments of the video, producing a declarative document that specifies both: different media stream structure and synchronization. In this article, we extend the WaC paradigm in two ways. First, user-video interactions are associated with edit commands and digital ink operations. Second, focusing on collaboration and distribution issues, we employ annotations as simple containers for context information by using them as tags in order to organize, store and distribute information in a P2P-based multimedia capture platform. We highlight the design principles of the watch-and-comment paradigm, and demonstrate related results including the current version of the WACTOOL and its architecture. We also illustrate how an interactive video produced by the WACTOOL can be rendered in an interactive video environment, the Ginga-NCL player, and include results from a preliminary evaluation.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Generalized hyper competitiveness in the world markets has determined the need to offer better products to potential and actual clients in order to mark an advantagefrom other competitors. To ensure the production of an adequate product, enterprises need to work on the efficiency and efficacy of their business processes (BPs) by means of the construction of Interactive Information Systems (IISs, including Interactive Multimedia Documents) so that they are processed more fluidly and correctly.The construction of the correct IIS is a major task that can only be successful if the needs from every intervenient are taken into account. Their requirements must bedefined with precision, extensively analyzed and consequently the system must be accurately designed in order to minimize implementation problems so that the IIS isproduced on schedule and with the fewer mistakes as possible. The main contribution of this thesis is the proposal of Goals, a software (engineering) construction process which aims at defining the tasks to be carried out in order to develop software. This process defines the stakeholders, the artifacts, and the techniques that should be applied to achieve correctness of the IIS. Complementarily, this process suggests two methodologies to be applied in the initial phases of the lifecycle of the Software Engineering process: Process Use Cases for the phase of requirements, and; MultiGoals for the phases of analysis and design. Process Use Cases is a UML-based (Unified Modeling Language), goal-driven and use case oriented methodology for the definition of functional requirements. It uses an information oriented strategy in order to identify BPs while constructing the enterprise’s information structure, and finalizes with the identification of use cases within the design of these BPs. This approach provides a useful tool for both activities of Business Process Management and Software Engineering. MultiGoals is a UML-based, use case-driven and architectural centric methodology for the analysis and design of IISs with support for Multimedia. It proposes the analysis of user tasks as the basis of the design of the: (i) user interface; (ii) the system behaviour that is modeled by means of patterns which can combine Multimedia and standard information, and; (iii) the database and media contents. This thesis makes the theoretic presentation of these approaches accompanied with examples from a real project which provide the necessary support for the understanding of the used techniques.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

[EN]This work is part of a methodological renovation project from Ingeniería de Fabricación Innovative Education Group, from University of Las Palmas de Gran Canaria. It has developed learning materials for courses in Manufacturing Engineering that can be used in several degrees. The first learning material, it was decided to take a plastic injection mould as a teaching resource. Abundant information generated has been used to develop an interactive electronic publication. This learning material has been chosen by the Publishing and Scientific Diffusion Service from this University, as a new line of work in publications of educational innovation. The group is developing more training materials on other manufacturing processes as well as cross-contents dimensional tolerances in the ISO GPS system. All this work has generated a lot of educational resources for both laboratory practices and interactive multimedia documents.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Knowledge based urban development (KBUD) is a new paradigm in urban planning tailoring to the era of knowledge economy. It aims mainly to assist a contemporary city to promote a more sustainable socio-spatial order. The paper reports on the investigation of KBUD initiative in Malaysia which is manifested through the establishment of a project called Multimedia Super Corridor (MSC). MSC Malaysia aims to attract knowledge workers and industries to invest and operate within the area by creating a world class urban corridor with state-of-the-art multimedia infrastructure, efficient transportation system and an attractive living environment. Based on documents analysis and interviews, this paper analyses the strategies, implementations, and achievements of KBUD initiative in Cyberjaya, being the leading intelligent city of the unique Malaysia’s KBUD project-MSC Malaysia. A critical evaluation is made to assess the achievements of MSC, by looking at the physical changes after about ten years since its official launching. The findings recommend some valuable lessons for other cities that strive to develop KBUD strategies, strengthen their sustainable socio-spatial policies, and seek a global recognition.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper describes an approach based on Zernike moments and Delaunay triangulation for localization of hand-written text in machine printed text documents. The Zernike moments of the image are first evaluated and we classify the text as hand-written using the nearest neighbor classifier. These features are independent of size, slant, orientation, translation and other variations in handwritten text. We then use Delaunay triangulation to reclassify the misclassified text regions. When imposing Delaunay triangulation on the centroid points of the connected components, we extract features based on the triangles and reclassify the text. We remove the noise components in the document as part of the preprocessing step so this method works well on noisy documents. The success rate of the method is found to be 86%. Also for specific hand-written elements such as signatures or similar text the accuracy is found to be even higher at 93%.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The following topics were dealt with: document analysis and recognition; multimedia document processing; character recognition; document image processing; cheque processing; form processing; music processing; document segmentation; electronic documents; character classification; handwritten character recognition; information retrieval; postal automation; font recognition; Indian language OCR; handwriting recognition; performance evaluation; graphics recognition; oriental character recognition; and word recognition

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Children in our society have access to many information resources and communication options. As we witness the convergence of art, literacy and publishing, individuals need to learn how to make sense of information presented in many different forms, and how to construct their own communications in multiple media.
Thinking Multimedia is a program that has developed out of many projects that I have run in several school and some tertiary institutions over the past 12 years. It is an attempt to integrate skills and knowledge from different academic disciplines and to encourage students to understand learning processes and their own learning preferences. The course, offered at this stage at Year 10 level at St Catherine’s School in Melbourne, aims to provide background and basic skills in how to construct and deconstruct information in multiple media and to provide students with the opportunity to explore a ‘real need’ project of their own in a project-based team environment. The course is supported by an online resource and discussion component.
In this presentation I will explain the background to the Thinking Multimedia program and explore some of the work by the students involved.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Electronic information is becoming increasingly rich in content and varied in format and style while at the same time client devices are getting increasingly varied in their capabilities. This mismatch between rich contents and the end devices capability presents a challenge in providing seamless and ubiquitous access to electronic documents to interested users. Service-oriented content adaptation has emerged as a potential solution to the content-device mismatch problem. Since an adaptation task can potentially be performed by multiple content adaptation services (CAS), an approach for CAS discovery is a fundamental component of service-oriented content adaptation environment. In this paper, we propose a service discovery approach that considers the client device capability and the service’s attributes to discover appropriate CAS while optimizing performance and functionality. The efficiency of the proposed CAS discovery protocol is studied experimentally. The results show that the proposed discovery approach is effective in terms of discovering appropriate content adaptation services.