94 resultados para Text summarization

em Deakin Research Online - Australia


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Summarization is an essential requirement for achieving a more compact and interesting representation of sports video contents. We propose a framework that integrates highlights into play segments and reveal why we should still retain breaks. Experimental results show that fast detections of whistle sounds, crowd excitement, and text boxes can complement existing techniques for play-breaks and highlights localization.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Summarization of cricket videos is very important because of three reasons: 1) its long duration making manual highlights generation tedious 2) less explored area compared to other sports like soccer 3) huge viewership. We propose a novel summarization scheme for cricket which exploits its contextual semantics. First, we detect the bowling frames based on which the video is temporally segmented into individual deliveries. Then each temporal segment representing a delivery is classified into an interesting or non-interesting segment based on detection of events namely boundaries and wickets. Due to the high frequency of ads and replays in cricket, we have proposed robust algorithms for their removal. Finally, we have proposed a finite state automaton based modeling of the temporal segments to extract key-frames. We have also extended the framework to include text cues and expert choices and also developed a hierarchical summary. We have tested our algorithm on several broadcast cricket videos and obtained good results.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Transcription of interview data is a common practice in qualitative health research. However, there has been little discussion of the techniques of transcription and the issues inherent in the use of transcription as a strategy for managing qualitative data in nursing publications. The process of transcription may disclose or obscure certain information. Researchers need to question practices of transcription that have been taken for granted and make transparent the processes used to preserve the integrity of data. This paper first examines research reported in nursing and allied health journals employing interviews for data collection and the attention given to the transcription phase. It then deals with issues of concern regarding the transcription of interviews, and offers suggestions for promoting validity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A version of this article was first presented at the Drama Australia Conference, Fremantle, July 2002. It draws upon Freebody and Luke's four resources literacy framework, where they describe four kinds of literacy  practices. It shows how this model is used within the literacy community and argues that this model is useful to describe the contribution that drama can make to literacy development. Freebody and Luke's model is used and  promoted throughout Australia and the author argues that it is politically astute for drama teachers to reclaim and promote their links to the English/Literacy curriculum.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Fish-net algorithm is a novel field learning algorithm which derives classification rules by looking at the range of values of each attribute instead of the individual point values. In this paper, we present a Feature Selection Fish-net learning algorithm to solve the Dual Imbalance problem on text classification. Dual imbalance includes the instance imbalance and feature imbalance. The instance imbalance is caused by the unevenly distributed classes and feature imbalance is due to the different document length. The proposed approach consists of two phases: (1) select a feature subset which consists of the features that are more supportive to difficult minority class; (2) construct classification rules based on the original Fish-net algorithm. Our experimental results on Reuters21578 show that the proposed approach achieves better balanced accuracy rate on both majority and minority class than Naive Bayes MultiNomial and SVM.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Classification methods are usually used to categorize text documents, such as, Rocchio method, Naïve bayes based method, and SVM based text classification method. These methods learn labeled text documents and then construct classifiers. The generated classifiers can predict which category is located for a new coming text document. The keywords in the document are often used to form rules to categorize text documents, for example “kw = computer” can be a rule for the IT documents category. However, the number of keywords is very large. To select keywords from the large number of keywords is a challenging work. Recently, a rule generation method based on enumeration of all possible keywords combinations has been proposed [2]. In this method, there remains a crucial problem: how to prune irrelevant combinations at the early stages of the rule generation procedure. In this paper, we propose a method than can effectively prune irrelative keywords at an early stage.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Compared with conventional two-class learning schemes, one-class classification simply uses a single class for training purposes. Applying one-class classification to the minorities in an imbalanced data has been shown to achieve better performance than the two-class one. In this paper, in order to make the best use of all the available information during the learning procedure, we propose a general framework which first uses the minority class for training in the one-class classification stage; and then uses both minority and majority class for estimating the generalization performance of the constructed classifier. Based upon this generalization performance measurement, parameter search algorithm selects the best parameter settings for this classifier. Experiments on UCI and Reuters text data show that one-class SVM embedded in this framework achieves much better performance than the standard one-class SVM alone and other learning schemes, such as one-class Naive Bayes, one-class nearest neighbour and neural network.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

People with motor impairments who use a switch device to interface with computers have poor access to affordable software for email communication. The MultiMail email package was developed with government support to provide email access solutions for these users and for others with a range of disabilities. In this paper, the development of accessible on-screen keyboards and a word prediction program which facilitates email text production is discussed. Technology solutions were informed by people with disabilities through focus group and survey data. The resulting cross-disability design of MultiMail provides innovative and cost-free solutions to email text production.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Concept learning of text documents can be viewed as the problem of acquiring the definition of a general category of documents. To definite the category of a text document, the Conjunctive of keywords is usually be used. These keywords should be fewer and comprehensible. A naïve method is enumerating all combinations of keywords to extract suitable ones. However, because of the enormous number of keyword combinations, it is impossible to extract the most relevant keywords to describe the categories of documents by enumerating all possible combinations of keywords. Many heuristic methods are proposed, such as GA-base, immune based algorithm. In this work, we introduce pruning power technique and propose a robust enumeration-based concept learning algorithm. Experimental results show that the rules produce by our approach has more comprehensible and simplicity than by other methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Text categorization (TC) is one of the main applications of machine learning. Many methods have been proposed, such as Rocchio method, Naive bayes based method, and SVM based text classification method. These methods learn labeled text documents and then construct a classifier. A new coming text document's category can be predicted. However, these methods do not give the description of each category. In the machine learning field, there are many concept learning algorithms, such as, ID3 and CN2. This paper proposes a more robust algorithm to induce concepts from training examples, which is based on enumeration of all possible keywords combinations. Experimental results show that the rules produced by our approach have more precision and simplicity than that of other methods.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Many classification methods have been proposed to find patterns in text documents. However, according to Occam's razor principle, "the explanation of any phenomenon should make as few assumptions as possible", short patterns usually have more explainable and meaningful for classifying text documents. In this paper, we propose a depth-first pattern generation algorithm, which can find out short patterns from text document more effectively, comparing with breadth-first algorithm