877 resultados para natural language


Relevância:

60.00% 60.00%

Publicador:

Resumo:

In this paper, we propose an unsupervised segmentation approach, named "n-gram mutual information", or NGMI, which is used to segment Chinese documents into n-character words or phrases, using language statistics drawn from the Chinese Wikipedia corpus. The approach alleviates the tremendous effort that is required in preparing and maintaining the manually segmented Chinese text for training purposes, and manually maintaining ever expanding lexicons. Previously, mutual information was used to achieve automated segmentation into 2-character words. The NGMI approach extends the approach to handle longer n-character words. Experiments with heterogeneous documents from the Chinese Wikipedia collection show good results.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper presents a novel approach of estimating the confidence interval of speaker verification scores. This approach is utilised to minimise the utterance lengths required in order to produce a confident verification decision. The confidence estimation method is also extended to address both the problem of high correlation in consecutive frame scores, and robustness with very limited training samples. The proposed technique achieves a drastic reduction in the typical data requirements for producing confident decisions in an automatic speaker verification system. When evaluated on the NIST 2005 SRE, the early verification decision method demonstrates that an average of 5–10 seconds of speech is sufficient to produce verification rates approaching those achieved previously using an average in excess of 100 seconds of speech.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

We argue that web service discovery technology should help the user navigate a complex problem space by providing suggestions for services which they may not be able to formulate themselves as (s)he lacks the epistemic resources to do so. Free text documents in service environments provide an untapped source of information for augmenting the epistemic state of the user and hence their ability to search effectively for services. A quantitative approach to semantic knowledge representation is adopted in the form of semantic space models computed from these free text documents. Knowledge of the user’s agenda is promoted by associational inferences computed from the semantic space. The inferences are suggestive and aim to promote human abductive reasoning to guide the user from fuzzy search goals into a better understanding of the problem space surrounding the given agenda. Experimental results are discussed based on a complex and realistic planning activity.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

While spoken term detection (STD) systems based on word indices provide good accuracy, there are several practical applications where it is infeasible or too costly to employ an LVCSR engine. An STD system is presented, which is designed to incorporate a fast phonetic decoding front-end and be robust to decoding errors whilst still allowing for rapid search speeds. This goal is achieved through mono-phone open-loop decoding coupled with fast hierarchical phone lattice search. Results demonstrate that an STD system that is designed with the constraint of a fast and simple phonetic decoding front-end requires a compromise to be made between search speed and search accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The use of the PC and Internet for placing telephone calls will present new opportunities to capture vast amounts of un-transcribed speech for a particular speaker. This paper investigates how to best exploit this data for speaker-dependent speech recognition. Supervised and unsupervised experiments in acoustic model and language model adaptation are presented. Using one hour of automatically transcribed speech per speaker with a word error rate of 36.0%, unsupervised adaptation resulted in an absolute gain of 6.3%, equivalent to 70% of the gain from the supervised case, with additional adaptation data likely to yield further improvements. LM adaptation experiments suggested that although there seems to be a small degree of speaker idiolect, adaptation to the speaker alone, without considering the topic of the conversation, is in itself unlikely to improve transcription accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Searching for multimedia is an important activity for users of Web search engines. Studying user's interactions with Web search engine multimedia buttons, including image, audio, and video, is important for the development of multimedia Web search systems. This article provides results from a Weblog analysis study of multimedia Web searching by Dogpile users in 2006. The study analyzes the (a) duration, size, and structure of Web search queries and sessions; (b) user demographics; (c) most popular multimedia Web searching terms; and (d) use of advanced Web search techniques including Boolean and natural language. The current study findings are compared with results from previous multimedia Web searching studies. The key findings are: (a) Since 1997, image search consistently is the dominant media type searched followed by audio and video; (b) multimedia search duration is still short (>50% of searching episodes are <1 min), using few search terms; (c) many multimedia searches are for information about people, especially in audio search; and (d) multimedia search has begun to shift from entertainment to other categories such as medical, sports, and technology (based on the most repeated terms). Implications for design of Web multimedia search engines are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The World Wide Web has become a medium for people to share information. People use Web-based collaborative tools such as question answering (QA) portals, blogs/forums, email and instant messaging to acquire information and to form online-based communities. In an online QA portal, a user asks a question and other users can provide answers based on their knowledge, with the question usually being answered by many users. It can become overwhelming and/or time/resource consuming for a user to read all of the answers provided for a given question. Thus, there exists a need for a mechanism to rank the provided answers so users can focus on only reading good quality answers. The majority of online QA systems use user feedback to rank users’ answers and the user who asked the question can decide on the best answer. Other users who didn’t participate in answering the question can also vote to determine the best answer. However, ranking the best answer via this collaborative method is time consuming and requires an ongoing continuous involvement of users to provide the needed feedback. The objective of this research is to discover a way to recommend the best answer as part of a ranked list of answers for a posted question automatically, without the need for user feedback. The proposed approach combines both a non-content-based reputation method and a content-based method to solve the problem of recommending the best answer to the user who posted the question. The non-content method assigns a score to each user which reflects the users’ reputation level in using the QA portal system. Each user is assigned two types of non-content-based reputations cores: a local reputation score and a global reputation score. The local reputation score plays an important role in deciding the reputation level of a user for the category in which the question is asked. The global reputation score indicates the prestige of a user across all of the categories in the QA system. Due to the possibility of user cheating, such as awarding the best answer to a friend regardless of the answer quality, a content-based method for determining the quality of a given answer is proposed, alongside the non-content-based reputation method. Answers for a question from different users are compared with an ideal (or expert) answer using traditional Information Retrieval and Natural Language Processing techniques. Each answer provided for a question is assigned a content score according to how well it matched the ideal answer. To evaluate the performance of the proposed methods, each recommended best answer is compared with the best answer determined by one of the most popular link analysis methods, Hyperlink-Induced Topic Search (HITS). The proposed methods are able to yield high accuracy, as shown by correlation scores: Kendall correlation and Spearman correlation. The reputation method outperforms the HITS method in terms of recommending the best answer. The inclusion of the reputation score with the content score improves the overall performance, which is measured through the use of Top-n match scores.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This correspondence presents a microphone array shape calibration procedure for diffuse noise environments. The procedure estimates intermicrophone distances by fitting the measured noise coherence with its theoretical model and then estimates the array geometry using classical multidimensional scaling. The technique is validated on noise recordings from two office environments.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The recently proposed data-driven background dataset refinement technique provides a means of selecting an informative background for support vector machine (SVM)-based speaker verification systems. This paper investigates the characteristics of the impostor examples in such highly-informative background datasets. Data-driven dataset refinement individually evaluates the suitability of candidate impostor examples for the SVM background prior to selecting the highest-ranking examples as a refined background dataset. Further, the characteristics of the refined dataset were analysed to investigate the desired traits of an informative SVM background. The most informative examples of the refined dataset were found to consist of large amounts of active speech and distinctive language characteristics. The data-driven refinement technique was shown to filter the set of candidate impostor examples to produce a more disperse representation of the impostor population in the SVM kernel space, thereby reducing the number of redundant and less-informative examples in the background dataset. Furthermore, data-driven refinement was shown to provide performance gains when applied to the difficult task of refining a small candidate dataset that was mis-matched to the evaluation conditions.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

This study assesses the recently proposed data-driven background dataset refinement technique for speaker verification using alternate SVM feature sets to the GMM supervector features for which it was originally designed. The performance improvements brought about in each trialled SVM configuration demonstrate the versatility of background dataset refinement. This work also extends on the originally proposed technique to exploit support vector coefficients as an impostor suitability metric in the data-driven selection process. Using support vector coefficients improved the performance of the refined datasets in the evaluation of unseen data. Further, attempts are made to exploit the differences in impostor example suitability measures from varying features spaces to provide added robustness.