908 resultados para query verification


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Query reformulation is a key user behavior during Web search. Our research goal is to develop predictive models of query reformulation during Web searching. This article reports results from a study in which we automatically classified the query-reformulation patterns for 964,780 Web searching sessions, composed of 1,523,072 queries, to predict the next query reformulation. We employed an n-gram modeling approach to describe the probability of users transitioning from one query-reformulation state to another to predict their next state. We developed first-, second-, third-, and fourth-order models and evaluated each model for accuracy of prediction, coverage of the dataset, and complexity of the possible pattern set. The results show that Reformulation and Assistance account for approximately 45% of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40% overall and higher than 70% for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper reports results from a study in which we automatically classified the query reformulation patterns for 964,780 Web searching sessions (composed of 1,523,072 queries) in order to predict what the next query reformulation would be. We employed an n-gram modeling approach to describe the probability of searchers transitioning from one query reformulation state to another and predict their next state. We developed first, second, third, and fourth order models and evaluated each model for accuracy of prediction. Findings show that Reformulation and Assistance account for approximately 45 percent of all query reformulations. Searchers seem to seek system searching assistant early in the session or after a content change. The results of our evaluations show that the first and second order models provided the best predictability, between 28 and 40 percent overall, and higher than 70 percent for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance in real time.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The problem of impostor dataset selection for GMM-based speaker verification is addressed through the recently proposed data-driven background dataset refinement technique. The SVM-based refinement technique selects from a candidate impostor dataset those examples that are most frequently selected as support vectors when training a set of SVMs on a development corpus. This study demonstrates the versatility of dataset refinement in the task of selecting suitable impostor datasets for use in GMM-based speaker verification. The use of refined Z- and T-norm datasets provided performance gains of 15% in EER in the NIST 2006 SRE over the use of heuristically selected datasets. The refined datasets were shown to generalise well to the unseen data of the NIST 2008 SRE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

A data-driven background dataset refinement technique was recently proposed for SVM based speaker verification. This method selects a refined SVM background dataset from a set of candidate impostor examples after individually ranking examples by their relevance. This paper extends this technique to the refinement of the T-norm dataset for SVM-based speaker verification. The independent refinement of the background and T-norm datasets provides a means of investigating the sensitivity of SVM-based speaker verification performance to the selection of each of these datasets. Using refined datasets provided improvements of 13% in min. DCF and 9% in EER over the full set of impostor examples on the 2006 SRE corpus with the majority of these gains due to refinement of the T-norm dataset. Similar trends were observed for the unseen data of the NIST 2008 SRE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This work presents an extended Joint Factor Analysis model including explicit modelling of unwanted within-session variability. The goals of the proposed extended JFA model are to improve verification performance with short utterances by compensating for the effects of limited or imbalanced phonetic coverage, and to produce a flexible JFA model that is effective over a wide range of utterance lengths without adjusting model parameters such as retraining session subspaces. Experimental results on the 2006 NIST SRE corpus demonstrate the flexibility of the proposed model by providing competitive results over a wide range of utterance lengths without retraining and also yielding modest improvements in a number of conditions over current state-of-the-art.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper presents a novel approach of estimating the confidence interval of speaker verification scores. This approach is utilised to minimise the utterance lengths required in order to produce a confident verification decision. The confidence estimation method is also extended to address both the problem of high correlation in consecutive frame scores, and robustness with very limited training samples. The proposed technique achieves a drastic reduction in the typical data requirements for producing confident decisions in an automatic speaker verification system. When evaluated on the NIST 2005 SRE, the early verification decision method demonstrates that an average of 5–10 seconds of speech is sufficient to produce verification rates approaching those achieved previously using an average in excess of 100 seconds of speech.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Tzeng et al. proposed a new threshold multi-proxy multi-signature scheme with threshold verification. In their scheme, a subset of original signers authenticates a designated proxy group to sign on behalf of the original group. A message m has to be signed by a subset of proxy signers who can represent the proxy group. Then, the proxy signature is sent to the verifier group. A subset of verifiers in the verifier group can also represent the group to authenticate the proxy signature. Subsequently, there are two improved schemes to eliminate the security leak of Tzeng et al.’s scheme. In this paper, we have pointed out the security leakage of the three schemes and further proposed a novel threshold multi-proxy multi-signature scheme with threshold verification.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The term self-selected (i.e., individual or comfortable walking pace or speed) is commonly used in the literature (Frost, Dowling, Bar-Or, & Dyson, 1997; Jeng, Liao, Lai, & Hou, 1997; Wergel-Kolmert & Wohlfart, 1999; Maltais, Bar-Or, Pienynowski, & Galea, 2003; Browning & Kram, 2005; Browning, Baker, Herron, & Kram, 2006; Hills, Byrne, Wearing, & Armstrong, 2006) and is identified as the most efficient walking speed, with increased efficiency defined by lower oxygen uptake (VO^sub 2^) per unit mechanical work (Hoyt & Taylor, 1981; Taylor, Heglund, & Maloiy, 1982; Hreljac, 1993). [...] assessing individual and group differences in metabolic energy expenditure using oxygen uptake requires individuals to be comfortable with, and able to accommodate to, the equipment.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Privacy enhancing protocols (PEPs) are a family of protocols that allow secure exchange and management of sensitive user information. They are important in preserving users’ privacy in today’s open environment. Proof of the correctness of PEPs is necessary before they can be deployed. However, the traditional provable security approach, though well established for verifying cryptographic primitives, is not applicable to PEPs. We apply the formal method of Coloured Petri Nets (CPNs) to construct an executable specification of a representative PEP, namely the Private Information Escrow Bound to Multiple Conditions Protocol (PIEMCP). Formal semantics of the CPN specification allow us to reason about various security properties of PIEMCP using state space analysis techniques. This investigation provides us with preliminary insights for modeling and verification of PEPs in general, demonstrating the benefit of applying the CPN-based formal approach to proving the correctness of PEPs.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The increasing diversity of the Internet has created a vast number of multilingual resources on the Web. A huge number of these documents are written in various languages other than English. Consequently, the demand for searching in non-English languages is growing exponentially. It is desirable that a search engine can search for information over collections of documents in other languages. This research investigates the techniques for developing high-quality Chinese information retrieval systems. A distinctive feature of Chinese text is that a Chinese document is a sequence of Chinese characters with no space or boundary between Chinese words. This feature makes Chinese information retrieval more difficult since a retrieved document which contains the query term as a sequence of Chinese characters may not be really relevant to the query since the query term (as a sequence Chinese characters) may not be a valid Chinese word in that documents. On the other hand, a document that is actually relevant may not be retrieved because it does not contain the query sequence but contains other relevant words. In this research, we propose two approaches to deal with the problems. In the first approach, we propose a hybrid Chinese information retrieval model by incorporating word-based techniques with the traditional character-based techniques. The aim of this approach is to investigate the influence of Chinese segmentation on the performance of Chinese information retrieval. Two ranking methods are proposed to rank retrieved documents based on the relevancy to the query calculated by combining character-based ranking and word-based ranking. Our experimental results show that Chinese segmentation can improve the performance of Chinese information retrieval, but the improvement is not significant if it incorporates only Chinese segmentation with the traditional character-based approach. In the second approach, we propose a novel query expansion method which applies text mining techniques in order to find the most relevant words to extend the query. Unlike most existing query expansion methods, which generally select the highly frequent indexing terms from the retrieved documents to expand the query. In our approach, we utilize text mining techniques to find patterns from the retrieved documents that highly correlate with the query term and then use the relevant words in the patterns to expand the original query. This research project develops and implements a Chinese information retrieval system for evaluating the proposed approaches. There are two stages in the experiments. The first stage is to investigate if high accuracy segmentation can make an improvement to Chinese information retrieval. In the second stage, a text mining based query expansion approach is implemented and a further experiment has been done to compare its performance with the standard Rocchio approach with the proposed text mining based query expansion method. The NTCIR5 Chinese collections are used in the experiments. The experiment results show that by incorporating the text mining based query expansion with the hybrid model, significant improvement has been achieved in both precision and recall assessments.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The cascading appearance-based (CAB) feature extraction technique has established itself as the state of the art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we will demonstrate that the same steps taken to reduce static speaker and environmental information for the speech recognition application also provide similar improvements for speaker recognition. These results suggest that visual speaker recognition can improve considerable when conducted solely through a consideration of the dynamic speech information rather than the static appearance of the speaker's mouth region.