319 resultados para Information retrieval


Relevância:

60.00% 60.00%

Publicador:

Resumo:

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of sTD accuracy, making it an ideal candiate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phe classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a non-linear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As Web searching becomes more prolific for information access worldwide, we need to better understand users’ Web searching behaviour and develop better models of their interaction with Web search systems. Web search modelling is a significant and important area of Web research. Searching on the Web is an integral element of information behaviour and human–computer interaction. Web searching includes multitasking processes, the allocation of cognitive resources among several tasks, and shifts in cognitive, problem and knowledge states. In addition to multitasking, cognitive coordination and cognitive shifts are also important, but are under-explored aspects of Web searching. During the Web searching process, beyond physical actions, users experience various cognitive activities. Interactive Web searching involves many users’ cognitive shifts at different information behaviour levels. Cognitive coordination allows users to trade off the dependences among multiple information tasks and the resources available. Much research has been conducted into Web searching. However, few studies have modelled the nature of and relationship between multitasking, cognitive coordination and cognitive shifts in the Web search context. Modelling how Web users interact with Web search systems is vital for the development of more effective Web IR systems. This study aims to model the relationship between multitasking, cognitive coordination and cognitive shifts during Web searching. A preliminary theoretical model is presented based on previous studies. The research is designed to validate the preliminary model. Forty-two study participants were involved in the empirical study. A combination of data collection instruments, including pre- and post-questionnaires, think-aloud protocols, search logs, observations and interviews were employed to obtain users’ comprehensive data during Web search interactions. Based on the grounded theory approach, qualitative analysis methods including content analysis and verbal protocol analysis were used to analyse the data. The findings were inferred through an analysis of questionnaires, a transcription of think-aloud protocols, the Web search logs, and notes on observations and interviews. Five key findings emerged. (1) Multitasking during Web searching was demonstrated as a two-dimensional behaviour. The first dimension was represented as multiple information problems searching by task switching. Users’ Web searching behaviour was a process of multiple tasks switching, that is, from searching on one information problem to searching another. The second dimension of multitasking behaviour was represented as an information problem searching within multiple Web search sessions. Users usually conducted Web searching on a complex information problem by submitting multiple queries, using several Web search systems and opening multiple windows/tabs. (2) Cognitive shifts were the brain’s internal response to external stimuli. Cognitive shifts were found as an essential element of searching interactions and users’ Web searching behaviour. The study revealed two kinds of cognitive shifts. The first kind, the holistic shift, included users’ perception on the information problem and overall information evaluation before and after Web searching. The second kind, the state shift, reflected users’ changes in focus between the different cognitive states during the course of Web searching. Cognitive states included users’ focus on the states of topic, strategy, evaluation, view and overview. (3) Three levels of cognitive coordination behaviour were identified: the information task coordination level, the coordination mechanism level, and the strategy coordination level. The three levels of cognitive coordination behaviour interplayed to support multiple information tasks switching. (4) An important relationship existed between multitasking, cognitive coordination and cognitive shifts during Web searching. Cognitive coordination as a management mechanism bound together other cognitive processes, including multitasking and cognitive shifts, in order to move through users’ Web searching process. (5) Web search interaction was shown to be a multitasking process which included information problems ordering, task switching and task and mental coordinating; also, at a deeper level, cognitive shifts took place. Cognitive coordination was the hinge behaviour linking multitasking and cognitive shifts. Without cognitive coordination, neither multitasking Web searching behaviour nor the complicated mental process of cognitive shifting could occur. The preliminary model was revisited with these empirical findings. A revised theoretical model (MCC Model) was built to illustrate the relationship between multitasking, cognitive coordination and cognitive shifts during Web searching. Implications and limitations of the study are also discussed, along with future research work.

Relevância:

60.00% 60.00%

Publicador:

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Web is a powerful hypermedia-based information retrieval mechanism that provides a user-friendly access across all major computer platforms connected over Internet. This paper demonstrates the application of Web technology when used as an educational delivery tool. It also reports on the development of a prototype electronic publishing project where Web technology was used to deliver power engineering educational resources. The resulting hyperbook will contain diverse teaching resources such as hypermedia-based modular educational units and computer simulation programs that are linked in a meaningful and structured way. The use of Web for disseminating information of this nature has many advantages that cannot possibly be achieved otherwise. PREAMBLE The continual increase of low-cost functionality available in desktop computing has opened up a new possibility in learning within a wider educational framework. This technology also is supported by enhanced features offered by new and ...

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Digital collections are growing exponentially in size as the information age takes a firm grip on all aspects of society. As a result Information Retrieval (IR) has become an increasingly important area of research. It promises to provide new and more effective ways for users to find information relevant to their search intentions. Document clustering is one of the many tools in the IR toolbox and is far from being perfected. It groups documents that share common features. This grouping allows a user to quickly identify relevant information. If these groups are misleading then valuable information can accidentally be ignored. There- fore, the study and analysis of the quality of document clustering is important. With more and more digital information available, the performance of these algorithms is also of interest. An algorithm with a time complexity of O(n2) can quickly become impractical when clustering a corpus containing millions of documents. Therefore, the investigation of algorithms and data structures to perform clustering in an efficient manner is vital to its success as an IR tool. Document classification is another tool frequently used in the IR field. It predicts categories of new documents based on an existing database of (doc- ument, category) pairs. Support Vector Machines (SVM) have been found to be effective when classifying text documents. As the algorithms for classifica- tion are both efficient and of high quality, the largest gains can be made from improvements to representation. Document representations are vital for both clustering and classification. Representations exploit the content and structure of documents. Dimensionality reduction can improve the effectiveness of existing representations in terms of quality and run-time performance. Research into these areas is another way to improve the efficiency and quality of clustering and classification results. Evaluating document clustering is a difficult task. Intrinsic measures of quality such as distortion only indicate how well an algorithm minimised a sim- ilarity function in a particular vector space. Intrinsic comparisons are inherently limited by the given representation and are not comparable between different representations. Extrinsic measures of quality compare a clustering solution to a “ground truth” solution. This allows comparison between different approaches. As the “ground truth” is created by humans it can suffer from the fact that not every human interprets a topic in the same manner. Whether a document belongs to a particular topic or not can be subjective.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Consider a person searching electronic health records, a search for the term ‘cracked skull’ should return documents that contain the term ‘cranium fracture’. A information retrieval systems is required that matches concepts, not just keywords. Further more, determining relevance of a query to a document requires inference – its not simply matching concepts. For example a document containing ‘dialysis machine’ should align with a query for ‘kidney disease’. Collectively we describe this problem as the ‘semantic gap’ – the difference between the raw medical data and the way a human interprets it. This paper presents an approach to semantic search of health records by combining two previous approaches: an ontological approach using the SNOMED CT medical ontology; and a distributional approach using semantic space vector space models. Our approach will be applied to a specific problem in health informatics: the matching of electronic patient records to clinical trials.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

User-Web interactions have emerged as an important area of research in the field of information science. In this study, we investigate the effects of users’ cognitive styles on their Web navigational styles and information processing strategies. We report results from the analyses of 594 minutes recorded Web search sessions of 18 participants engaged in 54 scenario-based search tasks. We use questionnaires, cognitive style test, Web session logs and think-aloud as the data collection instruments. We classify users’ cognitive styles as verbalisers and imagers based on Riding’s (1991) Cognitive Style Analysis test. Two classifications of navigational styles and three categories of information processing strategies are identified. Our study findings show that there exist relationships between users’ cognitive style, and their navigational styles and information processing strategies. Verbal users seem to display sporadic navigational styles, and adopt a scanning strategy to understand the content of the search result page, while imagery users follow a structured navigational style and reading approach. We develop a matrix and a model that depicts the relationships between users’ cognitive styles, and their navigational style and information processing strategies. We discuss how the findings from this study could help search engine designers to provide an adaptive navigation support to users.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

As more and more information is available on the Web finding quality and reliable information is becoming harder. To help solve this problem, Web search models need to incorporate users’ cognitive styles. This paper reports the preliminary results from a user study exploring the relationships between Web users’ searching behavior and their cognitive style. The data was collected using a questionnaire, Web search logs and think-aloud strategy. The preliminary findings reveal a number of cognitive factors, such as information searching processes, results evaluations and cognitive style, having an influence on users’ Web searching behavior. Among these factors, the cognitive style of the user was observed to have a greater impact. Based on the key findings, a conceptual model of Web searching and cognitive styles is presented.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

User-Web interactions have emerged as an important research in the field of information science. In this study, we examine extensively the Web searching performed by general users. Our goal is to investigate the effects of users’ cognitive styles on their Web search behavior in relation to two broad components: Information Searching and Information Processing Approaches. We use questionnaires, a measure of cognitive style, Web session logs and think-aloud as the data collection instruments. Our study findings show wholistic Web users tend to adopt a top-down approach to Web searching, where the users searched for a generic topic, and then reformulate their queries to search for specific information. They tend to prefer reading to process information. Analytic users tend to prefer a bottom-up approach to information searching and they process information by scanning search result pages.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

In computational linguistics, information retrieval and applied cognition, words and concepts are often represented as vectors in high dimensional spaces computed from a corpus of text. These high dimensional spaces are often referred to as Semantic Spaces. We describe a novel and efficient approach to computing these semantic spaces via the use of complex valued vector representations. We report on the practical implementation of the proposed method and some associated experiments. We also briefly discuss how the proposed system relates to previous theoretical work in Information Retrieval and Quantum Mechanics and how the notions of probability, logic and geometry are integrated within a single Hilbert space representation. In this sense the proposed system has more general application and gives rise to a variety of opportunities for future research.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

The Australian National Data Service (ANDS) was established in 2008 and aims to: influence national policy in the area of data management in the Australian research community; inform best practice for the curation of data, and, transform the disparate collections of research data around Australia into a cohesive collection of research resources One high profile ANDS activity is to establish the population of Research Data Australia, a set of web pages describing data collections produced by or relevant to Australian researchers. It is designed to promote visibility of research data collections in search engines, in order to encourage their re-use. As part of activities associated with the Australian National Data Service, an increasing number of Australian Universities are choosing to implement VIVO, not as a platform to profile information about researchers, but as a 'metadata store' platform to profile information about institutional research data sets, both locally and as part of a national data commons. To date, the University of Melbourne, Griffith University, the Queensland University of Technology, and the University of Western Australia have all chosen to implement VIVO, with interest from other Universities growing.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

INTRODUCTION: Since the introduction of its QUT ePrints institutional repository of published research outputs, together with the world’s first mandate for author contributions to an institutional repository, Queensland University of Technology (QUT) has been a leader in support of green road open access. With QUT ePrints providing our mechanism for supporting the green road to open access, QUT has since then also continued to expand its secondary open access strategy supporting gold road open access, which is also designed to assist QUT researchers to maximise the accessibility and so impact of their research. ---------- METHODS: QUT Library has adopted the position of selectively supporting true gold road open access publishing by using the Library Resource Allocation budget to pay the author publication fees for QUT authors wishing to publish in the open access journals of a range of publishers including BioMed Central, Public Library of Science and Hindawi. QUT Library has been careful to support only true open access publishers and not those open access publishers with hybrid models which “double dip” by charging authors publication fees and libraries subscription fees for the same journal content. QUT Library has maintained a watch on the growing number of open access journals available from gold road open access publishers and their increased rate of success as measured by publication impact. ---------- RESULTS: This paper reports on the successes and challenges of QUT’s efforts to support true gold road open access publishers and promote these publishing strategy options to researchers at QUT. The number and spread of QUT papers submitted and published in the journals of each publisher is provided. Citation counts for papers and authors are also presented and analysed, with the intention of identifying the benefits to accessibility and research impact for early career and established researchers.---------- CONCLUSIONS: QUT Library is eager to continue and further develop support for this publishing strategy, and makes a number of recommendations to other research institutions, on how they can best achieve success with this strategy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Many researchers have investigated and modelled aspects of Web searching. A number of studies have explored the relationships between individual differences and Web searching. However, limited studies have explored the role of users’ cognitive styles in determining Web searching behaviour. Current models of Web searching have limited consideration of users’ cognitive styles. The impact of users’ cognitive style on Web searching and their relationships are little understood or represented. Individuals differ in their information processing approaches and the way they represent information, thus affecting their performance. To create better models of Web searching we need to understand more about user’s cognitive style and their Web search behaviour, and the relationship between them. More rigorous research is needed in using more complex and meaningful measures of relevance; across a range of different types of search tasks and different populations of Internet users. The project further explores the relationships between the users’ cognitive style and their Web searching. The project will develop a model depicting the relationships between a user’s cognitive style and their Web searching. The related literature, aims and objectives and research design are discussed.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Special collections, because of the issues associated with conservation and use, a feature they share with archives, tend to be the most digitized areas in libraries. The Nineteenth Century Schoolbooks collection is a collection of 9000 rarely held nineteenth-century schoolbooks that were painstakingly collected over a lifetime of work by Prof. John A. Nietz, and donated to the Hillman Library at the University of Pittsburgh in 1958, which has since grown to 15,000. About 140 of these texts are completely digitized and showcased in a publicly accessible website through the University of Pittsburgh’s Library, along with a searchable bibliography of the entire collection, which expanded the awareness of this collection and its user base to beyond the academic community. The URL for the website is http://digital.library.pitt.edu/nietz/. The collection is a rich resource for researchers studying the intellectual, educational, and textbook publishing history of the United States. In this study, we examined several existing records collected by the Digital Research Library at the University of Pittsburgh in order to determine the identity and searching behaviors of the users of this collection. Some of the records examined include: 1) The results of a 3-month long user survey, 2) User access statistics including search queries for a period of one year, a year after the digitized collection became publicly available in 2001, and 3) E-mail input received by the website over 4 years from 2000-2004. The results of the study demonstrate the differences in online retrieval strategies used by academic researchers and historians, archivists, avocationists, and the general public, and the importance of facilitating the discovery of digitized special collections through the use of electronic finding aids and an interactive interface with detailed metadata.