43 resultados para keyword
Resumo:
For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.
Resumo:
The purpose of this review is to update expected values for pedometer-determined physical activity in free-living healthy older populations. A search of the literature published since 2001 began with a keyword (pedometer, "step counter," "step activity monitor" or "accelerometer AND steps/day") search of PubMed, Cumulative Index to Nursing & Allied Health Literature (CINAHL), SportDiscus, and PsychInfo. An iterative process was then undertaken to abstract and verify studies of pedometer-determined physical activity (captured in terms of steps taken; distance only was not accepted) in free-living adult populations described as ≥ 50 years of age (studies that included samples which spanned this threshold were not included unless they provided at least some appropriately age-stratified data) and not specifically recruited based on any chronic disease or disability. We identified 28 studies representing at least 1,343 males and 3,098 females ranging in age from 50–94 years. Eighteen (or 64%) of the studies clearly identified using a Yamax pedometer model. Monitoring frames ranged from 3 days to 1 year; the modal length of time was 7 days (17 studies, or 61%). Mean pedometer-determined physical activity ranged from 2,015 steps/day to 8,938 steps/day. In those studies reporting such data, consistent patterns emerged: males generally took more steps/day than similarly aged females, steps/day decreased across study-specific age groupings, and BMI-defined normal weight individuals took more steps/day than overweight/obese older adults. The range of 2,000–9,000 steps/day likely reflects the true variability of physical activity behaviors in older populations. More explicit patterns, for example sex- and age-specific relationships, remain to be informed by future research endeavors.
Resumo:
Most web service discovery systems use keyword-based search algorithms and, although partially successful, sometimes fail to satisfy some users information needs. This has given rise to several semantics-based approaches that look to go beyond simple attribute matching and try to capture the semantics of services. However, the results reported in the literature vary and in many cases are worse than the results obtained by keyword-based systems. We believe the accuracy of the mechanisms used to extract tokens from the non-natural language sections of WSDL files directly affects the performance of these techniques, because some of them can be more sensitive to noise. In this paper three existing tokenization algorithms are evaluated and a new algorithm that outperforms all the algorithms found in the literature is introduced.
Resumo:
As the use of Twitter has become more commonplace throughout many nations, its role in political discussion has also increased. This has been evident in contexts ranging from general political discussion through local, state, and national elections (such as in the 2010 Australian elections) to protests and other activist mobilisation (for example in the current uprisings in Tunisia, Egypt, and Yemen, as well as in the controversy around Wikileaks). Research into the use of Twitter in such political contexts has also developed rapidly, aided by substantial advancements in quantitative and qualitative methodologies for capturing, processing, analysing, and visualising Twitter updates by large groups of users. Recent work has especially highlighted the role of the Twitter hashtag – a short keyword, prefixed with the hash symbol ‘#’ – as a means of coordinating a distributed discussion between more or less large groups of users, who do not need to be connected through existing ‘follower’ networks. Twitter hashtags – such as ‘#ausvotes’ for the 2010 Australian elections, ‘#londonriots’ for the coordination of information and political debates around the recent unrest in London, or ‘#wikileaks’ for the controversy around Wikileaks thus aid the formation of ad hoc publics around specific themes and topics. They emerge from within the Twitter community – sometimes as a result of pre-planning or quickly reached consensus, sometimes through protracted debate about what the appropriate hashtag for an event or topic should be (which may also lead to the formation of competing publics using different hashtags). Drawing on innovative methodologies for the study of Twitter content, this paper examines the use of hashtags in political debate in the context of a number of major case studies.
Resumo:
This paper demonstrates an experimental study that examines the accuracy of various information retrieval techniques for Web service discovery. The main goal of this research is to evaluate algorithms for semantic web service discovery. The evaluation is comprehensively benchmarked using more than 1,700 real-world WSDL documents from INEX 2010 Web Service Discovery Track dataset. For automatic search, we successfully use Latent Semantic Analysis and BM25 to perform Web service discovery. Moreover, we provide linking analysis which automatically links possible atomic Web services to meet the complex requirements of users. Our fusion engine recommends a final result to users. Our experiments show that linking analysis can improve the overall performance of Web service discovery. We also find that keyword-based search can quickly return results but it has limitation of understanding users’ goals.
Resumo:
The construction phase of building projects is often a crucial influencing factor in success or failure of projects. Project managers are believed to play a significant role in firms’ success and competitiveness. Therefore, it is important for firms to better understand the demands of managing projects and the competencies that project managers require for more effective project delivery. In a survey of building project managers in the state of Queensland, Australia, it was found that management and information management system are the top ranking competencies required by effective project managers. Furthermore, a significant number of respondents identified the site manager, construction manager and client’s representative as the three individuals whose close and regular contacts with project managers have the greatest influence on the project managers’ performance. Based on these findings, an intra-project workgroups model is proposed to help project managers facilitate more effective management of people and information on building projects.
Resumo:
Background This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching. Aim The concept-based approach is intended to overcome specific challenges we identified in searching medical records. Method Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology. Results Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision. Conclusion The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.
Resumo:
Children’s Literature Digital Resources incorporates primary texts published from white settlement to 1945, including children’s and young adult fiction, poetry, short stories, and picture books. This collection is supported by selected secondary material. The objective is to provide a centralised access point for information about Australian children's literature and writers and a growing body of full-text primary resources. Four key aims are: * To establish an important digital facility for research, teaching, and information provision around Australian children’s literature; * To provide access to a wide range of high-quality full-text data, both primary and secondary resources; * To provide access to essential library and research information infrastructure and facilities for established and emerging researchers in the fields of Humanities and Education; To enable research while preserving important heritage material. The collection contains texts digitised for AustLit through cooperation with various Australian libraries. The collection includes children’s and young adult fiction, poetry, picture books, short stories, and critical articles relating to relevant primary texts. Authors of primary sources include Irene Cheyne, E. W. Cole, Richard Rowe, Lillian M. Pyke, and Dorothy Wall. Secondary sources include critical works by Clare Bradford, Heather Scutter, Kerry White, Sharyn Pearce, and Marcie Muir. These full-text materials are keyword searchable (both within individual texts and across the CLDR corpus) and can be downloaded for research purposes. As well as digitising primary and secondary material, the project locates and provides pathways to existing online resources or internet publications to enhance AustLit's Children's Literature subset. These resources include both primary and secondary texts.
Resumo:
This article considers recent cases on guarantees of business loans to identify the lending practices that led the court to set aside the guarantee as against the creditor on the basis that the creditor had engaged in unconscionable conduct. It also explores the role of industry codes of practice in preventing unconscionable conduct, including whether there is a correlation between commitment to an industry code and higher standards of lending practices; whether compliance with an industry code would have produced different outcomes in the cases considered; and whether lenders need to do more than comply with an industry code to ensure their practices are fair and reasonable.
Resumo:
Over the last decade, the majority of existing search techniques is either keyword- based or category-based, resulting in unsatisfactory effectiveness. Meanwhile, studies have illustrated that more than 80% of users preferred personalized search results. As a result, many studies paid a great deal of efforts (referred to as col- laborative filtering) investigating on personalized notions for enhancing retrieval performance. One of the fundamental yet most challenging steps is to capture precise user information needs. Most Web users are inexperienced or lack the capability to express their needs properly, whereas the existent retrieval systems are highly sensitive to vocabulary. Researchers have increasingly proposed the utilization of ontology-based tech- niques to improve current mining approaches. The related techniques are not only able to refine search intentions among specific generic domains, but also to access new knowledge by tracking semantic relations. In recent years, some researchers have attempted to build ontological user profiles according to discovered user background knowledge. The knowledge is considered to be both global and lo- cal analyses, which aim to produce tailored ontologies by a group of concepts. However, a key problem here that has not been addressed is: how to accurately match diverse local information to universal global knowledge. This research conducts a theoretical study on the use of personalized ontolo- gies to enhance text mining performance. The objective is to understand user information needs by a \bag-of-concepts" rather than \words". The concepts are gathered from a general world knowledge base named the Library of Congress Subject Headings. To return desirable search results, a novel ontology-based mining approach is introduced to discover accurate search intentions and learn personalized ontologies as user profiles. The approach can not only pinpoint users' individual intentions in a rough hierarchical structure, but can also in- terpret their needs by a set of acknowledged concepts. Along with global and local analyses, another solid concept matching approach is carried out to address about the mismatch between local information and world knowledge. Relevance features produced by the Relevance Feature Discovery model, are determined as representatives of local information. These features have been proven as the best alternative for user queries to avoid ambiguity and consistently outperform the features extracted by other filtering models. The two attempt-to-proposed ap- proaches are both evaluated by a scientific evaluation with the standard Reuters Corpus Volume 1 testing set. A comprehensive comparison is made with a num- ber of the state-of-the art baseline models, including TF-IDF, Rocchio, Okapi BM25, the deploying Pattern Taxonomy Model, and an ontology-based model. The gathered results indicate that the top precision can be improved remarkably with the proposed ontology mining approach, where the matching approach is successful and achieves significant improvements in most information filtering measurements. This research contributes to the fields of ontological filtering, user profiling, and knowledge representation. The related outputs are critical when systems are expected to return proper mining results and provide personalized services. The scientific findings have the potential to facilitate the design of advanced preference mining models, where impact on people's daily lives.
Resumo:
Equity and Trusts : in Principle, 3rd edition is updated and revised throughout. It addresses the principles of equity and trusts and provides a clear analysis of this area.
Resumo:
BACKGROUND: Ankle joint equinus, or restricted dorsiflexion range of motion (ROM), has been linked to a range of pathologies of relevance to clinical practitioners. This systematic review and meta-analysis investigated the effects of conservative interventions on ankle joint ROM in healthy individuals and athletic populations. METHODS: Keyword searches of Embase Medline Cochrane and CINAHL databases were performed with the final search being run in August 2013. Studies were eligible for inclusion if they assessed the effect of a non-surgical intervention on ankle joint dorsiflexion in healthy populations. Studies were quality rated using a standard quality assessment scale. Standardised mean differences (SMDs) and 95% confidence intervals (CIs) were calculated and results were pooled where study methods were homogenous. RESULTS: Twenty-three studies met eligibility criteria, with a total of 734 study participants. Results suggest that there is some evidence to support the efficacy of static stretching alone (SMDs: range 0.70 to 1.69) and static stretching in combination with ultrasound (SMDs: range 0.91 to 0.95), diathermy (SMD 1.12), diathermy and ice (SMD 1.16), heel raise exercises (SMDs: range 0.70 to 0.77), superficial moist heat (SMDs: range 0.65 to 0.84) and warm up (SMD 0.87) in improving ankle joint dorsiflexion ROM. CONCLUSIONS: Some evidence exists to support the efficacy of stretching alone and stretching in combination with other therapies in increasing ankle joint ROM in healthy individuals. There is a paucity of quality evidence to support the efficacy of other non-surgical interventions, thus further research in this area is warranted.
Resumo:
Big Data presents many challenges related to volume, whether one is interested in studying past datasets or, even more problematically, attempting to work with live streams of data. The most obvious challenge, in a ‘noisy’ environment such as contemporary social media, is to collect the pertinent information; be that information for a specific study, tweets which can inform emergency services or other responders to an ongoing crisis, or give an advantage to those involved in prediction markets. Often, such a process is iterative, with keywords and hashtags changing with the passage of time, and both collection and analytic methodologies need to be continually adapted to respond to this changing information. While many of the data sets collected and analyzed are preformed, that is they are built around a particular keyword, hashtag, or set of authors, they still contain a large volume of information, much of which is unnecessary for the current purpose and/or potentially useful for future projects. Accordingly, this panel considers methods for separating and combining data to optimize big data research and report findings to stakeholders. The first paper considers possible coding mechanisms for incoming tweets during a crisis, taking a large stream of incoming tweets and selecting which of those need to be immediately placed in front of responders, for manual filtering and possible action. The paper suggests two solutions for this, content analysis and user profiling. In the former case, aspects of the tweet are assigned a score to assess its likely relationship to the topic at hand, and the urgency of the information, whilst the latter attempts to identify those users who are either serving as amplifiers of information or are known as an authoritative source. Through these techniques, the information contained in a large dataset could be filtered down to match the expected capacity of emergency responders, and knowledge as to the core keywords or hashtags relating to the current event is constantly refined for future data collection. The second paper is also concerned with identifying significant tweets, but in this case tweets relevant to particular prediction market; tennis betting. As increasing numbers of professional sports men and women create Twitter accounts to communicate with their fans, information is being shared regarding injuries, form and emotions which have the potential to impact on future results. As has already been demonstrated with leading US sports, such information is extremely valuable. Tennis, as with American Football (NFL) and Baseball (MLB) has paid subscription services which manually filter incoming news sources, including tweets, for information valuable to gamblers, gambling operators, and fantasy sports players. However, whilst such services are still niche operations, much of the value of information is lost by the time it reaches one of these services. The paper thus considers how information could be filtered from twitter user lists and hash tag or keyword monitoring, assessing the value of the source, information, and the prediction markets to which it may relate. The third paper examines methods for collecting Twitter data and following changes in an ongoing, dynamic social movement, such as the Occupy Wall Street movement. It involves the development of technical infrastructure to collect and make the tweets available for exploration and analysis. A strategy to respond to changes in the social movement is also required or the resulting tweets will only reflect the discussions and strategies the movement used at the time the keyword list is created — in a way, keyword creation is part strategy and part art. In this paper we describe strategies for the creation of a social media archive, specifically tweets related to the Occupy Wall Street movement, and methods for continuing to adapt data collection strategies as the movement’s presence in Twitter changes over time. We also discuss the opportunities and methods to extract data smaller slices of data from an archive of social media data to support a multitude of research projects in multiple fields of study. The common theme amongst these papers is that of constructing a data set, filtering it for a specific purpose, and then using the resulting information to aid in future data collection. The intention is that through the papers presented, and subsequent discussion, the panel will inform the wider research community not only on the objectives and limitations of data collection, live analytics, and filtering, but also on current and in-development methodologies that could be adopted by those working with such datasets, and how such approaches could be customized depending on the project stakeholders.
Resumo:
Interdisciplinary research is often funded by national government initiatives or large corporate sponsorship, and as such, demands periodic reporting on the use of those funds. For reasons of accountability, governance and communication to the tax payer, knowledge of the outcomes of the research need to be measured and understood. The interdisciplinary approach to research raises many challenges for impact reporting. This presentation will consider what are the best practice workflow models and methodologies.Novel methodologies that can be added to the usual metrics of academic publications include analysis of percentage share of total publications in a subject or keyword field, calculating most cited publication in a key phrase category, analysis of who has cited or reviewed the work, and benchmarking of this data against others in that same category. At QUT, interest in how collaborative networking is trending in a research theme has led to the creation of some useful co-authorship graphs that demonstrate the network positions of authors and the strength of their scientific collaborations within a group. The scale of international collaborations is also worth including in the assessment. However, despite all of the tools and techniques available, the most useful way a researcher can help themselves and the process is to set up and maintain their researcher identifier and profile.
Resumo:
The support for typically out-of-vocabulary query terms such as names, acronyms, and foreign words is an important requirement of many speech indexing applications. However, to date many unrestricted vocabulary indexing systems have struggled to provide a balance between good detection rate and fast query speeds. This paper presents a fast and accurate unrestricted vocabulary speech indexing technique named Dynamic Match Lattice Spotting (DMLS). The proposed method augments the conventional lattice spotting technique with dynamic sequence matching, together with a number of other novel algorithmic enhancements, to obtain a system that is capable of searching hours of speech in seconds while maintaining excellent detection performance