223 resultados para keyword
Resumo:
Traditional information retrieval (IR) systems respond to user queries with ranked lists of relevant documents. The separation of content and structure in XML documents allows individual XML elements to be selected in isolation. Thus, users expect XML-IR systems to return highly relevant results that are more precise than entire documents. In this paper we describe the implementation of a search engine for XML document collections. The system is keyword based and is built upon an XML inverted file system. We describe the approach that was adopted to meet the requirements of Content Only (CO) and Vague Content and Structure (VCAS) queries in INEX 2004.
Resumo:
The historical challenge of environmental impact assessment (EIA) has been to predict project-based impacts accurately. Both EIA legislation and the practice of EIA have evolved over the last three decades in Canada, and the development of the discipline and science of environmental assessment has improved how we apply environmental assessment to complex projects. The practice of environmental assessment integrates the social and natural sciences and relies on an eclectic knowledge base from a wide range of sources. EIA methods and tools provide a means to structure and integrate knowledge in order to evaluate and predict environmental impacts.----- This Chapter will provide a brief overview of how impacts are identified and predicted. How do we determine what aspect of the natural and social environment will be affected when a mine is excavated? How does the practitioner determine the range of potential impacts, assess whether they are significant, and predict the consequences? There are no standard answers to these questions, but there are established methods to provide a foundation for scoping and predicting the potential impacts of a project.----- Of course, the community and publics play an important role in this process, and this will be discussed in subsequent chapters. In the first part of this chapter, we will deal with impact identification, which involves appplying scoping to critical issues and determining impact significance, baseline ecosystem evaluation techniques, and how to communicate environmental impacts. In the second part of the chapter, we discuss the prediction of impacts in relation to the complexity of the environment, ecological risk assessment, and modelling.
Resumo:
With the advent of Service Oriented Architecture, Web Services have gained tremendous popularity. Due to the availability of a large number of Web services, finding an appropriate Web service according to the requirement of the user is a challenge. This warrants the need to establish an effective and reliable process of Web service discovery. A considerable body of research has emerged to develop methods to improve the accuracy of Web service discovery to match the best service. The process of Web service discovery results in suggesting many individual services that partially fulfil the user’s interest. By considering the semantic relationships of words used in describing the services as well as the use of input and output parameters can lead to accurate Web service discovery. Appropriate linking of individual matched services should fully satisfy the requirements which the user is looking for. This research proposes to integrate a semantic model and a data mining technique to enhance the accuracy of Web service discovery. A novel three-phase Web service discovery methodology has been proposed. The first phase performs match-making to find semantically similar Web services for a user query. In order to perform semantic analysis on the content present in the Web service description language document, the support-based latent semantic kernel is constructed using an innovative concept of binning and merging on the large quantity of text documents covering diverse areas of domain of knowledge. The use of a generic latent semantic kernel constructed with a large number of terms helps to find the hidden meaning of the query terms which otherwise could not be found. Sometimes a single Web service is unable to fully satisfy the requirement of the user. In such cases, a composition of multiple inter-related Web services is presented to the user. The task of checking the possibility of linking multiple Web services is done in the second phase. Once the feasibility of linking Web services is checked, the objective is to provide the user with the best composition of Web services. In the link analysis phase, the Web services are modelled as nodes of a graph and an allpair shortest-path algorithm is applied to find the optimum path at the minimum cost for traversal. The third phase which is the system integration, integrates the results from the preceding two phases by using an original fusion algorithm in the fusion engine. Finally, the recommendation engine which is an integral part of the system integration phase makes the final recommendations including individual and composite Web services to the user. In order to evaluate the performance of the proposed method, extensive experimentation has been performed. Results of the proposed support-based semantic kernel method of Web service discovery are compared with the results of the standard keyword-based information-retrieval method and a clustering-based machine-learning method of Web service discovery. The proposed method outperforms both information-retrieval and machine-learning based methods. Experimental results and statistical analysis also show that the best Web services compositions are obtained by considering 10 to 15 Web services that are found in phase-I for linking. Empirical results also ascertain that the fusion engine boosts the accuracy of Web service discovery by combining the inputs from both the semantic analysis (phase-I) and the link analysis (phase-II) in a systematic fashion. Overall, the accuracy of Web service discovery with the proposed method shows a significant improvement over traditional discovery methods.
Resumo:
Current multimedia Web search engines still use keywords as the primary means to search. Due to the richness in multimedia contents, general users constantly experience some difficulties in formulating textual queries that are representative enough for their needs. As a result, query reformulation becomes part of an inevitable process in most multimedia searches. Previous Web query formulation studies did not investigate the modification sequences and thus can only report limited findings on the reformulation behavior. In this study, we propose an automatic approach to examine multimedia query reformulation using large-scale transaction logs. The key findings show that search term replacement is the most dominant type of modifications in visual searches but less important in audio searches. Image search users prefer the specified search strategy more than video and audio users. There is also a clear tendency to replace terms with synonyms or associated terms in visual queries. The analysis of the search strategies in different types of multimedia searching provides some insights into user’s searching behavior, which can contribute to the design of future query formulation assistance for keyword-based Web multimedia retrieval systems.
Resumo:
Purpose – This paper aims to report findings from an exploratory study investigating the web interactions and technoliteracy of children in the early childhood years. Previous research has studied aspects of older children’s technoliteracy and web searching; however, few studies have analyzed web search data from children younger than six years of age. Design/methodology/approach – The study explored the Google web searching and technoliteracy of young children who are enrolled in a “preparatory classroom” or kindergarten (the year before young children begin compulsory schooling in Queensland, Australia). Young children were video- and audio-taped while conducting Google web searches in the classroom. The data were qualitatively analysed to understand the young children’s web search behaviour. Findings – The findings show that young children engage in complex web searches, including keyword searching and browsing, query formulation and reformulation, relevance judgments, successive searches, information multitasking and collaborative behaviours. The study results provide significant initial insights into young children’s web searching and technoliteracy. Practical implications – The use of web search engines by young children is an important research area with implications for educators and web technologies developers. Originality/value – This is the first study of young children’s interaction with a web search engine.
Resumo:
Background: Work-related injuries in Australia are estimated to cost around $57.5 billion annually, however there are currently insufficient surveillance data available to support an evidence-based public health response. Emergency departments (ED) in Australia are a potential source of information on work-related injuries though most ED’s do not have an ‘Activity Code’ to identify work-related cases with information about the presenting problem recorded in a short free text field. This study compared methods for interrogating text fields for identifying work-related injuries presenting at emergency departments to inform approaches to surveillance of work-related injury.---------- Methods: Three approaches were used to interrogate an injury description text field to classify cases as work-related: keyword search, index search, and content analytic text mining. Sensitivity and specificity were examined by comparing cases flagged by each approach to cases coded with an Activity code during triage. Methods to improve the sensitivity and/or specificity of each approach were explored by adjusting the classification techniques within each broad approach.---------- Results: The basic keyword search detected 58% of cases (Specificity 0.99), an index search detected 62% of cases (Specificity 0.87), and the content analytic text mining (using adjusted probabilities) approach detected 77% of cases (Specificity 0.95).---------- Conclusions The findings of this study provide strong support for continued development of text searching methods to obtain information from routine emergency department data, to improve the capacity for comprehensive injury surveillance.
Resumo:
Citizenship is a term of association among strangers. Access to it involves contested identities and symbolic meanings, differing power relations and strategies of inclusion, exclusion and action, and unequal room for maneuver or productivity in the uses of citizenship for any given group or individual. In the context of "rethinking communication," strenuous action is neede to associate such different life chances in a common enterprise at a national level or, more modestly, simply to claim equivalence for all such groups under the rule of one law.
Resumo:
For the first time in human history, large volumes of spoken audio are being broadcast, made available on the internet, archived, and monitored for surveillance every day. New technologies are urgently required to unlock these vast and powerful stores of information. Spoken Term Detection (STD) systems provide access to speech collections by detecting individual occurrences of specified search terms. The aim of this work is to develop improved STD solutions based on phonetic indexing. In particular, this work aims to develop phonetic STD systems for applications that require open-vocabulary search, fast indexing and search speeds, and accurate term detection. Within this scope, novel contributions are made within two research themes, that is, accommodating phone recognition errors and, secondly, modelling uncertainty with probabilistic scores. A state-of-the-art Dynamic Match Lattice Spotting (DMLS) system is used to address the problem of accommodating phone recognition errors with approximate phone sequence matching. Extensive experimentation on the use of DMLS is carried out and a number of novel enhancements are developed that provide for faster indexing, faster search, and improved accuracy. Firstly, a novel comparison of methods for deriving a phone error cost model is presented to improve STD accuracy, resulting in up to a 33% improvement in the Figure of Merit. A method is also presented for drastically increasing the speed of DMLS search by at least an order of magnitude with no loss in search accuracy. An investigation is then presented of the effects of increasing indexing speed for DMLS, by using simpler modelling during phone decoding, with results highlighting the trade-off between indexing speed, search speed and search accuracy. The Figure of Merit is further improved by up to 25% using a novel proposal to utilise word-level language modelling during DMLS indexing. Analysis shows that this use of language modelling can, however, be unhelpful or even disadvantageous for terms with a very low language model probability. The DMLS approach to STD involves generating an index of phone sequences using phone recognition. An alternative approach to phonetic STD is also investigated that instead indexes probabilistic acoustic scores in the form of a posterior-feature matrix. A state-of-the-art system is described and its use for STD is explored through several experiments on spontaneous conversational telephone speech. A novel technique and framework is proposed for discriminatively training such a system to directly maximise the Figure of Merit. This results in a 13% improvement in the Figure of Merit on held-out data. The framework is also found to be particularly useful for index compression in conjunction with the proposed optimisation technique, providing for a substantial index compression factor in addition to an overall gain in the Figure of Merit. These contributions significantly advance the state-of-the-art in phonetic STD, by improving the utility of such systems in a wide range of applications.
Resumo:
The purpose of this review is to update expected values for pedometer-determined physical activity in free-living healthy older populations. A search of the literature published since 2001 began with a keyword (pedometer, "step counter," "step activity monitor" or "accelerometer AND steps/day") search of PubMed, Cumulative Index to Nursing & Allied Health Literature (CINAHL), SportDiscus, and PsychInfo. An iterative process was then undertaken to abstract and verify studies of pedometer-determined physical activity (captured in terms of steps taken; distance only was not accepted) in free-living adult populations described as ≥ 50 years of age (studies that included samples which spanned this threshold were not included unless they provided at least some appropriately age-stratified data) and not specifically recruited based on any chronic disease or disability. We identified 28 studies representing at least 1,343 males and 3,098 females ranging in age from 50–94 years. Eighteen (or 64%) of the studies clearly identified using a Yamax pedometer model. Monitoring frames ranged from 3 days to 1 year; the modal length of time was 7 days (17 studies, or 61%). Mean pedometer-determined physical activity ranged from 2,015 steps/day to 8,938 steps/day. In those studies reporting such data, consistent patterns emerged: males generally took more steps/day than similarly aged females, steps/day decreased across study-specific age groupings, and BMI-defined normal weight individuals took more steps/day than overweight/obese older adults. The range of 2,000–9,000 steps/day likely reflects the true variability of physical activity behaviors in older populations. More explicit patterns, for example sex- and age-specific relationships, remain to be informed by future research endeavors.
Resumo:
Most web service discovery systems use keyword-based search algorithms and, although partially successful, sometimes fail to satisfy some users information needs. This has given rise to several semantics-based approaches that look to go beyond simple attribute matching and try to capture the semantics of services. However, the results reported in the literature vary and in many cases are worse than the results obtained by keyword-based systems. We believe the accuracy of the mechanisms used to extract tokens from the non-natural language sections of WSDL files directly affects the performance of these techniques, because some of them can be more sensitive to noise. In this paper three existing tokenization algorithms are evaluated and a new algorithm that outperforms all the algorithms found in the literature is introduced.
Resumo:
As the use of Twitter has become more commonplace throughout many nations, its role in political discussion has also increased. This has been evident in contexts ranging from general political discussion through local, state, and national elections (such as in the 2010 Australian elections) to protests and other activist mobilisation (for example in the current uprisings in Tunisia, Egypt, and Yemen, as well as in the controversy around Wikileaks). Research into the use of Twitter in such political contexts has also developed rapidly, aided by substantial advancements in quantitative and qualitative methodologies for capturing, processing, analysing, and visualising Twitter updates by large groups of users. Recent work has especially highlighted the role of the Twitter hashtag – a short keyword, prefixed with the hash symbol ‘#’ – as a means of coordinating a distributed discussion between more or less large groups of users, who do not need to be connected through existing ‘follower’ networks. Twitter hashtags – such as ‘#ausvotes’ for the 2010 Australian elections, ‘#londonriots’ for the coordination of information and political debates around the recent unrest in London, or ‘#wikileaks’ for the controversy around Wikileaks thus aid the formation of ad hoc publics around specific themes and topics. They emerge from within the Twitter community – sometimes as a result of pre-planning or quickly reached consensus, sometimes through protracted debate about what the appropriate hashtag for an event or topic should be (which may also lead to the formation of competing publics using different hashtags). Drawing on innovative methodologies for the study of Twitter content, this paper examines the use of hashtags in political debate in the context of a number of major case studies.
Resumo:
This paper demonstrates an experimental study that examines the accuracy of various information retrieval techniques for Web service discovery. The main goal of this research is to evaluate algorithms for semantic web service discovery. The evaluation is comprehensively benchmarked using more than 1,700 real-world WSDL documents from INEX 2010 Web Service Discovery Track dataset. For automatic search, we successfully use Latent Semantic Analysis and BM25 to perform Web service discovery. Moreover, we provide linking analysis which automatically links possible atomic Web services to meet the complex requirements of users. Our fusion engine recommends a final result to users. Our experiments show that linking analysis can improve the overall performance of Web service discovery. We also find that keyword-based search can quickly return results but it has limitation of understanding users’ goals.
Resumo:
The construction phase of building projects is often a crucial influencing factor in success or failure of projects. Project managers are believed to play a significant role in firms’ success and competitiveness. Therefore, it is important for firms to better understand the demands of managing projects and the competencies that project managers require for more effective project delivery. In a survey of building project managers in the state of Queensland, Australia, it was found that management and information management system are the top ranking competencies required by effective project managers. Furthermore, a significant number of respondents identified the site manager, construction manager and client’s representative as the three individuals whose close and regular contacts with project managers have the greatest influence on the project managers’ performance. Based on these findings, an intra-project workgroups model is proposed to help project managers facilitate more effective management of people and information on building projects.
Resumo:
Background This paper presents a novel approach to searching electronic medical records that is based on concept matching rather than keyword matching. Aim The concept-based approach is intended to overcome specific challenges we identified in searching medical records. Method Queries and documents were transformed from their term-based originals into medical concepts as defined by the SNOMED-CT ontology. Results Evaluation on a real-world collection of medical records showed our concept-based approach outperformed a keyword baseline by 25% in Mean Average Precision. Conclusion The concept-based approach provides a framework for further development of inference based search systems for dealing with medical data.