976 resultados para missing information
Resumo:
This item provides supplementary materials for the paper mentioned in the title, specifically a range of organisms used in the study. The full abstract for the main paper is as follows: Next Generation Sequencing (NGS) technologies have revolutionised molecular biology, allowing clinical sequencing to become a matter of routine. NGS data sets consist of short sequence reads obtained from the machine, given context and meaning through downstream assembly and annotation. For these techniques to operate successfully, the collected reads must be consistent with the assumed species or species group, and not corrupted in some way. The common bacterium Staphylococcus aureus may cause severe and life-threatening infections in humans,with some strains exhibiting antibiotic resistance. In this paper, we apply an SVM classifier to the important problem of distinguishing S. aureus sequencing projects from alternative pathogens, including closely related Staphylococci. Using a sequence k-mer representation, we achieve precision and recall above 95%, implicating features with important functional associations.
Resumo:
The Therapeutic Advice and Information Service was funded by the National Prescribing Service to provide a national drug information service for health professionals working in the community. For ten years the service achieved high levels of client satisfaction, and reached its contracted target of 6000 enquiries about medicines per year, however the service ceased on 30 June 2010.
Resumo:
A building information model (BIM) provides a rich representation of a building's design. However, there are many challenges in getting construction-specific information from a BIM, limiting the usability of BIM for construction and other downstream processes. This paper describes a novel approach that utilizes ontology-based feature modeling, automatic feature extraction based on ifcXML, and query processing to extract information relevant to construction practitioners from a given BIM. The feature ontology generically represents construction-specific information that is useful for a broad range of construction management functions. The software prototype uses the ontology to transform the designer-focused BIM into a construction-specific feature-based model (FBM). The formal query methods operate on the FBM to further help construction users to quickly extract the necessary information from a BIM. Our tests demonstrate that this approach provides a richer representation of construction-specific information compared to existing BIM tools.
Resumo:
The design and construction community has shown increasing interest in adopting building information models (BIMs). The richness of information provided by BIMs has the potential to streamline the design and construction processes by enabling enhanced communication, coordination, automation and analysis. However, there are many challenges in extracting construction-specific information out of BIMs. In most cases, construction practitioners have to manually identify the required information, which is inefficient and prone to error, particularly for complex, large-scale projects. This paper describes the process and methods we have formalized to partially automate the extraction and querying of construction-specific information from a BIM. We describe methods for analyzing a BIM to query for spatial information that is relevant for construction practitioners, and that is typically represented implicitly in a BIM. Our approach integrates ifcXML data and other spatial data to develop a richer model for construction users. We employ custom 2D topological XQuery predicates to answer a variety of spatial queries. The validation results demonstrate that this approach provides a richer representation of construction-specific information compared to existing BIM tools.
Resumo:
Modern mobile computing devices are versatile, but bring the burden of constant settings adjustment according to the current conditions of the environment. While until today, this task has to be accomplished by the human user, the variety of sensors usually deployed in such a handset provides enough data for autonomous self-configuration by a learning, adaptive system. However, this data is not fully available at certain points in time, or can contain false values. Handling potentially incomplete sensor data to detect context changes without a semantic layer represents a scientific challenge which we address with our approach. A novel machine learning technique is presented - the Missing-Values-SOM - which solves this problem by predicting setting adjustments based on context information. Our method is centered around a self-organizing map, extending it to provide a means of handling missing values. We demonstrate the performance of our approach on mobile context snapshots, as well as on classical machine learning datasets.
Resumo:
Crowdsourcing has become a popular approach for capitalizing on the potential of large and open crowds of people external to the organization. While crowdsourcing as a phenomenon is studied in a variety of fields, research mostly focuses on isolated aspects and little is known about the integrated design of crowdsourcing efforts. We introduce a socio-technical systems perspective on crowdsourcing, which provides a deeper understanding of the components and relationships in crowdsourcing systems. By considering the function of crowdsourcing systems within their organizational context, we develop a typology of four distinct system archetypes. We analyze the characteristics of each type and derive a number of design requirements for the respective system components. The paper lays a foundation for IS-based crowdsourcing research, channels related academic work, and helps guiding the study and design of crowdsourcing information systems.
Resumo:
On August 16, 2012 the SIGIR 2012 Workshop on Open Source Information Retrieval was held as part of the SIGIR 2012 conference in Portland, Oregon, USA. There were 2 invited talks, one from industry and one from academia. There were 6 full papers and 6 short papers presented as well as demonstrations of 4 open source tools. Finally there was a lively discussion on future directions for the open source Information Retrieval community. This contribution discusses the events of the workshop and outlines future directions for the community.
Resumo:
We consider Cooperative Intrusion Detection System (CIDS) which is a distributed AIS-based (Artificial Immune System) IDS where nodes collaborate over a peer-to-peer overlay network. The AIS uses the negative selection algorithm for the selection of detectors (e.g., vectors of features such as CPU utilization, memory usage and network activity). For better detection performance, selection of all possible detectors for a node is desirable but it may not be feasible due to storage and computational overheads. Limiting the number of detectors on the other hand comes with the danger of missing attacks. We present a scheme for the controlled and decentralized division of detector sets where each IDS is assigned to a region of the feature space. We investigate the trade-off between scalability and robustness of detector sets. We address the problem of self-organization in CIDS so that each node generates a distinct set of the detectors to maximize the coverage of the feature space while pairs of nodes exchange their detector sets to provide a controlled level of redundancy. Our contribution is twofold. First, we use Symmetric Balanced Incomplete Block Design, Generalized Quadrangles and Ramanujan Expander Graph based deterministic techniques from combinatorial design theory and graph theory to decide how many and which detectors are exchanged between which pair of IDS nodes. Second, we use a classical epidemic model (SIR model) to show how properties from deterministic techniques can help us to reduce the attack spread rate.
Resumo:
This paper describes an innovative platform that facilitates the collection of objective safety data around occurrences at railway level crossings using data sources including forward-facing video, telemetry from trains and geo-referenced asset and survey data. This platform is being developed with support by the Australian rail industry and the Cooperative Research Centre for Rail Innovation. The paper provides a description of the underlying accident causation model, the development methodology and refinement process as well as a description of the data collection platform. The paper concludes with a brief discussion of benefits this project is expected to provide the Australian rail industry.
Resumo:
Nowadays people heavily rely on the Internet for information and knowledge. Wikipedia is an online multilingual encyclopaedia that contains a very large number of detailed articles covering most written languages. It is often considered to be a treasury of human knowledge. It includes extensive hypertext links between documents of the same language for easy navigation. However, the pages in different languages are rarely cross-linked except for direct equivalent pages on the same subject in different languages. This could pose serious difficulties to users seeking information or knowledge from different lingual sources, or where there is no equivalent page in one language or another. In this thesis, a new information retrieval task—cross-lingual link discovery (CLLD) is proposed to tackle the problem of the lack of cross-lingual anchored links in a knowledge base such as Wikipedia. In contrast to traditional information retrieval tasks, cross language link discovery algorithms actively recommend a set of meaningful anchors in a source document and establish links to documents in an alternative language. In other words, cross-lingual link discovery is a way of automatically finding hypertext links between documents in different languages, which is particularly helpful for knowledge discovery in different language domains. This study is specifically focused on Chinese / English link discovery (C/ELD). Chinese / English link discovery is a special case of cross-lingual link discovery task. It involves tasks including natural language processing (NLP), cross-lingual information retrieval (CLIR) and cross-lingual link discovery. To justify the effectiveness of CLLD, a standard evaluation framework is also proposed. The evaluation framework includes topics, document collections, a gold standard dataset, evaluation metrics, and toolkits for run pooling, link assessment and system evaluation. With the evaluation framework, performance of CLLD approaches and systems can be quantified. This thesis contributes to the research on natural language processing and cross-lingual information retrieval in CLLD: 1) a new simple, but effective Chinese segmentation method, n-gram mutual information, is presented for determining the boundaries of Chinese text; 2) a voting mechanism of name entity translation is demonstrated for achieving a high precision of English / Chinese machine translation; 3) a link mining approach that mines the existing link structure for anchor probabilities achieves encouraging results in suggesting cross-lingual Chinese / English links in Wikipedia. This approach was examined in the experiments for better, automatic generation of cross-lingual links that were carried out as part of the study. The overall major contribution of this thesis is the provision of a standard evaluation framework for cross-lingual link discovery research. It is important in CLLD evaluation to have this framework which helps in benchmarking the performance of various CLLD systems and in identifying good CLLD realisation approaches. The evaluation methods and the evaluation framework described in this thesis have been utilised to quantify the system performance in the NTCIR-9 Crosslink task which is the first information retrieval track of this kind.
Resumo:
Purpose – The aim of the paper is to describe and explain, using a combination of interviews and content analysis, the social and environmental reporting practices of a major garment export organisation within a developing country. Design/methodology/approach – Senior executives from a major organisation in Bangladesh are interviewed to determine the pressures being exerted on them in terms of their social and environmental performance. The perceptions of pressures are then used to explain – via content analysis – changing social and environmental disclosure practices. Findings – The results show that particular stakeholder groups have, since the early 1990s, placed pressure on the Bangladeshi clothing industry in terms of its social performance. This pressure, which is also directly related to the expectations of the global community, in turn drives the industry's social policies and related disclosure practices. Research limitations/implications – The findings show that, within the context of a developing country, unless we consider the managers' perceptions about the social and environmental expectations being imposed upon them by powerful stakeholder groups then we will be unable to understand organisational disclosure practices. Originality/value – This paper is the first known paper to interview managers from a large organisation in a developing country about changing stakeholder expectations and then link these changing expectations to annual report disclosures across an extended period of analysis.
Resumo:
This paper investigates the social and environmental disclosure practices of two large multinational companies, specifically Nike and Hennes&Mauritz. Utilising a joint consideration of legitimacy theory and media agenda setting theory, we investigate the linkage between negative media attention, and positive corporate social and environmental disclosures. Our results generally support a view that for those industry‐related social and environmental issues attracting the greatest amount of negative media attention, these corporations react by providing positive social and environmental disclosures. The results were particularly significant in relation to labour practices in developing countries – the issue attracting the greatest amount of negative media attention for the companies in question.
Resumo:
Information privacy is a crucial aspect of eHealth. Appropriate privacy management measures are therefore essential for its success. However, traditional measures for privacy preservation such as rigid access controls (i.e., preventive measures) are not suitable to eHealth because of the specialised and information - intensive nature of healthcare itself, and the nature of the information. Healthcare professionals (HCP) require easy, unrestricted access to as much information as possible towards making well - informed decisions. On the other end of the scale however, consumers (i.e., patients) demand control over their health information and raise concerns for privacy arising from internal activities (i.e., information use by HCPs). A proper balance of these competing concerns is vital for the implementation of successful eHealth systems. Towards reaching this balance, we propose an information accountability framework (IAF) for eHealth systems.
Resumo:
A global, online quantitative study among 300 consumers of digital technology products found the most reliable information sources were friends, family or word of mouth (WOM) from someone they knew, followed by expert product reviews, and product reviews written by other consumers. The most unreliable information sources were advertising or infomercials, automated recommendations based on purchasing patterns or retailers. While a very small number of consumers evaluated products online, rating of products and online discussions were more frequent activities. The most popular social media websites for reviews were Facebook, Twitter, Amazon and e-Bay, indicating the importance of WOM in social networks and online media spaces that feature product reviews as it is the most persuasive piece of information in both online and offline social networks. These results suggest that ‘social customers’ must be considered as an integral part of a marketing strategy.