2 resultados para Unstructured
em CORA - Cork Open Research Archive - University College Cork - Ireland
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
This research aims to explore the challenges nurses face, when caring for stroke patients on a general medical/surgical ward, in the acute care setting and identify how nurses resolve or process this challenge. Healthcare environments continue to face the pressures of constraints such as reduced staffing levels, budgets, resources and less time, which influence care provision. Patient safety is central in care provision where nurses face the challenge of delivering best quality care when working within constraints. The incidence of stroke is increasing worldwide and internationally stroke units are the recognised minimum standard of care. In Ireland with few designated stroke units in operation many stroke patients are cared for in the acute general care setting. A classic grounded theory methodology was utilised for this study. Data was collected and analysed simultaneously through coding, constant comparison, theoretical sampling and memoing. Individual unstructured interviews with thirty two nurses were carried out. Twenty hours of non-participant observations in the acute general care setting were undertaken. The main concern that emerged was working within constraints. This concern is processed by nurses through resigning which consists of three phases; idealistic striving, resourcing and care accommodation. Through the process of resigning nurses engage in an energy maintenance process enabling them to continue working within constraints. The generation of the theory of resigning explains how nurses’ resolve or process working within constraints. This theory adds to the body of knowledge on stroke care provision. This theory has the potential to enhance nursing care, minimise burnout and make better use of resources while advocating for best care of stroke patients.