3 resultados para Extensible Pluggable Architecture Hydra Data
em CORA - Cork Open Research Archive - University College Cork - Ireland
Resumo:
A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.
Resumo:
The use of Cyber Physical Systems (CPS) to optimise industrial energy systems is an approach which has the potential to positively impact on manufacturing sector energy efficiency. The need to obtain data to facilitate the implementation of a CPS in an industrial energy system is however a complex task which is often implemented in a non-standardised way. The use of the 5C CPS architecture has the potential to standardise this approach. This paper describes a case study where data from a Combined Heat and Power (CHP) system located in a large manufacturing company was fused with grid electricity and gas models as well as a maintenance cost model using the 5C architecture with a view to making effective decisions on its cost efficient operation. A control change implemented based on the cognitive analysis enabled via the 5C architecture implementation has resulted in energy cost savings of over €7400 over a four-month period, with energy cost savings of over €150,000 projected once the 5C architecture is extended into the production environment.
Resumo:
We measure quality of service (QoS) in a wireless network architecture of transoceanic aircraft. A distinguishing characteristic of the network scheme we analyze is that it mixes the concept of Delay Tolerant Networking (DTN) through the exploitation of opportunistic contacts, together with direct satellite access in a limited number of the nodes. We provide a graph sparsification technique for deriving a network model that satisfies the key properties of a real aeronautical opportunistic network while enabling scalable simulation. This reduced model allows us to analyze the impact regarding QoS of introducing Internet-like traffic in the form of outgoing data from passengers. Promoting QoS in DTNs is usually really challenging due to their long delays and scarce resources. The availability of satellite communication links offers a chance to provide an improved degree of service regarding a pure opportunistic approach, and therefore it needs to be properly measured and quantified. Our analysis focuses on several QoS indicators such as delivery time, delivery ratio, and bandwidth allocation fairness. Obtained results show significant improvements in all metric indicators regarding QoS, not usually achievable on the field of DTNs.