3 resultados para data analysis: algorithms and implementation

em CORA - Cork Open Research Archive - University College Cork - Ireland


Relevância:

100.00% 100.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, the embedded capacitance material (ECM) is fabricated between the power and ground layers of the wireless sensor nodes, forming an integrated capacitance to replace the large amount of decoupling capacitors on the board. The ECM material, whose dielectric constant is 16, has the same size of the wireless sensor nodes of 3cm*3cm, with a thickness of only 14μm. Though the capacitance of a single ECM layer being only around 8nF, there are two reasons the ECM layers can still replace the high frequency decoupling capacitors (100nF in our case) on the board. The first reason is: the parasitic inductance of the ECM layer is much lower than the surface mount capacitors'. A smaller capacitance value of the ECM layer could achieve the same resonant frequency of the surface mount decoupling capacitors. Simulation and measurement fit this assumption well. The second reason is: more than one layer of ECM material are utilized during the design step to get a parallel connection of the several ECM capacitance layers, finally leading to a larger value of the capacitance and smaller value of parasitic. Characterization of the ECM is carried out by the LCR meter. To evaluate the behaviors of the ECM layer, time and frequency domain measurements are performed on the power-bus decoupling of the wireless sensor nodes. Comparison with the measurements of bare PCB board and decoupling capacitors solution are provided to show the improvement of the ECM layer. Measurements show that the implementation of the ECM layer can not only save the space of the surface mount decoupling capacitors, but also provide better power-bus decoupling to the nodes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Quantitative analysis of penetrative deformation in sedimentary rocks of fold and thrust belts has largely been carried out using clast based strain analysis techniques. These methods analyse the geometric deviations from an original state that populations of clasts, or strain markers, have undergone. The characterisation of these geometric changes, or strain, in the early stages of rock deformation is not entirely straight forward. This is in part due to the paucity of information on the original state of the strain markers, but also the uncertainty of the relative rheological properties of the strain markers and their matrix during deformation, as well as the interaction of two competing fabrics, such as bedding and cleavage. Furthermore one of the single largest setbacks for accurate strain analysis has been associated with the methods themselves, they are traditionally time consuming, labour intensive and results can vary between users. A suite of semi-automated techniques have been tested and found to work very well, but in low strain environments the problems discussed above persist. Additionally these techniques have been compared to Anisotropy of Magnetic Susceptibility (AMS) analyses, which is a particularly sensitive tool for the characterisation of low strain in sedimentary lithologies.