3 resultados para Content processing

em CORA - Cork Open Research Archive - University College Cork - Ireland


Relevância:

30.00% 30.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objectives of this thesis were to (i) study the effect of increasing protein concentration in milk protein concentrate (MPC) powders on surface composition and sorption properties; (ii) examine the effect of increasing protein content on the rehydration properties of MPC; (iii) study the physicochemical properties of spraydried emulsion-containing powders having different water and oil contents; (iv) analyse the effect of protein type on water sorption and diffusivity properties in a protein/lactose dispersion, and; (v) characterise lactose crystallisation and emulsion stability of model infant formula containing intact or hydrolysed whey proteins. Surface composition of MPC powders (protein contents 35 - 86 g / 100 g) indicated that fat and protein were preferentially located on the surface of powders. Low protein powder (35 g / 100 g) exhibited lactose crystallisation, whereas powders with higher protein contents did not, due to their high protein: lactose ratio. Insolubility was evident in high protein MPCs and was primarily related to insolubility of the casein fraction. High temperature (50 °C) was required for dissolution of high protein MPCs (protein content > 60 g / 100 g). The effect of different oil types and spray-drying outlet temperature on the physicochemical properties of the resultant fat-filled powders was investigated and showed that increasing outlet temperature reduced water content, water activity and tapped bulk density, irrespective of oil type, and increased solvent-extractable free fat for all oil types and onset of glass transition (Tg) and crystallisation (Tcr) temperature. Powder dispersions of protein/lactose (0.21:1), containing either intact or hydrolysed whey protein (12 % degree of hydrolysis; DH), were spray-dried at pilot scale. Moisture sorption analysis at 25 °C showed that dispersions containing intact whey protein exhibited lactose crystallisation at a lower relative humidity (RH). Dispersions containing hydrolysed whey protein had significantly higher (P < 0.05) water diffusivity. Finally, a spray-dried model infant formula was produced containing hydrolysed or intact whey as the protein with sunflower oil as the fat source. Reconstituted, hydrolysed formula had a significantly (P < 0.05) higher fat globule size and lower emulsion stability than intact formula. Lactose crystallisation in powders occurred at higher RH for hydrolysed formula. In conclusion, this research has shown the effect of altering the protein type, protein composition, and oil type on the surface composition and physical properties of different dairy powders, and how these variations greatly affect their rehydration characteristics and storage stability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Honey is rich in sugar content and dominated by fructose and glucose that make honey prone to crystallize during storage. Due to honey composition, the anhydrous glass transition temperature of honey is very low that makes honey difficult to dry alone and drying aid or filler is needed to dry honey. Maltodextrin is a common drying aid material used in drying of sugar-rich food. The present study aims to study the processing of honey powder by vacuum drying method and the impact of drying process and formulation on the stability of honey powder. To achieve the objectives, the series of experiments were done: investigating of maltodextrin DE 10 properties, studying the effect of drying temperature, total solid concentration, DE value, maltodextrin concentration and anti-caking agent on honey powder processing and stability. Maltodextrin provide stable glass compared to lower molecular weight sugars. Dynamic Dew Point Isotherm (DDI) data could be used to determine amorphous content of a system. The area under the first derivative curve from DDI curve is equal to the amount of water needed by amorphous material to crystallize. The drying temperature affected the amorphous content of vacuum-dried honey powder. The higher temperature seemed to result in honey powder with more amorphous component. The ratio of maltodextrin affected more significantly the stability of honey powder compared to the treatments of total solids concentration, DE value and drying temperature. The critical water activity of honey powder was lower than water activity of the equilibrium water content corresponding to BET monolayer water content. Addition of anti-caking agent increased stability and flow-ability of honey powder. Addition of Calcium stearate could inhibit collapse of the honey powder during storage.