974 resultados para Content processing


Relevância:

30.00% 30.00%

Publicador:

Resumo:

After many years of scholar study, manuscript collections continue to be an important source of novel information for scholars, concerning both the history of earlier times as well as the development of cultural documentation over the centuries. D-SCRIBE project aims to support and facilitate current and future efforts in manuscript digitization and processing. It strives toward the creation of a comprehensive software product, which can assist the content holders in turning an archive of manuscripts into a digital collection using automated methods. In this paper, we focus on the problem of recognizing early Christian Greek manuscripts. We propose a novel digital image binarization scheme for low quality historical documents allowing further content exploitation in an efficient way. Based on the existence of closed cavity regions in the majority of characters and character ligatures in these scripts, we propose a novel, segmentation-free, fast and efficient technique that assists the recognition procedure by tracing and recognizing the most frequently appearing characters or character ligatures.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mixed-content miscellanies (very frequent in the Byzantine and mediaeval Slavic written heritage) are usually defined as collections of works with non-occupational, non-liturgical application, and texts in them are selected and arranged according to no identifiable principle. It is a “readable” type of miscellanies which were compiled mainly on the basis of the cognitive interests of compilers and readers. Just like the occupational ones, they also appeared to satisfy public needs but were intended for individual usage. My textological comparison had shown that mixed- content miscellanies often showed evidence of a stable content – some of them include the same constituent works in the same order, regardless that the manuscripts had no obvious genetic relationship. These correspondences were sufficiently numerous and distinctive that they could not be merely fortuitous, and the only sensible interpretation was that even when the operative organizational principle was not based on independently identifiable criteria, such as the church calendar, liturgical function, or thematic considerations, mixed-content miscellanies (or, at least, portions of their contents) nonetheless fell into types. In this respect, the apparent free selection and arrangement of texts in mixed-content miscellanies turns out to be illusory. The problem was – as the corpus of manuscripts that I and my colleagues needed to examine grew – our ability to keep track of the structure of each one, and to identify structural correspondences among manuscripts within the corpus, diminished. So, at the end of 1993 I addressed a letter to Prof. David Birnbaum (University of Pittsburgh, PA) with a request to help me to solve the problem. He and my colleague Andrey Boyadzhiev (Sofia University) pointed out to me that computers are well suited to recording, processing, and analyzing large amounts of data, and to identifying patterns within the data, and their proposal was that we try to develop a computer system for description of manuscripts, for their analysis and of course, for searching the data. Our collaboration in this project is now ten years old, and our talk today presents an overview of that collaboration.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A solar power satellite is paid attention to as a clean, inexhaustible large- scale base-load power supply. The following technology related to beam control is used: A pilot signal is sent from the power receiving site and after direction of arrival estimation the beam is directed back to the earth by same direction. A novel direction-finding algorithm based on linear prediction technique for exploiting cyclostationary statistical information (spatial and temporal) is explored. Many modulated communication signals exhibit a cyclostationarity (or periodic correlation) property, corresponding to the underlying periodicity arising from carrier frequencies or baud rates. The problem was solved by using both cyclic second-order statistics and cyclic higher-order statistics. By evaluating the corresponding cyclic statistics of the received data at certain cycle frequencies, we can extract the cyclic correlations of only signals with the same cycle frequency and null out the cyclic correlations of stationary additive noise and all other co-channel interferences with different cycle frequencies. Thus, the signal detection capability can be significantly improved. The proposed algorithms employ cyclic higher-order statistics of the array output and suppress additive Gaussian noise of unknown spectral content, even when the noise shares common cycle frequencies with the non-Gaussian signals of interest. The proposed method completely exploits temporal information (multiple lag ), and also can correctly estimate direction of arrival of desired signals by suppressing undesired signals. Our approach was generalized over direction of arrival estimation of cyclostationary coherent signals. In this paper, we propose a new approach for exploiting cyclostationarity that seems to be more advanced in comparison with the other existing direction finding algorithms.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Lutein and zeaxanthin are carotenoids that are selectively taken up into the macula of the eye, where they are thought to protect against the development of age-related macular degeneration. They are obtained from dietary sources, with the highest concentrations found in dark green leafy vegetables, such as kale and spinach. In this Review, compositional variations due to variety/cultivar, stage of maturity, climate or season, farming practice, storage, and processing effects are highlighted. Only data from studies which report on lutein and zeaxanthin content in foods are reported. The main focus is kale; however, other predominantly xanthophyll containing vegetables such as spinach and broccoli are included. A small amount of data about exotic fruits is also referenced for comparison. The qualitative and quantitative composition of carotenoids in fruits and vegetables is known to vary with multiple factors. In kale, lutein and zeaxanthin levels are affected by pre-harvest effects such as maturity, climate, and farming practice. Further research is needed to determine the post-harvest processing and storage effects of lutein and zeaxanthin in kale; this will enable precise suggestions for increasing retinal levels of these nutrients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as “histogram binning” inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. ^ Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. ^ The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. ^ These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. ^ In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation. ^

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Elemental and isotopic composition of leaves of the seagrassThalassia testudinum was highly variable across the 10,000 km2 and 8 years of this study. The data reported herein expand the reported range in carbon:nitrogen (C:N) and carbon:phosphorus (C:P) ratios and δ13C and δ15N values reported for this species worldwide; 13.2–38.6 for C:N and 411–2,041 for C:P. The 981 determinations in this study generated a range of −13.5‰ to −5.2‰ for δ13C and −4.3‰ to 9.4‰ for δ15N. The elemental and isotope ratios displayed marked seasonality, and the seasonal patterns could be described with a simple sine wave model. C:N, C:P, δ13C, and δ15N values all had maxima in the summer and minima in the winter. Spatial patterns in the summer maxima of these quantities suggest there are large differences in the relative availability of N and P across the study area and that there are differences in the processing and the isotopic composition of C and N. This work calls into question the interpretation of studies about nutrient cycling and food webs in estuaries based on few samples collected at one time, since we document natural variability greater than the signal often used to imply changes in the structure or function of ecosystems. The data and patterns presented in this paper make it clear that there is no threshold δ15N value for marine plants that can be used as an unambiguous indicator of human sewage pollution without a thorough understanding of local temporal and spatial variability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as "histogram binning" inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

OBJECTIVES: The aim of this study was to investigate the influence of process parameters during dry coating on particle and dosage form properties upon varying the surface adsorbed moisture of microcrystalline cellulose (MCC), a model filler/binder for orally disintegrating tablets (ODTs). METHODS: The moisture content of MCC was optimised using the spray water method and analysed using thermogravimetric analysis. Microproperty/macroproperty assessment was investigated using atomic force microscopy, nano-indentation, scanning electron microscopy, tablet hardness and disintegration testing. KEY FINDINGS: The results showed that MCC demonstrated its best flowability at a moisture content of 11.2% w/w when compared to control, comprising of 3.9% w/w moisture. The use of the composite powder coating process (without air) resulted in up to 80% increase in tablet hardness, when compared to the control. The study also demonstrated that surface adsorbed moisture can be displaced upon addition of excipients during dry processing circumventing the need for particle drying before tabletting. CONCLUSIONS: It was concluded that MCC with a moisture content of 11% w/w provides a good balance between powder flowability and favourable ODT characteristics.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objectives of this thesis were to (i) study the effect of increasing protein concentration in milk protein concentrate (MPC) powders on surface composition and sorption properties; (ii) examine the effect of increasing protein content on the rehydration properties of MPC; (iii) study the physicochemical properties of spraydried emulsion-containing powders having different water and oil contents; (iv) analyse the effect of protein type on water sorption and diffusivity properties in a protein/lactose dispersion, and; (v) characterise lactose crystallisation and emulsion stability of model infant formula containing intact or hydrolysed whey proteins. Surface composition of MPC powders (protein contents 35 - 86 g / 100 g) indicated that fat and protein were preferentially located on the surface of powders. Low protein powder (35 g / 100 g) exhibited lactose crystallisation, whereas powders with higher protein contents did not, due to their high protein: lactose ratio. Insolubility was evident in high protein MPCs and was primarily related to insolubility of the casein fraction. High temperature (50 °C) was required for dissolution of high protein MPCs (protein content > 60 g / 100 g). The effect of different oil types and spray-drying outlet temperature on the physicochemical properties of the resultant fat-filled powders was investigated and showed that increasing outlet temperature reduced water content, water activity and tapped bulk density, irrespective of oil type, and increased solvent-extractable free fat for all oil types and onset of glass transition (Tg) and crystallisation (Tcr) temperature. Powder dispersions of protein/lactose (0.21:1), containing either intact or hydrolysed whey protein (12 % degree of hydrolysis; DH), were spray-dried at pilot scale. Moisture sorption analysis at 25 °C showed that dispersions containing intact whey protein exhibited lactose crystallisation at a lower relative humidity (RH). Dispersions containing hydrolysed whey protein had significantly higher (P < 0.05) water diffusivity. Finally, a spray-dried model infant formula was produced containing hydrolysed or intact whey as the protein with sunflower oil as the fat source. Reconstituted, hydrolysed formula had a significantly (P < 0.05) higher fat globule size and lower emulsion stability than intact formula. Lactose crystallisation in powders occurred at higher RH for hydrolysed formula. In conclusion, this research has shown the effect of altering the protein type, protein composition, and oil type on the surface composition and physical properties of different dairy powders, and how these variations greatly affect their rehydration characteristics and storage stability.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Semantic Annotation component is a software application that provides support for automated text classification, a process grounded in a cohesion-centered representation of discourse that facilitates topic extraction. The component enables the semantic meta-annotation of text resources, including automated classification, thus facilitating information retrieval within the RAGE ecosystem. It is available in the ReaderBench framework (http://readerbench.com/) which integrates advanced Natural Language Processing (NLP) techniques. The component makes use of Cohesion Network Analysis (CNA) in order to ensure an in-depth representation of discourse, useful for mining keywords and performing automated text categorization. Our component automatically classifies documents into the categories provided by the ACM Computing Classification System (http://dl.acm.org/ccs_flat.cfm), but also into the categories from a high level serious games categorization provisionally developed by RAGE. English and French languages are already covered by the provided web service, whereas the entire framework can be extended in order to support additional languages.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Argon infiltration is a well-known problem of hot isostatic pressed components. Thus, the argon content is one quality attribute which is measured after a hot isostatic pressing (HIP) process. Since the Selective Laser Melting (SLM) process takes place under an inert argon atmosphere; it is imaginable that argon is entrapped in the component after SLM processing. Despite using optimized process parameters, defects like pores and shrink holes cannot be completely avoided. Especially, pores could be filled with process gas during the building process. Argon filled pores would clearly affect the mechanical properties. The present paper takes a closer look at the porosity in Inconel 718 samples, which were generated by means of SLM. Furthermore, the argon content of the powder feedstock, of samples made by means of SLM, of samples which were hot isostatic pressed after the SLM process, and of conventionally manufactured samples were measured and compared. The results showed an increased argon content in the Inconel 718 samples after SLM processing compared to conventional manufactured samples.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Coupled map lattices (CML) can describe many relaxation and optimization algorithms currently used in image processing. We recently introduced the ‘‘plastic‐CML’’ as a paradigm to extract (segment) objects in an image. Here, the image is applied by a set of forces to a metal sheet which is allowed to undergo plastic deformation parallel to the applied forces. In this paper we present an analysis of our ‘‘plastic‐CML’’ in one and two dimensions, deriving the nature and stability of its stationary solutions. We also detail how to use the CML in image processing, how to set the system parameters and present examples of it at work. We conclude that the plastic‐CML is able to segment images with large amounts of noise and large dynamic range of pixel values, and is suitable for a very large scale integration(VLSI) implementation.