947 resultados para Data aggregation
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
Tese de Doutoramento em Ciências (Especialidade em Matemática)
Resumo:
Inspired by the relational algebra of data processing, this paper addresses the foundations of data analytical processing from a linear algebra perspective. The paper investigates, in particular, how aggregation operations such as cross tabulations and data cubes essential to quantitative analysis of data can be expressed solely in terms of matrix multiplication, transposition and the Khatri–Rao variant of the Kronecker product. The approach offers a basis for deriving an algebraic theory of data consolidation, handling the quantitative as well as qualitative sides of data science in a natural, elegant and typed way. It also shows potential for parallel analytical processing, as the parallelization theory of such matrix operations is well acknowledged.
Resumo:
Large scale distributed data stores rely on optimistic replication to scale and remain highly available in the face of net work partitions. Managing data without coordination results in eventually consistent data stores that allow for concurrent data updates. These systems often use anti-entropy mechanisms (like Merkle Trees) to detect and repair divergent data versions across nodes. However, in practice hash-based data structures are too expensive for large amounts of data and create too many false conflicts. Another aspect of eventual consistency is detecting write conflicts. Logical clocks are often used to track data causality, necessary to detect causally concurrent writes on the same key. However, there is a nonnegligible metadata overhead per key, which also keeps growing with time, proportional with the node churn rate. Another challenge is deleting keys while respecting causality: while the values can be deleted, perkey metadata cannot be permanently removed without coordination. Weintroduceanewcausalitymanagementframeworkforeventuallyconsistentdatastores,thatleveragesnodelogicalclocks(BitmappedVersion Vectors) and a new key logical clock (Dotted Causal Container) to provides advantages on multiple fronts: 1) a new efficient and lightweight anti-entropy mechanism; 2) greatly reduced per-key causality metadata size; 3) accurate key deletes without permanent metadata.
Resumo:
We study the problem of privacy-preserving proofs on authenticated data, where a party receives data from a trusted source and is requested to prove computations over the data to third parties in a correct and private way, i.e., the third party learns no information on the data but is still assured that the claimed proof is valid. Our work particularly focuses on the challenging requirement that the third party should be able to verify the validity with respect to the specific data authenticated by the source — even without having access to that source. This problem is motivated by various scenarios emerging from several application areas such as wearable computing, smart metering, or general business-to-business interactions. Furthermore, these applications also demand any meaningful solution to satisfy additional properties related to usability and scalability. In this paper, we formalize the above three-party model, discuss concrete application scenarios, and then we design, build, and evaluate ADSNARK, a nearly practical system for proving arbitrary computations over authenticated data in a privacy-preserving manner. ADSNARK improves significantly over state-of-the-art solutions for this model. For instance, compared to corresponding solutions based on Pinocchio (Oakland’13), ADSNARK achieves up to 25× improvement in proof-computation time and a 20× reduction in prover storage space.
Resumo:
Dissertação de mestrado integrado em Engenharia Civil
Resumo:
Chitosan coating was applied in Lactoferrin (Lf)-Glycomacropeptide (GMP) nanohydrogels by layer-by-layer coating process. A volume ratio of 0.1 of Lf-GMP nanohydrogels (0.2 mg.mL-1, at pH 5.0) to chitosan (1 mg.mL-1, at pH 3) demonstrated to be the optimal condition to obtain stable nanohydrogels with size of 230 ± 12 nm, a PdI of 0.22 ± 0.02 and a -potential of 30.0 ± 0.15 mV. Transmission electron microscopy (TEM) images showed that the application of chitosan coating in Lf-GMP did not affect the spherical shape of nanohydrogels and confirmed the low aggregation of nanohydrogels in solution. The analysis of chemical interactions between chitosan and Lf-GMP nanohydrogels were performed by Fourier transform infrared spectroscopy (FTIR) and by circular dichroism (CD) that revealed that a specific chemical interaction occurring between functional groups of protein-based nanohydrogels and active groups of the chitosan was established. The effect of chitosan coating on release mechanisms of Lf-GMP nanohydrogels at acid conditions (pH 2, 37 ºC) was evaluated by the encapsulation of a model compound (caffeine) in these systems. Linear Superposition Model was used to fit the experimental data and revealed that Fick and relaxation mechanisms are involved in caffeine release. It was also observed that the Fick contribution increase with the application of chitosan coating. In vitro gastric digestion was performed with Lf-GMP nanohydrogels and Lf-GMP nanohydrogels with chitosan coating and it was observed that the presence of chitosan improve the stability of Lf and GMP (proteins were hydrolysed at a slower rate and were present in solution by longer time). Native electrophoreses revealed that the nanohydrogels without coating remained intact in solution until 15 min and with chitosan coating remained intact until 60 min, during gastric digestion.
Resumo:
The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fmicb. 2016.00275
Resumo:
Genome-scale metabolic models are valuable tools in the metabolic engineering process, based on the ability of these models to integrate diverse sources of data to produce global predictions of organism behavior. At the most basic level, these models require only a genome sequence to construct, and once built, they may be used to predict essential genes, culture conditions, pathway utilization, and the modifications required to enhance a desired organism behavior. In this chapter, we address two key challenges associated with the reconstruction of metabolic models: (a) leveraging existing knowledge of microbiology, biochemistry, and available omics data to produce the best possible model; and (b) applying available tools and data to automate the reconstruction process. We consider these challenges as we progress through the model reconstruction process, beginning with genome assembly, and culminating in the integration of constraints to capture the impact of transcriptional regulation. We divide the reconstruction process into ten distinct steps: (1) genome assembly from sequenced reads; (2) automated structural and functional annotation; (3) phylogenetic tree-based curation of genome annotations; (4) assembly and standardization of biochemistry database; (5) genome-scale metabolic reconstruction; (6) generation of core metabolic model; (7) generation of biomass composition reaction; (8) completion of draft metabolic model; (9) curation of metabolic model; and (10) integration of regulatory constraints. Each of these ten steps is documented in detail.
Resumo:
Publicado em "Information control in manufacturing 1998 : (INCOM'98) : advances in industrial engineering : a proceedings volume from the 9th IFAC Symposium, Nancy-Metz, France, 24-26 June 1998. Vol. 2"
Resumo:
Recently, there has been a growing interest in the field of metabolomics, materialized by a remarkable growth in experimental techniques, available data and related biological applications. Indeed, techniques as Nuclear Magnetic Resonance, Gas or Liquid Chromatography, Mass Spectrometry, Infrared and UV-visible spectroscopies have provided extensive datasets that can help in tasks as biological and biomedical discovery, biotechnology and drug development. However, as it happens with other omics data, the analysis of metabolomics datasets provides multiple challenges, both in terms of methodologies and in the development of appropriate computational tools. Indeed, from the available software tools, none addresses the multiplicity of existing techniques and data analysis tasks. In this work, we make available a novel R package, named specmine, which provides a set of methods for metabolomics data analysis, including data loading in different formats, pre-processing, metabolite identification, univariate and multivariate data analysis, machine learning, and feature selection. Importantly, the implemented methods provide adequate support for the analysis of data from diverse experimental techniques, integrating a large set of functions from several R packages in a powerful, yet simple to use environment. The package, already available in CRAN, is accompanied by a web site where users can deposit datasets, scripts and analysis reports to be shared with the community, promoting the efficient sharing of metabolomics data analysis pipelines.
Resumo:
Dissertação de mestrado em Estatística
Resumo:
Article first published online: 13 NOV 2013
Resumo:
Football is considered nowadays one of the most popular sports. In the betting world, it has acquired an outstanding position, which moves millions of euros during the period of a single football match. The lack of profitability of football betting users has been stressed as a problem. This lack gave origin to this research proposal, which it is going to analyse the possibility of existing a way to support the users to increase their profits on their bets. Data mining models were induced with the purpose of supporting the gamblers to increase their profits in the medium/long term. Being conscience that the models can fail, the results achieved by four of the seven targets in the models are encouraging and suggest that the system can help to increase the profits. All defined targets have two possible classes to predict, for example, if there are more or less than 7.5 corners in a single game. The data mining models of the targets, more or less than 7.5 corners, 8.5 corners, 1.5 goals and 3.5 goals achieved the pre-defined thresholds. The models were implemented in a prototype, which it is a pervasive decision support system. This system was developed with the purpose to be an interface for any user, both for an expert user as to a user who has no knowledge in football games.