1 resultado para Scientific Data
em Universidade de Madeira
Resumo:
This thesis presents a cloud-based software platform for sharing publicly available scientific datasets. The proposed platform leverages the potential of NoSQL databases and asynchronous IO technologies, such as Node.JS, in order to achieve high performances and flexible solutions. This solution will serve two main groups of users. The dataset providers, which are the researchers responsible for sharing and maintaining datasets, and the dataset users, that are those who desire to access the public data. To the former are given tools to easily publish and maintain large volumes of data, whereas the later are given tools to enable the preview and creation of subsets of the original data through the introduction of filter and aggregation operations. The choice of NoSQL over more traditional RDDMS emerged from and extended benchmark between relational databases (MySQL) and NoSQL (MongoDB) that is also presented in this thesis. The obtained results come to confirm the theoretical guarantees that NoSQL databases are more suitable for the kind of data that our system users will be handling, i. e., non-homogeneous data structures that can grow really fast. It is envisioned that a platform like this can lead the way to a new era of scientific data sharing where researchers are able to easily share and access all kinds of datasets, and even in more advanced scenarios be presented with recommended datasets and already existing research results on top of those recommendations.