937 resultados para MongoDB, database, NoSQL
Resumo:
Com o advento da invenção do modelo relacional em 1970 por E.F.Codd, a forma como a informação era gerida numa base de dados foi totalmente revolucionada. Migrou‐se de sistemas hierárquicos baseados em ficheiros para uma base de dados relacional com tabelas relações e registos que simplificou em muito a gestão da informação e levou muitas empresas a adotarem este modelo. O que E.F.Codd não previu foi o facto de que cada vez mais a informação que uma base de dados teria de armazenar fosse de proporções gigantescas, nem que as solicitações às bases de dados fossem da mesma ordem. Tudo isto veio a acontecer com a difusão da internet que veio ligar todas as pessoas de qualquer parte do mundo que tivessem um computador. Com o número de adesões à internet a crescer, o número de sites que nela eram criados também cresceu (e ainda cresce exponencialmente). Os motores de busca que antigamente indexavam alguns sites por dia, atualmente indexam uns milhões de sites por segundo e, mais recentemente as redes sociais também estão a lidar com quantidades gigantescas de informação. Tanto os motores de busca como as redes sociais chegaram à conclusão que uma base de dados relacional não chega para gerir a enorme quantidade de informação que ambos produzem e como tal, foi necessário encontrar uma solução. Essa solução é NoSQL e é o assunto que esta tese vai tratar. O presente documento visa definir e apresentar o problema que as bases de dados relacionais têm quando lidam com grandes volumes de dados, introduzir os limites do modelo relacional que só até há bem pouco tempo começaram a ser evidenciados com o surgimento de movimentos, como o BigData, com o crescente número de sites que surgem por dia e com o elevado número de utilizadores das redes sociais. Será também ilustrada a solução adotada até ao momento pelos grandes consumidores de dados de elevado volume, como o Google e o Facebook, enunciando as suas características vantagens, desvantagens e os demais conceitos ligados ao modelo NoSQL. A presente tese tenciona ainda demonstrar que o modelo NoSQL é uma realidade usada em algumas empresas e quais as principias mudanças a nível programático e as boas práticas delas resultantes que o modelo NoSQL traz. Por fim esta tese termina com a explicação de que NoSQL é uma forma de implementar a persistência de uma aplicação que se inclui no novo modelo de persistência da informação.
Resumo:
We are living in the era of Big Data. A time which is characterized by the continuous creation of vast amounts of data, originated from different sources, and with different formats. First, with the rise of the social networks and, more recently, with the advent of the Internet of Things (IoT), in which everyone and (eventually) everything is linked to the Internet, data with enormous potential for organizations is being continuously generated. In order to be more competitive, organizations want to access and explore all the richness that is present in those data. Indeed, Big Data is only as valuable as the insights organizations gather from it to make better decisions, which is the main goal of Business Intelligence. In this paper we describe an experiment in which data obtained from a NoSQL data source (database technology explicitly developed to deal with the specificities of Big Data) is used to feed a Business Intelligence solution.
Resumo:
Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação
Resumo:
In questa tesi sono stati introdotti e studiati i Big Data, dando particolare importanza al mondo NoSQL, approfondendo MongoDB, e al mondo del Machine Learning, approfondendo PredictionIO. Successivamente è stata sviluppata un'applicazione attraverso l'utilizzo di tecnologie web, nodejs, node-webkit e le tecnologie approfondite prima. L'applicazione utilizza l'interpolazione polinomiale per predirre il prezzo di un bene salvato nello storico presente su MongoDB. Attraverso PredictionIO, essa analizza il comportamento degli altri utenti consigliando dei prodotti per l'acquisto. Infine è stata effetuata un'analisi dei risultati dell'errore prodotto dall'interpolazione.
Resumo:
En este proyecto se ha estudiado el abanico de posibilidades que las plataformas web y móviles ofrecen para aprender lenguajes de programación compilados. A continuación, se ha realizado el diseño y la implementación de una plataforma para el aprendizaje de lenguajes de programación desde dispositivos móviles, con posibilidad de compilación remota desde la aplicación desarrollada, analizando el proceso y las elecciones de desarrollo tomadas. Así, se ha desarrollado una app mediante la plataforma de desarrollo Cordova, que puede ser distribuida para todas las plataformas móviles que esta soporta, incluyendo las más populares: iOS y Android. Para la parte servidora se ha utilizado un servidor Apache (PHP) y el sistema NoSQL MongoDB para la base de datos. Para mayor facilidad en la gestión del contenido de la app, se ha desarrollado en paralelo un gestor web de la base de datos, el cual permite añadir, editar y eliminar contenido de la misma a través de una interfaz agradable y funcional. ABSTRACT. In this project I have studied the range of possibilities that web and mobile platforms offer to learn compiled programming languages. Next, I have designed and implemented a platform for learning programming languages from mobile devices, giving the possibility of remote compilation within the developed application. In this terms, I have developed an app with the Cordova development platform, which can be distributed for all the mobile platforms Cordova supports, including the most popular ones: iOS and Android. For the server part, I have used an Apache (PHP) server and the NoSQL database system MongoDB. In order to offer a more usable system and a better database management, I have also developed a web manager for the database, from which database content can be added, edited and removed, through a clear and functional interface.
Resumo:
Component-based Software Engineering (CBSE) and Service-Oriented Architecture (SOA) became popular ways to develop software over the last years. During the life-cycle of a software system, several components and services can be developed, evolved and replaced. In production environments, the replacement of core components, such as databases, is often a risky and delicate operation, where several factors and stakeholders should be considered. Service Level Agreement (SLA), according to ITILv3’s official glossary, is “an agreement between an IT service provider and a customer. The agreement consists on a set of measurable constraints that a service provider must guarantee to its customers.”. In practical terms, SLA is a document that a service provider delivers to its consumers with minimum quality of service (QoS) metrics.This work is intended to assesses and improve the use of SLAs to guide the transitioning process of databases on production environments. In particular, in this work we propose SLA-Based Guidelines/Process to support migrations from a relational database management system (RDBMS) to a NoSQL one. Our study is validated by case studies.
Resumo:
To store, update and retrieve data from database management systems (DBMS), software architects use tools, like call-level interfaces (CLI), which provide standard functionalities to interact with DBMS. However, the emerging of NoSQL paradigm, and particularly new NoSQL DBMS providers, lead to situations where some of the standard functionalities provided by CLI are not supported, very often due to their distance from the relational model or due to design constraints. As such, when a system architect needs to evolve, namely from a relational DBMS to a NoSQL DBMS, he must overcome the difficulties conveyed by the features not provided by NoSQL DBMS. Choosing the wrong NoSQL DBMS risks major issues with components requesting non-supported features. This paper focuses on how to deploy features that are not so commonly supported by NoSQL DBMS (like Stored Procedures, Transactions, Save Points and interactions with local memory structures) by implementing them in standard CLI.
Resumo:
Different types of water bodies, including lakes, streams, and coastal marine waters, are often susceptible to fecal contamination from a range of point and nonpoint sources, and have been evaluated using fecal indicator microorganisms. The most commonly used fecal indicator is Escherichia coli, but traditional cultivation methods do not allow discrimination of the source of pollution. The use of triplex PCR offers an approach that is fast and inexpensive, and here enabled the identification of phylogroups. The phylogenetic distribution of E. coli subgroups isolated from water samples revealed higher frequencies of subgroups A1 and B23 in rivers impacted by human pollution sources, while subgroups D1 and D2 were associated with pristine sites, and subgroup B1 with domesticated animal sources, suggesting their use as a first screening for pollution source identification. A simple classification is also proposed based on phylogenetic subgroup distribution using the w-clique metric, enabling differentiation of polluted and unpolluted sites.
Resumo:
Despite a strong increase in research on seamounts and oceanic islands ecology and biogeography, many basic aspects of their biodiversity are still unknown. In the southwestern Atlantic, the Vitória-Trindade Seamount Chain (VTC) extends ca. 1,200 km offshore the Brazilian continental shelf, from the Vitória seamount to the oceanic islands of Trindade and Martin Vaz. For a long time, most of the biological information available regarded its islands. Our study presents and analyzes an extensive database on the VTC fish biodiversity, built on data compiled from literature and recent scientific expeditions that assessed both shallow to mesophotic environments. A total of 273 species were recorded, 211 of which occur on seamounts and 173 at the islands. New records for seamounts or islands include 191 reef fish species and 64 depth range extensions. The structure of fish assemblages was similar between islands and seamounts, not differing in species geographic distribution, trophic composition, or spawning strategies. Main differences were related to endemism, higher at the islands, and to the number of endangered species, higher at the seamounts. Since unregulated fishing activities are common in the region, and mining activities are expected to drastically increase in the near future (carbonates on seamount summits and metals on slopes), this unique biodiversity needs urgent attention and management.
Resumo:
Considering the difficulties in finding good-quality images for the development and test of computer-aided diagnosis (CAD), this paper presents a public online mammographic images database free for all interested viewers and aimed to help develop and evaluate CAD schemes. The digitalization of the mammographic images is made with suitable contrast and spatial resolution for processing purposes. The broad recuperation system allows the user to search for different images, exams, or patient characteristics. Comparison with other databases currently available has shown that the presented database has a sufficient number of images, is of high quality, and is the only one to include a functional search system.
Resumo:
This article documents the addition of 229 microsatellite marker loci to the Molecular Ecology Resources Database. Loci were developed for the following species: Acacia auriculiformis x Acacia mangium hybrid, Alabama argillacea, Anoplopoma fimbria, Aplochiton zebra, Brevicoryne brassicae, Bruguiera gymnorhiza, Bucorvus leadbeateri, Delphacodes detecta, Tumidagena minuta, Dictyostelium giganteum, Echinogammarus berilloni, Epimedium sagittatum, Fraxinus excelsior, Labeo chrysophekadion, Oncorhynchus clarki lewisi, Paratrechina longicornis, Phaeocystis antarctica, Pinus roxburghii and Potamilus capax. These loci were cross-tested on the following species: Acacia peregrinalis, Acacia crassicarpa, Bruguiera cylindrica, Delphacodes detecta, Tumidagena minuta, Dictyostelium macrocephalum, Dictyostelium discoideum, Dictyostelium purpureum, Dictyostelium mucoroides, Dictyostelium rosarium, Polysphondylium pallidum, Epimedium brevicornum, Epimedium koreanum, Epimedium pubescens, Epimedium wushanese and Fraxinus angustifolia.
Resumo:
Much information on flavonoid content of Brazilian foods has already been obtained; however, this information is spread in scientific publications and non-published data. The objectives of this work were to compile and evaluate the quality of national flavonoid data according to the United States Department of Agriculture`s Data Quality Evaluation System (USDA-DQES) with few modifications, for future dissemination in the TBCA-USP (Brazilian Food Composition Database). For the compilation, the most abundant compounds in the flavonoid subclasses were considered (flavonols, flavones, isoflavones, flavanones, flavan-3-ols, and anthocyanidins) and the analysis of the compounds by HPLC was adopted as criteria for data inclusion. The evaluation system considers five categories, and the maximum score assigned to each category is 20. For each data, a confidence code (CC) was attributed (A, B, C and D), indicating the quality and reliability of the information. Flavonoid data (773) present in 197 Brazilian foods were evaluated. The CC ""C"" (as average) was attributed to 99% of the data and ""B"" (above average) to 1%. The main categories assigned low average scores were: number of samples; sampling plan and analytical quality control (average scores 2, 5 and 4, respectively). The analytical method category received an average score of 9. The category assigned the highest score was the sample handling (20 average). These results show that researchers need to be conscious about the importance of the number and plan of evaluated samples and the complete description and documentation of all the processes of methodology execution and analytical quality control. (C) 2010 Elsevier Inc. All rights reserved.
Resumo:
Foods that contain unavailable carbohydrates may lower the risks for some non-transmissible chronic diseases because of the potential benefits provided by the products of colonic fermentation. On the other hand, foods that are sources of available carbohydrates may have higher energy value and increase the post-prandial glycemic response. The biomarker glycemic index and the resulting glycemic load may be used to classify foods according to their potential to increase blood glucose. Information about glycemic index and glycemic load may be useful in diet therapy. Currently, food composition tables in Brazil do not provide data for individually analyzed carbohydrates even though some quality data are available in scientific publications. The objectives of this work were to produce and compile information about the concentration of individual carbohydrates in foods and their glycemic responses and to disseminate this information through the Brazilian Food Composition Database (TBCA-USP). The glycemic index and glycemic load of foods were evaluated in healthy individuals. Concentrations of available carbohydrates (soluble sugars and available starch) and unavailable carbohydrates (dietary fiber, resistant starch, beta-glucans, fructans) were quantified by official methods, and other national data were compiled. TBCA-USP (http://www.fcf.usp.br/tabela), which is used by professionals and the population in general, now offers both chemical and biological information for carbohydrates. (C) 2009 Elsevier Inc. All rights reserved.