917 resultados para Big Data


Relevância:

70.00% 70.00%

Publicador:

Resumo:

The amount and quality of available biomass is a key factor for the sustainable livestock industry and agricultural management related decision making. Globally 31.5% of land cover is grassland while 80% of Ireland’s agricultural land is grassland. In Ireland, grasslands are intensively managed and provide the cheapest feed source for animals. This dissertation presents a detailed state of the art review of satellite remote sensing of grasslands, and the potential application of optical (Moderate–resolution Imaging Spectroradiometer (MODIS)) and radar (TerraSAR-X) time series imagery to estimate the grassland biomass at two study sites (Moorepark and Grange) in the Republic of Ireland using both statistical and state of the art machine learning algorithms. High quality weather data available from the on-site weather station was also used to calculate the Growing Degree Days (GDD) for Grange to determine the impact of ancillary data on biomass estimation. In situ and satellite data covering 12 years for the Moorepark and 6 years for the Grange study sites were used to predict grassland biomass using multiple linear regression, Neuro Fuzzy Inference Systems (ANFIS) models. The results demonstrate that a dense (8-day composite) MODIS image time series, along with high quality in situ data, can be used to retrieve grassland biomass with high performance (R2 = 0:86; p < 0:05, RMSE = 11.07 for Moorepark). The model for Grange was modified to evaluate the synergistic use of vegetation indices derived from remote sensing time series and accumulated GDD information. As GDD is strongly linked to the plant development, or phonological stage, an improvement in biomass estimation would be expected. It was observed that using the ANFIS model the biomass estimation accuracy increased from R2 = 0:76 (p < 0:05) to R2 = 0:81 (p < 0:05) and the root mean square error was reduced by 2.72%. The work on the application of optical remote sensing was further developed using a TerraSAR-X Staring Spotlight mode time series over the Moorepark study site to explore the extent to which very high resolution Synthetic Aperture Radar (SAR) data of interferometrically coherent paddocks can be exploited to retrieve grassland biophysical parameters. After filtering out the non-coherent plots it is demonstrated that interferometric coherence can be used to retrieve grassland biophysical parameters (i. e., height, biomass), and that it is possible to detect changes due to the grass growth, and grazing and mowing events, when the temporal baseline is short (11 days). However, it not possible to automatically uniquely identify the cause of these changes based only on the SAR backscatter and coherence, due to the ambiguity caused by tall grass laid down due to the wind. Overall, the work presented in this dissertation has demonstrated the potential of dense remote sensing and weather data time series to predict grassland biomass using machine-learning algorithms, where high quality ground data were used for training. At present a major limitation for national scale biomass retrieval is the lack of spatial and temporal ground samples, which can be partially resolved by minor modifications in the existing PastureBaseIreland database by adding the location and extent ofeach grassland paddock in the database. As far as remote sensing data requirements are concerned, MODIS is useful for large scale evaluation but due to its coarse resolution it is not possible to detect the variations within the fields and between the fields at the farm scale. However, this issue will be resolved in terms of spatial resolution by the Sentinel-2 mission, and when both satellites (Sentinel-2A and Sentinel-2B) are operational the revisit time will reduce to 5 days, which together with Landsat-8, should enable sufficient cloud-free data for operational biomass estimation at a national scale. The Synthetic Aperture Radar Interferometry (InSAR) approach is feasible if there are enough coherent interferometric pairs available, however this is difficult to achieve due to the temporal decorrelation of the signal. For repeat-pass InSAR over a vegetated area even an 11 days temporal baseline is too large. In order to achieve better coherence a very high resolution is required at the cost of spatial coverage, which limits its scope for use in an operational context at a national scale. Future InSAR missions with pair acquisition in Tandem mode will minimize the temporal decorrelation over vegetation areas for more focused studies. The proposed approach complements the current paradigm of Big Data in Earth Observation, and illustrates the feasibility of integrating data from multiple sources. In future, this framework can be used to build an operational decision support system for retrieval of grassland biophysical parameters based on data from long term planned optical missions (e. g., Landsat, Sentinel) that will ensure the continuity of data acquisition. Similarly, Spanish X-band PAZ and TerraSAR-X2 missions will ensure the continuity of TerraSAR-X and COSMO-SkyMed.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

In order to become better prepared to support Research Data Management (RDM) practices in sciences and engineering, Queen’s University Library, together with the University Research Services, conducted a research study of all ranks of faculty members, as well as postdoctoral fellows and graduate students at the Faculty of Engineering & Applied Science, Departments of Chemistry, Computer Science, Geological Sciences and Geological Engineering, Mathematics and Statistics, Physics, Engineering Physics & Astronomy, School of Environmental Studies, and Geography & Planning in the Faculty of Arts and Science.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

This paper discusses a series of artworks named CODEX produced by the authors as part of a collaborative research project between the Centre for Research in Education, Art and Media (CREAM), University of Westminster, and the Oxford Internet Institute. Taking the form of experimental maps, large-scale installations and prints, we show how big data can be employed to reflect upon social phenomena through the formulation of critical, aesthetic and speculative geographies.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The Twitter System is the biggest social network in the world, and everyday millions of tweets are posted and talked about, expressing various views and opinions. A large variety of research activities have been conducted to study how the opinions can be clustered and analyzed, so that some tendencies can be uncovered. Due to the inherent weaknesses of the tweets - very short texts and very informal styles of writing - it is rather hard to make an investigation of tweet data analysis giving results with good performance and accuracy. In this paper, we intend to attack the problem from another aspect - using a two-layer structure to analyze the twitter data: LDA with topic map modelling. The experimental results demonstrate that this approach shows a progress in twitter data analysis. However, more experiments with this method are expected in order to ensure that the accurate analytic results can be maintained.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Abstract: Decision support systems have been widely used for years in companies to gain insights from internal data, thus making successful decisions. Lately, thanks to the increasing availability of open data, these systems are also integrating open data to enrich decision making process with external data. On the other hand, within an open-data scenario, decision support systems can be also useful to decide which data should be opened, not only by considering technical or legal constraints, but other requirements, such as "reusing potential" of data. In this talk, we focus on both issues: (i) open data for decision making, and (ii) decision making for opening data. We will first briefly comment some research problems regarding using open data for decision making. Then, we will give an outline of a novel decision-making approach (based on how open data is being actually used in open-source projects hosted in Github) for supporting open data publication. Bio of the speaker: Jose-Norberto Mazón holds a PhD from the University of Alicante (Spain). He is head of the "Cátedra Telefónica" on Big Data and coordinator of the Computing degree at the University of Alicante. He is also member of the WaKe research group at the University of Alicante. His research work focuses on open data management, data integration and business intelligence within "big data" scenarios, and their application to the tourism domain (smart tourism destinations). He has published his research in international journals, such as Decision Support Systems, Information Sciences, Data & Knowledge Engineering or ACM Transaction on the Web. Finally, he is involved in the open data project in the University of Alicante, including its open data portal at http://datos.ua.es

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Tiedon hyödyntäminen on yhä merkittävämmässä osassa markkinoinnin perustehtävien toteuttamisessa. Tiedon avulla asiakkaiden tarpeita saadaan tunnistettua paremmin ja niihin voidaan vastata tehokkaammin. Lisäksi erityisesti viimevuosien teknologinen kehitys on parantanut tiedon hyödyntämismahdollisuuksia markkinoinnissa merkittävästi. Tämä tutkielma käsittelee uudenlaista tiedon hyödyntämistä markkinointiviestinnässä. Työssä tutkitaan tiedon kehityksen uutta aikakautta, big dataa, joka vaikuttaa erityisesti markkinointiviestinnän kohdentamisen kehittymiseen. Kohdentamisen kanavana tutkielmassa tarkastellaan mobiiliverkkopankkia, mistä syystä markkinointiviestinnän tutkiminen on rajattu työssä käsittämään lähinnä asiakaspalvelun sekä mainonnan. Tutkielman tarkoituksena on vastata kysymykseen: Mitkä ovat big datan tuomat mahdollisuudet ja haasteet markkinointiviestinnän kohdentamisessa mobiiliverkkopankissa? Vastaus tähän tutkimuskysymykseen muodostetaan kahden osakysymyksen avulla: Mitä mahdollisuuksia ja haasteita big data tuo markkinointiviestinnän kohdentamiseen? Millainen markkinointiviestinnän kohdentamisen kanava on mobiiliverkkopankki? Tutkimuskysymykseen vastattiin sekä tutkielman teoriaosassa toteutetun teoreettis-käsitteellisen aikaisemman tutkimuksen läpikäynnin että erillisen empiirisen tapaustutkimuksen avulla. Laadullinen tapaustutkimus suoritettiin teemahaastatteluina S-Pankin mobiiliverkkopankkiin, Smobiiliin, liittyen. Teemahaastatteluissa haastateltiin seitsemää asiantuntijaa sekä kahta S-mobiilin käyttäjää. Tutkielman teoriaosassa tuli ilmi, että big datan avulla kuluttaja on mahdollista tuntea kokonaisuudessaan paremmin, mikä parantaa perinteisiä sekä tarjoaa myös täysin uusia markkinointiviestinnän kohdentamisen keinoja. Näiden kautta on mahdollista vaikuttaa yrityksen kilpailuetuun. Teoriaosuudessa todettiin big datan tuomien haasteiden liittyvän ilmiön uutuuteen ja tuntemattomuuteen, tiedonhallintaan sekä yritysten ulkopuolelta tuleviin haasteisiin koskien yksityisyydensuojaa, kuluttajien mielipiteitä sekä erilaisia määrättyjä rajoitteita. Tapaustutkimuksen tulokset erosivat näistä löydöksistä ainoastaan haasteiden tärkeyden painotuksissa: suurimpana tiedonhallintaan liittyvänä haasteena empiirisessä tutkimuksessa tuli esiin teknologioiden tarve tutkielman teoriaosuudessa ilmenneen asiantuntijuuden tarpeen sijaan. Lisäksi ulkoisia haasteita ei koettu merkittävinä haasteina empiirisessä tutkimuksessa. Tapaustutkimuksen tulokset tukivat tutkielman teoriaosuudessa muodostettua kuvaa mobiiliverkkopankista markkinointiviestinnän kohdentamisen kanavana: se on henkilökohtainen pankkisovellus, jonka avulla tarkasti tunnettu asiakas voidaan tavoittaa suojatussa ympäristössä tehokkaasti, ajasta ja paikasta riippumatta. Mobiiliverkkopankkia käytetään oletettavasti päämääräkeskeisesti, mutta mahdollisesti myös osittain viihdykkeellisesti sekä ajankulutustarkoituksiin. Lisäksi sovelluksen sisällä on mahdollista saada asiakkaan jakamaton huomio, jota älypuhelimen erilaiset käyttötilanteet voivat kuitenkin käytännössä häiritä. Mobiiliverkkopankin kautta toteutetun markkinointiviestinnän kohdentamisen hyväksyttävyyteen voitaneen vaikuttaa luvan pyytämisen, pull-tyyppisen mainonnan, viestinnän hyödyllisyyden sekä viestivän yrityksen brändin luotettavuuden kautta. Tutkielman aihepiiri on vielä hyvin tuore ja muuttuva, mistä syystä siihen liittyvät ilmiöt ja termit ovat osittain vakiintumattomia sekä hankalasti hahmotettavissa. Tämä tuo esiin tarpeellisia jatkotutkimusmahdollisuuksia liittyen esimerkiksi big data -termin käsiteanalyyttiseen tutkimiseen. Jatkotutkimusta olisi hyödyllistä suorittaa myös koskien big datan ja mobiiliverkkopankin avulla toteutetun kohdentamisen

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Kandidaatintyö on toteutettu kirjallisuuskatsauksena, jonka tavoitteena on selvittää data-analytiikan käyttökohteita ja datan hyödyntämisen vaikutusta liiketoimintaan. Työ käsittelee data-analytiikan käyttöä ja datan tehokkaan hyödyntämisen haasteita. Työ on rajattu tarkastelemaan yrityksen talouden ohjausta, jossa analytiikkaa käytetään johdon ja rahoituksen laskentatoimessa. Datan määrän eksponentiaalinen kasvunopeus luo data-analytiikan käytölle uusia haasteita ja mahdollisuuksia. Datalla itsessään ei kuitenkaan ole suurta arvoa yritykselle, vaan arvo syntyy prosessoinnin kautta. Vaikka data-analytiikkaa tutkitaan ja käytetään jo runsaasti, se tarjoaa paljon nykyisiä sovelluksia suurempia mahdollisuuksia. Yksi työn keskeisimmistä tuloksista on, että data-analytiikalla voidaan tehostaa johdon laskentatoimea ja helpottaa rahoituksen laskentatoimen tehtäviä. Tarjolla olevan datan määrä kasvaa kuitenkin niin nopeasti, että käytettävissä oleva teknologia ja osaamisen taso eivät pysy kehityksessä mukana. Varsinkin big datan laajempi käyttöönotto ja sen tehokas hyödyntäminen vaikuttavat jatkossa talouden ohjauksen käytäntöihin ja sovelluksiin yhä enemmän.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Americans are accustomed to a wide range of data collection in their lives: census, polls, surveys, user registrations, and disclosure forms. When logging onto the Internet, users’ actions are being tracked everywhere: clicking, typing, tapping, swiping, searching, and placing orders. All of this data is stored to create data-driven profiles of each user. Social network sites, furthermore, set the voluntarily sharing of personal data as the default mode of engagement. But people’s time and energy devoted to creating this massive amount of data, on paper and online, are taken for granted. Few people would consider their time and energy spent on data production as labor. Even if some people do acknowledge their labor for data, they believe it is accessory to the activities at hand. In the face of pervasive data collection and the rising time spent on screens, why do people keep ignoring their labor for data? How has labor for data been become invisible, as something that is disregarded by many users? What does invisible labor for data imply for everyday cultural practices in the United States? Invisible Labor for Data addresses these questions. I argue that three intertwined forces contribute to framing data production as being void of labor: data production institutions throughout history, the Internet’s technological infrastructure (especially with the implementation of algorithms), and the multiplication of virtual spaces. There is a common tendency in the framework of human interactions with computers to deprive data and bodies of their materiality. My Introduction and Chapter 1 offer theoretical interventions by reinstating embodied materiality and redefining labor for data as an ongoing process. The middle Chapters present case studies explaining how labor for data is pushed to the margin of the narratives about data production. I focus on a nationwide debate in the 1960s on whether the U.S. should build a databank, contemporary Big Data practices in the data broker and the Internet industries, and the group of people who are hired to produce data for other people’s avatars in the virtual games. I conclude with a discussion on how the new development of crowdsourcing projects may usher in the new chapter in exploiting invisible and discounted labor for data.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Intelligent systems are currently inherent to the society, supporting a synergistic human-machine collaboration. Beyond economical and climate factors, energy consumption is strongly affected by the performance of computing systems. The quality of software functioning may invalidate any improvement attempt. In addition, data-driven machine learning algorithms are the basis for human-centered applications, being their interpretability one of the most important features of computational systems. Software maintenance is a critical discipline to support automatic and life-long system operation. As most software registers its inner events by means of logs, log analysis is an approach to keep system operation. Logs are characterized as Big data assembled in large-flow streams, being unstructured, heterogeneous, imprecise, and uncertain. This thesis addresses fuzzy and neuro-granular methods to provide maintenance solutions applied to anomaly detection (AD) and log parsing (LP), dealing with data uncertainty, identifying ideal time periods for detailed software analyses. LP provides deeper semantics interpretation of the anomalous occurrences. The solutions evolve over time and are general-purpose, being highly applicable, scalable, and maintainable. Granular classification models, namely, Fuzzy set-Based evolving Model (FBeM), evolving Granular Neural Network (eGNN), and evolving Gaussian Fuzzy Classifier (eGFC), are compared considering the AD problem. The evolving Log Parsing (eLP) method is proposed to approach the automatic parsing applied to system logs. All the methods perform recursive mechanisms to create, update, merge, and delete information granules according with the data behavior. For the first time in the evolving intelligent systems literature, the proposed method, eLP, is able to process streams of words and sentences. Essentially, regarding to AD accuracy, FBeM achieved (85.64+-3.69)%; eGNN reached (96.17+-0.78)%; eGFC obtained (92.48+-1.21)%; and eLP reached (96.05+-1.04)%. Besides being competitive, eLP particularly generates a log grammar, and presents a higher level of model interpretability.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

The thesis is the result of work conducted during a period of six months at the Strategy department of Automobili Lamborghini S.p.A. in Sant'Agata Bolognese (BO) and concerns the study and analysis of Big Data relating to Lamborghini's connected cars. The Big Data is a project of Connected Car Project House, that is an inter-departmental team which works toward the definition of the Lamborghini corporate connectivity strategy and its implementation in the product portfolio. The Data of the connected cars is one of the hottest topics right now in the automotive industry; in fact, all the largest automotive companies are investi,ng a lot in this direction, in order to derive the greatest advantages both from a purely economic point of view, because from these data you can understand a lot the behaviors and habits of each driver, and from a technological point of view because it will increasingly promote the development of 5G that will be an important enabler for the future of connectivity. The main purpose of the work by Lamborghini prospective is to analyze the data of the connected cars, in particular a data-set referred to connected Huracans that had been already placed on the market, and, starting from that point, derive valuable Key Performance Indicators (KPIs) on which the company could partly base the decisions to be made in the near future. The key result that we have obtained at the end of this period was the creation of a Dashboard, in which is possible to visualize many parameters and indicators both related to driving habits and the use of the vehicle itself, which has brought great insights on the huge potential and value that is present behind the study of these data. The final Demo of the project has received great interest, not only from the whole strategy department but also from all the other business areas of Lamborghini, making mostly a great awareness that this will be the road to follow in the coming years.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Una gestione, un’analisi e un’interpretazione efficienti dei big data possono cambiare il modello lavorativo, modificare i risultati, aumentare le produzioni, e possono aprire nuove strade per l’assistenza sanitaria moderna. L'obiettivo di questo studio è incentrato sulla costruzione di una dashboard interattiva di un nuovo modello e nuove prestazioni nell’ambito della Sanità territoriale. Lo scopo è quello di fornire al cliente una piattaforma di Data Visualization che mostra risultati utili relativi ai dati sanitari in modo da fornire agli utilizzatori sia informazioni descrittive che statistiche sulla attuale gestione delle cure e delle terapie somministrate. Si propone uno strumento che consente la navigazione dei dati analizzando l’andamento di un set di indicatori di fine vita calcolati a partire da pazienti oncologici della Regione Emilia Romagna in un arco temporale che va dal 2010 ad oggi.

Relevância:

70.00% 70.00%

Publicador:

Resumo:

Nel panorama aziendale odierno, risulta essere di fondamentale importanza la capacità, da parte di un’azienda o di una società di servizi, di orientare in modo programmatico la propria innovazione in modo tale da poter essere competitivi sul mercato. In molti casi, questo e significa investire una cospicua somma di denaro in progetti che andranno a migliorare aspetti essenziali del prodotto o del servizio e che avranno un importante impatto sulla trasformazione digitale dell’azienda. Lo studio che viene proposto riguarda in particolar modo due approcci che sono tipicamente in antitesi tra loro proprio per il fatto che si basano su due tipologie di dati differenti, i Big Data e i Thick Data. I due approcci sono rispettivamente il Data Science e il Design Thinking. Nel corso dei seguenti capitoli, dopo aver definito gli approcci di Design Thinking e Data Science, verrà definito il concetto di blending e la problematica che ruota attorno all’intersezione dei due metodi di innovazione. Per mettere in evidenza i diversi aspetti che riguardano la tematica, verranno riportati anche casi di aziende che hanno integrato i due approcci nei loro processi di innovazione, ottenendo importanti risultati. In particolar modo verrà riportato il lavoro di ricerca svolto dall’autore riguardo l'esame, la classificazione e l'analisi della letteratura esistente all'intersezione dell'innovazione guidata dai dati e dal pensiero progettuale. Infine viene riportato un caso aziendale che è stato condotto presso la realtà ospedaliero-sanitaria di Parma in cui, a fronte di una problematica relativa al rapporto tra clinici dell’ospedale e clinici del territorio, si è progettato un sistema innovativo attraverso l’utilizzo del Design Thinking. Inoltre, si cercherà di sviluppare un’analisi critica di tipo “what-if” al fine di elaborare un possibile scenario di integrazione di metodi o tecniche provenienti anche dal mondo del Data Science e applicarlo al caso studio in oggetto.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

A classical application of biosignal analysis has been the psychophysiological detection of deception, also known as the polygraph test, which is currently a part of standard practices of law enforcement agencies and several other institutions worldwide. Although its validity is far from gathering consensus, the underlying psychophysiological principles are still an interesting add-on for more informal applications. In this paper we present an experimental off-the-person hardware setup, propose a set of feature extraction criteria and provide a comparison of two classification approaches, targeting the detection of deception in the context of a role-playing interactive multimedia environment. Our work is primarily targeted at recreational use in the context of a science exhibition, where the main goal is to present basic concepts related with knowledge discovery, biosignal analysis and psychophysiology in an educational way, using techniques that are simple enough to be understood by children of different ages. Nonetheless, this setting will also allow us to build a significant data corpus, annotated with ground-truth information, and collected with non-intrusive sensors, enabling more advanced research on the topic. Experimental results have shown interesting findings and provided useful guidelines for future work. Pattern Recognition

Relevância:

60.00% 60.00%

Publicador:

Resumo:

One of the main problems of hyperspectral data analysis is the presence of mixed pixels due to the low spatial resolution of such images. Linear spectral unmixing aims at inferring pure spectral signatures and their fractions at each pixel of the scene. The huge data volumes acquired by hyperspectral sensors put stringent requirements on processing and unmixing methods. This letter proposes an efficient implementation of the method called simplex identification via split augmented Lagrangian (SISAL) which exploits the graphics processing unit (GPU) architecture at low level using Compute Unified Device Architecture. SISAL aims to identify the endmembers of a scene, i.e., is able to unmix hyperspectral data sets in which the pure pixel assumption is violated. The proposed implementation is performed in a pixel-by-pixel fashion using coalesced accesses to memory and exploiting shared memory to store temporary data. Furthermore, the kernels have been optimized to minimize the threads divergence, therefore achieving high GPU occupancy. The experimental results obtained for the simulated and real hyperspectral data sets reveal speedups up to 49 times, which demonstrates that the GPU implementation can significantly accelerate the method's execution over big data sets while maintaining the methods accuracy.

Relevância:

60.00% 60.00%

Publicador:

Resumo:

Na atualidade, existe uma quantidade de dados criados diariamente que ultrapassam em muito as mais otimistas espectativas estabelecidas na década anterior. Estes dados têm origens bastante diversas e apresentam-se sobre várias formas. Este novo conceito que dá pelo nome de Big Data está a colocar novos e rebuscados desafios ao seu armazenamento, tratamento e manipulação. Os tradicionais sistemas de armazenamento não se apresentam como a solução indicada para este problema. Estes desafios são alguns dos mais analisados e dissertados temas informáticos do momento. Várias tecnologias têm emergido com esta nova era, das quais se salienta um novo paradigma de armazenamento, o movimento NoSQL. Esta nova filosofia de armazenamento visa responder às necessidades de armazenamento e processamento destes volumosos e heterogéneos dados. Os armazéns de dados são um dos componentes mais importantes do âmbito Business Intelligence e são, maioritariamente, utilizados como uma ferramenta de apoio aos processos de tomada decisão, levados a cabo no dia-a-dia de uma organização. A sua componente histórica implica que grandes volumes de dados sejam armazenados, tratados e analisados tendo por base os seus repositórios. Algumas organizações começam a ter problemas para gerir e armazenar estes grandes volumes de informação. Esse facto deve-se, em grande parte, à estrutura de armazenamento que lhes serve de base. Os sistemas de gestão de bases de dados relacionais são, há algumas décadas, considerados como o método primordial de armazenamento de informação num armazém de dados. De facto, estes sistemas começam a não se mostrar capazes de armazenar e gerir os dados operacionais das organizações, sendo consequentemente cada vez menos recomendada a sua utilização em armazéns de dados. É intrinsecamente interessante o pensamento de que as bases de dados relacionais começam a perder a luta contra o volume de dados, numa altura em que um novo paradigma de armazenamento surge, exatamente com o intuito de dominar o grande volume inerente aos dados Big Data. Ainda é mais interessante o pensamento de que, possivelmente, estes novos sistemas NoSQL podem trazer vantagens para o mundo dos armazéns de dados. Assim, neste trabalho de mestrado, irá ser estudada a viabilidade e as implicações da adoção de bases de dados NoSQL, no contexto de armazéns de dados, em comparação com a abordagem tradicional, implementada sobre sistemas relacionais. Para alcançar esta tarefa, vários estudos foram operados tendo por base o sistema relacional SQL Server 2014 e os sistemas NoSQL, MongoDB e Cassandra. Várias etapas do processo de desenho e implementação de um armazém de dados foram comparadas entre os três sistemas, sendo que três armazéns de dados distintos foram criados tendo por base cada um dos sistemas. Toda a investigação realizada neste trabalho culmina no confronto da performance de consultas, realizadas nos três sistemas.