6 resultados para Impala, Hadoop, Big Data, HDFS, Social Business Intelligence, SBI, cloudera
em Repositório Institucional da Universidade de Aveiro - Portugal
Resumo:
During the last decades, we assisted to what is called “information explosion”. With the advent of the new technologies and new contexts, the volume, velocity and variety of data has increased exponentially, becoming what is known today as big data. Among them, we emphasize telecommunications operators, which gather, using network monitoring equipment, millions of network event records, the Call Detail Records (CDRs) and the Event Detail Records (EDRs), commonly known as xDRs. These records are stored and later processed to compute network performance and quality of service metrics. With the ever increasing number of collected xDRs, its generated volume needing to be stored has increased exponentially, making the current solutions based on relational databases not suited anymore. To tackle this problem, the relational data store can be replaced by Hadoop File System (HDFS). However, HDFS is simply a distributed file system, this way not supporting any aspect of the relational paradigm. To overcome this difficulty, this paper presents a framework that enables the current systems inserting data into relational databases, to keep doing it transparently when migrating to Hadoop. As proof of concept, the developed platform was integrated with the Altaia - a performance and QoS management of telecommunications networks and services.
Resumo:
Internet users consume online targeted advertising based on information collected about them and voluntarily share personal information in social networks. Sensor information and data from smart-phones is collected and used by applications, sometimes in unclear ways. As it happens today with smartphones, in the near future sensors will be shipped in all types of connected devices, enabling ubiquitous information gathering from the physical environment, enabling the vision of Ambient Intelligence. The value of gathered data, if not obvious, can be harnessed through data mining techniques and put to use by enabling personalized and tailored services as well as business intelligence practices, fueling the digital economy. However, the ever-expanding information gathering and use undermines the privacy conceptions of the past. Natural social practices of managing privacy in daily relations are overridden by socially-awkward communication tools, service providers struggle with security issues resulting in harmful data leaks, governments use mass surveillance techniques, the incentives of the digital economy threaten consumer privacy, and the advancement of consumergrade data-gathering technology enables new inter-personal abuses. A wide range of fields attempts to address technology-related privacy problems, however they vary immensely in terms of assumptions, scope and approach. Privacy of future use cases is typically handled vertically, instead of building upon previous work that can be re-contextualized, while current privacy problems are typically addressed per type in a more focused way. Because significant effort was required to make sense of the relations and structure of privacy-related work, this thesis attempts to transmit a structured view of it. It is multi-disciplinary - from cryptography to economics, including distributed systems and information theory - and addresses privacy issues of different natures. As existing work is framed and discussed, the contributions to the state-of-theart done in the scope of this thesis are presented. The contributions add to five distinct areas: 1) identity in distributed systems; 2) future context-aware services; 3) event-based context management; 4) low-latency information flow control; 5) high-dimensional dataset anonymity. Finally, having laid out such landscape of the privacy-preserving work, the current and future privacy challenges are discussed, considering not only technical but also socio-economic perspectives.
Resumo:
A presente investigação propõe-se a atuar no sector turístico, uma vez que este é bombardeado diariamente por uma quantidade considerável de dados e informações. Atualmente, usufrui-se significativamente mais da tecnologia com a finalidade de promover e vender os produtos/serviços disponíveis no mercado. A par da evolução tecnológica, os utilizadores/clientes conseguem comprar, cada vez mais, à distancia de um clique os produtos turísticos que desejam. No entanto, há um variado leque de aplicações sobre o turismo que permitem entender os gostos e as necessidades dos turistas assim como a sua atitude para com o mesmo. Porém, nem as entidades nem os gestores turísticos usufruem inteligentemente dos dados que lhes são facultados. Estes tendem normalmente a prender-se pelo turismo em Portugal e de que forma é que a sua entidade é apresentada acabando por esquecer que os dados podem e devem ser utilizados para expandir o mercado assim como entender/conhecer potenciais mercados. Deste modo, o fundamento principal desta investigação remete para a criação de uma plataforma infocomunicacional que analise na totalidade os dados obtidos, assim como fornecer as ferramentas pertinentes para que se consiga fazer esta análise, nomeadamente através de uma representação infográfica adequada e estratégias de a comunicar aos stakeholders.. Para tal foi aplicada no âmbito desta dissertação a metodologia investigação/ação, vista como um processo cíclico que para além de incluir simultaneamente estas duas vertentes, vai alternando entre a ação e a reflexão critica sendo sustentada por bases teóricas. A criação do protótipo da plataforma Smart Tourism, resultou num sistema inovador que tenta responder aos indicadores escolhidos no Dashbord e ao problema infocomunicacional, tentando criar as bases necessárias para que as entidades consigam analisar de forma mais integrada/sistematizada e racional a atividade turística. Foi por isso, desenvolvido e avaliado qualitativamente um protótipo de base infocomunicacional visual (dashboard visual) que para além do que para além do que já foi referido, consegue proporcionar a gestão dos produtos, clientes, staff e parceiros, aumentando assim o valor deste sector.
Resumo:
The effective supplier evaluation and purchasing processes are of vital importance to business organizations, making the suppliers selection problem a fundamental key issue to their success. We consider a complex supplier selection problem with multiple products where minimum package quantities, minimum order values related to delivery costs, and discounted pricing schemes are taken into account. Our main contribution is to present a mixed integer linear programming (MILP) model for this supplier selection problem. The model is used to solve several examples including three real case studies from an electronic equipment assembly company.
Resumo:
In recent years the technological world has grown by incorporating billions of small sensing devices, collecting and sharing real-world information. As the number of such devices grows, it becomes increasingly difficult to manage all these new information sources. There is no uniform way to share, process and understand context information. In previous publications we discussed efficient ways to organize context information that is independent of structure and representation. However, our previous solution suffers from semantic sensitivity. In this paper we review semantic methods that can be used to minimize this issue, and propose an unsupervised semantic similarity solution that combines distributional profiles with public web services. Our solution was evaluated against Miller-Charles dataset, achieving a correlation of 0.6.
Resumo:
Worldwide air traffic tends to increase and for many airports it is no longer an op-tion to expand terminals and runways, so airports are trying to maximize their op-erational efficiency. Many airports already operate near their maximal capacity. Peak hours imply operational bottlenecks and cause chained delays across flights impacting passengers, airlines and airports. Therefore there is a need for the opti-mization of the ground movements at the airports. The ground movement prob-lem consists of routing the departing planes from the gate to the runway for take-off, and the arriving planes from the runway to the gate, and to schedule their movements. The main goal is to minimize the time spent by the planes during their ground movements while respecting all the rules established by the Ad-vanced Surface Movement, Guidance and Control Systems of the International Civil Aviation. Each aircraft event (arrival or departing authorization) generates a new environment and therefore a new instance of the Ground Movement Prob-lem. The optimization approach proposed is based on an Iterated Local Search and provides a fast heuristic solution for each real-time event generated instance granting all safety regulations. Preliminary computational results are reported for real data comparing the heuristic solutions with the solutions obtained using a mixed-integer programming approach.