94 resultados para big data storage


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Wireless sensor networks (WSN) are attractive for information gathering in large-scale data rich environments. In order to fully exploit the data gathering and dissemination capabilities of these networks, energy-efficient and scalable solutions for data storage and information discovery are essential. In this paper, we formulate the information discovery problem as a load-balancing problem, with the combined aim being to maximize network lifetime and minimize query processing delay resulting in QoS improvements. We propose a novel information storage and distribution mechanism that takes into account the residual energy levels in individual sensors. Further, we propose a hybrid push-pull strategy that enables fast response to information discovery queries.

Simulations results prove the proposed method(s) of information discovery offer significant QoS benefits for global as well as individual queries in comparison to previous approaches.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Recursive Auto-Associative Memory (RAAM) has come to dominate connectionist investigations into representing compositional structure. Although an adequate model when dealing with limited data, the capacity of RAAM to scale-up to real-world tasks has been frequently questioned. RAAM networks are difficult to train (due to the moving target effect) and as such training times can be lengthy. Investigations into RAAM have produced many variants in an attempt to overcome such limitations. We outline how one such model ((S)RAAM) is able to quickly produce context-sensitive representations that may be used to aid a deterministic parsing process. By substituting a symbolic stack in an existing hybrid parser, we show that (S)RAAM is more than capable of encoding the real-world data sets employed. We conclude by suggesting that models such as (S)RAAM offer valuable insights into the features of connectionist compositional representations.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

We consider a cloud data storage involving three entities, the cloud customer, the cloud business centre which provides services, and the cloud data storage centre. Data stored in the data storage centre comes from a variety of customers and some of these customers may compete with each other in the market place or may own data which comprises confidential information about their own clients. Cloud staff have access to data in the data storage centre which could be used to steal identities or to compromise cloud customers. In this paper, we provide an efficient method of data storage which prevents staff from accessing data which can be abused as described above. We also suggest a method of securing access to data which requires more than one staff member to access it at any given time. This ensures that, in case of a dispute, a staff member always has a witness to the fact that she accessed data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Using film grammar as the underpinning, we study the extraction of structures in video based on color using a wide configuration of clustering methods combined with existing and new similarity measures. We study the visualisation of these structures, which we call Scene-Cluster Temporal Charts and show how it can bring out the interweaving of different themes and settings in a film. We also extract color events that filmmakers use to draw/force a viewer's attention to a shot/scene. This is done by first extracting a set of colors used rarely in film, and then building a probabilistic model for color event detection. We demonstrate with experimental results from ten movies that our algorithms are effective in the extraction of both scene-cluster temporal charts and color events.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

 Phenomenological research into the online experience offers real value to Internet Studies and Digital Humanities scholars for three key reasons. Firstly, as an explicitly qualitative approach, it offers a way to gain insights into the experience of going online that are not identified by those who study behaviour alone. Secondly, as phenomenological studies focus on the individual rather than the collective, the resulting small sample size means that the investment required in terms of time spent with participants is minimised. Finally, the interpretation that emerges through the phenomenological research process produces categorisations that could form the basis on which larger scale, Big Data, quantitative research projects could be built.
This paper will explore the above ideas through the lens of my doctoral research, which uses hermeneutic phenomenology to investigate the experience of persona construction by artists on the fringes of the traditional art world, specifically craftivists, tattoo artists, street artists, and performance poets. By incorporating the interpretive categorisations that have come from my early discussions, I will demonstrate the strength of a phenomenological approach to investigating the experience of using the world and social media to present the self to the world.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Industries in developed countries are moving quickly to ensure the rapid adoption of cloud computing. At this stage, several outstanding issues exist, particularly related to Service Level Agreements (SLAs), security and privacy. Consumers and businesses are willing to use cloud computing only if they can trust that their data will remain private and secure. Our review of research literature indicates the level of control that a user has on their data is directly correlated to the level of data privacy provided by the cloud service. We considered several privacy factors from the industry perspective, namely data loss, data storage location being unknown to the client, vendor lock-in, unauthorized secondary use of user's data for advertising, targeting secured backup and easy restoration. The level of user control in database models were identified according to the level of existence in these privacy factors. Finally, we focused on a novel logical model that might help to bring the level of user control of privacy in cloud databases into a higher level.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Biopolymers can be produced through a variety of mechanisms. They can be derived from microbial systems, extracted from higher organisms such as plants, or synthesized chemically from basic biological building blocks. A wide range of emerging applications rely on all three of these production techniques. In recent years, considerable attention has been given to biopolymers produced by microbes. It is on the microbial level where the tools of genetic engineering can be most readily applied. A number of novel materials are now being developed or introduced into the market. Biopolymers are being developed for use as medical materials, packaging, cosmetics, food additives, clothing fabrics, water treatment chemicals, industrial plastics, absorbents, biosensors, and even data storage elements. This review identifies the possible commercial applications and describes the various methods of production of microbial biopolymers.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Various air-breathing marine vertebrates such as seals, turtles and seabirds show distinct patterns of diving behaviour. For fish, the distinction between different vertical behaviours is often less clear-cut, as there are no surface intervals to differentiate between dives. Using data from acoustic tags (n = 23) and archival depth recorders attached to cod Gadus morhua (n = 92) in the southern North Sea, we developed a quantitative method of classifying vertical movements in order to facilitate an objective comparison of the behaviour of different individuals. This method expands the utilisation of data from data storage tags, with the potential for a better understanding of fish behaviour and enhanced individual based behaviour for improved ecosystem modelling. We found that cod were closely associated with the seabed for 90% of the time, although they showed distinct seasonal and spatial patterns in behaviour. For example, cod tagged in the southern North Sea exhibited high rates of vertical movement in spring and autumn that were probably associated with migration, while the vertical movements of resident cod in other areas were much less extensive and were probably related to foraging or spawning behaviours. The full reasons underlying spatial and temporal behavioural plasticity by cod in the North Sea warrant further investigation.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multidimensional WSNs are deployed in complex environments to sense and collect data relating to multiple attributes (multidimensional data). Such networks present unique challenges to data dissemination, data storage and in-network query processing (information discovery). In this paper, we investigate efficient strategies for information discovery in large-scale multidimensional WSNs and propose the Adaptive MultiDimensional Multi-Resolution Architecture (A-MDMRA) that efficiently combines “push” and “pull” strategies for information discovery and adapts to variations in the frequencies of events and queries in the network to construct optimal routing structures. We present simulation results showing the optimal routing structure depends on the frequency of events and query occurrence in the network. It also balances push and pull operations in large scale networks enabling significant QoS improvements and energy savings.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, we propose the concept: the BI Sweet Spot. The BI Sweet Spot ecosystem includes mobile computing, cloud computing and Big Data. We provide an overview for each of the key components and explain how these three components support the BI Sweet Spot. We also discuss best practices for managing these essential components. This study is the first-of-its-kind work in the BI research that considers the inter-relationships and the combined effect of mobile, cloud and Big Data.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Due to the serious information overload problem on the Internet, recommender systems have emerged as an important tool for recommending more useful information to users by providing personalized services for individual users. However, in the “big data“ era, recommender systems face significant challenges, such as how to process massive data efficiently and accurately. In this paper we propose an incremental algorithm based on singular value decomposition (SVD) with good scalability, which combines the Incremental SVD algorithm with the Approximating the Singular Value Decomposition (ApproSVD) algorithm, called the Incremental ApproSVD. Furthermore, strict error analysis demonstrates the effectiveness of the performance of our Incremental ApproSVD algorithm. We then present an empirical study to compare the prediction accuracy and running time between our Incremental ApproSVD algorithm and the Incremental SVD algorithm on the MovieLens dataset and Flixster dataset. The experimental results demonstrate that our proposed method outperforms its counterparts.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations and causality. Since some traditional frequent itemsets mining algorithms are unable to handle massive small files datasets effectively, such as high memory cost, high I/O overhead, and low computing performance, we propose a novel parallel frequent itemsets mining algorithm based on the FP-Growth algorithm and discuss its applications in this paper. First, we introduce a small files processing strategy for massive small files datasets to compensate defects of low read-write speed and low processing efficiency in Hadoop. Moreover, we use MapReduce to redesign the FP-Growth algorithm for implementing parallel computing, thereby improving the overall performance of frequent itemsets mining. Finally, we apply the proposed algorithm to the association analysis of the data from the national college entrance examination and admission of China. The experimental results show that the proposed algorithm is feasible and valid for a good speedup and a higher mining efficiency, and can meet the actual requirements of frequent itemsets mining for massive small files datasets. © 2014 ISSN 2185-2766.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

With the arrival of big data era, the Internet traffic is growing exponentially. A wide variety of applications arise on the Internet and traffic classification is introduced to help people manage the massive applications on the Internet for security monitoring and quality of service purposes. A large number of Machine Learning (ML) algorithms are introduced to deal with traffic classification. A significant challenge to the classification performance comes from imbalanced distribution of data in traffic classification system. In this paper, we proposed an Optimised Distance-based Nearest Neighbor (ODNN), which has the capability of improving the classification performance of imbalanced traffic data. We analyzed the proposed ODNN approach and its performance benefit from both theoretical and empirical perspectives. A large number of experiments were implemented on the real-world traffic dataset. The results show that the performance of “small classes” can be improved significantly even only with small number of training data and the performance of “large classes” remains stable.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The Hadoop framework provides a powerful way to handle Big Data. Since Hadoop has inherent defects of high memory overhead and low computing performance in processing massive small files, we implement three methods and propose two strategies for solving small files problem in this paper. First, we implement three methods, i.e., Hadoop Archives (HAR), Sequence Files (SF) and CombineFileInputFormat (CFIF), to compensate the existing defects of Hadoop. Moreover, we propose two strategies for meeting the actual needs of different users. Finally, we evaluate the efficiency of the implemented methods and the validity of the proposed strategies. The experimental results show that our methods and strategies can improve the efficiency of massive small files processing, thereby enhancing the overall performance of Hadoop. © 2014 ISSN 1881-803X.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Multidimensional WSNs are deployed in complex environments to sense and collect data relating to multiple attributes (multi-dimensional data). Such networks present unique challenges to data dissemination, data storage and in-network query processing (information discovery). Recent algorithms proposed for such WSNs are aimed at achieving better energy efficiency and minimizing latency. This creates a partitioned network area due to the overuse of certain nodes in areas which are on the shortest or closest or path to the base station or data aggregation points which results in hotspots nodes. In this paper, we propose a time-based multi-dimensional, multi-resolution storage approach for range queries that balances the energy consumption by balancing the traffic load as uniformly as possible. Thus ensuring a maximum network lifetime. We present simulation results to show that the proposed approach to information discovery offers significant improvements on information discovery latency compared with current approaches. In addition, the results prove that the Quality of Service (QoS) improvements reduces hotspots thus resulting in significant network-wide energy saving and an increased network lifetime.