923 resultados para Dynamic storage allocation (Computer science)
Resumo:
Big Data Analytics is an emerging field since massive storage and computing capabilities have been made available by advanced e-infrastructures. Earth and Environmental sciences are likely to benefit from Big Data Analytics techniques supporting the processing of the large number of Earth Observation datasets currently acquired and generated through observations and simulations. However, Earth Science data and applications present specificities in terms of relevance of the geospatial information, wide heterogeneity of data models and formats, and complexity of processing. Therefore, Big Earth Data Analytics requires specifically tailored techniques and tools. The EarthServer Big Earth Data Analytics engine offers a solution for coverage-type datasets, built around a high performance array database technology, and the adoption and enhancement of standards for service interaction (OGC WCS and WCPS). The EarthServer solution, led by the collection of requirements from scientific communities and international initiatives, provides a holistic approach that ranges from query languages and scalability up to mobile access and visualization. The result is demonstrated and validated through the development of lighthouse applications in the Marine, Geology, Atmospheric, Planetary and Cryospheric science domains.
Resumo:
Ecosystem engineers that increase habitat complexity are keystone species in marine systems, increasing shelter and niche availability, and therefore biodiversity. For example, kelp holdfasts form intricate structures and host the largest number of organisms in kelp ecosystems. However, methods that quantify 3D habitat complexity have only seldom been used in marine habitats, and never in kelp holdfast communities. This study investigated the role of kelp holdfasts (Laminaria hyperborea) in supporting benthic faunal biodiversity. Computer-aided tomography (CT-) scanning was used to quantify the three-dimensional geometrical complexity of holdfasts, including volume, surface area and surface fractal dimension (FD). Additionally, the number of haptera, number of haptera per unit of volume, and age of kelps were estimated. These measurements were compared to faunal biodiversity and community structure, using partial least-squares regression and multivariate ordination. Holdfast volume explained most of the variance observed in biodiversity indices, however all other complexity measures also strongly contributed to the variance observed. Multivariate ordinations further revealed that surface area and haptera per unit of volume accounted for the patterns observed in faunal community structure. Using 3D image analysis, this study makes a strong contribution to elucidate quantitative mechanisms underlying the observed relationship between biodiversity and habitat complexity. Furthermore, the potential of CT-scanning as an ecological tool is demonstrated, and a methodology for its use in future similar studies is established. Such spatially resolved imager analysis could help identify structurally complex areas as biodiversity hotspots, and may support the prioritization of areas for conservation.
Resumo:
Ecosystem engineers that increase habitat complexity are keystone species in marine systems, increasing shelter and niche availability, and therefore biodiversity. For example, kelp holdfasts form intricate structures and host the largest number of organisms in kelp ecosystems. However, methods that quantify 3D habitat complexity have only seldom been used in marine habitats, and never in kelp holdfast communities. This study investigated the role of kelp holdfasts (Laminaria hyperborea) in supporting benthic faunal biodiversity. Computer-aided tomography (CT-) scanning was used to quantify the three-dimensional geometrical complexity of holdfasts, including volume, surface area and surface fractal dimension (FD). Additionally, the number of haptera, number of haptera per unit of volume, and age of kelps were estimated. These measurements were compared to faunal biodiversity and community structure, using partial least-squares regression and multivariate ordination. Holdfast volume explained most of the variance observed in biodiversity indices, however all other complexity measures also strongly contributed to the variance observed. Multivariate ordinations further revealed that surface area and haptera per unit of volume accounted for the patterns observed in faunal community structure. Using 3D image analysis, this study makes a strong contribution to elucidate quantitative mechanisms underlying the observed relationship between biodiversity and habitat complexity. Furthermore, the potential of CT-scanning as an ecological tool is demonstrated, and a methodology for its use in future similar studies is established. Such spatially resolved imager analysis could help identify structurally complex areas as biodiversity hotspots, and may support the prioritization of areas for conservation.
Resumo:
Several studies in the past have revealed that network end user devices are left powered up 24/7 even when idle just for the sake of maintaining Internet connectivity. Network devices normally support low power states but are kept inactive due to their inability to maintain network connectivity. The Network Connectivity Proxy (NCP) has recently been proposed as an effective mechanism to impersonate network connectivity on behalf of high power devices and enable them to sleep when idle without losing network presence. The NCP can efficiently proxy basic networking protocol, however, proxying of Internet based applications have no absolute solution due to dynamic and non-predictable nature of the packets they are sending and receiving periodically. This paper proposes an approach for proxying Internet based applications and presents the basic software architectures and capabilities. Further, this paper also practically evaluates the proposed framework and analyzes expected energy savings achievable under-different realistic conditions.
Resumo:
Stealthy attackers move patiently through computer networks - taking days, weeks or months to accomplish their objectives in order to avoid detection. As networks scale up in size and speed, monitoring for such attack attempts is increasingly a challenge. This paper presents an efficient monitoring technique for stealthy attacks. It investigates the feasibility of proposed method under number of different test cases and examines how design of the network affects the detection. A methodological way for tracing anonymous stealthy activities to their approximate sources is also presented. The Bayesian fusion along with traffic sampling is employed as a data reduction method. The proposed method has the ability to monitor stealthy activities using 10-20% size sampling rates without degrading the quality of detection.
Resumo:
In this talk, I will describe various computational modelling and data mining solutions that form the basis of how the office of Deputy Head of Department (Resources) works to serve you. These include lessons I learn about, and from, optimisation issues in resource allocation, uncertainty analysis on league tables, modelling the process of winning external grants, and lessons we learn from student satisfaction surveys, some of which I have attempted to inject into our planning processes.
Resumo:
The generation of heterogeneous big data sources with ever increasing volumes, velocities and veracities over the he last few years has inspired the data science and research community to address the challenge of extracting knowledge form big data. Such a wealth of generated data across the board can be intelligently exploited to advance our knowledge about our environment, public health, critical infrastructure and security. In recent years we have developed generic approaches to process such big data at multiple levels for advancing decision-support. It specifically concerns data processing with semantic harmonisation, low level fusion, analytics, knowledge modelling with high level fusion and reasoning. Such approaches will be introduced and presented in context of the TRIDEC project results on critical oil and gas industry drilling operations and also the ongoing large eVacuate project on critical crowd behaviour detection in confined spaces.
Resumo:
Abstract Heading into the 2020s, Physics and Astronomy are undergoing experimental revolutions that will reshape our picture of the fabric of the Universe. The Large Hadron Collider (LHC), the largest particle physics project in the world, produces 30 petabytes of data annually that need to be sifted through, analysed, and modelled. In astrophysics, the Large Synoptic Survey Telescope (LSST) will be taking a high-resolution image of the full sky every 3 days, leading to data rates of 30 terabytes per night over ten years. These experiments endeavour to answer the question why 96% of the content of the universe currently elude our physical understanding. Both the LHC and LSST share the 5-dimensional nature of their data, with position, energy and time being the fundamental axes. This talk will present an overview of the experiments and data that is gathered, and outlines the challenges in extracting information. Common strategies employed are very similar to industrial data! Science problems (e.g., data filtering, machine learning, statistical interpretation) and provide a seed for exchange of knowledge between academia and industry. Speaker Biography Professor Mark Sullivan Mark Sullivan is a Professor of Astrophysics in the Department of Physics and Astronomy. Mark completed his PhD at Cambridge, and following postdoctoral study in Durham, Toronto and Oxford, now leads a research group at Southampton studying dark energy using exploding stars called "type Ia supernovae". Mark has many years' experience of research that involves repeatedly imaging the night sky to track the arrival of transient objects, involving significant challenges in data handling, processing, classification and analysis.
Resumo:
We present an Integrated Environment suitable for learning and teaching computer programming which is designed for both students of specialised Computer Science courses, and also non-specialist students such as those following Liberal Arts. The environment is rich enough to allow exploration of concepts from robotics, artificial intelligence, social science, and philosophy as well as the specialist areas of operating systems and the various computer programming paradigms.
Resumo:
Coupled map lattices (CML) can describe many relaxation and optimization algorithms currently used in image processing. We recently introduced the ‘‘plastic‐CML’’ as a paradigm to extract (segment) objects in an image. Here, the image is applied by a set of forces to a metal sheet which is allowed to undergo plastic deformation parallel to the applied forces. In this paper we present an analysis of our ‘‘plastic‐CML’’ in one and two dimensions, deriving the nature and stability of its stationary solutions. We also detail how to use the CML in image processing, how to set the system parameters and present examples of it at work. We conclude that the plastic‐CML is able to segment images with large amounts of noise and large dynamic range of pixel values, and is suitable for a very large scale integration(VLSI) implementation.
Resumo:
Abstract not available
Resumo:
The central product of the DRAMA (Dynamic Re-Allocation of Meshes for parallel Finite Element Applications) project is a library comprising a variety of tools for dynamic re-partitioning of unstructured Finite Element (FE) applications. The input to the DRAMA library is the computational mesh, and corresponding costs, partitioned into sub-domains. The core library functions then perform a parallel computation of a mesh re-allocation that will re-balance the costs based on the DRAMA cost model. We discuss the basic features of this cost model, which allows a general approach to load identification, modelling and imbalance minimisation. Results from crash simulations are presented which show the necessity for multi-phase/multi-constraint partitioning components.
Resumo:
In today’s big data world, data is being produced in massive volumes, at great velocity and from a variety of different sources such as mobile devices, sensors, a plethora of small devices hooked to the internet (Internet of Things), social networks, communication networks and many others. Interactive querying and large-scale analytics are being increasingly used to derive value out of this big data. A large portion of this data is being stored and processed in the Cloud due the several advantages provided by the Cloud such as scalability, elasticity, availability, low cost of ownership and the overall economies of scale. There is thus, a growing need for large-scale cloud-based data management systems that can support real-time ingest, storage and processing of large volumes of heterogeneous data. However, in the pay-as-you-go Cloud environment, the cost of analytics can grow linearly with the time and resources required. Reducing the cost of data analytics in the Cloud thus remains a primary challenge. In my dissertation research, I have focused on building efficient and cost-effective cloud-based data management systems for different application domains that are predominant in cloud computing environments. In the first part of my dissertation, I address the problem of reducing the cost of transactional workloads on relational databases to support database-as-a-service in the Cloud. The primary challenges in supporting such workloads include choosing how to partition the data across a large number of machines, minimizing the number of distributed transactions, providing high data availability, and tolerating failures gracefully. I have designed, built and evaluated SWORD, an end-to-end scalable online transaction processing system, that utilizes workload-aware data placement and replication to minimize the number of distributed transactions that incorporates a suite of novel techniques to significantly reduce the overheads incurred both during the initial placement of data, and during query execution at runtime. In the second part of my dissertation, I focus on sampling-based progressive analytics as a means to reduce the cost of data analytics in the relational domain. Sampling has been traditionally used by data scientists to get progressive answers to complex analytical tasks over large volumes of data. Typically, this involves manually extracting samples of increasing data size (progressive samples) for exploratory querying. This provides the data scientists with user control, repeatable semantics, and result provenance. However, such solutions result in tedious workflows that preclude the reuse of work across samples. On the other hand, existing approximate query processing systems report early results, but do not offer the above benefits for complex ad-hoc queries. I propose a new progressive data-parallel computation framework, NOW!, that provides support for progressive analytics over big data. In particular, NOW! enables progressive relational (SQL) query support in the Cloud using unique progress semantics that allow efficient and deterministic query processing over samples providing meaningful early results and provenance to data scientists. NOW! enables the provision of early results using significantly fewer resources thereby enabling a substantial reduction in the cost incurred during such analytics. Finally, I propose NSCALE, a system for efficient and cost-effective complex analytics on large-scale graph-structured data in the Cloud. The system is based on the key observation that a wide range of complex analysis tasks over graph data require processing and reasoning about a large number of multi-hop neighborhoods or subgraphs in the graph; examples include ego network analysis, motif counting in biological networks, finding social circles in social networks, personalized recommendations, link prediction, etc. These tasks are not well served by existing vertex-centric graph processing frameworks whose computation and execution models limit the user program to directly access the state of a single vertex, resulting in high execution overheads. Further, the lack of support for extracting the relevant portions of the graph that are of interest to an analysis task and loading it onto distributed memory leads to poor scalability. NSCALE allows users to write programs at the level of neighborhoods or subgraphs rather than at the level of vertices, and to declaratively specify the subgraphs of interest. It enables the efficient distributed execution of these neighborhood-centric complex analysis tasks over largescale graphs, while minimizing resource consumption and communication cost, thereby substantially reducing the overall cost of graph data analytics in the Cloud. The results of our extensive experimental evaluation of these prototypes with several real-world data sets and applications validate the effectiveness of our techniques which provide orders-of-magnitude reductions in the overheads of distributed data querying and analysis in the Cloud.
Resumo:
This work presents a study about a the Baars-Franklin architecture, which defines a model of computational consciousness, and use it in a mobile robot navigation task. The insertion of mobile robots in dynamic environments carries a high complexity in navigation tasks, in order to deal with the constant environment changes, it is essential that the robot can adapt to this dynamism. The approach utilized in this work is to make the execution of these tasks closer to how human beings react to the same conditions by means of a model of computational consci-ousness. The LIDA architecture (Learning Intelligent Distribution Agent) is a cognitive system that seeks tomodel some of the human cognitive aspects, from low-level perceptions to decision making, as well as attention mechanism and episodic memory. In the present work, a computa-tional implementation of the LIDA architecture was evaluated by means of a case study, aiming to evaluate the capabilities of a cognitive approach to navigation of a mobile robot in dynamic and unknown environments, using experiments both with virtual environments (simulation) and a real robot in a realistic environment. This study concluded that it is possible to obtain benefits by using conscious cognitive models in mobile robot navigation tasks, presenting the positive and negative aspects of this approach.
Resumo:
SQL Injection Attack (SQLIA) remains a technique used by a computer network intruder to pilfer an organisation’s confidential data. This is done by an intruder re-crafting web form’s input and query strings used in web requests with malicious intent to compromise the security of an organisation’s confidential data stored at the back-end database. The database is the most valuable data source, and thus, intruders are unrelenting in constantly evolving new techniques to bypass the signature’s solutions currently provided in Web Application Firewalls (WAF) to mitigate SQLIA. There is therefore a need for an automated scalable methodology in the pre-processing of SQLIA features fit for a supervised learning model. However, obtaining a ready-made scalable dataset that is feature engineered with numerical attributes dataset items to train Artificial Neural Network (ANN) and Machine Leaning (ML) models is a known issue in applying artificial intelligence to effectively address ever evolving novel SQLIA signatures. This proposed approach applies numerical attributes encoding ontology to encode features (both legitimate web requests and SQLIA) to numerical data items as to extract scalable dataset for input to a supervised learning model in moving towards a ML SQLIA detection and prevention model. In numerical attributes encoding of features, the proposed model explores a hybrid of static and dynamic pattern matching by implementing a Non-Deterministic Finite Automaton (NFA). This combined with proxy and SQL parser Application Programming Interface (API) to intercept and parse web requests in transition to the back-end database. In developing a solution to address SQLIA, this model allows processed web requests at the proxy deemed to contain injected query string to be excluded from reaching the target back-end database. This paper is intended for evaluating the performance metrics of a dataset obtained by numerical encoding of features ontology in Microsoft Azure Machine Learning (MAML) studio using Two-Class Support Vector Machines (TCSVM) binary classifier. This methodology then forms the subject of the empirical evaluation.