950 resultados para Extensible Pluggable Architecture Hydra Data


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A substantial amount of information on the Internet is present in the form of text. The value of this semi-structured and unstructured data has been widely acknowledged, with consequent scientific and commercial exploitation. The ever-increasing data production, however, pushes data analytic platforms to their limit. This thesis proposes techniques for more efficient textual big data analysis suitable for the Hadoop analytic platform. This research explores the direct processing of compressed textual data. The focus is on developing novel compression methods with a number of desirable properties to support text-based big data analysis in distributed environments. The novel contributions of this work include the following. Firstly, a Content-aware Partial Compression (CaPC) scheme is developed. CaPC makes a distinction between informational and functional content in which only the informational content is compressed. Thus, the compressed data is made transparent to existing software libraries which often rely on functional content to work. Secondly, a context-free bit-oriented compression scheme (Approximated Huffman Compression) based on the Huffman algorithm is developed. This uses a hybrid data structure that allows pattern searching in compressed data in linear time. Thirdly, several modern compression schemes have been extended so that the compressed data can be safely split with respect to logical data records in distributed file systems. Furthermore, an innovative two layer compression architecture is used, in which each compression layer is appropriate for the corresponding stage of data processing. Peripheral libraries are developed that seamlessly link the proposed compression schemes to existing analytic platforms and computational frameworks, and also make the use of the compressed data transparent to developers. The compression schemes have been evaluated for a number of standard MapReduce analysis tasks using a collection of real-world datasets. In comparison with existing solutions, they have shown substantial improvement in performance and significant reduction in system resource requirements.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Mémoire numérisé par la Direction des bibliothèques de l'Université de Montréal.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This dissertation studies the context-aware application with its proposed algorithms at client side. The required context-aware infrastructure is discussed in depth to illustrate that such an infrastructure collects the mobile user’s context information, registers service providers, derives mobile user’s current context, distributes user context among context-aware applications, and provides tailored services. The approach proposed tries to strike a balance between the context server and mobile devices. The context acquisition is centralized at the server to ensure the usability of context information among mobile devices, while context reasoning remains at the application level. Hence, a centralized context acquisition and distributed context reasoning are viewed as a better solution overall. The context-aware search application is designed and implemented at the server side. A new algorithm is proposed to take into consideration the user context profiles. By promoting feedback on the dynamics of the system, any prior user selection is now saved for further analysis such that it may contribute to help the results of a subsequent search. On the basis of these developments at the server side, various solutions are consequently provided at the client side. A proxy software-based component is set up for the purpose of data collection. This research endorses the belief that the proxy at the client side should contain the context reasoning component. Implementation of such a component provides credence to this belief in that the context applications are able to derive the user context profiles. Furthermore, a context cache scheme is implemented to manage the cache on the client device in order to minimize processing requirements and other resources (bandwidth, CPU cycle, power). Java and MySQL platforms are used to implement the proposed architecture and to test scenarios derived from user’s daily activities. To meet the practical demands required of a testing environment without the impositions of a heavy cost for establishing such a comprehensive infrastructure, a software simulation using a free Yahoo search API is provided as a means to evaluate the effectiveness of the design approach in a most realistic way. The integration of Yahoo search engine into the context-aware architecture design proves how context aware application can meet user demands for tailored services and products in and around the user’s environment. The test results show that the overall design is highly effective,providing new features and enriching the mobile user’s experience through a broad scope of potential applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Corel Geological Drafting Kit (CGDK), a program written in VBA, has been designed to assist geologists and geochemists with their drafting work. It obtains geological data from a running Excel application directly, and uses the data to plot geochemical diagrams and to construct stratigraphic columns. The software also contains functions for creating stereographic projections and rose diagrams, which can be used for spatial analysis, on a calibrated geological map. The user-friendly program has been tested to work with CorelDRAW 13 - 14 - 15 and Excel 2003 - 2007.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Visual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical or textual metadata which are given for the data elements. While often not part of the clustering process, such metadata play an important role and need to be considered during the interactive cluster exploration process. Traditionally, linked-views allow to relate (or loosely speaking: correlate) clusters with metadata or other properties of the underlying cluster data. Manually inspecting the distribution of metadata for each cluster in a linked-view approach is tedious, specially for large data sets, where a large search problem arises. Fully interactive search for potentially useful or interesting cluster to metadata relationships may constitute a cumbersome and long process. To remedy this problem, we propose a novel approach for guiding users in discovering interesting relationships between clusters and associated metadata. Its goal is to guide the analyst through the potentially huge search space. We focus in our work on metadata of categorical type, which can be summarized for a cluster in form of a histogram. We start from a given visual cluster representation, and compute certain measures of interestingness defined on the distribution of metadata categories for the clusters. These measures are used to automatically score and rank the clusters for potential interestingness regarding the distribution of categorical metadata. Identified interesting relationships are highlighted in the visual cluster representation for easy inspection by the user. We present a system implementing an encompassing, yet extensible, set of interestingness scores for categorical metadata, which can also be extended to numerical metadata. Appropriate visual representations are provided for showing the visual correlations, as well as the calculated ranking scores. Focusing on clusters of time series data, we test our approach on a large real-world data set of time-oriented scientific research data, demonstrating how specific interesting views are automatically identified, supporting the analyst discovering interesting and visually understandable relationships.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A parallel pipelined array of cells suitable for realtime computation of histograms is proposed. The cell architecture builds on previous work to now allow operating on a stream of data at 1 pixel per clock cycle. This new cell is more suitable for interfacing to camera sensors or to microprocessors of 8-bit data buses which are common in consumer digital cameras. Arrays using the new proposed cells are obtained via C-slow retiming techniques and can be clocked at a 65% faster frequency than previous arrays. This achieves over 80% of the performance of two-pixel per clock cycle parallel pipelined arrays.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents the novel theory for performing multi-agent activity recognition without requiring large training corpora. The reduced need for data means that robust probabilistic recognition can be performed within domains where annotated datasets are traditionally unavailable. Complex human activities are composed from sequences of underlying primitive activities. We do not assume that the exact temporal ordering of primitives is necessary, so can represent complex activity using an unordered bag. Our three-tier architecture comprises low-level video tracking, event analysis and high-level inference. High-level inference is performed using a new, cascading extension of the Rao–Blackwellised Particle Filter. Simulated annealing is used to identify pairs of agents involved in multi-agent activity. We validate our framework using the benchmarked PETS 2006 video surveillance dataset and our own sequences, and achieve a mean recognition F-Score of 0.82. Our approach achieves a mean improvement of 17% over a Hidden Markov Model baseline.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This report evaluates the use of remotely sensed images in implementing the Iowa DOT LRS that is currently in the stages of system architecture. The Iowa Department of Transportation is investing a significant amount of time and resources into creation of a linear referencing system (LRS). A significant portion of the effort in implementing the system will be creation of a datum, which includes geographically locating anchor points and then measuring anchor section distances between those anchor points. Currently, system architecture and evaluation of different data collection methods to establish the LRS datum is being performed for the DOT by an outside consulting team.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The automated transfer of flight logbook information from aircrafts into aircraft maintenance systems leads to reduced ground and maintenance time and is thus desirable from an economical point of view. Until recently, flight logbooks have not been managed electronically in aircrafts or at least the data transfer from aircraft to ground maintenance system has been executed manually. Latest aircraft types such as the Airbus A380 or the Boeing 787 do support an electronic logbook and thus make an automated transfer possible. A generic flight logbook transfer system must deal with different data formats on the input side – due to different aircraft makes and models – as well as different, distributed aircraft maintenance systems for different airlines as aircraft operators. This article contributes the concept and top level distributed system architecture of such a generic system for automated flight log data transfer. It has been developed within a joint industry and applied research project. The architecture has already been successfully evaluated in a prototypical implementation.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Provenance plays a pivotal in tracing the origin of something and determining how and why something had occurred. With the emergence of the cloud and the benefits it encompasses, there has been a rapid proliferation of services being adopted by commercial and government sectors. However, trust and security concerns for such services are on an unprecedented scale. Currently, these services expose very little internal working to their customers; this can cause accountability and compliance issues especially in the event of a fault or error, customers and providers are left to point finger at each other. Provenance-based traceability provides a mean to address part of this problem by being able to capture and query events occurred in the past to understand how and why it took place. However, due to the complexity of the cloud infrastructure, the current provenance models lack the expressibility required to describe the inner-working of a cloud service. For a complete solution, a provenance-aware policy language is also required for operators and users to define policies for compliance purpose. The current policy standards do not cater for such requirement. To address these issues, in this paper we propose a provenance (traceability) model cProv, and a provenance-aware policy language (cProvl) to capture traceability data, and express policies for validating against the model. For implementation, we have extended the XACML3.0 architecture to support provenance, and provided a translator that converts cProvl policy and request into XACML type.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Hyperspectral instruments have been incorporated in satellite missions, providing data of high spectral resolution of the Earth. This data can be used in remote sensing applications, such as, target detection, hazard prevention, and monitoring oil spills, among others. In most of these applications, one of the requirements of paramount importance is the ability to give real-time or near real-time response. Recently, onboard processing systems have emerged, in order to overcome the huge amount of data to transfer from the satellite to the ground station, and thus, avoiding delays between hyperspectral image acquisition and its interpretation. For this purpose, compact reconfigurable hardware modules, such as field programmable gate arrays (FPGAs) are widely used. This paper proposes a parallel FPGA-based architecture for endmember’s signature extraction. This method based on the Vertex Component Analysis (VCA) has several advantages, namely it is unsupervised, fully automatic, and it works without dimensionality reduction (DR) pre-processing step. The architecture has been designed for a low cost Xilinx Zynq board with a Zynq-7020 SoC FPGA based on the Artix-7 FPGA programmable logic and tested using real hyperspectral data sets collected by the NASA’s Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) over the Cuprite mining district in Nevada. Experimental results indicate that the proposed implementation can achieve real-time processing, while maintaining the methods accuracy, which indicate the potential of the proposed platform to implement high-performance, low cost embedded systems, opening new perspectives for onboard hyperspectral image processing.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The stratigraphic architecture of deep sea depositional systems has been discussed in detail. Some examples in Ischia and Stromboli volcanic islands (Southern Tyrrhenian sea, Italy) are here shown and discussed. The submarine slope and base of slope depositional systems represent a major component of marine and lacustrine basin fills, constituting primary targets for hydrocarbon exploration and development. The slope systems are characterized by seven seismic facies building blocks, including the turbiditic channel fills, the turbidite lobes, the sheet turbidites, the slide, slump and debris flow sheets, lobes and tongues, the fine-grained turbidite fills and sheets, the contourite drifts and finally, the hemipelagic drapes and fills. Sparker profiles offshore Ischia are presented. New seismo-stratigraphic evidence on buried volcanic structures and overlying Quaternary deposits of the eastern offshore of the Ischia Island are here discussed to highlight the implications on marine geophysics and volcanology. Regional seismic sections in the Ischia offshore across buried volcanic structures and debris avalanche and debris flow deposits are here presented and discussed. Deep sea depositional systems in the Ischia Island are well developed in correspondence to the Southern Ischia canyon system. The canyon system engraves a narrow continental shelf from Punta Imperatore to Punta San Pancrazio, being limited southwestwards from the relict volcanic edifice of the Ischia bank. While the eastern boundary of the canyon system is controlled by extensional tectonics, being limited from a NE-SW trending (counter-Apenninic) normal fault, its western boundary is controlled by volcanism, due to the growth of the Ischia volcanic bank. Submarine gravitational instabilities also acted in relationships to the canyon system, allowing for the individuation of large scale creeping at the sea bottom and hummocky deposits already interpreted as debris avalanche deposits. High resolution seismic data (Subbottom Chirp) coupled to high resolution Multibeam bathymetry collected in the frame of the Stromboli geophysical experiment aimed at recording seismic active data and tomography of the Stromboli Island are here presented. A new detailed swath bathymetry of Stromboli Island is here shown and discussed to reconstruct an up-to-date morpho-bathymetry and marine geology of the area, compared to volcanologic setting of the Aeolian volcanic complex. The Stromboli DEM gives information about the submerged structure of the volcano, particularly about the volcano-tectonic and gravitational processes involving the submarine flanks of the edifice. Several seismic units have been identified around the volcanic edifice and interpreted as volcanic acoustic basement pertaining to the volcano and overlying slide chaotic bodies emplaced during its complex volcano-tectonic evolution. They are related to the eruptive activity of Stromboli, mainly poliphasic and to regional geological processes involving the geology of the Aeolian Arc.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Public agencies are increasingly required to collaborate with each other in order to provide high-quality e-government services. This collaboration is usually based on the service-oriented approach and supported by interoperability platforms. Such platforms are specialized middleware-based infrastructures enabling the provision, discovery and invocation of interoperable software services. In turn, given that personal data handled by governments are often very sensitive, most governments have developed some sort of legislation focusing on data protection. This paper proposes solutions for monitoring and enforcing data protection laws within an E-government Interoperability Platform. In particular, the proposal addresses requirements posed by the Uruguayan Data Protection Law and the Uruguayan E-government Platform, although it can also be applied in similar scenarios. The solutions are based on well-known integration mechanisms (e.g. Enterprise Service Bus) as well as recognized security standards (e.g. eXtensible Access Control Markup Language) and were completely prototyped leveraging the SwitchYard ESB product.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This thesis reports on an investigation of the feasibility and usefulness of incorporating dynamic management facilities for managing sensed context data in a distributed contextaware mobile application. The investigation focuses on reducing the work required to integrate new sensed context streams in an existing context aware architecture. Current architectures require integration work for new streams and new contexts that are encountered. This means of operation is acceptable for current fixed architectures. However, as systems become more mobile the number of discoverable streams increases. Without the ability to discover and use these new streams the functionality of any given device will be limited to the streams that it knows how to decode. The integration of new streams requires that the sensed context data be understood by the current application. If the new source provides data of a type that an application currently requires then the new source should be connected to the application without any prior knowledge of the new source. If the type is similar and can be converted then this stream too should be appropriated by the application. Such applications are based on portable devices (phones, PDAs) for semi-autonomous services that use data from sensors connected to the devices, plus data exchanged with other such devices and remote servers. Such applications must handle input from a variety of sensors, refining the data locally and managing its communication from the device in volatile and unpredictable network conditions. The choice to focus on locally connected sensory input allows for the introduction of privacy and access controls. This local control can determine how the information is communicated to others. This investigation focuses on the evaluation of three approaches to sensor data management. The first system is characterised by its static management based on the pre-pended metadata. This was the reference system. Developed for a mobile system, the data was processed based on the attached metadata. The code that performed the processing was static. The second system was developed to move away from the static processing and introduce a greater freedom of handling for the data stream, this resulted in a heavy weight approach. The approach focused on pushing the processing of the data into a number of networked nodes rather than the monolithic design of the previous system. By creating a separate communication channel for the metadata it is possible to be more flexible with the amount and type of data transmitted. The final system pulled the benefits of the other systems together. By providing a small management class that would load a separate handler based on the incoming data, Dynamism was maximised whilst maintaining ease of code understanding. The three systems were then compared to highlight their ability to dynamically manage new sensed context. The evaluation took two approaches, the first is a quantitative analysis of the code to understand the complexity of the relative three systems. This was done by evaluating what changes to the system were involved for the new context. The second approach takes a qualitative view of the work required by the software engineer to reconfigure the systems to provide support for a new data stream. The evaluation highlights the various scenarios in which the three systems are most suited. There is always a trade-o↵ in the development of a system. The three approaches highlight this fact. The creation of a statically bound system can be quick to develop but may need to be completely re-written if the requirements move too far. Alternatively a highly dynamic system may be able to cope with new requirements but the developer time to create such a system may be greater than the creation of several simpler systems.