941 resultados para open source seismic data processing packages
Resumo:
This dissertation develops a new mathematical approach that overcomes the effect of a data processing phenomenon known as "histogram binning" inherent to flow cytometry data. A real-time procedure is introduced to prove the effectiveness and fast implementation of such an approach on real-world data. The histogram binning effect is a dilemma posed by two seemingly antagonistic developments: (1) flow cytometry data in its histogram form is extended in its dynamic range to improve its analysis and interpretation, and (2) the inevitable dynamic range extension introduces an unwelcome side effect, the binning effect, which skews the statistics of the data, undermining as a consequence the accuracy of the analysis and the eventual interpretation of the data. Researchers in the field contended with such a dilemma for many years, resorting either to hardware approaches that are rather costly with inherent calibration and noise effects; or have developed software techniques based on filtering the binning effect but without successfully preserving the statistical content of the original data. The mathematical approach introduced in this dissertation is so appealing that a patent application has been filed. The contribution of this dissertation is an incremental scientific innovation based on a mathematical framework that will allow researchers in the field of flow cytometry to improve the interpretation of data knowing that its statistical meaning has been faithfully preserved for its optimized analysis. Furthermore, with the same mathematical foundation, proof of the origin of such an inherent artifact is provided. These results are unique in that new mathematical derivations are established to define and solve the critical problem of the binning effect faced at the experimental assessment level, providing a data platform that preserves its statistical content. In addition, a novel method for accumulating the log-transformed data was developed. This new method uses the properties of the transformation of statistical distributions to accumulate the output histogram in a non-integer and multi-channel fashion. Although the mathematics of this new mapping technique seem intricate, the concise nature of the derivations allow for an implementation procedure that lends itself to a real-time implementation using lookup tables, a task that is also introduced in this dissertation.
Resumo:
Cloud computing enables independent end users and applications to share data and pooled resources, possibly located in geographically distributed Data Centers, in a fully transparent way. This need is particularly felt by scientific applications to exploit distributed resources in efficient and scalable way for the processing of big amount of data. This paper proposes an open so- lution to deploy a Platform as a service (PaaS) over a set of multi- site data centers by applying open source virtualization tools to facilitate operation among virtual machines while optimizing the usage of distributed resources. An experimental testbed is set up in Openstack environment to obtain evaluations with different types of TCP sample connections to demonstrate the functionality of the proposed solution and to obtain throughput measurements in relation to relevant design parameters.
Resumo:
The full-scale base-isolated structure studied in this dissertation is the only base-isolated building in South Island of New Zealand. It sustained hundreds of earthquake ground motions from September 2010 and well into 2012. Several large earthquake responses were recorded in December 2011 by NEES@UCLA and by GeoNet recording station nearby Christchurch Women's Hospital. The primary focus of this dissertation is to advance the state-of-the art of the methods to evaluate performance of seismic-isolated structures and the effects of soil-structure interaction by developing new data processing methodologies to overcome current limitations and by implementing advanced numerical modeling in OpenSees for direct analysis of soil-structure interaction.
This dissertation presents a novel method for recovering force-displacement relations within the isolators of building structures with unknown nonlinearities from sparse seismic-response measurements of floor accelerations. The method requires only direct matrix calculations (factorizations and multiplications); no iterative trial-and-error methods are required. The method requires a mass matrix, or at least an estimate of the floor masses. A stiffness matrix may be used, but is not necessary. Essentially, the method operates on a matrix of incomplete measurements of floor accelerations. In the special case of complete floor measurements of systems with linear dynamics, real modes, and equal floor masses, the principal components of this matrix are the modal responses. In the more general case of partial measurements and nonlinear dynamics, the method extracts a number of linearly-dependent components from Hankel matrices of measured horizontal response accelerations, assembles these components row-wise and extracts principal components from the singular value decomposition of this large matrix of linearly-dependent components. These principal components are then interpolated between floors in a way that minimizes the curvature energy of the interpolation. This interpolation step can make use of a reduced-order stiffness matrix, a backward difference matrix or a central difference matrix. The measured and interpolated floor acceleration components at all floors are then assembled and multiplied by a mass matrix. The recovered in-service force-displacement relations are then incorporated into the OpenSees soil structure interaction model.
Numerical simulations of soil-structure interaction involving non-uniform soil behavior are conducted following the development of the complete soil-structure interaction model of Christchurch Women's Hospital in OpenSees. In these 2D OpenSees models, the superstructure is modeled as two-dimensional frames in short span and long span respectively. The lead rubber bearings are modeled as elastomeric bearing (Bouc Wen) elements. The soil underlying the concrete raft foundation is modeled with linear elastic plane strain quadrilateral element. The non-uniformity of the soil profile is incorporated by extraction and interpolation of shear wave velocity profile from the Canterbury Geotechnical Database. The validity of the complete two-dimensional soil-structure interaction OpenSees model for the hospital is checked by comparing the results of peak floor responses and force-displacement relations within the isolation system achieved from OpenSees simulations to the recorded measurements. General explanations and implications, supported by displacement drifts, floor acceleration and displacement responses, force-displacement relations are described to address the effects of soil-structure interaction.
Resumo:
Here, we describe gene expression compositional assignment (GECA), a powerful, yet simple method based on compositional statistics that can validate the transfer of prior knowledge, such as gene lists, into independent data sets, platforms and technologies. Transcriptional profiling has been used to derive gene lists that stratify patients into prognostic molecular subgroups and assess biomarker performance in the pre-clinical setting. Archived public data sets are an invaluable resource for subsequent in silico validation, though their use can lead to data integration issues. We show that GECA can be used without the need for normalising expression levels between data sets and can outperform rank-based correlation methods. To validate GECA, we demonstrate its success in the cross-platform transfer of gene lists in different domains including: bladder cancer staging, tumour site of origin and mislabelled cell lines. We also show its effectiveness in transferring an epithelial ovarian cancer prognostic gene signature across technologies, from a microarray to a next-generation sequencing setting. In a final case study, we predict the tumour site of origin and histopathology of epithelial ovarian cancer cell lines. In particular, we identify and validate the commonly-used cell line OVCAR-5 as non-ovarian, being gastrointestinal in origin. GECA is available as an open-source R package.
Resumo:
The advancement of GPS technology has made it possible to use GPS devices as orientation and navigation tools, but also as tools to track spatiotemporal information. GPS tracking data can be broadly applied in location-based services, such as spatial distribution of the economy, transportation routing and planning, traffic management and environmental control. Therefore, knowledge of how to process the data from a standard GPS device is crucial for further use. Previous studies have considered various issues of the data processing at the time. This paper, however, aims to outline a general procedure for processing GPS tracking data. The procedure is illustrated step-by-step by the processing of real-world GPS data of car movements in Borlänge in the centre of Sweden.
Resumo:
With the rapid development of Internet technologies, video and audio processing are among the most important parts due to the constant requirements of high quality media contents. Along with the improvement of network environment and the hardware equipment, this demand is becoming more and more imperious, people prefer high quality videos and audios as well as the net streaming media resources. FFmpeg is a set of open source program about the A/V decoding. Many commercial players use FFmpeg as their displaying cores. This paper designed a simple and easy-to-use video player based on FFmpeg. The first part is about the basic theories and related knowledge of video displaying, including some concepts like data formats, streaming media data, video coding and decoding. In a word, the realization of the video player depend on the a set of video decoding process. The general idea about the process is to get the video packets from the Internet, to read the related protocols and de-encapsulate the protocols, to de-encapsulate the packaging data and to get encoded formats data, to decode them to pixel data that can be displayed directly through graphics cards. During the coding and decoding process, there could be different degrees of data losing, which is called lossy compression, but it usually does not influence the quality of user experiences. The second part is about the principle of the FFmpeg decoding process, that is one of the key point of the paper. In this project, FFmpeg is used for the main decoding task, by call some main functions and structures from FFmpeg class libraries, packaging video formats could be transfer to pixel data, after getting the pixel data, SDL is used for the displaying process. The third part is about the SDL displaying flow. Similarly, it would invoke some important displaying functions from SDL class libraries to realize the function, though SDL is able to do not only displaying task, but also many other game playing process. After that, a independent video displayer is completed, it is provided with all the key function of a player. The fourth part make a simple users interface for the player based on the MFC program, it enable the player could be used by most people. At last, in consideration of the mobile Internet’s blossom, people nowadays can hardly ever drop their mobile phones, there is a brief introduction about how to transplant the video player to Android platform which is one of the most used mobile systems.
Resumo:
Relatório de Estágio apresentado à Escola Superior de Educação de Paula Frassinetti para obtenção de grau de Mestre em Educação Pré-Escolar
Resumo:
Intrusion Detection Systems (IDSs) provide an important layer of security for computer systems and networks, and are becoming more and more necessary as reliance on Internet services increases and systems with sensitive data are more commonly open to Internet access. An IDS’s responsibility is to detect suspicious or unacceptable system and network activity and to alert a systems administrator to this activity. The majority of IDSs use a set of signatures that define what suspicious traffic is, and Snort is one popular and actively developing open-source IDS that uses such a set of signatures known as Snort rules. Our aim is to identify a way in which Snort could be developed further by generalising rules to identify novel attacks. In particular, we attempted to relax and vary the conditions and parameters of current Snort rules, using a similar approach to classic rule learning operators such as generalisation and specialisation. We demonstrate the effectiveness of our approach through experiments with standard datasets and show that we are able to detect previously undetected variants of various attacks. We conclude by discussing the general effectiveness and appropriateness of generalisation in Snort based IDS rule processing. Keywords: anomaly detection, intrusion detection, Snort, Snort rules
Resumo:
This thesis reports on an investigation of the feasibility and usefulness of incorporating dynamic management facilities for managing sensed context data in a distributed contextaware mobile application. The investigation focuses on reducing the work required to integrate new sensed context streams in an existing context aware architecture. Current architectures require integration work for new streams and new contexts that are encountered. This means of operation is acceptable for current fixed architectures. However, as systems become more mobile the number of discoverable streams increases. Without the ability to discover and use these new streams the functionality of any given device will be limited to the streams that it knows how to decode. The integration of new streams requires that the sensed context data be understood by the current application. If the new source provides data of a type that an application currently requires then the new source should be connected to the application without any prior knowledge of the new source. If the type is similar and can be converted then this stream too should be appropriated by the application. Such applications are based on portable devices (phones, PDAs) for semi-autonomous services that use data from sensors connected to the devices, plus data exchanged with other such devices and remote servers. Such applications must handle input from a variety of sensors, refining the data locally and managing its communication from the device in volatile and unpredictable network conditions. The choice to focus on locally connected sensory input allows for the introduction of privacy and access controls. This local control can determine how the information is communicated to others. This investigation focuses on the evaluation of three approaches to sensor data management. The first system is characterised by its static management based on the pre-pended metadata. This was the reference system. Developed for a mobile system, the data was processed based on the attached metadata. The code that performed the processing was static. The second system was developed to move away from the static processing and introduce a greater freedom of handling for the data stream, this resulted in a heavy weight approach. The approach focused on pushing the processing of the data into a number of networked nodes rather than the monolithic design of the previous system. By creating a separate communication channel for the metadata it is possible to be more flexible with the amount and type of data transmitted. The final system pulled the benefits of the other systems together. By providing a small management class that would load a separate handler based on the incoming data, Dynamism was maximised whilst maintaining ease of code understanding. The three systems were then compared to highlight their ability to dynamically manage new sensed context. The evaluation took two approaches, the first is a quantitative analysis of the code to understand the complexity of the relative three systems. This was done by evaluating what changes to the system were involved for the new context. The second approach takes a qualitative view of the work required by the software engineer to reconfigure the systems to provide support for a new data stream. The evaluation highlights the various scenarios in which the three systems are most suited. There is always a trade-o↵ in the development of a system. The three approaches highlight this fact. The creation of a statically bound system can be quick to develop but may need to be completely re-written if the requirements move too far. Alternatively a highly dynamic system may be able to cope with new requirements but the developer time to create such a system may be greater than the creation of several simpler systems.
Resumo:
Due to the growth of design size and complexity, design verification is an important aspect of the Logic Circuit development process. The purpose of verification is to validate that the design meets the system requirements and specification. This is done by either functional or formal verification. The most popular approach to functional verification is the use of simulation based techniques. Using models to replicate the behaviour of an actual system is called simulation. In this thesis, a software/data structure architecture without explicit locks is proposed to accelerate logic gate circuit simulation. We call thus system ZSIM. The ZSIM software architecture simulator targets low cost SIMD multi-core machines. Its performance is evaluated on the Intel Xeon Phi and 2 other machines (Intel Xeon and AMD Opteron). The aim of these experiments is to: • Verify that the data structure used allows SIMD acceleration, particularly on machines with gather instructions ( section 5.3.1). • Verify that, on sufficiently large circuits, substantial gains could be made from multicore parallelism ( section 5.3.2 ). • Show that a simulator using this approach out-performs an existing commercial simulator on a standard workstation ( section 5.3.3 ). • Show that the performance on a cheap Xeon Phi card is competitive with results reported elsewhere on much more expensive super-computers ( section 5.3.5 ). To evaluate the ZSIM, two types of test circuits were used: 1. Circuits from the IWLS benchmark suit [1] which allow direct comparison with other published studies of parallel simulators.2. Circuits generated by a parametrised circuit synthesizer. The synthesizer used an algorithm that has been shown to generate circuits that are statistically representative of real logic circuits. The synthesizer allowed testing of a range of very large circuits, larger than the ones for which it was possible to obtain open source files. The experimental results show that with SIMD acceleration and multicore, ZSIM gained a peak parallelisation factor of 300 on Intel Xeon Phi and 11 on Intel Xeon. With only SIMD enabled, ZSIM achieved a maximum parallelistion gain of 10 on Intel Xeon Phi and 4 on Intel Xeon. Furthermore, it was shown that this software architecture simulator running on a SIMD machine is much faster than, and can handle much bigger circuits than a widely used commercial simulator (Xilinx) running on a workstation. The performance achieved by ZSIM was also compared with similar pre-existing work on logic simulation targeting GPUs and supercomputers. It was shown that ZSIM simulator running on a Xeon Phi machine gives comparable simulation performance to the IBM Blue Gene supercomputer at very much lower cost. The experimental results have shown that the Xeon Phi is competitive with simulation on GPUs and allows the handling of much larger circuits than have been reported for GPU simulation. When targeting Xeon Phi architecture, the automatic cache management of the Xeon Phi, handles and manages the on-chip local store without any explicit mention of the local store being made in the architecture of the simulator itself. However, targeting GPUs, explicit cache management in program increases the complexity of the software architecture. Furthermore, one of the strongest points of the ZSIM simulator is its portability. Note that the same code was tested on both AMD and Xeon Phi machines. The same architecture that efficiently performs on Xeon Phi, was ported into a 64 core NUMA AMD Opteron. To conclude, the two main achievements are restated as following: The primary achievement of this work was proving that the ZSIM architecture was faster than previously published logic simulators on low cost platforms. The secondary achievement was the development of a synthetic testing suite that went beyond the scale range that was previously publicly available, based on prior work that showed the synthesis technique is valid.
Resumo:
Intrusion Detection Systems (IDSs) provide an important layer of security for computer systems and networks, and are becoming more and more necessary as reliance on Internet services increases and systems with sensitive data are more commonly open to Internet access. An IDS’s responsibility is to detect suspicious or unacceptable system and network activity and to alert a systems administrator to this activity. The majority of IDSs use a set of signatures that define what suspicious traffic is, and Snort is one popular and actively developing open-source IDS that uses such a set of signatures known as Snort rules. Our aim is to identify a way in which Snort could be developed further by generalising rules to identify novel attacks. In particular, we attempted to relax and vary the conditions and parameters of current Snort rules, using a similar approach to classic rule learning operators such as generalisation and specialisation. We demonstrate the effectiveness of our approach through experiments with standard datasets and show that we are able to detect previously undetected variants of various attacks. We conclude by discussing the general effectiveness and appropriateness of generalisation in Snort based IDS rule processing. Keywords: anomaly detection, intrusion detection, Snort, Snort rules
Resumo:
Natural language processing has achieved great success in a wide range of ap- plications, producing both commercial language services and open-source language tools. However, most methods take a static or batch approach, assuming that the model has all information it needs and makes a one-time prediction. In this disser- tation, we study dynamic problems where the input comes in a sequence instead of all at once, and the output must be produced while the input is arriving. In these problems, predictions are often made based only on partial information. We see this dynamic setting in many real-time, interactive applications. These problems usually involve a trade-off between the amount of input received (cost) and the quality of the output prediction (accuracy). Therefore, the evaluation considers both objectives (e.g., plotting a Pareto curve). Our goal is to develop a formal understanding of sequential prediction and decision-making problems in natural language processing and to propose efficient solutions. Toward this end, we present meta-algorithms that take an existent batch model and produce a dynamic model to handle sequential inputs and outputs. Webuild our framework upon theories of Markov Decision Process (MDP), which allows learning to trade off competing objectives in a principled way. The main machine learning techniques we use are from imitation learning and reinforcement learning, and we advance current techniques to tackle problems arising in our settings. We evaluate our algorithm on a variety of applications, including dependency parsing, machine translation, and question answering. We show that our approach achieves a better cost-accuracy trade-off than the batch approach and heuristic-based decision- making approaches. We first propose a general framework for cost-sensitive prediction, where dif- ferent parts of the input come at different costs. We formulate a decision-making process that selects pieces of the input sequentially, and the selection is adaptive to each instance. Our approach is evaluated on both standard classification tasks and a structured prediction task (dependency parsing). We show that it achieves similar prediction quality to methods that use all input, while inducing a much smaller cost. Next, we extend the framework to problems where the input is revealed incremen- tally in a fixed order. We study two applications: simultaneous machine translation and quiz bowl (incremental text classification). We discuss challenges in this set- ting and show that adding domain knowledge eases the decision-making problem. A central theme throughout the chapters is an MDP formulation of a challenging problem with sequential input/output and trade-off decisions, accompanied by a learning algorithm that solves the MDP.
Resumo:
The Arctic continental margin contains large amounts of methane in the form of methane hydrates. The west Svalbard continental slope is an area where active methane seeps have been reported near the landward limit of the hydrate stability zone. The presence of bottom simulating reflectors (BSR) on seismic reflection data in water depths greater than 600 m suggests the presence of free gas beneath gas hydrates in the area. Resistivity obtained from marine controlled source electromagnetic (CSEM) data provides a useful complement to seismic methods for detecting shallow hydrate and gas as they are more resistive than surrounding water saturated sediments. We acquired two CSEM lines in the west Svalbard continental slope, extending from the edge of the continental shelf (250 m water depth) to water depths of around 800 m. High resistivities (5-12 Ωm) observed above the BSR support the presence of gas hydrate in water depths greater than 600 m. High resistivities (3-4 Ωm) at 390-600 m water depth also suggest possible hydrate occurrence within the gas hydrate stability zone (GHSZ) of the continental slope. In addition, high resistivities (4-8 Ωm) landward of the GHSZ are coincident with high-amplitude reflectors and low velocities reported in seismic data that indicate the likely presence of free gas. Pore space saturation estimates using a connectivity equation suggest 20-50% hydrate within the lower slope sediments and less than 12% within the upper slope sediments. A free gas zone beneath the GHSZ (10-20% gas saturation) is connected to the high free gas saturated (10-45%) area at the edge of the continental shelf, where most of the seeps are observed. This evidence supports the presence of lateral free gas migration beneath the GHSZ towards the continental shelf.
Resumo:
Dissertação de mestrado, Engenharia Electrónica e Telecomunicações, Faculdade de Ciências e Tecnologia, Universidade do Algarve, 2011
Resumo:
Dissertação de Mestrado, Ciências da Linguagem, Faculdade de Ciências Humanas e Sociais, Universidade do Algarve, 2016