97 resultados para Data Storage Solutions
Resumo:
Distributed systems are widely used for solving large-scale and data-intensive computing problems, including all-to-all comparison (ATAC) problems. However, when used for ATAC problems, existing computational frameworks such as Hadoop focus on load balancing for allocating comparison tasks, without careful consideration of data distribution and storage usage. While Hadoop-based solutions provide users with simplicity of implementation, their inherent MapReduce computing pattern does not match the ATAC pattern. This leads to load imbalances and poor data locality when Hadoop's data distribution strategy is used for ATAC problems. Here we present a data distribution strategy which considers data locality, load balancing and storage savings for ATAC computing problems in homogeneous distributed systems. A simulated annealing algorithm is developed for data distribution and task scheduling. Experimental results show a significant performance improvement for our approach over Hadoop-based solutions.
Resumo:
The recent trend for journals to require open access to primary data included in publications has been embraced by many biologists, but has caused apprehension amongst researchers engaged in long-term ecological and evolutionary studies. A worldwide survey of 73 principal investigators (Pls) with long-term studies revealed positive attitudes towards sharing data with the agreement or involvement of the PI, and 93% of PIs have historically shared data. Only 8% were in favor of uncontrolled, open access to primary data while 63% expressed serious concern. We present here their viewpoint on an issue that can have non-trivial scientific consequences. We discuss potential costs of public data archiving and provide possible solutions to meet the needs of journals and researchers.
Resumo:
In our recent paper [1], we discussed some potential undesirable consequences of public data archiving (PDA) with specific reference to long-term studies and proposed solutions to manage these issues. We reaffirm our commitment to data sharing and collaboration, both of which have been common and fruitful practices supported for many decades by researchers involved in long-term studies. We acknowledge the potential benefits of PDA (e.g., [2]), but believe that several potential negative consequences for science have been underestimated [1] (see also 3 and 4). The objective of our recent paper [1] was to define practices to simultaneously maximize the benefits and minimize the potential unwanted consequences of PDA.
Resumo:
“SOH see significant benefit in digitising its drawings and operation and maintenance manuals. Since SOH do not currently have digital models of the Opera House structure or other components, there is an opportunity for this national case study to promote the application of Digital Facility Modelling using standardized Building Information Models (BIM)”. The digital modelling element of this project examined the potential of building information models for Facility Management focusing on the following areas: • The re-usability of building information for FM purposes • BIM as an Integrated information model for facility management • Extendibility of the BIM to cope with business specific requirements • Commercial facility management software using standardised building information models • The ability to add (organisation specific) intelligence to the model • A roadmap for SOH to adopt BIM for FM The project has established that BIM – building information modelling - is an appropriate and potentially beneficial technology for the storage of integrated building, maintenance and management data for SOH. Based on the attributes of a BIM, several advantages can be envisioned: consistency in the data, intelligence in the model, multiple representations, source of information for intelligent programs and intelligent queries. The IFC – open building exchange standard – specification provides comprehensive support for asset and facility management functions, and offers new management, collaboration and procurement relationships based on sharing of intelligent building data. The major advantages of using an open standard are: information can be read and manipulated by any compliant software, reduced user “lock in” to proprietary solutions, third party software can be the “best of breed” to suit the process and scope at hand, standardised BIM solutions consider the wider implications of information exchange outside the scope of any particular vendor, information can be archived as ASCII files for archival purposes, and data quality can be enhanced as the now single source of users’ information has improved accuracy, correctness, currency, completeness and relevance. SOH current building standards have been successfully drafted for a BIM environment and are confidently expected to be fully developed when BIM is adopted operationally by SOH. There have been remarkably few technical difficulties in converting the House’s existing conventions and standards to the new model based environment. This demonstrates that the IFC model represents world practice for building data representation and management (see Sydney Opera House – FM Exemplar Project Report Number 2005-001-C-3, Open Specification for BIM: Sydney Opera House Case Study). Availability of FM applications based on BIM is in its infancy but focussed systems are already in operation internationally and show excellent prospects for implementation systems at SOH. In addition to the generic benefits of standardised BIM described above, the following FM specific advantages can be expected from this new integrated facilities management environment: faster and more effective processes, controlled whole life costs and environmental data, better customer service, common operational picture for current and strategic planning, visual decision-making and a total ownership cost model. Tests with partial BIM data – provided by several of SOH’s current consultants – show that the creation of a SOH complete model is realistic, but subject to resolution of compliance and detailed functional support by participating software applications. The showcase has demonstrated successfully that IFC based exchange is possible with several common BIM based applications through the creation of a new partial model of the building. Data exchanged has been geometrically accurate (the SOH building structure represents some of the most complex building elements) and supports rich information describing the types of objects, with their properties and relationships.
Resumo:
There is evidence that many heating, ventilating & air conditioning (HVAC) systems, installed in larger buildings, have more capacity than is ever required to keep the occupants comfortable. This paper explores the reasons why this can occur, by examining a typical brief/design/documentation process. Over-sized HVAC systems cost more to install and operate and may not be able to control thermal comfort as well as a “right-sized” system. These impacts are evaluated, where data exists. Finally, some suggestions are developed to minimise both the extent of, and the negative impacts of, HVAC system over-sizing, for example: • Challenge “rules of thumb” and/or brief requirements which may be out of date. • Conduct an accurate load estimate, using AIRAH design data, specific to project location, and then resist the temptation to apply “safety factors • Use a load estimation program that accounts for thermal storage and diversification of peak loads for each zone and air handling system. • Select chiller sizes and staged or variable speed pumps and fans to ensure good part load performance. • Allow for unknown future tenancies by designing flexibility into the system, not by over-sizing. For example, generous sizing of distribution pipework and ductwork will allow available capacity to be redistributed. • Provide an auxiliary tenant condenser water loop to handle high load areas. • Consider using an Integrated Design Process, build an integrated load and energy use simulation model and test different operational scenarios • Use comprehensive Life Cycle Cost analysis for selection of the most optimal design solutions. This paper is an interim report on the findings of CRC-CI project 2002-051-B, Right-Sizing HVAC Systems, which is due for completion in January 2006.
Resumo:
The digital modelling research stream of the Sydney Opera House FM Exemplar Project has demonstrated significant benefits in digitising design documentation and operational and maintenance manuals. Since Sydney Opera House did not have digital models of its structure, there was an opportunity to investigate the application of digital modelling using standardised Building Information Models (BIM) to support facilities management (FM).The focus of this investigation was on the following areas:the re-usability of standardised BIM for FM purposesthe potential of BIM as an information framework acting as integrator for various FM data sources the extendibility and flexibility of the BIM to cope with business-specific data and requirements commercial FM software using standardised BIMthe ability to add (organisation-specific) intelligence to the modela roadmap for Sydney Opera House to adopt BIM for FM.
Resumo:
High-speed videokeratoscopy is an emerging technique that enables study of the corneal surface and tear-film dynamics. Unlike its static predecessor, this new technique results in a very large amount of digital data for which storage needs become significant. We aimed to design a compression technique that would use mathematical functions to parsimoniously fit corneal surface data with a minimum number of coefficients. Since the Zernike polynomial functions that have been traditionally used for modeling corneal surfaces may not necessarily correctly represent given corneal surface data in terms of its optical performance, we introduced the concept of Zernike polynomial-based rational functions. Modeling optimality criteria were employed in terms of both the rms surface error as well as the point spread function cross-correlation. The parameters of approximations were estimated using a nonlinear least-squares procedure based on the Levenberg-Marquardt algorithm. A large number of retrospective videokeratoscopic measurements were used to evaluate the performance of the proposed rational-function-based modeling approach. The results indicate that the rational functions almost always outperform the traditional Zernike polynomial approximations with the same number of coefficients.
Resumo:
Integrity of Real Time Kinematic (RTK) positioning solutions relates to the confidential level that can be placed in the information provided by the RTK system. It includes the ability of the RTK system to provide timely valid warnings to users when the system must not be used for the intended operation. For instance, in the controlled traffic farming (CTF) system that controls traffic separates wheel beds and root beds, RTK positioning error causes overlap and increases the amount of soil compaction. The RTK system’s integrity capacity can inform users when the actual positional errors of the RTK solutions have exceeded Horizontal Protection Levels (HPL) within a certain Time-To-Alert (TTA) at a given Integrity Risk (IR). The later is defined as the probability that the system claims its normal operational status while actually being in an abnormal status, e.g., the ambiguities being incorrectly fixed and positional errors having exceeded the HPL. The paper studies the required positioning performance (RPP) of GPS positioning system for PA applications such as a CTF system, according to literature review and survey conducted among a number of farming companies. The HPL and IR are derived from these RPP parameters. A RTK-specific rover autonomous integrity monitoring (RAIM) algorithm is developed to determine the system integrity according to real time outputs, such as residual square sum (RSS), HDOP values. A two-station baseline data set is analyzed to demonstrate the concept of RTK integrity and assess the RTK solution continuity, missed detection probability and false alarm probability.
Resumo:
This paper studies receiver autonomous integrity monitoring (RAIM) algorithms and performance benefits of RTK solutions with multiple-constellations. The proposed method is generally known as Multi-constellation RAIM -McRAIM. The McRAIM algorithms take advantage of the ambiguity invariant character to assist fast identification of multiple satellite faults in the context of multiple constellations, and then detect faulty satellites in the follow-up ambiguity search and position estimation processes. The concept of Virtual Galileo Constellation (VGC) is used to generate useful data sets of dual-constellations for performance analysis. Experimental results from a 24-h data set demonstrate that with GPS&VGC constellations, McRAIM can significantly enhance the detection and exclusion probabilities of two simultaneous faulty satellites in RTK solutions.
Resumo:
In this paper, the problems of three carrier phase ambiguity resolution (TCAR) and position estimation (PE) are generalized as real time GNSS data processing problems for a continuously observing network on large scale. In order to describe these problems, a general linear equation system is presented to uniform various geometry-free, geometry-based and geometry-constrained TCAR models, along with state transition questions between observation times. With this general formulation, generalized TCAR solutions are given to cover different real time GNSS data processing scenarios, and various simplified integer solutions, such as geometry-free rounding and geometry-based LAMBDA solutions with single and multiple-epoch measurements. In fact, various ambiguity resolution (AR) solutions differ in the floating ambiguity estimation and integer ambiguity search processes, but their theoretical equivalence remains under the same observational systems models and statistical assumptions. TCAR performance benefits as outlined from the data analyses in some recent literatures are reviewed, showing profound implications for the future GNSS development from both technology and application perspectives.
Resumo:
An investigation has been made of the interactions between silicone oil and various solid substrates immersed in aqueous solutions. Measurements were made using an atomic force microscope (AFM) using the colloid-probe method. The silicone oil drop is simulated by coating a small silica sphere with the oil, and measuring the force as this coated sphere is brought close to contact with a flat solid surface. It is found that the silicone oil surface is negatively charged, which causes a double-layer repulsion between the oil drop and another negatively charged surface such as mica. With hydrophilic solids, this repulsion is strong enough to prevent attachment of the drop to the solid. However, with hydrophobic surfaces there is an additional attractive force which overcomes the double-layer repulsion, and the silicone oil drop attaches to the solid. A "ramp" force appears in some, but not all, of the data sets. There is circumstantial evidence that this force results from compression of the silicone oil film coated on the glass sphere.
Resumo:
Real‐time kinematic (RTK) GPS techniques have been extensively developed for applications including surveying, structural monitoring, and machine automation. Limitations of the existing RTK techniques that hinder their applications for geodynamics purposes are twofold: (1) the achievable RTK accuracy is on the level of a few centimeters and the uncertainty of vertical component is 1.5–2 times worse than those of horizontal components and (2) the RTK position uncertainty grows in proportional to the base‐torover distances. The key limiting factor behind the problems is the significant effect of residual tropospheric errors on the positioning solutions, especially on the highly correlated height component. This paper develops the geometry‐specified troposphere decorrelation strategy to achieve the subcentimeter kinematic positioning accuracy in all three components. The key is to set up a relative zenith tropospheric delay (RZTD) parameter to absorb the residual tropospheric effects and to solve the established model as an ill‐posed problem using the regularization method. In order to compute a reasonable regularization parameter to obtain an optimal regularized solution, the covariance matrix of positional parameters estimated without the RZTD parameter, which is characterized by observation geometry, is used to replace the quadratic matrix of their “true” values. As a result, the regularization parameter is adaptively computed with variation of observation geometry. The experiment results show that new method can efficiently alleviate the model’s ill condition and stabilize the solution from a single data epoch. Compared to the results from the conventional least squares method, the new method can improve the longrange RTK solution precision from several centimeters to the subcentimeter in all components. More significantly, the precision of the height component is even higher. Several geosciences applications that require subcentimeter real‐time solutions can largely benefit from the proposed approach, such as monitoring of earthquakes and large dams in real‐time, high‐precision GPS leveling and refinement of the vertical datum. In addition, the high‐resolution RZTD solutions can contribute to effective recovery of tropospheric slant path delays in order to establish a 4‐D troposphere tomography.
Resumo:
The aim of this paper is to demonstrate the validity of using Gaussian mixture models (GMM) for representing probabilistic distributions in a decentralised data fusion (DDF) framework. GMMs are a powerful and compact stochastic representation allowing efficient communication of feature properties in large scale decentralised sensor networks. It will be shown that GMMs provide a basis for analytical solutions to the update and prediction operations for general Bayesian filtering. Furthermore, a variant on the Covariance Intersect algorithm for Gaussian mixtures will be presented ensuring a conservative update for the fusion of correlated information between two nodes in the network. In addition, purely visual sensory data will be used to show that decentralised data fusion and tracking of non-Gaussian states observed by multiple autonomous vehicles is feasible.
Resumo:
The success rate of carrier phase ambiguity resolution (AR) is the probability that the ambiguities are successfully fixed to their correct integer values. In existing works, an exact success rate formula for integer bootstrapping estimator has been used as a sharp lower bound for the integer least squares (ILS) success rate. Rigorous computation of success rate for the more general ILS solutions has been considered difficult, because of complexity of the ILS ambiguity pull-in region and computational load of the integration of the multivariate probability density function. Contributions of this work are twofold. First, the pull-in region mathematically expressed as the vertices of a polyhedron is represented by a multi-dimensional grid, at which the cumulative probability can be integrated with the multivariate normal cumulative density function (mvncdf) available in Matlab. The bivariate case is studied where the pull-region is usually defined as a hexagon and the probability is easily obtained using mvncdf at all the grid points within the convex polygon. Second, the paper compares the computed integer rounding and integer bootstrapping success rates, lower and upper bounds of the ILS success rates to the actual ILS AR success rates obtained from a 24 h GPS data set for a 21 km baseline. The results demonstrate that the upper bound probability of the ILS AR probability given in the existing literatures agrees with the actual ILS success rate well, although the success rate computed with integer bootstrapping method is a quite sharp approximation to the actual ILS success rate. The results also show that variations or uncertainty of the unit–weight variance estimates from epoch to epoch will affect the computed success rates from different methods significantly, thus deserving more attentions in order to obtain useful success probability predictions.
Resumo:
A Wireless Sensor Network (WSN) is a set of sensors that are integrated with a physical environment. These sensors are small in size, and capable of sensing physical phenomena and processing them. They communicate in a multihop manner, due to a short radio range, to form an Ad Hoc network capable of reporting network activities to a data collection sink. Recent advances in WSNs have led to several new promising applications, including habitat monitoring, military target tracking, natural disaster relief, and health monitoring. The current version of sensor node, such as MICA2, uses a 16 bit, 8 MHz Texas Instruments MSP430 micro-controller with only 10 KB RAM, 128 KB program space, 512 KB external ash memory to store measurement data, and is powered by two AA batteries. Due to these unique specifications and a lack of tamper-resistant hardware, devising security protocols for WSNs is complex. Previous studies show that data transmission consumes much more energy than computation. Data aggregation can greatly help to reduce this consumption by eliminating redundant data. However, aggregators are under the threat of various types of attacks. Among them, node compromise is usually considered as one of the most challenging for the security of WSNs. In a node compromise attack, an adversary physically tampers with a node in order to extract the cryptographic secrets. This attack can be very harmful depending on the security architecture of the network. For example, when an aggregator node is compromised, it is easy for the adversary to change the aggregation result and inject false data into the WSN. The contributions of this thesis to the area of secure data aggregation are manifold. We firstly define the security for data aggregation in WSNs. In contrast with existing secure data aggregation definitions, the proposed definition covers the unique characteristics that WSNs have. Secondly, we analyze the relationship between security services and adversarial models considered in existing secure data aggregation in order to provide a general framework of required security services. Thirdly, we analyze existing cryptographic-based and reputationbased secure data aggregation schemes. This analysis covers security services provided by these schemes and their robustness against attacks. Fourthly, we propose a robust reputationbased secure data aggregation scheme for WSNs. This scheme minimizes the use of heavy cryptographic mechanisms. The security advantages provided by this scheme are realized by integrating aggregation functionalities with: (i) a reputation system, (ii) an estimation theory, and (iii) a change detection mechanism. We have shown that this addition helps defend against most of the security attacks discussed in this thesis, including the On-Off attack. Finally, we propose a secure key management scheme in order to distribute essential pairwise and group keys among the sensor nodes. The design idea of the proposed scheme is the combination between Lamport's reverse hash chain as well as the usual hash chain to provide both past and future key secrecy. The proposal avoids the delivery of the whole value of a new group key for group key update; instead only the half of the value is transmitted from the network manager to the sensor nodes. This way, the compromise of a pairwise key alone does not lead to the compromise of the group key. The new pairwise key in our scheme is determined by Diffie-Hellman based key agreement.