175 resultados para Data compression (Computer science)
Resumo:
Due to the availability of huge number of web services, finding an appropriate Web service according to the requirements of a service consumer is still a challenge. Moreover, sometimes a single web service is unable to fully satisfy the requirements of the service consumer. In such cases, combinations of multiple inter-related web services can be utilised. This paper proposes a method that first utilises a semantic kernel model to find related services and then models these related Web services as nodes of a graph. An all-pair shortest-path algorithm is applied to find the best compositions of Web services that are semantically related to the service consumer requirement. The recommendation of individual and composite Web services composition for a service request is finally made. Empirical evaluation confirms that the proposed method significantly improves the accuracy of service discovery in comparison to traditional keyword-based discovery methods.
Resumo:
The security of permutation-based hash functions in the ideal permutation model has been studied when the input-length of compression function is larger than the input-length of the permutation function. In this paper, we consider permutation based compression functions that have input lengths shorter than that of the permutation. Under this assumption, we propose a permutation based compression function and prove its security with respect to collision and (second) preimage attacks in the ideal permutation model. The proposed compression function can be seen as a generalization of the compression function of MD6 hash function.
Resumo:
In this paper, we analyze the SHAvite-3-512 hash function, as proposed and tweaked for round 2 of the SHA-3 competition. We present cryptanalytic results on 10 out of 14 rounds of the hash function SHAvite-3-512, and on the full 14 round compression function of SHAvite-3-512. We show a second preimage attack on the hash function reduced to 10 rounds with a complexity of 2497 compression function evaluations and 216 memory. For the full 14-round compression function, we give a chosen counter, chosen salt preimage attack with 2384 compression function evaluations and 2128 memory (or complexity 2448 without memory), and a collision attack with 2192 compression function evaluations and 2128 memory.
Resumo:
Repeatable and accurate seagrass mapping is required for understanding seagrass ecology and supporting management decisions. For shallow (< 5 m) seagrass habitats, these maps can be created by integrating high spatial resolution imagery with field survey data. Field survey data for seagrass is often collected via snorkelling or diving. However, these methods are limited by environmental and safety considerations. Autonomous Underwater Vehicles (AUVs) are used increasingly to collect field data for habitat mapping, albeit mostly in deeper waters (>20 m). Here we demonstrate and evaluate the use and potential advantages of AUV field data collection for calibration and validation of seagrass habitat mapping of shallow waters (< 5 m), from multispectral satellite imagery. The study was conducted in the seagrass habitats of the Eastern Banks (142 km2), Moreton Bay, Australia. In the field, georeferenced photos of the seagrass were collected along transects via snorkelling or an AUV. Photos from both collection methods were analysed manually for seagrass species composition and then used as calibration and validation data to map seagrass using an established semi-automated object based mapping routine. A comparison of the relative advantages and disadvantages of AUV and snorkeller collected field data sets and their influence on the mapping routine was conducted. AUV data collection was more consistent, repeatable and safer in comparison to snorkeller transects. Inclusion of deeper water AUV data resulted in mapping of a larger extent of seagrass (~7 km2, 5 % of study area) in the deeper waters of the site. Although overall map accuracies did not differ considerably, inclusion of the AUV data from deeper water transects corrected errors in seagrass mapped at depths to 5 m, but where the bottom is visible on satellite imagery. Our results demonstrate that further development of AUV technology is justified for the monitoring of seagrass habitats in ongoing management programs.
Resumo:
In this paper we present research adapting a state of the art condition-invariant robotic place recognition algorithm to the role of automated inter- and intra-image alignment of sensor observations of environmental and skin change over time. The approach involves inverting the typical criteria placed upon navigation algorithms in robotics; we exploit rather than attempt to fix the limited camera viewpoint invariance of such algorithms, showing that approximate viewpoint repetition is realistic in a wide range of environments and medical applications. We demonstrate the algorithms automatically aligning challenging visual data from a range of real-world applications: ecological monitoring of environmental change, aerial observation of natural disasters including flooding, tsunamis and bushfires and tracking wound recovery and sun damage over time and present a prototype active guidance system for enforcing viewpoint repetition. We hope to provide an interesting case study for how traditional research criteria in robotics can be inverted to provide useful outcomes in applied situations.
Resumo:
Due to their unobtrusive nature, vision-based approaches to tracking sports players have been preferred over wearable sensors as they do not require the players to be instrumented for each match. Unfortunately however, due to the heavy occlusion between players, variation in resolution and pose, in addition to fluctuating illumination conditions, tracking players continuously is still an unsolved vision problem. For tasks like clustering and retrieval, having noisy data (i.e. missing and false player detections) is problematic as it generates discontinuities in the input data stream. One method of circumventing this issue is to use an occupancy map, where the field is discretised into a series of zones and a count of player detections in each zone is obtained. A series of frames can then be concatenated to represent a set-play or example of team behaviour. A problem with this approach though is that the compressibility is low (i.e. the variability in the feature space is incredibly high). In this paper, we propose the use of a bilinear spatiotemporal basis model using a role representation to clean-up the noisy detections which operates in a low-dimensional space. To evaluate our approach, we used a fully instrumented field-hockey pitch with 8 fixed high-definition (HD) cameras and evaluated our approach on approximately 200,000 frames of data from a state-of-the-art real-time player detector and compare it to manually labeled data.
Resumo:
In recommender systems based on multidimensional data, additional metadata provides algorithms with more information for better understanding the interaction between users and items. However, most of the profiling approaches in neighbourhood-based recommendation approaches for multidimensional data merely split or project the dimensional data and lack the consideration of latent interaction between the dimensions of the data. In this paper, we propose a novel user/item profiling approach for Collaborative Filtering (CF) item recommendation on multidimensional data. We further present incremental profiling method for updating the profiles. For item recommendation, we seek to delve into different types of relations in data to understand the interaction between users and items more fully, and propose three multidimensional CF recommendation approaches for top-N item recommendations based on the proposed user/item profiles. The proposed multidimensional CF approaches are capable of incorporating not only localized relations of user-user and/or item-item neighbourhoods but also latent interaction between all dimensions of the data. Experimental results show significant improvements in terms of recommendation accuracy.
Resumo:
The majority of sugar mill locomotives are equipped with GPS devices from which locomotive position data is stored. Locomotive run information (e.g. start times, run destinations and activities) is electronically stored in software called TOTools. The latest software development allows TOTools to interpret historical GPS information by combining this data with run information recorded in TOTools and geographic information from a GIS application called MapInfo. As a result, TOTools is capable of summarising run activity details such as run start and finish times and shunt activities with great accuracy. This paper presents 15 reports developed to summarise run activities and speed information. The reports will be of use pre-season to assist in developing the next year's schedule and for determining priorities for investment in the track infrastructure. They will also be of benefit during the season to closely monitor locomotive run performance against the existing schedule.
Resumo:
Available industrial energy meters offer high accuracy and reliability, but are typically expensive and low-bandwidth, making them poorly suited to multi-sensor data acquisition schemes and power quality analysis. An alternative measurement system is proposed in this paper that is highly modular, extensible and compact. To minimise cost, the device makes use of planar coreless PCB transformers to provide galvanic isolation for both power and data. Samples from multiple acquisition devices may be concentrated by a central processor before integration with existing host control systems. This paper focusses on the practical design and implementation of planar coreless PCB transformers to facilitate the module's isolated power, clock and data signal transfer. Calculations necessary to design coreless PCB transformers, and circuits designed for the transformer's practical application in the measurement module are presented. The designed transformer and each application circuit have been experimentally verified, with test data and conclusions made applicable to coreless PCB transformers in general.
Resumo:
One of the main challenges in data analytics is that discovering structures and patterns in complex datasets is a computer-intensive task. Recent advances in high-performance computing provide part of the solution. Multicore systems are now more affordable and more accessible. In this paper, we investigate how this can be used to develop more advanced methods for data analytics. We focus on two specific areas: model-driven analysis and data mining using optimisation techniques.
Resumo:
Most real-life data analysis problems are difficult to solve using exact methods, due to the size of the datasets and the nature of the underlying mechanisms of the system under investigation. As datasets grow even larger, finding the balance between the quality of the approximation and the computing time of the heuristic becomes non-trivial. One solution is to consider parallel methods, and to use the increased computational power to perform a deeper exploration of the solution space in a similar time. It is, however, difficult to estimate a priori whether parallelisation will provide the expected improvement. In this paper we consider a well-known method, genetic algorithms, and evaluate on two distinct problem types the behaviour of the classic and parallel implementations.
Resumo:
This paper addresses the development of trust in the use of Open Data through incorporation of appropriate authentication and integrity parameters for use by end user Open Data application developers in an architecture for trustworthy Open Data Services. The advantages of this architecture scheme is that it is far more scalable, not another certificate-based hierarchy that has problems with certificate revocation management. With the use of a Public File, if the key is compromised: it is a simple matter of the single responsible entity replacing the key pair with a new one and re-performing the data file signing process. Under this proposed architecture, the the Open Data environment does not interfere with the internal security schemes that might be employed by the entity. However, this architecture incorporates, when needed, parameters from the entity, e.g. person who authorized publishing as Open Data, at the time that datasets are created/added.
Resumo:
Relative abundance data is common in the life sciences, but appreciation that it needs special analysis and interpretation is scarce. Correlation is popular as a statistical measure of pairwise association but should not be used on data that carry only relative information. Using timecourse yeast gene expression data, we show how correlation of relative abundances can lead to conclusions opposite to those drawn from absolute abundances, and that its value changes when different components are included in the analysis. Once all absolute information has been removed, only a subset of those associations will reliably endure in the remaining relative data, specifically, associations where pairs of values behave proportionally across observations. We propose a new statistic φ to describe the strength of proportionality between two variables and demonstrate how it can be straightforwardly used instead of correlation as the basis of familiar analyses and visualization methods.
Resumo:
Large volumes of heterogeneous health data silos pose a big challenge when exploring for information to allow for evidence based decision making and ensuring quality outcomes. In this paper, we present a proof of concept for adopting data warehousing technology to aggregate and analyse disparate health data in order to understand the impact various lifestyle factors on obesity. We present a practical model for data warehousing with detailed explanation which can be adopted similarly for studying various other health issues.