135 resultados para data gathering algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Accurate and detailed measurement of an individual's physical activity is a key requirement for helping researchers understand the relationship between physical activity and health. Accelerometers have become the method of choice for measuring physical activity due to their small size, low cost, convenience and their ability to provide objective information about physical activity. However, interpreting accelerometer data once it has been collected can be challenging. In this work, we applied machine learning algorithms to the task of physical activity recognition from triaxial accelerometer data. We employed a simple but effective approach of dividing the accelerometer data into short non-overlapping windows, converting each window into a feature vector, and treating each feature vector as an i.i.d training instance for a supervised learning algorithm. In addition, we improved on this simple approach with a multi-scale ensemble method that did not need to commit to a single window size and was able to leverage the fact that physical activities produced time series with repetitive patterns and discriminative features for physical activity occurred at different temporal scales.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background Heatwaves could cause the population excess death numbers to be ranged from tens to thousands within a couple of weeks in a local area. An excess mortality due to a special event (e.g., a heatwave or an epidemic outbreak) is estimated by subtracting the mortality figure under ‘normal’ conditions from the historical daily mortality records. The calculation of the excess mortality is a scientific challenge because of the stochastic temporal pattern of the daily mortality data which is characterised by (a) the long-term changing mean levels (i.e., non-stationarity); (b) the non-linear temperature-mortality association. The Hilbert-Huang Transform (HHT) algorithm is a novel method originally developed for analysing the non-linear and non-stationary time series data in the field of signal processing, however, it has not been applied in public health research. This paper aimed to demonstrate the applicability and strength of the HHT algorithm in analysing health data. Methods Special R functions were developed to implement the HHT algorithm to decompose the daily mortality time series into trend and non-trend components in terms of the underlying physical mechanism. The excess mortality is calculated directly from the resulting non-trend component series. Results The Brisbane (Queensland, Australia) and the Chicago (United States) daily mortality time series data were utilized for calculating the excess mortality associated with heatwaves. The HHT algorithm estimated 62 excess deaths related to the February 2004 Brisbane heatwave. To calculate the excess mortality associated with the July 1995 Chicago heatwave, the HHT algorithm needed to handle the mode mixing issue. The HHT algorithm estimated 510 excess deaths for the 1995 Chicago heatwave event. To exemplify potential applications, the HHT decomposition results were used as the input data for a subsequent regression analysis, using the Brisbane data, to investigate the association between excess mortality and different risk factors. Conclusions The HHT algorithm is a novel and powerful analytical tool in time series data analysis. It has a real potential to have a wide range of applications in public health research because of its ability to decompose a nonlinear and non-stationary time series into trend and non-trend components consistently and efficiently.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A dynamic accumulator is an algorithm, which merges a large set of elements into a constant-size value such that for an element accumulated, there is a witness confirming that the element was included into the value, with a property that accumulated elements can be dynamically added and deleted into/from the original set. Recently Wang et al. presented a dynamic accumulator for batch updates at ICICS 2007. However, their construction suffers from two serious problems. We analyze them and propose a way to repair their scheme. We use the accumulator to construct a new scheme for common secure indices with conjunctive keyword-based retrieval.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The Common Scrambling Algorithm Stream Cipher (CSASC) is a shift register based stream cipher designed to encrypt digital video broadcast. CSA-SC produces a pseudo-random binary sequence that is used to mask the contents of the transmission. In this paper, we analyse the initialisation process of the CSA-SC keystream generator and demonstrate weaknesses which lead to state convergence, slid pairs and shifted keystreams. As a result, the cipher may be vulnerable to distinguishing attacks, time-memory-data trade-off attacks or slide attacks.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Realistic virtual models of leaf surfaces are important for a number of applications in the plant sciences, such as modelling agrichemical spray droplet movement and spreading on the surface. In this context, the virtual surfaces are required to be sufficiently smooth to facilitate the use of the mathematical equations that govern the motion of the droplet. While an effective approach is to apply discrete smoothing D2-spline algorithms to reconstruct the leaf surfaces from three-dimensional scanned data, difficulties arise when dealing with wheat leaves that tend to twist and bend. To overcome this topological difficulty, we develop a parameterisation technique that rotates and translates the original data, allowing the surface to be fitted using the discrete smoothing D2-spline methods in the new parameter space. Our algorithm uses finite element methods to represent the surface as a linear combination of compactly supported shape functions. Numerical results confirm that the parameterisation, along with the use of discrete smoothing D2-spline techniques, produces realistic virtual representations of wheat leaves.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The work presented in this report is aimed to implement a cost-effective offline mission path planner for aerial inspection tasks of large linear infrastructures. Like most real-world optimisation problems, mission path planning involves a number of objectives which ideally should be minimised simultaneously. Understandably, the objectives of a practical optimisation problem are conflicting each other and the minimisation of one of them necessarily implies the impossibility to minimise the other ones. This leads to the need to find a set of optimal solutions for the problem; once such a set of available options is produced, the mission planning problem is reduced to a decision making problem for the mission specialists, who will choose the solution which best fit the requirements of the mission. The goal of this work is then to develop a Multi-Objective optimisation tool able to provide the mission specialists a set of optimal solutions for the inspection task amongst which the final trajectory will be chosen, given the environment data, the mission requirements and the definition of the objectives to minimise. All the possible optimal solutions of a Multi-Objective optimisation problem are said to form the Pareto-optimal front of the problem. For any of the Pareto-optimal solutions, it is impossible to improve one objective without worsening at least another one. Amongst a set of Pareto-optimal solutions, no solution is absolutely better than another and the final choice must be a trade-off of the objectives of the problem. Multi-Objective Evolutionary Algorithms (MOEAs) are recognised to be a convenient method for exploring the Pareto-optimal front of Multi-Objective optimization problems. Their efficiency is due to their parallelism architecture which allows to find several optimal solutions at each time

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This chapter describes decentralized data fusion algorithms for a team of multiple autonomous platforms. Decentralized data fusion (DDF) provides a useful basis with which to build upon for cooperative information gathering tasks for robotic teams operating in outdoor environments. Through the DDF algorithms, each platform can maintain a consistent global solution from which decisions may then be made. Comparisons will be made between the implementation of DDF using two probabilistic representations. The first, Gaussian estimates and the second Gaussian mixtures are compared using a common data set. The overall system design is detailed, providing insight into the overall complexity of implementing a robust DDF system for use in information gathering tasks in outdoor UAV applications.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1–28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

A computationally efficient sequential Monte Carlo algorithm is proposed for the sequential design of experiments for the collection of block data described by mixed effects models. The difficulty in applying a sequential Monte Carlo algorithm in such settings is the need to evaluate the observed data likelihood, which is typically intractable for all but linear Gaussian models. To overcome this difficulty, we propose to unbiasedly estimate the likelihood, and perform inference and make decisions based on an exact-approximate algorithm. Two estimates are proposed: using Quasi Monte Carlo methods and using the Laplace approximation with importance sampling. Both of these approaches can be computationally expensive, so we propose exploiting parallel computational architectures to ensure designs can be derived in a timely manner. We also extend our approach to allow for model uncertainty. This research is motivated by important pharmacological studies related to the treatment of critically ill patients.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

In the past few years, there has been a steady increase in the attention, importance and focus of green initiatives related to data centers. While various energy aware measures have been developed for data centers, the requirement of improving the performance efficiency of application assignment at the same time has yet to be fulfilled. For instance, many energy aware measures applied to data centers maintain a trade-off between energy consumption and Quality of Service (QoS). To address this problem, this paper presents a novel concept of profiling to facilitate offline optimization for a deterministic application assignment to virtual machines. Then, a profile-based model is established for obtaining near-optimal allocations of applications to virtual machines with consideration of three major objectives: energy cost, CPU utilization efficiency and application completion time. From this model, a profile-based and scalable matching algorithm is developed to solve the profile-based model. The assignment efficiency of our algorithm is then compared with that of the Hungarian algorithm, which does not scale well though giving the optimal solution.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents a single pass algorithm for mining discriminative Itemsets in data streams using a novel data structure and the tilted-time window model. Discriminative Itemsets are defined as Itemsets that are frequent in one data stream and their frequency in that stream is much higher than the rest of the streams in the dataset. In order to deal with the data structure size, we propose a pruning process that results in the compact tree structure containing discriminative Itemsets. Empirical analysis shows the sound time and space complexity of the proposed method.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Extracting frequent subtrees from the tree structured data has important applications in Web mining. In this paper, we introduce a novel canonical form for rooted labelled unordered trees called the balanced-optimal-search canonical form (BOCF) that can handle the isomorphism problem efficiently. Using BOCF, we define a tree structure guided scheme based enumeration approach that systematically enumerates only the valid subtrees. Finally, we present the balanced optimal search tree miner (BOSTER) algorithm based on BOCF and the proposed enumeration approach, for finding frequent induced subtrees from a database of labelled rooted unordered trees. Experiments on the real datasets compare the efficiency of BOSTER over the two state-of-the-art algorithms for mining induced unordered subtrees, HybridTreeMiner and UNI3. The results are encouraging.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This paper presents an algorithm for mining unordered embedded subtrees using the balanced-optimal-search canonical form (BOCF). A tree structure guided scheme based enumeration approach is defined using BOCF for systematically enumerating the valid subtrees only. Based on this canonical form and enumeration technique, the balanced optimal search embedded subtree mining algorithm (BEST) is introduced for mining embedded subtrees from a database of labelled rooted unordered trees. The extensive experiments on both synthetic and real datasets demonstrate the efficiency of BEST over the two state-of-the-art algorithms for mining embedded unordered subtrees, SLEUTH and U3.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Transit passenger market segmentation enables transit operators to target different classes of transit users for targeted surveys and various operational and strategic planning improvements. However, the existing market segmentation studies in the literature have been generally done using passenger surveys, which have various limitations. The smart card (SC) data from an automated fare collection system facilitate the understanding of the multiday travel pattern of transit passengers and can be used to segment them into identifiable types of similar behaviors and needs. This paper proposes a comprehensive methodology for passenger segmentation solely using SC data. After reconstructing the travel itineraries from SC transactions, this paper adopts the density-based spatial clustering of application with noise (DBSCAN) algorithm to mine the travel pattern of each SC user. An a priori market segmentation approach then segments transit passengers into four identifiable types. The methodology proposed in this paper assists transit operators to understand their passengers and provides them oriented information and services.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

High-Order Co-Clustering (HOCC) methods have attracted high attention in recent years because of their ability to cluster multiple types of objects simultaneously using all available information. During the clustering process, HOCC methods exploit object co-occurrence information, i.e., inter-type relationships amongst different types of objects as well as object affinity information, i.e., intra-type relationships amongst the same types of objects. However, it is difficult to learn accurate intra-type relationships in the presence of noise and outliers. Existing HOCC methods consider the p nearest neighbours based on Euclidean distance for the intra-type relationships, which leads to incomplete and inaccurate intra-type relationships. In this paper, we propose a novel HOCC method that incorporates multiple subspace learning with a heterogeneous manifold ensemble to learn complete and accurate intra-type relationships. Multiple subspace learning reconstructs the similarity between any pair of objects that belong to the same subspace. The heterogeneous manifold ensemble is created based on two-types of intra-type relationships learnt using p-nearest-neighbour graph and multiple subspaces learning. Moreover, in order to make sure the robustness of clustering process, we introduce a sparse error matrix into matrix decomposition and develop a novel iterative algorithm. Empirical experiments show that the proposed method achieves improved results over the state-of-art HOCC methods for FScore and NMI.