135 resultados para data gathering algorithm
Resumo:
Accurate and detailed road models play an important role in a number of geospatial applications, such as infrastructure planning, traffic monitoring, and driver assistance systems. In this thesis, an integrated approach for the automatic extraction of precise road features from high resolution aerial images and LiDAR point clouds is presented. A framework of road information modeling has been proposed, for rural and urban scenarios respectively, and an integrated system has been developed to deal with road feature extraction using image and LiDAR analysis. For road extraction in rural regions, a hierarchical image analysis is first performed to maximize the exploitation of road characteristics in different resolutions. The rough locations and directions of roads are provided by the road centerlines detected in low resolution images, both of which can be further employed to facilitate the road information generation in high resolution images. The histogram thresholding method is then chosen to classify road details in high resolution images, where color space transformation is used for data preparation. After the road surface detection, anisotropic Gaussian and Gabor filters are employed to enhance road pavement markings while constraining other ground objects, such as vegetation and houses. Afterwards, pavement markings are obtained from the filtered image using the Otsu's clustering method. The final road model is generated by superimposing the lane markings on the road surfaces, where the digital terrain model (DTM) produced by LiDAR data can also be combined to obtain the 3D road model. As the extraction of roads in urban areas is greatly affected by buildings, shadows, vehicles, and parking lots, we combine high resolution aerial images and dense LiDAR data to fully exploit the precise spectral and horizontal spatial resolution of aerial images and the accurate vertical information provided by airborne LiDAR. Objectoriented image analysis methods are employed to process the feature classiffcation and road detection in aerial images. In this process, we first utilize an adaptive mean shift (MS) segmentation algorithm to segment the original images into meaningful object-oriented clusters. Then the support vector machine (SVM) algorithm is further applied on the MS segmented image to extract road objects. Road surface detected in LiDAR intensity images is taken as a mask to remove the effects of shadows and trees. In addition, normalized DSM (nDSM) obtained from LiDAR is employed to filter out other above-ground objects, such as buildings and vehicles. The proposed road extraction approaches are tested using rural and urban datasets respectively. The rural road extraction method is performed using pan-sharpened aerial images of the Bruce Highway, Gympie, Queensland. The road extraction algorithm for urban regions is tested using the datasets of Bundaberg, which combine aerial imagery and LiDAR data. Quantitative evaluation of the extracted road information for both datasets has been carried out. The experiments and the evaluation results using Gympie datasets show that more than 96% of the road surfaces and over 90% of the lane markings are accurately reconstructed, and the false alarm rates for road surfaces and lane markings are below 3% and 2% respectively. For the urban test sites of Bundaberg, more than 93% of the road surface is correctly reconstructed, and the mis-detection rate is below 10%.
Resumo:
Here we present a sequential Monte Carlo (SMC) algorithm that can be used for any one-at-a-time Bayesian sequential design problem in the presence of model uncertainty where discrete data are encountered. Our focus is on adaptive design for model discrimination but the methodology is applicable if one has a different design objective such as parameter estimation or prediction. An SMC algorithm is run in parallel for each model and the algorithm relies on a convenient estimator of the evidence of each model which is essentially a function of importance sampling weights. Other methods for this task such as quadrature, often used in design, suffer from the curse of dimensionality. Approximating posterior model probabilities in this way allows us to use model discrimination utility functions derived from information theory that were previously difficult to compute except for conjugate models. A major benefit of the algorithm is that it requires very little problem specific tuning. We demonstrate the methodology on three applications, including discriminating between models for decline in motor neuron numbers in patients suffering from neurological diseases such as Motor Neuron disease.
Resumo:
Recently, Software as a Service (SaaS) in Cloud computing, has become more and more significant among software users and providers. To offer a SaaS with flexible functions at a low cost, SaaS providers have focused on the decomposition of the SaaS functionalities, or known as composite SaaS. This approach has introduced new challenges in SaaS resource management in data centres. One of the challenges is managing the resources allocated to the composite SaaS. Due to the dynamic environment of a Cloud data centre, resources that have been initially allocated to SaaS components may be overloaded or wasted. As such, reconfiguration for the components’ placement is triggered to maintain the performance of the composite SaaS. However, existing approaches often ignore the communication or dependencies between SaaS components in their implementation. In a composite SaaS, it is important to include these elements, as they will directly affect the performance of the SaaS. This paper will propose a Grouping Genetic Algorithm (GGA) for multiple composite SaaS application component clustering in Cloud computing that will address this gap. To the best of our knowledge, this is the first attempt to handle multiple composite SaaS reconfiguration placement in a dynamic Cloud environment. The experimental results demonstrate the feasibility and the scalability of the GGA.
Resumo:
A composite SaaS (Software as a Service) is a software that is comprised of several software components and data components. The composite SaaS placement problem is to determine where each of the components should be deployed in a cloud computing environment such that the performance of the composite SaaS is optimal. From the computational point of view, the composite SaaS placement problem is a large-scale combinatorial optimization problem. Thus, an Iterative Cooperative Co-evolutionary Genetic Algorithm (ICCGA) was proposed. The ICCGA can find reasonable quality of solutions. However, its computation time is noticeably slow. Aiming at improving the computation time, we propose an unsynchronized Parallel Cooperative Co-evolutionary Genetic Algorithm (PCCGA) in this paper. Experimental results have shown that the PCCGA not only has quicker computation time, but also generates better quality of solutions than the ICCGA.
Resumo:
Software as a Service (SaaS) in Cloud is getting more and more significant among software users and providers recently. A SaaS that is delivered as composite application has many benefits including reduced delivery costs, flexible offers of the SaaS functions and decreased subscription cost for users. However, this approach has introduced a new problem in managing the resources allocated to the composite SaaS. The resource allocation that has been done at the initial stage may be overloaded or wasted due to the dynamic environment of a Cloud. A typical data center resource management usually triggers a placement reconfiguration for the SaaS in order to maintain its performance as well as to minimize the resource used. Existing approaches for this problem often ignore the underlying dependencies between SaaS components. In addition, the reconfiguration also has to comply with SaaS constraints in terms of its resource requirements, placement requirement as well as its SLA. To tackle the problem, this paper proposes a penalty-based Grouping Genetic Algorithm for multiple composite SaaS components clustering in Cloud. The main objective is to minimize the resource used by the SaaS by clustering its component without violating any constraint. Experimental results demonstrate the feasibility and the scalability of the proposed algorithm.
Resumo:
Improving energy efficiency has become increasingly important in data centers in recent years to reduce the rapidly growing tremendous amounts of electricity consumption. The power dissipation of the physical servers is the root cause of power usage of other systems, such as cooling systems. Many efforts have been made to make data centers more energy efficient. One of them is to minimize the total power consumption of these servers in a data center through virtual machine consolidation, which is implemented by virtual machine placement. The placement problem is often modeled as a bin packing problem. Due to the NP-hard nature of the problem, heuristic solutions such as First Fit and Best Fit algorithms have been often used and have generally good results. However, their performance leaves room for further improvement. In this paper we propose a Simulated Annealing based algorithm, which aims at further improvement from any feasible placement. This is the first published attempt of using SA to solve the VM placement problem to optimize the power consumption. Experimental results show that this SA algorithm can generate better results, saving up to 25 percentage more energy than First Fit Decreasing in an acceptable time frame.
Resumo:
Here we present a sequential Monte Carlo approach to Bayesian sequential design for the incorporation of model uncertainty. The methodology is demonstrated through the development and implementation of two model discrimination utilities; mutual information and total separation, but it can also be applied more generally if one has different experimental aims. A sequential Monte Carlo algorithm is run for each rival model (in parallel), and provides a convenient estimate of the marginal likelihood (of each model) given the data, which can be used for model comparison and in the evaluation of utility functions. A major benefit of this approach is that it requires very little problem specific tuning and is also computationally efficient when compared to full Markov chain Monte Carlo approaches. This research is motivated by applications in drug development and chemical engineering.
Resumo:
A simple and effective down-sample algorithm, Peak-Hold-Down-Sample (PHDS) algorithm is developed in this paper to enable a rapid and efficient data transfer in remote condition monitoring applications. The algorithm is particularly useful for high frequency Condition Monitoring (CM) techniques, and for low speed machine applications since the combination of the high sampling frequency and low rotating speed will generally lead to large unwieldy data size. The effectiveness of the algorithm was evaluated and tested on four sets of data in the study. One set of the data was extracted from the condition monitoring signal of a practical industry application. Another set of data was acquired from a low speed machine test rig in the laboratory. The other two sets of data were computer simulated bearing defect signals having either a single or multiple bearing defects. The results disclose that the PHDS algorithm can substantially reduce the size of data while preserving the critical bearing defect information for all the data sets used in this work even when a large down-sample ratio was used (i.e., 500 times down-sampled). In contrast, the down-sample process using existing normal down-sample technique in signal processing eliminates the useful and critical information such as bearing defect frequencies in a signal when the same down-sample ratio was employed. Noise and artificial frequency components were also induced by the normal down-sample technique, thus limits its usefulness for machine condition monitoring applications.
Resumo:
Background Cancer outlier profile analysis (COPA) has proven to be an effective approach to analyzing cancer expression data, leading to the discovery of the TMPRSS2 and ETS family gene fusion events in prostate cancer. However, the original COPA algorithm did not identify down-regulated outliers, and the currently available R package implementing the method is similarly restricted to the analysis of over-expressed outliers. Here we present a modified outlier detection method, mCOPA, which contains refinements to the outlier-detection algorithm, identifies both over- and under-expressed outliers, is freely available, and can be applied to any expression dataset. Results We compare our method to other feature-selection approaches, and demonstrate that mCOPA frequently selects more-informative features than do differential expression or variance-based feature selection approaches, and is able to recover observed clinical subtypes more consistently. We demonstrate the application of mCOPA to prostate cancer expression data, and explore the use of outliers in clustering, pathway analysis, and the identification of tumour suppressors. We analyse the under-expressed outliers to identify known and novel prostate cancer tumour suppressor genes, validating these against data in Oncomine and the Cancer Gene Index. We also demonstrate how a combination of outlier analysis and pathway analysis can identify molecular mechanisms disrupted in individual tumours. Conclusions We demonstrate that mCOPA offers advantages, compared to differential expression or variance, in selecting outlier features, and that the features so selected are better able to assign samples to clinically annotated subtypes. Further, we show that the biology explored by outlier analysis differs from that uncovered in differential expression or variance analysis. mCOPA is an important new tool for the exploration of cancer datasets and the discovery of new cancer subtypes, and can be combined with pathway and functional analysis approaches to discover mechanisms underpinning heterogeneity in cancers
Resumo:
In this paper, we present WebPut, a prototype system that adopts a novel web-based approach to the data imputation problem. Towards this, Webput utilizes the available information in an incomplete database in conjunction with the data consistency principle. Moreover, WebPut extends effective Information Extraction (IE) methods for the purpose of formulating web search queries that are capable of effectively retrieving missing values with high accuracy. WebPut employs a confidence-based scheme that efficiently leverages our suite of data imputation queries to automatically select the most effective imputation query for each missing value. A greedy iterative algorithm is also proposed to schedule the imputation order of the different missing values in a database, and in turn the issuing of their corresponding imputation queries, for improving the accuracy and efficiency of WebPut. Experiments based on several real-world data collections demonstrate that WebPut outperforms existing approaches.
Resumo:
Background Accumulated biological research outcomes show that biological functions do not depend on individual genes, but on complex gene networks. Microarray data are widely used to cluster genes according to their expression levels across experimental conditions. However, functionally related genes generally do not show coherent expression across all conditions since any given cellular process is active only under a subset of conditions. Biclustering finds gene clusters that have similar expression levels across a subset of conditions. This paper proposes a seed-based algorithm that identifies coherent genes in an exhaustive, but efficient manner. Methods In order to find the biclusters in a gene expression dataset, we exhaustively select combinations of genes and conditions as seeds to create candidate bicluster tables. The tables have two columns: (a) a gene set, and (b) the conditions on which the gene set have dissimilar expression levels to the seed. First, the genes with less than the maximum number of dissimilar conditions are identified and a table of these genes is created. Second, the rows that have the same dissimilar conditions are grouped together. Third, the table is sorted in ascending order based on the number of dissimilar conditions. Finally, beginning with the first row of the table, a test is run repeatedly to determine whether the cardinality of the gene set in the row is greater than the minimum threshold number of genes in a bicluster. If so, a bicluster is outputted and the corresponding row is removed from the table. Repeating this process, all biclusters in the table are systematically identified until the table becomes empty. Conclusions This paper presents a novel biclustering algorithm for the identification of additive biclusters. Since it involves exhaustively testing combinations of genes and conditions, the additive biclusters can be found more readily.
Resumo:
miRDeep and its varieties are widely used to quantify known and novel micro RNA (miRNA) from small RNA sequencing (RNAseq). This article describes miRDeep*, our integrated miRNA identification tool, which is modeled off miRDeep, but the precision of detecting novel miRNAs is improved by introducing new strategies to identify precursor miRNAs. miRDeep* has a user-friendly graphic interface and accepts raw data in FastQ and Sequence Alignment Map (SAM) or the binary equivalent (BAM) format. Known and novel miRNA expression levels, as measured by the number of reads, are displayed in an interface, which shows each RNAseq read relative to the pre-miRNA hairpin. The secondary pre-miRNA structure and read locations for each predicted miRNA are shown and kept in a separate figure file. Moreover, the target genes of known and novel miRNAs are predicted using the TargetScan algorithm, and the targets are ranked according to the confidence score. miRDeep* is an integrated standalone application where sequence alignment, pre-miRNA secondary structure calculation and graphical display are purely Java coded. This application tool can be executed using a normal personal computer with 1.5 GB of memory. Further, we show that miRDeep* outperformed existing miRNA prediction tools using our LNCaP and other small RNAseq datasets. miRDeep* is freely available online at http://www.australianprostatecentre.org/research/software/mirdeep-star
Resumo:
The aim of this paper is to implement a Game-Theory based offline mission path planner for aerial inspection tasks of large linear infrastructures. Like most real-world optimisation problems, mission path planning involves a number of objectives which ideally should be minimised simultaneously. The goal of this work is then to develop a Multi-Objective (MO) optimisation tool able to provide a set of optimal solutions for the inspection task, given the environment data, the mission requirements and the definition of the objectives to minimise. Results indicate the robustness and capability of the method to find the trade-off between the Pareto-optimal solutions.
Resumo:
With the explosive growth of resources available through the Internet, information mismatching and overload have become a severe concern to users. Web users are commonly overwhelmed by huge volume of information and are faced with the challenge of finding the most relevant and reliable information in a timely manner. Personalised information gathering and recommender systems represent state-of-the-art tools for efficient selection of the most relevant and reliable information resources, and the interest in such systems has increased dramatically over the last few years. However, web personalization has not yet been well-exploited; difficulties arise while selecting resources through recommender systems from a technological and social perspective. Aiming to promote high quality research in order to overcome these challenges, this paper provides a comprehensive survey on the recent work and achievements in the areas of personalised web information gathering and recommender systems. The report covers concept-based techniques exploited in personalised information gathering and recommender systems.
Resumo:
Research has long documented the value that design brings to the innovation of products and services. The research landscape has transformed in the last decade and now reflects the value of design as a different way thinking that can be applied to the innovation of business models and catalyst for strategic growth. This paper presents a case study of gathering deep customer insights through a design led innovation approach and reveals industry perspectives and attitudes towards the value of deep customer insights within the context of a leading Australian airport corporation. The findings highlight that the process of gathering deep customer insights encourages a design led approach to testing assumptions and developing stronger customer engagement. The richness of the deep customer insights also provided a bridge to future thought by provoking possible product, service and business innovations which aligned to the airport corporation’s vision. The implications of the study reveal how quantitative market data, which reveals broad sociocultural trends into ‘how’ and ‘what’ customers interact with within an airport, can be strongly validated and built upon through qualitative deep customer insights that explore ‘why’ those choices to interact are made. Future research is then presented which aims to widely disseminate a design led approach to innovation within internal stakeholders of the airport corporation through the development of a digital strategy.