945 resultados para Density-based Scanning Algorithm


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Smart Card Automated Fare Collection (AFC) data has been extensively exploited to understand passenger behavior, passenger segment, trip purpose and improve transit planning through spatial travel pattern analysis. The literature has been evolving from simple to more sophisticated methods such as from aggregated to individual travel pattern analysis, and from stop-to-stop to flexible stop aggregation. However, the issue of high computing complexity has limited these methods in practical applications. This paper proposes a new algorithm named Weighted Stop Density Based Scanning Algorithm with Noise (WS-DBSCAN) based on the classical Density Based Scanning Algorithm with Noise (DBSCAN) algorithm to detect and update the daily changes in travel pattern. WS-DBSCAN converts the classical quadratic computation complexity DBSCAN to a problem of sub-quadratic complexity. The numerical experiment using the real AFC data in South East Queensland, Australia shows that the algorithm costs only 0.45% in computation time compared to the classical DBSCAN, but provides the same clustering results.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Integrating evidence from multiple domains is useful in prioritizing disease candidate genes for subsequent testing. We ranked all known human genes (n = 3819) under linkage peaks in the Irish Study of High-Density Schizophrenia Families using three different evidence domains: 1) a meta-analysis of microarray gene expression results using the Stanley Brain collection, 2) a schizophrenia protein-protein interaction network, and 3) a systematic literature search. Each gene was assigned a domain-specific p-value and ranked after evaluating the evidence within each domain. For comparison to this
ranking process, a large-scale candidate gene hypothesis was also tested by including genes with Gene Ontology terms related to neurodevelopment. Subsequently, genotypes of 3725 SNPs in 167 genes from a custom Illumina iSelect array were used to evaluate the top ranked vs. hypothesis selected genes. Seventy-three genes were both highly ranked and involved in neurodevelopment (category 1) while 42 and 52 genes were exclusive to neurodevelopment (category 2) or highly ranked (category 3), respectively. The most significant associations were observed in genes PRKG1, PRKCE, and CNTN4 but no individual SNPs were significant after correction for multiple testing. Comparison of the approaches showed an excess of significant tests using the hypothesis-driven neurodevelopment category. Random selection of similar sized genes from two independent genome-wide association studies (GWAS) of schizophrenia showed the excess was unlikely by chance. In a further meta-analysis of three GWAS datasets, four candidate SNPs reached nominal significance. Although gene ranking using integrated sources of prior information did not enrich for significant results in the current experiment, gene selection using an a priori hypothesis (neurodevelopment) was superior to random selection. As such, further development of gene ranking strategies using more carefully selected sources of information is warranted.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

We propose a new technique to perform unsupervised data classification (clustering) based on density induced metric and non-smooth optimization. Our goal is to automatically recognize multidimensional clusters of non-convex shape. We present a modification of the fuzzy c-means algorithm, which uses the data induced metric, defined with the help of Delaunay triangulation. We detail computation of the distances in such a metric using graph algorithms. To find optimal positions of cluster prototypes we employ the discrete gradient method of non-smooth optimization. The new clustering method is capable to identify non-convex overlapped d-dimensional clusters.


Relevância:

100.00% 100.00%

Publicador:

Resumo:

The Information Bottleneck method aims to extract a compact representation which preserves the maximum relevant information. The sub-optimality in agglomerative Information Bottleneck (aIB) algorithm restricts the applications of Information Bottleneck method. In this paper, the concept of density-based chains is adopted to evaluate the information loss among the neighbors of an element, rather than the information loss between pairs of elements. The DaIB algorithm is then presented to alleviate the sub-optimality problem in aIB while simultaneously keeping the useful hierarchical clustering tree-structure. The experiment results on the benchmark data sets show that the DaIB algorithm can get more relevant information and higher precision than aIB algorithm, and the paired t-test indicates that these improvements are statistically significant.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging behaviour of ants, which move through the data space searching for high data-density regions, and leave pheromone trails on their path. The pheromone map is used to identify the exact number of clusters, and assign the pixels to these clusters using the pheromone gradient. We applied ASCA to detection of microcalcifications in digital mammograms and compared its performance with state-of-the-art clustering algorithms such as 1D Self-Organizing Map, k-Means, Fuzzy c-Means and Possibilistic Fuzzy c-Means. The main advantage of ASCA is that the number of clusters needs not to be known a priori. The experimental results show that ASCA is more efficient than the other algorithms in detecting small clusters of atypical data.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, a novel approach is proposed to automatically generate both watercolor painting and pencil sketch drawing, or binary image of contour, from realism-style photo by using DBSCAN color clustering based on HSV color space. While the color clusters produced by proposed methods help to create watercolor painting, the noise pixels are useful to generate the pencil sketch drawing. Moreover, noise pixels are reassigned to color clusters by a novel algorithm to refine the contour in the watercolor painting. The main goal of this paper is to inspire non-professional artists' imagination to produce traditional style painting easily by only adjusting a few parameters. Also, another contribution of this paper is to propose an easy method to produce the binary image of contour, which is a vice product when mining image data by DBSCAN clustering. Thus the binary image is useful in resource limited system to reduce data but keep enough information of images. © 2007 IEEE.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In Web service based systems, new value-added Web services can be constructed by integrating existing Web services. A Web service may have many implementations, which are functionally identical, but have different Quality of Service (QoS) attributes, such as response time, price, reputation, reliability, availability and so on. Thus, a significant research problem in Web service composition is how to select an implementation for each of the component Web services so that the overall QoS of the composite Web service is optimal. This is so called QoS-aware Web service composition problem. In some composite Web services there are some dependencies and conflicts between the Web service implementations. However, existing approaches cannot handle the constraints. This paper tackles the QoS-aware Web service composition problem with inter service dependencies and conflicts using a penalty-based genetic algorithm (GA). Experimental results demonstrate the effectiveness and the scalability of the penalty-based GA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Cloud computing is a latest new computing paradigm where applications, data and IT services are provided over the Internet. Cloud computing has become a main medium for Software as a Service (SaaS) providers to host their SaaS as it can provide the scalability a SaaS requires. The challenges in the composite SaaS placement process rely on several factors including the large size of the Cloud network, SaaS competing resource requirements, SaaS interactions between its components and SaaS interactions with its data components. However, existing applications’ placement methods in data centres are not concerned with the placement of the component’s data. In addition, a Cloud network is much larger than data center networks that have been discussed in existing studies. This paper proposes a penalty-based genetic algorithm (GA) to the composite SaaS placement problem in the Cloud. We believe this is the first attempt to the SaaS placement with its data in Cloud provider’s servers. Experimental results demonstrate the feasibility and the scalability of the GA.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In the past few years, the virtual machine (VM) placement problem has been studied intensively and many algorithms for the VM placement problem have been proposed. However, those proposed VM placement algorithms have not been widely used in today's cloud data centers as they do not consider the migration cost from current VM placement to the new optimal VM placement. As a result, the gain from optimizing VM placement may be less than the loss of the migration cost from current VM placement to the new VM placement. To address this issue, this paper presents a penalty-based genetic algorithm (GA) for the VM placement problem that considers the migration cost in addition to the energy-consumption of the new VM placement and the total inter-VM traffic flow in the new VM placement. The GA has been implemented and evaluated by experiments, and the experimental results show that the GA outperforms two well known algorithms for the VM placement problem.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

During the past few decades, developing efficient methods to solve dynamic facility layout problems has been focused on significantly by practitioners and researchers. More specifically meta-heuristic algorithms, especially genetic algorithm, have been proven to be increasingly helpful to generate sub-optimal solutions for large-scale dynamic facility layout problems. Nevertheless, the uncertainty of the manufacturing factors in addition to the scale of the layout problem calls for a mixed genetic algorithm–robust approach that could provide a single unlimited layout design. The present research aims to devise a customized permutation-based robust genetic algorithm in dynamic manufacturing environments that is expected to be generating a unique robust layout for all the manufacturing periods. The numerical outcomes of the proposed robust genetic algorithm indicate significant cost improvements compared to the conventional genetic algorithm methods and a selective number of other heuristic and meta-heuristic techniques.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Evaluation of intermolecular interactions in terms of both experimental and theoretical charge density analyses has produced a unified picture with which to classify strong and weak hydrogen bonds, along with van der Waals interactions, into three regions.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, we propose a new token-based distributed algorithm for total order atomic broadcast. We have shown that the proposed algorithm requires lesser number of messages compared to the algorithm where broadcast servers use unicasting to send messages to other broadcast servers. The traditional method of broadcasting requires 3(N - 1) messages to broadcast an application message, where N is the number of broadcast servers present in the system. In this algorithm, the maximum number of token messages required to broadcast an application message is 2N. For a heavily loaded system, the average number of token messages required to broadcast an application message reduces to 2, which is a substantial improvement over the traditional broadcasting approach.