22 resultados para Clustering search algorithm


Relevância:

30.00% 30.00%

Publicador:

Resumo:

Magnetoencephalography (MEG), a non-invasive technique for characterizing brain electrical activity, is gaining popularity as a tool for assessing group-level differences between experimental conditions. One method for assessing task-condition effects involves beamforming, where a weighted sum of field measurements is used to tune activity on a voxel-by-voxel basis. However, this method has been shown to produce inhomogeneous smoothness differences as a function of signal-to-noise across a volumetric image, which can then produce false positives at the group level. Here we describe a novel method for group-level analysis with MEG beamformer images that utilizes the peak locations within each participant's volumetric image to assess group-level effects. We compared our peak-clustering algorithm with SnPM using simulated data. We found that our method was immune to artefactual group effects that can arise as a result of inhomogeneous smoothness differences across a volumetric image. We also used our peak-clustering algorithm on experimental data and found that regions were identified that corresponded with task-related regions identified in the literature. These findings suggest that our technique is a robust method for group-level analysis with MEG beamformer images.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Objective: Recently, much research has been proposed using nature inspired algorithms to perform complex machine learning tasks. Ant colony optimization (ACO) is one such algorithm based on swarm intelligence and is derived from a model inspired by the collective foraging behavior of ants. Taking advantage of the ACO in traits such as self-organization and robustness, this paper investigates ant-based algorithms for gene expression data clustering and associative classification. Methods and material: An ant-based clustering (Ant-C) and an ant-based association rule mining (Ant-ARM) algorithms are proposed for gene expression data analysis. The proposed algorithms make use of the natural behavior of ants such as cooperation and adaptation to allow for a flexible robust search for a good candidate solution. Results: Ant-C has been tested on the three datasets selected from the Stanford Genomic Resource Database and achieved relatively high accuracy compared to other classical clustering methods. Ant-ARM has been tested on the acute lymphoblastic leukemia (ALL)/acute myeloid leukemia (AML) dataset and generated about 30 classification rules with high accuracy. Conclusions: Ant-C can generate optimal number of clusters without incorporating any other algorithms such as K-means or agglomerative hierarchical clustering. For associative classification, while a few of the well-known algorithms such as Apriori, FP-growth and Magnum Opus are unable to mine any association rules from the ALL/AML dataset within a reasonable period of time, Ant-ARM is able to extract associative classification rules.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Web document cluster analysis plays an important role in information retrieval by organizing large amounts of documents into a small number of meaningful clusters. Traditional web document clustering is based on the Vector Space Model (VSM), which takes into account only two-level (document and term) knowledge granularity but ignores the bridging paragraph granularity. However, this two-level granularity may lead to unsatisfactory clustering results with “false correlation”. In order to deal with the problem, a Hierarchical Representation Model with Multi-granularity (HRMM), which consists of five-layer representation of data and a twophase clustering process is proposed based on granular computing and article structure theory. To deal with the zero-valued similarity problemresulted from the sparse term-paragraphmatrix, an ontology based strategy and a tolerance-rough-set based strategy are introduced into HRMM. By using granular computing, structural knowledge hidden in documents can be more efficiently and effectively captured in HRMM and thus web document clusters with higher quality can be generated. Extensive experiments show that HRMM, HRMM with tolerancerough-set strategy, and HRMM with ontology all outperform VSM and a representative non VSM-based algorithm, WFP, significantly in terms of the F-Score.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

Background: Despite initial concerns about the sensitivity of the proposed diagnostic criteria for DSM-5 Autism Spectrum Disorder (ASD; e.g. Gibbs et al., 2012; McPartland et al., 2012), evidence is growing that the DSM-5 criteria provides an inclusive description with both good sensitivity and specificity (e.g. Frazier et al., 2012; Kent, Carrington et al., 2013). The capacity of the criteria to provide high levels of sensitivity and specificity comparable with DSM-IV-TR however relies on careful measurement to ensure that appropriate items from diagnostic instruments map onto the new DSM-5 descriptions.Objectives: To use an existing DSM-5 diagnostic algorithm (Kent, Carrington et .al., 2013) to identify a set of ‘essential’ behaviors sufficient to make a reliable and accurate diagnosis of DSM-5 Autism Spectrum Disorder (ASD) across age and ability level. Methods: Specific behaviors were identified and tested from the recently published DSM-5 algorithm for the Diagnostic Interview for Social and Communication Disorders (DISCO). Analyses were run on existing DISCO datasets, with a total participant sample size of 335. Three studies provided step-by-step development towards identification of a minimum set of items. Study 1 identified the most highly discriminating items (p<.0001). Study 2 used a lower selection threshold than in Study 1 (p<.05) to facilitate better representation of the full DSM-5 ASD profile. Study 3 included additional items previously reported as significantly more frequent in individuals with higher ability. The discriminant validity of all three item sets was tested using Receiver Operating Characteristic curves. Finally, sensitivity across age and ability was investigated in a subset of individuals with ASD (n=190).Results: Study 1 identified an item set (14 items) with good discriminant validity, but which predominantly measured social-communication behaviors (11/14). The Study 2 item set (48 items) better represented the DSM-5 ASD and had good discriminant validity, but the item set lacked sensitivity for individuals with higher ability. The final Study 3 adjusted item set (54 items) improved sensitivity for individuals with higher ability and performance and was comparable to the published DISCO DSM-5 algorithm.Conclusions: This work represents a first attempt to derive a reduced set of behaviors for DSM-5 directly from an existing standardized ASD developmental history interview. Further work involving existing ASD diagnostic tools with community-based and well characterized research samples will be required to replicate these findings and exploit their potential to contribute to a more efficient and focused ASD diagnostic process.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

The objective of this study was to identify a set of 'essential' behaviours sufficient for diagnosis of DSM-5 Autism Spectrum Disorder (ASD). Highly discriminating, 'essential' behaviours were identified from the published DSM-5 algorithm developed for the Diagnostic Interview for Social and Communication Disorders (DISCO). Study 1 identified a reduced item set (48 items) with good predictive validity (as measured using receiver operating characteristic curves) that represented all symptom sub-domains described in the DSM-5 ASD criteria but lacked sensitivity for individuals with higher ability. An adjusted essential item set (54 items; Study 2) had good sensitivity when applied to individuals with higher ability and performance was comparable to the published full DISCO DSM-5 algorithm. Investigation at the item level revealed that the most highly discriminating items predominantly measured social-communication behaviours. This work represents a first attempt to derive a reduced set of behaviours for DSM-5 directly from an existing standardised ASD developmental history interview and has implications for the use of DSM-5 criteria for clinical and research practice. © 2014 The Authors.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This work contributes to the development of search engines that self-adapt their size in response to fluctuations in workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computational resources to or from the engine. In this paper, we focus on the problem of regrouping the metric-space search index when the number of virtual machines used to run the search engine is modified to reflect changes in workload. We propose an algorithm for incrementally adjusting the index to fit the varying number of virtual machines. We tested its performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud, while calibrating the results to compensate for the performance fluctuations of the platform. Our experiments show that, when compared with computing the index from scratch, the incremental algorithm speeds up the index computation 2–10 times while maintaining a similar search performance.

Relevância:

30.00% 30.00%

Publicador:

Resumo:

This research focuses on automatically adapting a search engine size in response to fluctuations in query workload. Deploying a search engine in an Infrastructure as a Service (IaaS) cloud facilitates allocating or deallocating computer resources to or from the engine. Our solution is to contribute an adaptive search engine that will repeatedly re-evaluate its load and, when appropriate, switch over to a dierent number of active processors. We focus on three aspects and break them out into three sub-problems as follows: Continually determining the Number of Processors (CNP), New Grouping Problem (NGP) and Regrouping Order Problem (ROP). CNP means that (in the light of the changes in the query workload in the search engine) there is a problem of determining the ideal number of processors p active at any given time to use in the search engine and we call this problem CNP. NGP happens when changes in the number of processors are determined and it must also be determined which groups of search data will be distributed across the processors. ROP is how to redistribute this data onto processors while keeping the engine responsive and while also minimising the switchover time and the incurred network load. We propose solutions for these sub-problems. For NGP we propose an algorithm for incrementally adjusting the index to t the varying number of virtual machines. For ROP we present an ecient method for redistributing data among processors while keeping the search engine responsive. Regarding the solution for CNP, we propose an algorithm determining the new size of the search engine by re-evaluating its load. We tested the solution performance using a custom-build prototype search engine deployed in the Amazon EC2 cloud. Our experiments show that when we compare our NGP solution with computing the index from scratch, the incremental algorithm speeds up the index computation 2{10 times while maintaining a similar search performance. The chosen redistribution method is 25% to 50% faster than other methods and reduces the network load around by 30%. For CNP we present a deterministic algorithm that shows a good ability to determine a new size of search engine. When combined, these algorithms give an adapting algorithm that is able to adjust the search engine size with a variable workload.