62 resultados para discovery of a similarity


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Autism Spectrum Disorder (ASD) is growing at a staggering rate, but, little is known about the cause of this condition. Inferring learning patterns from therapeutic performance data, and subsequently clustering ASD children into subgroups, is important to understand this domain, and more importantly to inform evidence-based intervention. However, this data-driven task was difficult in the past due to insufficiency of data to perform reliable analysis. For the first time, using data from a recent application for early intervention in autism (TOBY Play pad), whose download count is now exceeding 4500, we present in this paper the automatic discovery of learning patterns across 32 skills in sensory, imitation and language. We use unsupervised learning methods for this task, but a notorious problem with existing methods is the correct specification of number of patterns in advance, which in our case is even more difficult due to complexity of the data. To this end, we appeal to recent Bayesian nonparametric methods, in particular the use of Bayesian Nonparametric Factor Analysis. This model uses Indian Buffet Process (IBP) as prior on a binary matrix of infinite columns to allocate groups of intervention skills to children. The optimal number of learning patterns as well as subgroup assignments are inferred automatically from data. Our experimental results follow an exploratory approach, present different newly discovered learning patterns. To provide quantitative results, we also report the clustering evaluation against K-means and Nonnegative matrix factorization (NMF). In addition to the novelty of this new problem, we were able to demonstrate the suitability of Bayesian nonparametric models over parametric rivals.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The discovery of contexts is important for context-aware applications in pervasive computing. This is a challenging problem because of the stream nature of data, the complexity and changing nature of contexts. We propose a Bayesian nonparametric model for the detection of co-location contexts from Bluetooth signals. By using an Indian buffet process as the prior distribution, the model can discover the number of contexts automatically. We introduce a novel fixed-lag particle filter that processes data incrementally. This sampling scheme is especially suitable for pervasive computing as the computational requirements remain constant in spite of growing data. We examine our model on a synthetic dataset and two real world datasets. To verify the discovered contexts, we compare them to the communities detected by the Louvain method, showing a strong correlation between the results of the two methods. Fixed-lag particle filter is compared with Gibbs sampling in terms of the normalized factorization error that shows a close performance between the two inference methods. As fixed-lag particle filter processes a small chunk of data when it comes and does not need to be restarted, its execution time is significantly shorter than that of Gibbs sampling.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

People are increasingly using social media, especially online communities, to discuss mental health issues and seek supports. Understanding topics, interaction, sentiment and clustering structures of these communities informs important aspects of mental health. It can potentially add knowledge to the underlying cognitive dynamics, mood swings patterns, shared interests, and interaction. There has been growing research interest in analyzing online mental health communities; however sentiment analysis of these communities has been largely under-explored. This study presents an analysis of online Live Journal communities with and without mental health-related conditions including depression and autism. Latent topics for mood tags, affective words, and generic words in the content of the posts made in these communities were learned using nonparametric topic modelling. These representations were then input into a nonparametric clustering to discover meta-groups among the communities. The best performance results can be achieved on clustering communities with latent mood-based representation for such communities. The study also found significant differences in usage latent topics for mood tags and affective features between online communities with and without affective disorders. The findings reveal useful insights into hyper-group detection of online mental health-related communities.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Monitoring daily physical activity plays an important role in disease prevention and intervention. This paper proposes an approach to monitor the body movement intensity levels from accelerometer data. We collect the data using the accelerometer in a realistic setting without any supervision. The ground-truth of activities is provided by the participants themselves using an experience sampling application running on their mobile phones. We compute a novel feature that has a strong correlation with the movement intensity. We use the hierarchical Dirichlet process (HDP) model to detect the activity levels from this feature. Consisting of Bayesian nonparametric priors over the parameters the model can infer the number of levels automatically. By demonstrating the approach on the publicly available USC-HAD dataset that includes ground-truth activity labels, we show a strong correlation between the discovered activity levels and the movement intensity of the activities. This correlation is further confirmed using our newly collected dataset. We further use the extracted patterns as features for clustering and classifying the activity sequences to improve performance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

GPS trajectory dataset with high sampling-rates is usually in large volume that challenges the processing efficiency. Most of the data points on trajectories are useless. This paper summarizes trajectories using stop points. We define a new concept of stay stability (i.e., time dividing distance or reciprocal of speed) between any two GPS points to detect stop points on individual trajectories. We propose a novel Mining Repeat Travel Behaviors Using Stop Regions (MRTBUSR) method. In MRTBUSR, a stop region is a popular region containing a certain number of close stop points that can be grouped into a cluster. We then retrieve common sequences of stop regions to denote repeat route patterns and further analyze the stop durations on a stop region to find repeat travel behaviors. The experiments on 20 labeled trajectories selected from GeoLife demonstrated the semantic effect, accuracy and near linear efficiency of our proposed method.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Bio prospecting of microalgal resources from diverse ecologically distinctive locations and better understanding of the physiological conditions of diverse habitats will enable us to better exploit these organisms for the production of lipid and carotenoids.The potential for coproduction of lipid and carotenoids, that may be benefical to human health have gained interest in recent decades. Methods for co-production and separating higher value compounds such as carotenoids and lipids can offset the cost of algal biofuel production, making this source more commercially viable. 

Relevância:

100.00% 100.00%

Publicador:

Resumo:

DNA-based approaches to the discovery of genes contributing to the development of type 2 diabetes have not been very successful despite substantial investments of time and money. The multiple gene-gene and gene-environment interactions that influence the development of type 2 diabetes mean that DNA approaches are not the ideal tool for defining the etiology of this complex disease. Gene expression-based technologies may prove to be a more rewarding strategy to identify diabetes candidate genes. There are a number of RNA-based technologies available to identify genes that are differentially expressed in various tissues in type 2 diabetes. These include differential display polymerase chain reaction (ddPCR), suppression subtractive hybridization (SSH), and cDNA microarrays. The power of new technologies to detect differential gene expression is ideally suited to studies utilizing appropriate animal models of human disease. We have shown that the gene expression approach, in combination with an excellent animal model such as the Israeli sand rat (Psammomys obesus), can provide novel genes and pathways that may be important in the disease process and provide novel therapeutic approaches. This paper will describe a new gene discovery, beacon, a novel gene linked with energy intake. As the functional characterization of novel genes discovered in our laboratory using this approach continues, it is anticipated that we will soon be able to compile a definitive list of genes that are important in the development of obesity and type 2 diabetes.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

In this paper, the zero-order Sugeno Fuzzy Inference System (FIS) that preserves the monotonicity property is studied. The sufficient conditions for the zero-order Sugeno FIS model to satisfy the monotonicity property are exploited as a set of useful governing equations to facilitate the FIS modelling process. The sufficient conditions suggest a fuzzy partition (at the rule antecedent part) and a monotonically-ordered rule base (at the rule consequent part) that can preserve the monotonicity property. The investigation focuses on the use of two Similarity Reasoning (SR)-based methods, i.e., Analogical Reasoning (AR) and Fuzzy Rule Interpolation (FRI), to deduce each conclusion separately. It is shown that AR and FRI may not be a direct solution to modelling of a multi-input FIS model that fulfils the monotonicity property, owing to the difficulty in getting a set of monotonically-ordered conclusions. As such, a Non-Linear Programming (NLP)-based SR scheme for constructing a monotonicity-preserving multi-input FIS model is proposed. In the proposed scheme, AR or FRI is first used to predict the rule conclusion of each observation. Then, a search algorithm is adopted to look for a set of consequents with minimized root means square errors as compared with the predicted conclusions. A constraint imposed by the sufficient conditions is also included in the search process. Applicability of the proposed scheme to undertaking fuzzy Failure Mode and Effect Analysis (FMEA) tasks is demonstrated. The results indicate that the proposed NLP-based SR scheme is useful for preserving the monotonicity property for building a multi-input FIS model with an incomplete rule base.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

Two Permian brachiopod genera, Rhynchopora King and Blasispirifer Kulikov, are reported for the first time from the Middle Permian, lower part of the Moribu Formation in the Hida Gaien Belt, central Japan. The Moribu species are closely compared with similar forms from the Middle Permian Barabash Formation (lower Chandalaz Series) in the Barabash area of South Primorye, Russian Far East. The discovery of these two genera, which exhibit close relationships with Middle Permian brachiopod faunas of South Primorye and the broad Boreal Realm, implies that tha Hida Gaien Belt was paleobiogeographically and paleogeographically close to the western part (Voznesenka Belt) of South Primorye, both situated in a middle latitudinal setting in the Northern Hemisphere on the southeastern side of the Bureya Block and lay proximal to and slightly northeast of the Sino-Korea Block during the Permian.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

The systematic relationships among Australian palaemonid shrimps have been the subject of speculation for some time. A preliminary phylogenetic study was undertaken to clarify the relationships of five species, Macrobrachium intermedium (Stimpson), M. australiense (Holthuis), M. atactum (Riek), M. rosenbergii (de Man) and Palaemon serenus (Heller), using 16S rRNA mitochondrial gene sequences. Phylogenetic analyses indicated inconsistencies with the current classification in two respects. First, M. intermedium formed a very well-supported clade with P. serenus distinct from M. australiense, M. atactum and M. rosenbergii. Second, the two species from inland Australia, M. australiense and M. atactum, showed a high level of genetic similarity over a substantial geographic range, suggesting that they may represent conspecific populations. The taxonomic and biogeographic implications of these findings for Macrobrachium in Australia are discussed.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

New treatments are currently required for the common metabolic diseases obesity and type 2 diabetes. The identification of physiological and  biochemical factors that underlie the metabolic disturbances observed in obesity and type 2 diabetes is a key step in developing better therapeutic outcomes. The discovery of new genes and pathways involved in the  pathogenesis of these diseases is critical to this process, however  identification of genes that contribute to the risk of developing these diseases represents a significant challenge as obesity and type 2 diabetes are complex diseases with many genetic and environmental causes. A number of diverse approaches have been used to discover and validate potential new targets for obesity and diabetes. To date, DNA-based approaches using candidate gene and genome-wide linkage analysis have had limited success in identifying genomic regions or genes involved in the development of these diseases. Recent advances in the ability to evaluate linkage analysis data from large family pedigrees using variance components based linkage analysis show great promise in robustly identifying genomic regions associated with the development of obesity and diabetes. RNA-based technologies such as cDNA microarrays have identified many genes differentially expressed in tissues of healthy and diseased subjects. Using a combined approach, we are endeavouring to focus attention on differentially expressed genes located in chromosomal regions previously linked with obesity and / or diabetes. Using this strategy, we have identified Beacon as a potential new target for obesity and diabetes.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

This paper presents an ensemble MML approach for the discovery of causal models. The component learners are formed based on the MML causal induction methods. Six different ensemble causal induction algorithms are proposed. Our experiential results reveal that (1) the ensemble MML causal induction approach has achieved an improved result compared with any single learner in terms of learning accuracy and correctness; (2) Among all the ensemble causal induction algorithms examined, the weighted voting without seeding algorithm outperforms all the rest; (3) It seems that the ensembled CI algorithms could alleviate the local minimum problem. The only drawback of this method is that the time complexity is increased by δ times, where δ is the ensemble size.

Relevância:

90.00% 90.00%

Publicador:

Resumo:

A major problem for a grid user is the discovery of currently available services. With large number of services, it is beneficial for a user to be able to discover the services that most closely match their requirements. This report shows how to extend some concepts of UDDI such that they are suitable for dynamic parameter based discovery of grid services.