425 resultados para Discriminative Itemsets


Relevância:

20.00% 20.00%

Publicador:

Resumo:

Protein kinases, a family of enzymes, have been viewed as an important signaling intermediary by living organisms for regulating critical biological processes such as memory, hormone response and cell growth. The
unbalanced kinases are known to cause cancer and other diseases. With the increasing efforts to collect, store and disseminate information about the entire kinase family, it not only leads to valuable data set to understand cell regulation but also poses a big challenge to extract valuable knowledge about metabolic pathway from the data. Data mining techniques that have been widely used to find frequent patterns in large datasets can be extended and adapted to kinase data as well. This paper proposes a framework for mining frequent itemsets from the collected kinase dataset. An experiment using AMPK regulation data demonstrates that our approaches are useful and efficient in analyzing kinase regulation data.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The computational approach for identifying promoters on increasingly large genomic sequences has led to many false positives. The biological significance of promoter identification lies in the ability to locate true promoters with and without prior sequence contextual knowledge. Prior approaches to promoter modelling have involved artificial neural networks (ANNs) or hidden Markov models (HMMs), each producing adequate results on small scale identification tasks, i.e. narrow upstream regions. In this work, we present an architecture to support prokaryote promoter identification on large scale genomic sequences, i.e. not limited to narrow upstream regions. The significant contribution involved the hybrid formed via aggregation of the profile HMM with the ANN, via Viterbi scoring optimizations. The benefit obtained using this architecture includes the modelling ability of the profile HMM with the ability of the ANN to associate elements composing the promoter. We present the high effectiveness of the hybrid approach in comparison to profile HMMs and ANNs when used separately. The contribution of Viterbi optimizations is also highlighted for supporting the hybrid architecture in which gains in sensitivity (+0.3), specificity (+0.65) and precision (+0.54) are achieved over existing approaches.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Most face recognition (FR) algorithms require the face images to satisfy certain restrictions in various aspects like view angle, illumination, occlusion, etc. But what is needed in general is the techniques that can recognize any face images recognizable by human beings. This paper provides one potential solution to this problem. A method named Individual Discriminative Subspace (IDS) is proposed for robust face recognition under uncontrolled conditions. IDS is the subspace where only the images from one particular person converge around the origin while those from others scatter. Each IDS can be used to distinguish one individual from others. There is no restriction on the face images fed into the algorithm, which makes it practical for real-life applications. In the experiments, IDS is tested on two large face databases with extensive variations and performs significantly better than 12 existing FR techniques.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Objective: To investigate the reliability and validity of five squat-based loading tests that are clinically appropriate for jumper's knee. The loading tests were step up, double leg squat, double leg squat on a 25-degree decline (decline squat), single leg decline squat, and decline hop. Design: Cross-sectional controlled cohort. Subjects without knee pain comprised controls, those with extensor tendon pain comprised the jumper's knee group. Setting: Institutional athlete study group in Australia. Participants: Fifty-six elite adolescent basketball players participated in this study, thirteen comprised the jumper's knee group, fifteen athletes formed a control group. Intervention: Each subject performed each loading test for baseline and reliability data on the first testing day. Subjects then performed three days of intensive (6 h daily) basketball training, after which each loading test was reexamined. Main outcome measures: Eleven point interval scale for pain. Results: The tests that best detected a change in pain due to intensive workload were the single leg decline squat and single leg decline hop. This study found that decline tests have better discriminative ability than the standard squat to detect change in jumper's knee pain due to intensive training. The typical error for these tests ranged from 0.3 to 0.5, however, caution should be exercised in the interpretation of these reliability figures due to relatively low scores. Conclusions: The single leg decline squat is recommended in the physical assessment of adolescent jumper's knee. The decline squat was selected as the best clinical test over the decline hop because it was easier to standardise performance.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Data perturbation is a popular method to achieve privacy-preserving data mining. However, distorted databases bring enormous overheads to mining algorithms as compared to original databases. In this paper, we present the GrC-FIM algorithm to address the efficiency problem in mining frequent itemsets from distorted databases. Two measures are introduced to overcome the weakness in existing work: firstly, the concept of independent granule is introduced, and granule inference is used to distinguish between non-independent itemsets and independent itemsets. We further prove that the support counts of non-independent itemsets can be directly derived from subitemsets, so that the error-prone reconstruction process can be avoided. This could improve the efficiency of the algorithm, and bring more accurate results; secondly, through the granular-bitmap representation, the support counts can be calculated in an efficient way. The empirical results on representative synthetic and real-world databases indicate that the proposed GrC-FIM algorithm outperforms the popular EMASK algorithm in both the efficiency and the support count reconstruction accuracy.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning a robust projection with a small number of training samples is still a challenging problem in face recognition, especially when the unseen faces have extreme variation in pose, illumination, and facial expression. To address this problem, we propose a framework formulated under statistical learning theory that facilitates robust learning of a discriminative projection. Dimensionality reduction using the projection matrix is combined with a linear classifier in the regularized framework of lasso regression. The projection matrix in conjunction with the classifier parameters are then found by solving an optimization problem over the Stiefel manifold. The experimental results on standard face databases suggest that the proposed method outperforms some recent regularized techniques when the number of training samples is small.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Recognising daily activity patterns of people from low-level sensory data is an important problem. Traditional approaches typically rely on generative models such as the hidden Markov models and training on fully labelled data. While activity data can be readily acquired from pervasive sensors, e.g. in smart environments, providing manual labels to support fully supervised learning is often expensive. In this paper, we propose a new approach based on partially-supervised training of discriminative sequence models such as the conditional random field (CRF) and the maximum entropy Markov model (MEMM). We show that the approach can reduce labelling effort, and at the same time, provides us with the flexibility and accuracy of the discriminative framework. Our experimental results in the video surveillance domain illustrate that these models can perform better than their generative counterpart (i.e. the partially hidden Markov model), even when a substantial amount of labels are unavailable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Learning and understanding the typical patterns in the daily activities and routines of people from low-level sensory data is an important problem in many application domains such as building smart environments, or providing intelligent assistance. Traditional approaches to this problem typically rely on supervised learning and generative models such as the hidden Markov models and its extensions. While activity data can be readily acquired from pervasive sensors, e.g. in smart environments, providing manual labels to support supervised training is often extremely expensive. In this paper, we propose a new approach based on semi-supervised training of partially hidden discriminative models such as the conditional random field (CRF) and the maximum entropy Markov model (MEMM). We show that these models allow us to incorporate both labeled and unlabeled data for learning, and at the same time, provide us with the flexibility and accuracy of the discriminative framework. Our experimental results in the video surveillance domain illustrate that these models can perform better than their generative counterpart, the partially hidden Markov model, even when a substantial amount of labels are unavailable.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

The k-means algorithm is a partitional clustering method. Over 60 years old, it has been successfully used for a variety of problems. The popularity of k-means is in large part a consequence of its simplicity and efficiency. In this paper we are inspired by these appealing properties of k-means in the development of a clustering algorithm which accepts the notion of "positively" and "negatively" labelled data. The goal is to discover the cluster structure of both positive and negative data in a manner which allows for the discrimination between the two sets. The usefulness of this idea is demonstrated practically on the problem of face recognition, where the task of learning the scope of a person's appearance should be done in a manner which allows this face to be differentiated from others.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

We demonstrated a new metal oxides based chemiresistor (MOC), which exhibits fast response/recovery behavior, large sensitivity, and good selectivity to ethanol, enabled by Sr-doped SnO2 nanofibers via simple electrospinning and followed by calcination. Transmission electron microscopy (TEM), scanning electron microscopy (SEM), X-ray diffraction (XRD), and X-ray photoelectron spectra (XPS) were carefully used to characterize their morphology, structure, and composition. The ethanol sensing performances based on Sr-doped SnO2 nanofibers were investigated. Comparing with the pristine SnO2 nanofibers, enhanced ethanol sensing performances (more rapid response/recovery behavior and larger response values) have been achieved owing to the basic SnO2 surface caused by Sr-doping, whereas the acetone sensing performances have been weakened. Thus, good discriminative ability to ethanol from acetone has been realized. Additionally, Sr-doped SnO2 nanofibers also exhibit good selectivity.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

In big data analysis, frequent itemsets mining plays a key role in mining associations, correlations and causality. Since some traditional frequent itemsets mining algorithms are unable to handle massive small files datasets effectively, such as high memory cost, high I/O overhead, and low computing performance, we propose a novel parallel frequent itemsets mining algorithm based on the FP-Growth algorithm and discuss its applications in this paper. First, we introduce a small files processing strategy for massive small files datasets to compensate defects of low read-write speed and low processing efficiency in Hadoop. Moreover, we use MapReduce to redesign the FP-Growth algorithm for implementing parallel computing, thereby improving the overall performance of frequent itemsets mining. Finally, we apply the proposed algorithm to the association analysis of the data from the national college entrance examination and admission of China. The experimental results show that the proposed algorithm is feasible and valid for a good speedup and a higher mining efficiency, and can meet the actual requirements of frequent itemsets mining for massive small files datasets. © 2014 ISSN 2185-2766.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Discriminative training of Gaussian Mixture Models (GMMs) for speech or speaker recognition purposes is usually based on the gradient descent method, in which the iteration step-size, ε, uses to be defined experimentally. In this letter, we derive an equation to adaptively determine ε, by showing that the second-order Newton-Raphson iterative method to find roots of equations is equivalent to the gradient descent algorithm. © 2010 IEEE.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

This paper introduces a new tool for pattern recognition. Called the Discriminative Paraconsistent Machine (DPM), it is based on a supervised discriminative model training that incorporates paraconsistency criteria and allows an intelligent treatment of contradictions and uncertainties. DPMs can be applied to solve problems in many fields of science, using the tests and discussions presented here, which demonstrate their efficacy and usefulness. Major difficulties and challenges that were overcome consisted basically in establishing the proper model with which to represent the concept of paraconsistency.

Relevância:

20.00% 20.00%

Publicador:

Resumo:

Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq)