287 resultados para Discriminative Itemsets
Resumo:
PURPOSE. This study aimed to assess the discriminative validity of the Brazilian version of the Patient Health Questionnaire (PHQ-9) and of its reduced version (PHQ-2). DESIGN AND METHODS. The sample consisted of 177 women (60 cases of depression and 117 noncases). The SCID-IV was used as the gold standard. FINDINGS. For the PHQ-9, a cutoff score equal to or higher than 10 proved to be the most adequate for the screening of depression, whereas the best cutoff score for the PHQ-2 was found to lie between 3 and 4. PRACTICE IMPLICATIONS. The systematic use of these instruments in nursing and in the context of primary health care could favor the early detection of depression.
Resumo:
OBJECTIVE: To evaluate the discriminative and diagnostic values of neuropsychological tests for identifying schizophrenia patients. METHODS: A cross-sectional study with 36 male schizophrenia outpatients and 72 healthy matched volunteers was carried out. Participants underwent the following neuropsychological tests: Wisconsin Card Sorting test, Verbal Fluency, Stroop test, Mini Mental State Examination, and Spatial Recognition Span. Sensitivity and specificity estimated the diagnostic value of tests with cutoffs obtained using Receiver Operating Characteristic curves. The latent class model (diagnosis of schizophrenia) was used as gold standard. RESULTS: Although patients presented lower scores in most tests, the highest canonical function for the discriminant analysis was 0.57 (Verbal Fluency M). The best sensitivity and specificity were obtained in the Verbal Fluency M test (75 and 65, respectively). CONCLUSIONS: The neuropsychological tests showed moderate diagnostic value for the identification of schizophrenia patients. These findings suggested that the cognitive impairment measured by these tests might not be homogeneous among schizophrenia patients.
Resumo:
Speaker Recognition, Speaker Verification, Sparse Kernel Logistic Regression, Support Vector Machine
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
The problem of the relevance and the usefulness of extracted association rules is of primary importance because, in the majority of cases, real-life databases lead to several thousands association rules with high confidence and among which are many redundancies. Using the closure of the Galois connection, we define two new bases for association rules which union is a generating set for all valid association rules with support and confidence. These bases are characterized using frequent closed itemsets and their generators; they consist of the non-redundant exact and approximate association rules having minimal antecedents and maximal consequences, i.e. the most relevant association rules. Algorithms for extracting these bases are presented and results of experiments carried out on real-life databases show that the proposed bases are useful, and that their generation is not time consuming.
Resumo:
We investigate whether dimensionality reduction using a latent generative model is beneficial for the task of weakly supervised scene classification. In detail, we are given a set of labeled images of scenes (for example, coast, forest, city, river, etc.), and our objective is to classify a new image into one of these categories. Our approach consists of first discovering latent ";topics"; using probabilistic Latent Semantic Analysis (pLSA), a generative model from the statistical text literature here applied to a bag of visual words representation for each image, and subsequently, training a multiway classifier on the topic distribution vector for each image. We compare this approach to that of representing each image by a bag of visual words vector directly and training a multiway classifier on these vectors. To this end, we introduce a novel vocabulary using dense color SIFT descriptors and then investigate the classification performance under changes in the size of the visual vocabulary, the number of latent topics learned, and the type of discriminative classifier used (k-nearest neighbor or SVM). We achieve superior classification performance to recent publications that have used a bag of visual word representation, in all cases, using the authors' own data sets and testing protocols. We also investigate the gain in adding spatial information. We show applications to image retrieval with relevance feedback and to scene classification in videos
Resumo:
We present a method to enhance fault localization for software systems based on a frequent pattern mining algorithm. Our method is based on a large set of test cases for a given set of programs in which faults can be detected. The test executions are recorded as function call trees. Based on test oracles the tests can be classified into successful and failing tests. A frequent pattern mining algorithm is used to identify frequent subtrees in successful and failing test executions. This information is used to rank functions according to their likelihood of containing a fault. The ranking suggests an order in which to examine the functions during fault analysis. We validate our approach experimentally using a subset of Siemens benchmark programs.
Resumo:
Frequent pattern discovery in structured data is receiving an increasing attention in many application areas of sciences. However, the computational complexity and the large amount of data to be explored often make the sequential algorithms unsuitable. In this context high performance distributed computing becomes a very interesting and promising approach. In this paper we present a parallel formulation of the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The application is characterized by a highly irregular tree-structured computation. No estimation is available for task workloads, which show a power-law distribution in a wide range. The proposed approach allows dynamic resource aggregation and provides fault and latency tolerance. These features make the distributed application suitable for multi-domain heterogeneous environments, such as computational Grids. The distributed application has been evaluated on the well known National Cancer Institute’s HIV-screening dataset.
Resumo:
Objective: The aim of this study was to verify the discriminative power of the most widely used pain assessment instruments. Methods: The sample consisted of 279 subjects divided into Fibromyalgia Group (FM- 205 patients with fibromyalgia) and Control Group (CG-74 healthy subjects), mean age 49.29 +/- 10.76 years. Only 9 subjects were male, 6 in FM and 3 in CG. FM were outpatients from the Rheumatology Clinic of the University of Sao Paulo - Hospital das Clinicas (HCFMUSP); the CG included people accompanying patients and hospital staff with similar socio-demographic characteristics. Three instruments were used to assess pain: the McGill Pain Questionnaire (MPQ), the Visual Analog Scale (VAS), and the Dolorimetry, to measure pain threshold on tender points (generating the TP index). In order to assess the discriminative power of the instruments, the measurements obtained were submitted to descriptive analysis and inferential analysis using ROC Curve - sensibility (S), specificity (S I) and area under the curve (AUC) - and Contingence tables with Chi-square Test and odds ratio. Significance level was 0.05. Results: Higher sensibility, specificity and area under the curve was obtained by VAS (80%, 80% and 0.864, respectively), followed by Dolorimetry (S 77%, S177% and AUC 0.851), McGill Sensory (S 72%, S167% and AUC 0.765) and McGill Affective (S 69%, S1 67% and AUC 0.753). Conclusions: VAS presented the higher sensibility, specificity and AUC, showing the greatest discriminative power among the instruments. However, these values are considerably similar to those of Dolorimetry.
Resumo:
Discriminative training of Gaussian Mixture Models (GMMs) for speech or speaker recognition purposes is usually based on the gradient descent method, in which the iteration step-size, ε, uses to be defined experimentally. In this letter, we derive an equation to adaptively determine ε, by showing that the second-order Newton-Raphson iterative method to find roots of equations is equivalent to the gradient descent algorithm. © 2010 IEEE.