939 resultados para Supervised classifier


Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: Internet users are increasingly using the worldwide web to search for information relating to their health. This situation makes it necessary to create specialized tools capable of supporting users in their searches. Objective: To apply and compare strategies that were developed to investigate the use of the Portuguese version of Medical Subject Headings (MeSH) for constructing an automated classifier for Brazilian Portuguese-language web-based content within or outside of the field of healthcare, focusing on the lay public. Methods: 3658 Brazilian web pages were used to train the classifier and 606 Brazilian web pages were used to validate it. The strategies proposed were constructed using content-based vector methods for text classification, such that Naive Bayes was used for the task of classifying vector patterns with characteristics obtained through the proposed strategies. Results: A strategy named InDeCS was developed specifically to adapt MeSH for the problem that was put forward. This approach achieved better accuracy for this pattern classification task (0.94 sensitivity, specificity and area under the ROC curve). Conclusions: Because of the significant results achieved by InDeCS, this tool has been successfully applied to the Brazilian healthcare search portal known as Busca Saude. Furthermore, it could be shown that MeSH presents important results when used for the task of classifying web-based content focusing on the lay public. It was also possible to show from this study that MeSH was able to map out mutable non-deterministic characteristics of the web. (c) 2010 Elsevier Inc. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Age-related changes in running kinematics have been reported in the literature using classical inferential statistics. However, this approach has been hampered by the increased number of biomechanical gait variables reported and subsequently the lack of differences presented in these studies. Data mining techniques have been applied in recent biomedical studies to solve this problem using a more general approach. In the present work, we re-analyzed lower extremity running kinematic data of 17 young and 17 elderly male runners using the Support Vector Machine (SVM) classification approach. In total, 31 kinematic variables were extracted to train the classification algorithm and test the generalized performance. The results revealed different accuracy rates across three different kernel methods adopted in the classifier, with the linear kernel performing the best. A subsequent forward feature selection algorithm demonstrated that with only six features, the linear kernel SVM achieved 100% classification performance rate, showing that these features provided powerful combined information to distinguish age groups. The results of the present work demonstrate potential in applying this approach to improve knowledge about the age-related differences in running gait biomechanics and encourages the use of the SVM in other clinical contexts. (C) 2010 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

A hybrid system to automatically detect, locate and classify disturbances affecting power quality in an electrical power system is presented in this paper. The disturbances characterized are events from an actual power distribution system simulated by the ATP (Alternative Transients Program) software. The hybrid approach introduced consists of two stages. In the first stage, the wavelet transform (WT) is used to detect disturbances in the system and to locate the time of their occurrence. When such an event is flagged, the second stage is triggered and various artificial neural networks (ANNs) are applied to classify the data measured during the disturbance(s). A computational logic using WTs and ANNs together with a graphical user interface (GU) between the algorithm and its end user is then implemented. The results obtained so far are promising and suggest that this approach could lead to a useful application in an actual distribution system. (C) 2009 Elsevier Ltd. All rights reserved.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a framework for detection of human skin in digital images is proposed. This framework is composed of a training phase and a detection phase. A skin class model is learned during the training phase by processing several training images in a hybrid and incremental fuzzy learning scheme. This scheme combines unsupervised-and supervised-learning: unsupervised, by fuzzy clustering, to obtain clusters of color groups from training images; and supervised to select groups that represent skin color. At the end of the training phase, aggregation operators are used to provide combinations of selected groups into a skin model. In the detection phase, the learned skin model is used to detect human skin in an efficient way. Experimental results show robust and accurate human skin detection performed by the proposed framework.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents the development of a prototype of a tubular linear induction motor applied to onshore oil exploitation, named MAT AE OS (which is the Portuguese acronym for Tubular Asynchronous Motor for Onshore Oil Exploitation). The function of this motor is to directly drive the sucker-rod pump installed in the down hole of the oil well. Considering the drawbacks and operational costs of the conventional oil extraction method, which is based on the walking beam and rod, string system, the developed prototype is intended to become a feasible alternative from both technical and economic points of view. At the present time, the MAT AE OS prototype is installed in a test bench at the Applied Electromagnetism Laboratory at the Escola Politecnica da Universidade de Sao Paulo. The complete testing system is controlled and supervised by special software, enabling good flexibility in operation, data acquisition, and performance analysis. The test results indicate that the motor develops a constant lift force along the pumping cycle, as shown by the measured dynamometric charts. Also, the evaluated electromechanical performance seems to be superior to that obtained by the traditional method. The system utilizing the MAT AE OS prototype allows the complete elimination of the rod string sets required by the conventional equipment, indicating that the new system may advantageously replace the surface mechanical components presently utilized.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

We propose a robust and low complexity scheme to estimate and track carrier frequency from signals traveling under low signal-to-noise ratio (SNR) conditions in highly nonstationary channels. These scenarios arise in planetary exploration missions subject to high dynamics, such as the Mars exploration rover missions. The method comprises a bank of adaptive linear predictors (ALP) supervised by a convex combiner that dynamically aggregates the individual predictors. The adaptive combination is able to outperform the best individual estimator in the set, which leads to a universal scheme for frequency estimation and tracking. A simple technique for bias compensation considerably improves the ALP performance. It is also shown that retrieval of frequency content by a fast Fourier transform (FFT)-search method, instead of only inspecting the angle of a particular root of the error predictor filter, enhances performance, particularly at very low SNR levels. Simple techniques that enforce frequency continuity improve further the overall performance. In summary we illustrate by extensive simulations that adaptive linear prediction methods render a robust and competitive frequency tracking technique.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

As is well known, Hessian-based adaptive filters (such as the recursive-least squares algorithm (RLS) for supervised adaptive filtering, or the Shalvi-Weinstein algorithm (SWA) for blind equalization) converge much faster than gradient-based algorithms [such as the least-mean-squares algorithm (LMS) or the constant-modulus algorithm (CMA)]. However, when the problem is tracking a time-variant filter, the issue is not so clear-cut: there are environments for which each family presents better performance. Given this, we propose the use of a convex combination of algorithms of different families to obtain an algorithm with superior tracking capability. We show the potential of this combination and provide a unified theoretical model for the steady-state excess mean-square error for convex combinations of gradient- and Hessian-based algorithms, assuming a random-walk model for the parameter variations. The proposed model is valid for algorithms of the same or different families, and for supervised (LMS and RLS) or blind (CMA and SWA) algorithms.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This work presents a method for predicting resource availability in opportunistic grids by means of use pattern analysis (UPA), a technique based on non-supervised learning methods. This prediction method is based on the assumption of the existence of several classes of computational resource use patterns, which can be used to predict the resource availability. Trace-driven simulations validate this basic assumptions, which also provide the parameter settings for the accurate learning of resource use patterns. Experiments made with an implementation of the UPA method show the feasibility of its use in the scheduling of grid tasks with very little overhead. The experiments also demonstrate the method`s superiority over other predictive and non-predictive methods. An adaptative prediction method is suggested to deal with the lack of training data at initialization. Further adaptative behaviour is motivated by experiments which show that, in some special environments, reliable resource use patterns may not always be detected. Copyright (C) 2009 John Wiley & Sons, Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Host responses following exposure to Mycobacterium tuberculosis (TB) are complex and can significantly affect clinical outcome. These responses, which are largely mediated by complex immune mechanisms involving peripheral blood cells (PBCs) such as T-lymphocytes, NK cells and monocyte-derived macrophages, have not been fully characterized. We hypothesize that different clinical outcome following TB exposure will be uniquely reflected in host gene expression profiles, and expression profiling of PBCs can be used to discriminate between different TB infectious outcomes. In this study, microarray analysis was performed on PBCs from three TB groups (BCG-vaccinated, latent TB infection, and active TB infection) and a control healthy group. Supervised learning algorithms were used to identify signature genomic responses that differentiate among group samples. Gene Set Enrichment Analysis was used to determine sets of genes that were co-regulated. Multivariate permutation analysis (p < 0.01) gave 645 genes differentially expressed among the four groups, with both distinct and common patterns of gene expression observed for each group. A 127-probeset, representing 77 known genes, capable of accurately classifying samples into their respective groups was identified. In addition, 13 insulin-sensitive genes were found to be differentially regulated in all three TB infected groups, underscoring the functional association between insulin signaling pathway and TB infection. Published by Elsevier Ltd.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Recently, we have built a classification model that is capable of assigning a given sesquiterpene lactone (STL) into exactly one tribe of the plant family Asteraceae from which the STL has been isolated. Although many plant species are able to biosynthesize a set of peculiar compounds, the occurrence of the same secondary metabolites in more than one tribe of Asteraceae is frequent. Building on our previous work, in this paper, we explore the possibility of assigning an STL to more than one tribe (class) simultaneously. When an object may belong to more than one class simultaneously, it is called multilabeled. In this work, we present a general overview of the techniques available to examine multilabeled data. The problem of evaluating the performance of a multilabeled classifier is discussed. Two particular multilabeled classification methods-cross-training with support vector machines (ct-SVM) and multilabeled k-nearest neighbors (M-L-kNN)were applied to the classification of the STLs into seven tribes from the plant family Asteraceae. The results are compared to a single-label classification and are analyzed from a chemotaxonomic point of view. The multilabeled approach allowed us to (1) model the reality as closely as possible, (2) improve our understanding of the relationship between the secondary metabolite profiles of different Asteraceae tribes, and (3) significantly decrease the number of plant sources to be considered for finding a certain STL. The presented classification models are useful for the targeted collection of plants with the objective of finding plant sources of natural compounds that are biologically active or possess other specific properties of interest.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The supervised pattern recognition methods K-Nearest Neighbors (KNN), stepwise discriminant analysis (SDA), and soft independent modelling of class analogy (SIMCA) were employed in this work with the aim to investigate the relationship between the molecular structure of 27 cannabinoid compounds and their analgesic activity. Previous analyses using two unsupervised pattern recognition methods (PCA-principal component analysis and HCA-hierarchical cluster analysis) were performed and five descriptors were selected as the most relevants for the analgesic activity of the compounds studied: R (3) (charge density on substituent at position C(3)), Q (1) (charge on atom C(1)), A (surface area), log P (logarithm of the partition coefficient) and MR (molecular refractivity). The supervised pattern recognition methods (SDA, KNN, and SIMCA) were employed in order to construct a reliable model that can be able to predict the analgesic activity of new cannabinoid compounds and to validate our previous study. The results obtained using the SDA, KNN, and SIMCA methods agree perfectly with our previous model. Comparing the SDA, KNN, and SIMCA results with the PCA and HCA ones we could notice that all multivariate statistical methods classified the cannabinoid compounds studied in three groups exactly in the same way: active, moderately active, and inactive.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper examines the article system in interlanguage grammar focusing on Japanese learners of English, whose native language lacks articles. It will be demonstrated that for the acquisition of the English article system, count/mass distinctions and definiteness are the crucial factors. Although Japanese does not employ the article system to encode these aspects, it will be argued that they are nevertheless syntactically encoded through its classifier system. Hence, the problem for these learners must be to map these features onto the appropriate surface forms as the Missing Surface Inflection Hypothesis predicts (Prévost & White 2000). This suggestion will further be supported empirically by a fill-in-the article task. It will be concluded that these Japanese learners understand the English article system fairly well, possibly due to their native language, yet have problems with realizing the relevant features (i.e. count/mass distinctions and definiteness) in the target language.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Introduction: This paper reviews studies of physical activity interventions in health care settings to determine effects on physical activity and/or fitness and characteristics of successful interventions. Methods: Studies testing interventions to promote physical activity in health care settings for primary prevention (patients without disease) and secondary prevention (patients with cardiovascular disease [CVD]) were identified by computerized search methods and reference lists of reviews and articles. Inclusion criteria included assignment to intervention and control groups, physical activity or cardiorespiratory fitness outcome measures, and, for the secondary prevention studies, measurement 12 or more months after randomization. The number of studies with statistically significant effects was determined overall as well as for studies testing interventions with various characteristics. Results: Twelve studies of primary prevention were identified, seven of which were randomized. Three of four randomized studies with short-term measurement (4 weeks to 3 months after randomization), and two of five randomized studies with long-term measurement (6 months after randomization) achieved significant effects on physical activity. Twenty-four randomized studies of CVD secondary prevention were identified; 13 achieved significant effects on activity and/or fitness at twelve or more months. Studies with measurement at two time points showed decaying effects over time, particularly if the intervention were discontinued. Successful interventions contained multiple contacts, behavioral approaches, supervised exercise, provision of equipment, and/or continuing intervention. Many studies had methodologic problems such as low follow-up rates. Conclusion: Interventions in health care settings can increase physical activity for both primary and secondary prevention. Long-term effects are more likely with continuing intervention and multiple intervention components such as supervised exercise, provision of equipment, and behavioral approaches. Recommendations for additional research are given.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

SETTING: Hlabisa Tuberculosis Programme, Hlabisa, South Africa. OBJECTIVE: To determine trends in and risk factors for interruption of tuberculosis treatment. METHODS: Data were extracted from the control programme database starting in 1991. Temporal trends in treatment interruption are described; independent risk factors for treatment interruption were determined with a multiple logistic regression model, and Kaplan-Meier survival curves for treatment interruption were constructed for patients treated in 1994-1995. RESULTS: Overall 629 of 3610 surviving patients (17%) failed to complete treatment; this proportion increased from 11% (n = 79) in 1991/1992 to 22% (n = 201) in 1996. Independent risk factors for treatment interruption were diagnosis between 1994-1996 compared with 1991-1393 (odds ratio [OR] 1.9, 95% confidence interval [CT] 1.6-2.4); human immunodeficiency virus (HIV) positivity compared with HIV negativity (OR 1.8, 95% CI 1.4-2.4); supervised by village clinic compared with community health worker (OR 1.9, 95% CI 1.4-2.6); and male versus female sex (OR 1.3, 95% CI 1.1-1.6). Few patients interrupted treatment during the first 2 weeks, and the treatment interruption rate thereafter was constant at 1% per 14 days. CONCLUSIONS: Frequency of treatment interruption from this programme has increased recently. The strongest risk factor was year of diagnosis, perhaps reflecting the impact of an increased caseload on programme performance. Ensuring adherence to therapy in communities with a high level of migration remains a challenge even within community-based directly observed therapy programmes.