855 resultados para optimal feature selection


Relevância:

80.00% 80.00%

Publicador:

Resumo:

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES)

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Background: A current challenge in gene annotation is to define the gene function in the context of the network of relationships instead of using single genes. The inference of gene networks (GNs) has emerged as an approach to better understand the biology of the system and to study how several components of this network interact with each other and keep their functions stable. However, in general there is no sufficient data to accurately recover the GNs from their expression levels leading to the curse of dimensionality, in which the number of variables is higher than samples. One way to mitigate this problem is to integrate biological data instead of using only the expression profiles in the inference process. Nowadays, the use of several biological information in inference methods had a significant increase in order to better recover the connections between genes and reduce the false positives. What makes this strategy so interesting is the possibility of confirming the known connections through the included biological data, and the possibility of discovering new relationships between genes when observed the expression data. Although several works in data integration have increased the performance of the network inference methods, the real contribution of adding each type of biological information in the obtained improvement is not clear. Methods: We propose a methodology to include biological information into an inference algorithm in order to assess its prediction gain by using biological information and expression profile together. We also evaluated and compared the gain of adding four types of biological information: (a) protein-protein interaction, (b) Rosetta stone fusion proteins, (c) KEGG and (d) KEGG+GO. Results and conclusions: This work presents a first comparison of the gain in the use of prior biological information in the inference of GNs by considering the eukaryote (P. falciparum) organism. Our results indicates that information based on direct interaction can produce a higher improvement in the gain than data about a less specific relationship as GO or KEGG. Also, as expected, the results show that the use of biological information is a very important approach for the improvement of the inference. We also compared the gain in the inference of the global network and only the hubs. The results indicates that the use of biological information can improve the identification of the most connected proteins.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Hematopoietic cell transplantation (HCT) is an emerging therapy for patients with severe autoimmune diseases (AID). We report data on 368 patients with AID who underwent HCT in 64 North and South American transplantation centers reported to the Center for International Blood and Marrow Transplant Research between 1996 and 2009. Most of the HCTs involved autologous grafts (n = 339); allogeneic HCT (n = 29) was done mostly in children. The most common indications for HCT were multiple sclerosis, systemic sclerosis, and systemic lupus erythematosus. The median age at transplantation was 38 years for autologous HCT and 25 years for allogeneic HCT. The corresponding times from diagnosis to HCT were 35 months and 24 months. Three-year overall survival after autologous HCT was 86% (95% confidence interval [CI], 81%-91%). Median follow-up of survivors was 31 months (range, 1-144 months). The most common causes of death were AID progression, infections, and organ failure. On multivariate analysis, the risk of death was higher in patients at centers that performed fewer than 5 autologous HCTs (relative risk, 3.5; 95% CI, 1.1-11.1; P = .03) and those that performed 5 to 15 autologous HCTs for AID during the study period (relative risk, 4.2; 95% CI, 1.5-11.7; P = .006) compared with patients at centers that performed more than 15 autologous HCTs for AID during the study period. AID is an emerging indication for HCT in the region. Collaboration of hematologists and other disease specialists with an outcomes database is important to promote optimal patient selection, analysis of the impact of prognostic variables and long-term outcomes, and development of clinical trials. Biol Blood Marrow Transplant 18: 1471-1478 (2012) (C) 2012 Published by Elsevier Inc. on behalf of American Society for Blood and Marrow Transplantation

Relevância:

80.00% 80.00%

Publicador:

Resumo:

A set of predictor variables is said to be intrinsically multivariate predictive (IMP) for a target variable if all properly contained subsets of the predictor set are poor predictors of the. target but the full set predicts the target with great accuracy. In a previous article, the main properties of IMP Boolean variables have been analytically described, including the introduction of the IMP score, a metric based on the coefficient of determination (CoD) as a measure of predictiveness with respect to the target variable. It was shown that the IMP score depends on four main properties: logic of connection, predictive power, covariance between predictors and marginal predictor probabilities (biases). This paper extends that work to a broader context, in an attempt to characterize properties of discrete Bayesian networks that contribute to the presence of variables (network nodes) with high IMP scores. We have found that there is a relationship between the IMP score of a node and its territory size, i.e., its position along a pathway with one source: nodes far from the source display larger IMP scores than those closer to the source, and longer pathways display larger maximum IMP scores. This appears to be a consequence of the fact that nodes with small territory have larger probability of having highly covariate predictors, which leads to smaller IMP scores. In addition, a larger number of XOR and NXOR predictive logic relationships has positive influence over the maximum IMP score found in the pathway. This work presents analytical results based on a simple structure network and an analysis involving random networks constructed by computational simulations. Finally, results from a real Bayesian network application are provided. (C) 2012 Elsevier Inc. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Recent experimental evidence has suggested a neuromodulatory deficit in Alzheimer's disease (AD). In this paper, we present a new electroencephalogram (EEG) based metric to quantitatively characterize neuromodulatory activity. More specifically, the short-term EEG amplitude modulation rate-of-change (i.e., modulation frequency) is computed for five EEG subband signals. To test the performance of the proposed metric, a classification task was performed on a database of 32 participants partitioned into three groups of approximately equal size: healthy controls, patients diagnosed with mild AD, and those with moderate-to-severe AD. To gauge the benefits of the proposed metric, performance results were compared with those obtained using EEG spectral peak parameters which were recently shown to outperform other conventional EEG measures. Using a simple feature selection algorithm based on area-under-the-curve maximization and a support vector machine classifier, the proposed parameters resulted in accuracy gains, relative to spectral peak parameters, of 21.3% when discriminating between the three groups and by 50% when mild and moderate-to-severe groups were merged into one. The preliminary findings reported herein provide promising insights that automated tools may be developed to assist physicians in very early diagnosis of AD as well as provide researchers with a tool to automatically characterize cross-frequency interactions and their changes with disease.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Several recent studies in literature have identified brain morphological alterations associated to Borderline Personality Disorder (BPD) patients. These findings are reported by studies based on voxel-based-morphometry analysis of structural MRI data, comparing mean gray-matter concentration between groups of BPD patients and healthy controls. On the other hand, mean differences between groups are not informative about the discriminative value of neuroimaging data to predict the group of individual subjects. In this paper, we go beyond mean differences analyses, and explore to what extent individual BPD patients can be differentiated from controls (25 subjects in each group), using a combination of automated-morphometric tools for regional cortical thickness/volumetric estimation and Support Vector Machine classifier. The approach included a feature selection step in order to identify the regions containing most discriminative information. The accuracy of this classifier was evaluated using the leave-one-subject-out procedure. The brain regions indicated as containing relevant information to discriminate groups were the orbitofrontal, rostral anterior cingulate, posterior cingulate, middle temporal cortices, among others. These areas, which are distinctively involved in emotional and affect regulation of BPD patients, were the most informative regions to achieve both sensitivity and specificity values of 80% in SVM classification. The findings suggest that this new methodology can add clinical and potential diagnostic value to neuroimaging of psychiatric disorders. (C) 2012 Elsevier Ltd. All rights reserved.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Advances in biomedical signal acquisition systems for motion analysis have led to lowcost and ubiquitous wearable sensors which can be used to record movement data in different settings. This implies the potential availability of large amounts of quantitative data. It is then crucial to identify and to extract the information of clinical relevance from the large amount of available data. This quantitative and objective information can be an important aid for clinical decision making. Data mining is the process of discovering such information in databases through data processing, selection of informative data, and identification of relevant patterns. The databases considered in this thesis store motion data from wearable sensors (specifically accelerometers) and clinical information (clinical data, scores, tests). The main goal of this thesis is to develop data mining tools which can provide quantitative information to the clinician in the field of movement disorders. This thesis will focus on motor impairment in Parkinson's disease (PD). Different databases related to Parkinson subjects in different stages of the disease were considered for this thesis. Each database is characterized by the data recorded during a specific motor task performed by different groups of subjects. The data mining techniques that were used in this thesis are feature selection (a technique which was used to find relevant information and to discard useless or redundant data), classification, clustering, and regression. The aims were to identify high risk subjects for PD, characterize the differences between early PD subjects and healthy ones, characterize PD subtypes and automatically assess the severity of symptoms in the home setting.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

The present work belongs to the PRANA project, the first extensive field campaign of observation of atmospheric emission spectra covering the Far InfraRed spectral region, for more than two years. The principal deployed instrument is REFIR-PAD, a Fourier transform spectrometer used by us to study Antarctic cloud properties. A dataset covering the whole 2013 has been analyzed and, firstly, a selection of good quality spectra is performed, using, as thresholds, radiance values in few chosen spectral regions. These spectra are described in a synthetic way averaging radiances in selected intervals, converting them into BTs and finally considering the differences between each pair of them. A supervised feature selection algorithm is implemented with the purpose to select the features really informative about the presence, the phase and the type of cloud. Hence, training and test sets are collected, by means of Lidar quick-looks. The supervised classification step of the overall monthly datasets is performed using a SVM. On the base of this classification and with the help of Lidar observations, 29 non-precipitating ice cloud case studies are selected. A single spectrum, or at most an average over two or three spectra, is processed by means of the retrieval algorithm RT-RET, exploiting some main IR window channels, in order to extract cloud properties. Retrieved effective radii and optical depths are analyzed, to compare them with literature studies and to evaluate possible seasonal trends. Finally, retrieval output atmospheric profiles are used as inputs for simulations, assuming two different crystal habits, with the aim to examine our ability to reproduce radiances in the FIR. Substantial mis-estimations are found for FIR micro-windows: a high variability is observed in the spectral pattern of simulation deviations from measured spectra and an effort to link these deviations to cloud parameters has been performed.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Nowadays communication is switching from a centralized scenario, where communication media like newspapers, radio, TV programs produce information and people are just consumers, to a completely different decentralized scenario, where everyone is potentially an information producer through the use of social networks, blogs, forums that allow a real-time worldwide information exchange. These new instruments, as a result of their widespread diffusion, have started playing an important socio-economic role. They are the most used communication media and, as a consequence, they constitute the main source of information enterprises, political parties and other organizations can rely on. Analyzing data stored in servers all over the world is feasible by means of Text Mining techniques like Sentiment Analysis, which aims to extract opinions from huge amount of unstructured texts. This could lead to determine, for instance, the user satisfaction degree about products, services, politicians and so on. In this context, this dissertation presents new Document Sentiment Classification methods based on the mathematical theory of Markov Chains. All these approaches bank on a Markov Chain based model, which is language independent and whose killing features are simplicity and generality, which make it interesting with respect to previous sophisticated techniques. Every discussed technique has been tested in both Single-Domain and Cross-Domain Sentiment Classification areas, comparing performance with those of other two previous works. The performed analysis shows that some of the examined algorithms produce results comparable with the best methods in literature, with reference to both single-domain and cross-domain tasks, in $2$-classes (i.e. positive and negative) Document Sentiment Classification. However, there is still room for improvement, because this work also shows the way to walk in order to enhance performance, that is, a good novel feature selection process would be enough to outperform the state of the art. Furthermore, since some of the proposed approaches show promising results in $2$-classes Single-Domain Sentiment Classification, another future work will regard validating these results also in tasks with more than $2$ classes.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Osteoarticular allograft transplantation is a popular treatment method in wide surgical resections with large defects. For this reason hospitals are building bone data banks. Performing the optimal allograft selection on bone banks is crucial to the surgical outcome and patient recovery. However, current approaches are very time consuming hindering an efficient selection. We present an automatic method based on registration of femur bones to overcome this limitation. We introduce a new regularization term for the log-domain demons algorithm. This term replaces the standard Gaussian smoothing with a femur specific polyaffine model. The polyaffine femur model is constructed with two affine (femoral head and condyles) and one rigid (shaft) transformation. Our main contribution in this paper is to show that the demons algorithm can be improved in specific cases with an appropriate model. We are not trying to find the most optimal polyaffine model of the femur, but the simplest model with a minimal number of parameters. There is no need to optimize for different number of regions, boundaries and choice of weights, since this fine tuning will be done automatically by a final demons relaxation step with Gaussian smoothing. The newly developed synthesis approach provides a clear anatomically motivated modeling contribution through the specific three component transformation model, and clearly shows a performance improvement (in terms of anatomical meaningful correspondences) on 146 CT images of femurs compared to a standard multiresolution demons. In addition, this simple model improves the robustness of the demons while preserving its accuracy. The ground truth are manual measurements performed by medical experts.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Aim of this paper is to evaluate the diagnostic contribution of various types of texture features in discrimination of hepatic tissue in abdominal non-enhanced Computed Tomography (CT) images. Regions of Interest (ROIs) corresponding to the classes: normal liver, cyst, hemangioma, and hepatocellular carcinoma were drawn by an experienced radiologist. For each ROI, five distinct sets of texture features are extracted using First Order Statistics (FOS), Spatial Gray Level Dependence Matrix (SGLDM), Gray Level Difference Method (GLDM), Laws' Texture Energy Measures (TEM), and Fractal Dimension Measurements (FDM). In order to evaluate the ability of the texture features to discriminate the various types of hepatic tissue, each set of texture features, or its reduced version after genetic algorithm based feature selection, was fed to a feed-forward Neural Network (NN) classifier. For each NN, the area under Receiver Operating Characteristic (ROC) curves (Az) was calculated for all one-vs-all discriminations of hepatic tissue. Additionally, the total Az for the multi-class discrimination task was estimated. The results show that features derived from FOS perform better than other texture features (total Az: 0.802+/-0.083) in the discrimination of hepatic tissue.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

In this paper, a computer-aided diagnostic (CAD) system for the classification of hepatic lesions from computed tomography (CT) images is presented. Regions of interest (ROIs) taken from nonenhanced CT images of normal liver, hepatic cysts, hemangiomas, and hepatocellular carcinomas have been used as input to the system. The proposed system consists of two modules: the feature extraction and the classification modules. The feature extraction module calculates the average gray level and 48 texture characteristics, which are derived from the spatial gray-level co-occurrence matrices, obtained from the ROIs. The classifier module consists of three sequentially placed feed-forward neural networks (NNs). The first NN classifies into normal or pathological liver regions. The pathological liver regions are characterized by the second NN as cyst or "other disease." The third NN classifies "other disease" into hemangioma or hepatocellular carcinoma. Three feature selection techniques have been applied to each individual NN: the sequential forward selection, the sequential floating forward selection, and a genetic algorithm for feature selection. The comparative study of the above dimensionality reduction methods shows that genetic algorithms result in lower dimension feature vectors and improved classification performance.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Fossil pollen data from stratigraphic cores are irregularly spaced in time due to non-linear age-depth relations. Moreover, their marginal distributions may vary over time. We address these features in a nonparametric regression model with errors that are monotone transformations of a latent continuous-time Gaussian process Z(T). Although Z(T) is unobserved, due to monotonicity, under suitable regularity conditions, it can be recovered facilitating further computations such as estimation of the long-memory parameter and the Hermite coefficients. The estimation of Z(T) itself involves estimation of the marginal distribution function of the regression errors. These issues are considered in proposing a plug-in algorithm for optimal bandwidth selection and construction of confidence bands for the trend function. Some high-resolution time series of pollen records from Lago di Origlio in Switzerland, which go back ca. 20,000 years are used to illustrate the methods.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

BACKGROUND AND PURPOSE Stent retrievers have become an important tool for the treatment of acute ischemic stroke. The aim of this study was to analyze outcome and complications in a large cohort of patients with stroke treated with the Solitaire stent retriever. The study also included patients who did not meet standard inclusion criteria for endovascular treatment: low or high baseline National Institutes of Health Stroke Scale score, ≥80 years of age, extensive ischemic signs in middle cerebral artery territory, and time from symptom onset to endovascular intervention>8 hours. METHODS Consecutive patients with acute anterior circulation stroke treated with the Solitaire FR were analyzed. Data on characteristics of endovascular interventions, complications, and clinical outcome were collected prospectively. Patients who met standard inclusion criteria were compared with those who did not. RESULTS A total of 227 patients were included. Mean age was 68.2±14.7 years, and median National Institutes of Health Stroke Scale score on admission was 16 (range, 2-36). Reperfusion was successful (thrombolysis in cerebral infarction, 2b-3) in 70.9%. Outcome was favorable (modified Rankin Scale, 0-2) in 57.7% of patients who met standard inclusion criteria and 30.3% of those who did not. The rates for symptomatic intracranial hemorrhage were 3.7% and 13.1%, for death 11.4% and 33.8%, and for symptomatic intraprocedural complications 2.5% and 4.8%, respectively. CONCLUSIONS Patients<80 years of age, without extensive pretreatment ischemic signs, and baseline National Institutes of Health Stroke Scale score≤30 had high rates of favorable outcome and low periprocedural complication rates after Solitaire thrombectomy. Successful reperfusion was also common in patients not fulfilling standard inclusion criteria, but worse clinical outcomes warrant further research with a special focus on optimal patient selection.

Relevância:

80.00% 80.00%

Publicador:

Resumo:

Abstract: The third-generation bovine pericardium Freedom SOLO (FS) stentless valve emerged in 2004 as a modified version of the Pericarbon Freedom stentless valve and as a very attractive alternative to stented bioprostheses. The design, choice of tissue, and anticalcification treatment fulfill most, if not all, requirements for an ideal valve substitute. The FS combines the single-suture, subcoronary implantation technique with the latest-generation bovine pericardial tissue and novel anticalcification treatment. The design allows imitation of the native healthy valve through unrestricted adaption to the patient's anatomy, reproducing a normal valve/root complex. However, despite hemodynamic performance superior to stented valves, we are approaching a critical observation period as superior durability, freedom from structural valve deterioration, and nonstructural failure has not been proven as expected. However, optimal performance and freedom from structural valve deterioration depend on correct sizing and perfect symmetric implantation, to ensure low leaflet stress. Any malpositioning can lead to tissue fatigue over time. Furthermore, the potential for better outcomes depends on optimal patient selection and observance of the limitations for the use of stentless valves, particularly for the FS. Clearly, stentless valve implantation techniques are less reproducible and standardized, and require surgeon-dependent experience and skill. Regardless of whether or not stentless valve durability surpasses third-generation stented bioprostheses, they will continue to play a role in the surgical repertoire. This review intends to help practitioners avoid pitfalls, observe limitations, and improve patient selection for optimal long-term outcome with the attractive FS stentless valve.