938 resultados para Naive Bayes
Resumo:
Web APIs have gained increasing popularity in recent Web service technology development owing to its simplicity of technology stack and the proliferation of mashups. However, efficiently discovering Web APIs and the relevant documentations on the Web is still a challenging task even with the best resources available on the Web. In this paper we cast the problem of detecting the Web API documentations as a text classification problem of classifying a given Web page as Web API associated or not. We propose a supervised generative topic model called feature latent Dirichlet allocation (feaLDA) which offers a generic probabilistic framework for automatic detection of Web APIs. feaLDA not only captures the correspondence between data and the associated class labels, but also provides a mechanism for incorporating side information such as labelled features automatically learned from data that can effectively help improving classification performance. Extensive experiments on our Web APIs documentation dataset shows that the feaLDA model outperforms three strong supervised baselines including naive Bayes, support vector machines, and the maximum entropy model, by over 3% in classification accuracy. In addition, feaLDA also gives superior performance when compared against other existing supervised topic models.
Resumo:
Identification of humans via ECG is being increasingly studied because it can have several advantages over the traditional biometric identification techniques. However, difficulties arise because of the heartrate variability. In this study we analysed the influence of QT interval correction on the performance of an identification system based on temporal and amplitude features of ECG. In particular we tested MLP, Naive Bayes and 3-NN classifiers on the Fantasia database. Results indicate that QT correction can significantly improve the overall system performance. © 2013 IEEE.
Resumo:
In this demo the basic text mining technologies by using RapidMining have been reviewed. RapidMining basic characteristics and operators of text mining have been described. Text mining example by using Navie Bayes algorithm and process modeling have been revealed.
Resumo:
Hebb proposed that synapses between neurons that fire synchronously are strengthened, forming cell assemblies and phase sequences. The former, on a shorter scale, are ensembles of synchronized cells that function transiently as a closed processing system; the latter, on a larger scale, correspond to the sequential activation of cell assemblies able to represent percepts and behaviors. Nowadays, the recording of large neuronal populations allows for the detection of multiple cell assemblies. Within Hebb’s theory, the next logical step is the analysis of phase sequences. Here we detected phase sequences as consecutive assembly activation patterns, and then analyzed their graph attributes in relation to behavior. We investigated action potentials recorded from the adult rat hippocampus and neocortex before, during and after novel object exploration (experimental periods). Within assembly graphs, each assembly corresponded to a node, and each edge corresponded to the temporal sequence of consecutive node activations. The sum of all assembly activations was proportional to firing rates, but the activity of individual assemblies was not. Assembly repertoire was stable across experimental periods, suggesting that novel experience does not create new assemblies in the adult rat. Assembly graph attributes, on the other hand, varied significantly across behavioral states and experimental periods, and were separable enough to correctly classify experimental periods (Naïve Bayes classifier; maximum AUROCs ranging from 0.55 to 0.99) and behavioral states (waking, slow wave sleep, and rapid eye movement sleep; maximum AUROCs ranging from 0.64 to 0.98). Our findings agree with Hebb’s view that neuronal assemblies correspond to primitive building blocks of representation, nearly unchanged in 10 the adult, while phase sequences are labile across behavioral states and change after novel experience. The results are compatible with a role for phase sequences in behavior and cognition
Resumo:
El análisis de datos actual se enfrenta a problemas derivados de la combinación de datos procedentes de diversas fuentes de información. El valor de la información puede enriquecerse enormemente facilitando la integración de nuevas fuentes de datos y la industria es muy consciente de ello en la actualidad. Sin embargo, no solo el volumen sino también la gran diversidad de los datos constituye un problema previo al análisis. Una buena integración de los datos garantiza unos resultados fiables y por ello merece la pena detenerse en la mejora de procesos de especificación, recolección, limpieza e integración de los datos. Este trabajo está dedicado a la fase de limpieza e integración de datos analizando los procedimientos existentes y proponiendo una solución que se aplica a datos médicos, centrándose así en los proyectos de predicción (con finalidad de prevención) en ciencias de la salud. Además de la implementación de los procesos de limpieza, se desarrollan algoritmos de detección de outliers que permiten mejorar la calidad del conjunto de datos tras su eliminación. El trabajo también incluye la implementación de un proceso de predicción que sirva de ayuda a la toma de decisiones. Concretamente este trabajo realiza un análisis predictivo de los datos de pacientes drogodependientes de la Clínica Nuestra Señora de la Paz, con la finalidad de poder brindar un apoyo en la toma de decisiones del médico a cargo de admitir el internamiento de pacientes en dicha clínica. En la mayoría de los casos el estudio de los datos facilitados requiere un pre-procesado adecuado para que los resultados de los análisis estadísticos tradicionales sean fiables. En tal sentido en este trabajo se implementan varias formas de detectar los outliers: un algoritmo propio (Detección de Outliers con Cadenas No Monótonas), que utiliza las ventajas del algoritmo Knuth-Morris-Pratt para reconocimiento de patrones, y las librerías outliers y Rcmdr de R. La aplicación de procedimientos de cleaning e integración de datos, así como de eliminación de datos atípicos proporciona una base de datos limpia y fiable sobre la que se implementarán procedimientos de predicción de los datos con el algoritmo de clasificación Naive Bayes en R.
Resumo:
Dato il recente avvento delle tecnologie NGS, in grado di sequenziare interi genomi umani in tempi e costi ridotti, la capacità di estrarre informazioni dai dati ha un ruolo fondamentale per lo sviluppo della ricerca. Attualmente i problemi computazionali connessi a tali analisi rientrano nel topic dei Big Data, con databases contenenti svariati tipi di dati sperimentali di dimensione sempre più ampia. Questo lavoro di tesi si occupa dell'implementazione e del benchmarking dell'algoritmo QDANet PRO, sviluppato dal gruppo di Biofisica dell'Università di Bologna: il metodo consente l'elaborazione di dati ad alta dimensionalità per l'estrazione di una Signature a bassa dimensionalità di features con un'elevata performance di classificazione, mediante una pipeline d'analisi che comprende algoritmi di dimensionality reduction. Il metodo è generalizzabile anche all'analisi di dati non biologici, ma caratterizzati comunque da un elevato volume e complessità, fattori tipici dei Big Data. L'algoritmo QDANet PRO, valutando la performance di tutte le possibili coppie di features, ne stima il potere discriminante utilizzando un Naive Bayes Quadratic Classifier per poi determinarne il ranking. Una volta selezionata una soglia di performance, viene costruito un network delle features, da cui vengono determinate le componenti connesse. Ogni sottografo viene analizzato separatamente e ridotto mediante metodi basati sulla teoria dei networks fino all'estrapolazione della Signature finale. Il metodo, già precedentemente testato su alcuni datasets disponibili al gruppo di ricerca con riscontri positivi, è stato messo a confronto con i risultati ottenuti su databases omici disponibili in letteratura, i quali costituiscono un riferimento nel settore, e con algoritmi già esistenti che svolgono simili compiti. Per la riduzione dei tempi computazionali l'algoritmo è stato implementato in linguaggio C++ su HPC, con la parallelizzazione mediante librerie OpenMP delle parti più critiche.
Resumo:
Security defects are common in large software systems because of their size and complexity. Although efficient development processes, testing, and maintenance policies are applied to software systems, there are still a large number of vulnerabilities that can remain, despite these measures. Some vulnerabilities stay in a system from one release to the next one because they cannot be easily reproduced through testing. These vulnerabilities endanger the security of the systems. We propose vulnerability classification and prediction frameworks based on vulnerability reproducibility. The frameworks are effective to identify the types and locations of vulnerabilities in the earlier stage, and improve the security of software in the next versions (referred to as releases). We expand an existing concept of software bug classification to vulnerability classification (easily reproducible and hard to reproduce) to develop a classification framework for differentiating between these vulnerabilities based on code fixes and textual reports. We then investigate the potential correlations between the vulnerability categories and the classical software metrics and some other runtime environmental factors of reproducibility to develop a vulnerability prediction framework. The classification and prediction frameworks help developers adopt corresponding mitigation or elimination actions and develop appropriate test cases. Also, the vulnerability prediction framework is of great help for security experts focus their effort on the top-ranked vulnerability-prone files. As a result, the frameworks decrease the number of attacks that exploit security vulnerabilities in the next versions of the software. To build the classification and prediction frameworks, different machine learning techniques (C4.5 Decision Tree, Random Forest, Logistic Regression, and Naive Bayes) are employed. The effectiveness of the proposed frameworks is assessed based on collected software security defects of Mozilla Firefox.
Resumo:
The aim of this thesis is to present a new approach to document classification using verb-object pairs. We explore one possible strategy that uses the presence of relevant verb-object pairs in documents as features and a Naive Bayes classifier as a classifier on which the model is trained. Then, we assess the results from the case study which uses a software based on the strategy and make conclusions.
Resumo:
The presence of mutations associated with integrase inhibitor (INI) resistance among INI-naive patients may play an important clinical role in the use of those drugs Samples from 76 HIV-1-infected subjects naive to INIs were submitted to direct sequencing. No differences were found between naive (25%) subjects and subjects on HAART (75%). No primary mutation associated with raltegravir or elvitegravir resistance was found. However, 78% of sequences showed at least one accessory mutation associated with resistance. The analysis of the 76 IN sequences showed a high polymorphic level on this region among Brazilian HIV-1-infected subjects, including a high prevalence of aa substitutions related to INI resistance. The impact of these findings remains unclear and further studies are necessary to address these questions.
Resumo:
Objectives: Adults with major depressive disorder (MDD) are reported to have reduced orbitofrontal cortex (OFC) volumes, which could be related to decreased neuronal density. We conducted a study on medication naive children with MDD to determine whether abnormalities of OFC are present early in the illness course. Methods: Twenty seven medication naive pediatric Diagnostic and Statistical Manual of Mental Disorders, 4(th) edition (DSM-IV) MDD patients (mean age +/- SD = 14.4 +/- 2.2 years; 10 males) and 26 healthy controls (mean age +/- SD = 14.4 +/- 2.4 years; 12 males) underwent a 1.5T magnetic resonance imaging (MRI) with 3D spoiled gradient recalled acquisition. The OFC volumes were compared using analysis of covariance with age, gender, and total brain volume as covariates. Results: There was no significant difference in either total OFC volume or total gray matter OFC volume between MDD patients and healthy controls. Exploratory analysis revealed that patients had unexpectedly larger total right lateral (F = 4.2, df = 1, 48, p = 0.05) and right lateral gray matter (F = 4.6, df = 1, 48, p = 0.04) OFC volumes compared to healthy controls, but this finding was not significant following statistical correction for multiple comparisons. No other OFC subregions showed a significant difference. Conclusions: The lack of OFC volume abnormalities in pediatric MDD patients suggests the abnormalities previously reported for adults may develop later in life as a result of neural cell loss.
Resumo:
Objective: The striatum, including the putamen and caudate, plays an important role in executive and emotional processing and may be involved in the pathophysiology of mood disorders. Few studies have examined structural abnormalities of the striatum in pediatric major depressive disorder (MDD) patients. We report striatal volume abnormalities in medication-naive pediatric MDD compared to healthy comparison subjects. Method: Twenty seven medication-naive pediatric Diagnostic and Statistical Manual of Mental Disorders, 4(th) edition (DSM-IV) MDD and 26 healthy comparison subjects underwent volumetric magnetic resonance imaging (MRI). The putamen and caudate volumes were traced manually by a blinded rater, and the patient and control groups were compared using analysis of covariance adjusting for age, sex, intelligence quotient, and total brain volumes. Results: MDD patients had significantly smaller right striatum (6.0% smaller) and right caudate volumes (7.4% smaller) compared to the healthy subjects. Left caudate volumes were inversely correlated with severity of depression in MDD subjects. Age was inversely correlated with left and right putamen volumes in MDD patients but not in the healthy subjects. Conclusions: These findings provide fresh evidence for abnormalities in the striatum of medication-naive pediatric MDD patients and suggest the possible involvement of the striatum in the pathophysiology of MDD.
Resumo:
Epidural motor cortex stimulation (MCS) has been used for treating patients with neuropathic pain resistant to other therapeutic approaches. Experimental evidence suggests that the motor cortex is also involved in the modulation of normal nociceptive response, but the underlying mechanisms of pain control have not been clarified yet. The aim of this study was to investigate the effects of epidural electrical MCS on the nociceptive threshold of naive rats. Electrodes were placed on epidural motor cortex, over the hind paw area, according to the functional mapping accomplished in this study. Nociceptive threshold and general activity were evaluated under 15-min electrical stimulating sessions. When rats were evaluated by the paw pressure test, MCS induced selective antinociception in the paw contralateral to the stimulated cortex, but no changes were noticed in the ipsilateral paw. When the nociceptive test was repeated 15 min after cessation of electrical stimulation, the nociceptive threshold returned to basal levels. On the other hand, no changes in the nociceptive threshold were observed in rats evaluated by the tail-flick test. Additionally, no behavioral or motor impairment were noticed in the course of stimulation session at the open-field test. Stimulation of posterior parietal or somatosensory cortices did not elicit any changes in the general activity or nociceptive response. Opioid receptors blockade by naloxone abolished the increase in nociceptive threshold induced by MCS. Data shown herein demonstrate that epidural electrical MCS elicits a substantial and selective antinociceptive effect, which is mediated by opioids. (C) 2008 Elsevier B.V. All rights reserved.
Resumo:
Egr-1 and related proteins are inducible transcription factors within the brain recognizing the same consensus DNA sequence. Three Egr DNA-binding activities were observed in regions of the naive rat brain. Egr-1 was present in all brain regions examined. Bands composed, at least in part, of Egr-2 and Egr-3 were present in different relative amounts in the cerebral cortex, striatum, hippocampus, thalamus, and midbrain. All had similar affinity and specificity for the Egr consensus DNA recognition sequence. Administration of the convulsants NMDA, kainate, and pentylenetetrazole differentially induced Egr-1 and Egr-2/3 DNA-binding activities in the cerebral cortex, hippocampus, and cerebellum. All convulsants induced Egr-1 and Egr-2 immunoreactivity in the cerebral cortex and hippocampus. These data indicate that the members of the Egr family are regulated at different levels and may interact at promoters containing the Egr consensus sequence to fine tune a program of gene expression resulting from excitatory stimuli.
Resumo:
Generalized Social Anxiety Disorder (SAD) is one of the most common anxiety conditions with impairment in social life. Cannabidiol (CBD), one major non-psychotomimetic compound of the cannabis sativa plant, has shown anxiolytic effects both in humans and in animals. This preliminary study aimed to compare the effects of a simulation public speaking test (SPST) on healthy control (HC) patients and treatment-naive SAD patients who received a single dose of CBD or placebo. A total of 24 never-treated patients with SAD were allocated to receive either CBD (600 mg; n = 12) or placebo (placebo; n = 12) in a double-blind randomized design 1 h and a half before the test. The same number of HC (n = 12) performed the SPST without receiving any medication. Each volunteer participated in only one experimental session in a double-blind procedure. Subjective ratings on the Visual Analogue Mood Scale (VAMS) and Negative Self-Statement scale (SSPS-N) and physiological measures (blood pressure, heart rate, and skin conductance) were measured at six different time points during the SPST. The results were submitted to a repeated-measures analysis of variance. Pretreatment with CBD significantly reduced anxiety, cognitive impairment and discomfort in their speech performance, and significantly decreased alert in their anticipatory speech. The placebo group presented higher anxiety, cognitive impairment, discomfort, and alert levels when compared with the control group as assessed with the VAMS. The SSPS-N scores evidenced significant increases during the testing of placebo group that was almost abolished in the CBD group. No significant differences were observed between CBD and HC in SSPS-N scores or in the cognitive impairment, discomfort, and alert factors of VAMS. The increase in anxiety induced by the SPST on subjects with SAD was reduced with the use of CBD, resulting in a similar response as the HC. Neuropsychopharmacology (2011) 36, 1219-1226; doi: 10.1038/npp.2011.6; published online 9 February 2011