947 resultados para naive bayes classifier
Resumo:
Abstract We have analyzed purine (R) and pyrimidine (Y) codon patterns in variable and constant regions of HIV-1 gp120 in seven patients infected with different HIV-1 subtypes and naive to antiretroviral therapy. We have calculated the relative frequency of each in-frame codon RNY, YNR, RNR, and YNY (N=any nucleotide) in variable and constant regions of gp120, in the sequence within indels and at indels' flanking sites. Our data show that hypervariable regions V1, V2, V4, and V5 are characterized by the presence of long stretches of RNY codons constituting the majority of the sequence portion within insertions/deletions. In full-length gp120 and within inserted/deleted fragments the number of AVT (V=A, C, G) codons did not exceed 50% of the total RNY codons. RNY strings in variable regions spanned up to 21 codons and were always in frame. In contrast, RNY strings in constant regions were mostly out of frame and their length was limited to five codons. The frequency of the codon RNY was found to be significantly higher in variable regions (p<0.0001; t-test), within indels, and at indels' flanking sites (p<0.0001; χ(2) test). Analysis of the distribution of RNY strings equal to or longer than five codons in the full genome of HXB2 also shows that these sequences are mostly out of frame, unless they contain a potential N-glycosylation site or an asparagine. These data suggest that cryptic repeats of RNY may play a role in the genesis of multiple base insertions and deletions in hypervariable regions of gp120.
Resumo:
BACKGROUND: Therapy of chronic hepatitis C (CHC) with pegIFNα/ribavirin achieves a sustained virologic response (SVR) in ∼55%. Pre-activation of the endogenous interferon system in the liver is associated with non-response (NR). Recently, genome-wide association studies described associations of allelic variants near the IL28B (IFNλ3) gene with treatment response and with spontaneous clearance of the virus. We investigated if the IL28B genotype determines the constitutive expression of IFN stimulated genes (ISGs) in the liver of patients with CHC. METHODS: We genotyped 93 patients with CHC for 3 IL28B single nucleotide polymorphisms (SNPs, rs12979860, rs8099917, rs12980275), extracted RNA from their liver biopsies and quantified the expression of IL28B and of 8 previously identified classifier genes which discriminate between SVR and NR (IFI44L, RSAD2, ISG15, IFI22, LAMP3, OAS3, LGALS3BP and HTATIP2). Decision tree ensembles in the form of a random forest classifier were used to calculate the relative predictive power of these different variables in a multivariate analysis. RESULTS: The minor IL28B allele (bad risk for treatment response) was significantly associated with increased expression of ISGs, and, unexpectedly, with decreased expression of IL28B. Stratification of the patients into SVR and NR revealed that ISG expression was conditionally independent from the IL28B genotype, i.e. there was an increased expression of ISGs in NR compared to SVR irrespective of the IL28B genotype. The random forest feature score (RFFS) identified IFI27 (RFFS = 2.93), RSAD2 (1.88) and HTATIP2 (1.50) expression and the HCV genotype (1.62) as the strongest predictors of treatment response. ROC curves of the IL28B SNPs showed an AUC of 0.66 with an error rate (ERR) of 0.38. A classifier with the 3 best classifying genes showed an excellent test performance with an AUC of 0.94 and ERR of 0.15. The addition of IL28B genotype information did not improve the predictive power of the 3-gene classifier. CONCLUSIONS: IL28B genotype and hepatic ISG expression are conditionally independent predictors of treatment response in CHC. There is no direct link between altered IFNλ3 expression and pre-activation of the endogenous system in the liver. Hepatic ISG expression is by far the better predictor for treatment response than IL28B genotype.
Resumo:
Standard practice of wave-height hazard analysis often pays little attention to the uncertainty of assessed return periods and occurrence probabilities. This fact favors the opinion that, when large events happen, the hazard assessment should change accordingly. However, uncertainty of the hazard estimates is normally able to hide the effect of those large events. This is illustrated using data from the Mediterranean coast of Spain, where the last years have been extremely disastrous. Thus, it is possible to compare the hazard assessment based on data previous to those years with the analysis including them. With our approach, no significant change is detected when the statistical uncertainty is taken into account. The hazard analysis is carried out with a standard model. Time-occurrence of events is assumed Poisson distributed. The wave-height of each event is modelled as a random variable which upper tail follows a Generalized Pareto Distribution (GPD). Moreover, wave-heights are assumed independent from event to event and also independent of their occurrence in time. A threshold for excesses is assessed empirically. The other three parameters (Poisson rate, shape and scale parameters of GPD) are jointly estimated using Bayes' theorem. Prior distribution accounts for physical features of ocean waves in the Mediterranean sea and experience with these phenomena. Posterior distribution of the parameters allows to obtain posterior distributions of other derived parameters like occurrence probabilities and return periods. Predictives are also available. Computations are carried out using the program BGPE v2.0
Resumo:
O objetivo deste trabalho foi propor uma abordagem bayesiana do método de Eberhart & Russell para avaliar a adaptabilidade e da estabilidade fenotípica de genótipos de alfafa (Medicago sativa), bem como avaliar a eficiência da utilização de distribuições a priori informativas e pouco informativas. Foram utilizados dados de um experimento em blocos ao acaso, no qual se avaliou a produção de massa de matéria seca de 92 genótipos. A metodologia bayesiana proposta foi implementada no programa livre R por meio da função MCMCregress do pacote MCMCpack. Para representar as distribuições a priori pouco informativas, utilizaram-se distribuições de probabilidade com grande variância; e, para representar distribuições a priori informativas, adotou-se o conceito de meta-análise, que se caracteriza pela utilização de informações provenientes de trabalhos anteriores. A comparação entre as distribuições a priori foi realizada por meio do fator de Bayes, com a função BayesFactor do pacote MCMCpack, que indicou a priori informativa como a mais adequada nas condições deste estudo.
Resumo:
We characterized lipid and lipoprotein changes associated with a lopinavir/ritonavir-containing regimen. We enrolled previously antiretroviral-naive patients participating in the Swiss HIV Cohort Study. Fasting blood samples (baseline) were retrieved retrospectively from stored frozen plasma and posttreatment (follow-up) samples were collected prospectively at two separate visits. Lipids and lipoproteins were analyzed at a single reference laboratory. Sixty-five patients had two posttreatment lipid profile measurements and nine had only one. Most of the measured lipids and lipoprotein plasma concentrations increased on lopinavir/ritonavir-based treatment. The percentage of patients with hypertriglyceridemia (TG >150 mg/dl) increased from 28/74 (38%) at baseline to 37/65 (57%) at the second follow-up. We did not find any correlation between lopinavir plasma levels and the concentration of triglycerides. There was weak evidence of an increase in small dense LDL-apoB during the first year of treatment but not beyond 1 year (odds ratio 4.5, 90% CI 0.7 to 29 and 0.9, 90% CI 0.5 to 1.5, respectively). However, 69% of our patients still had undetectable small dense LDL-apoB levels while on treatment. LDL-cholesterol increased by a mean of 17 mg/dl (90% CI -3 to 37) during the first year of treatment, but mean values remained below the cut-off for therapeutic intervention. Despite an increase in the majority of measured lipids and lipoproteins particularly in the first year after initiation, we could not detect an obvious increase of cardiovascular risk resulting from the observed lipid changes.
Resumo:
(from the journal abstract) Schizophrenia, a major psychiatric disease, affects individuals in the centre of their personality. Its aetiology is not clearly established. In this review, we will present evidence that patients suffering of schizophrenia present a brain deficit in glutathione, a major endogenous redox regulator and antioxidant. We will also show that, in experimental models, a decrease in glutathione, particularly during development, induces morphological, electrophysiological and behavioural anomalies consistent with those observed in the disease. In the cerebrospinal fluid of drug-naive schizophrenics, glutathione level was decreased by 27% and its direct metabolite of glutathione by 16%. Glutathione level in prefrontal cortex of patients, measured by magnetic resonance spectroscopy, was 52% lower than in controls. Patients' fibroblasts reveal a decrease in mRNA levels of the two glutathione synthesising enzymes, glutamatecysteine ligase modulatory subunit (GCLM) and glutathione synthetase. GCLM expression level in fibroblasts correlates negatively with symptoms severity. Glutathione is an important endogenous redox regulator and neuroactive substance. It is protecting cells from damage by reactive oxygen species generated, among others, by dopamine metabolism. A glutathione deficit-induced oxidative stress would lead to lipid peroxidation and micro-lesions at the level of dendritic spines, a synaptic damage responsible for abnormal nervous connections or structural disconnectivity. On the other hand, a glutathione deficit could also lead to a functional disconnectivity by depressing NMDA neurotransmission, in analogy to phencyclidine effects. Present experimental data are consistent with the proposed hypothesis: decreasing pharmacologically glutathione level in experimental models, with or without blocking dopamine (DA) uptake (GBR12909), induces morphological, electrophysiological and behavioural changes similar to those observed in patients. In summary, a deficit of glutathione and/or glutathione-related enzymes during early development would lead to both a functional and a structural disconnectivity, which could be at the basis of some perceptive, cognitive and behavioural troubles of the disease. It could constitute a major vulnerability factor for schizophrenia. Attempts to restore physiological glutathione functions could open new therapeutic avenues. This translational research, made possible by a close interaction between clinicians and neuroscientists, should also pave the way to the identification of biological markers for schizophrenia. In turn, they should allow early diagnostic and hopefully preventive intervention to this devastating disease. (PsycINFO Database Record (c) 2005 APA, all rights reserved)
Resumo:
Land use/cover classification is one of the most important applications in remote sensing. However, mapping accurate land use/cover spatial distribution is a challenge, particularly in moist tropical regions, due to the complex biophysical environment and limitations of remote sensing data per se. This paper reviews experiments related to land use/cover classification in the Brazilian Amazon for a decade. Through comprehensive analysis of the classification results, it is concluded that spatial information inherent in remote sensing data plays an essential role in improving land use/cover classification. Incorporation of suitable textural images into multispectral bands and use of segmentation‑based method are valuable ways to improve land use/cover classification, especially for high spatial resolution images. Data fusion of multi‑resolution images within optical sensor data is vital for visual interpretation, but may not improve classification performance. In contrast, integration of optical and radar data did improve classification performance when the proper data fusion method was used. Among the classification algorithms available, the maximum likelihood classifier is still an important method for providing reasonably good accuracy, but nonparametric algorithms, such as classification tree analysis, have the potential to provide better results. However, they often require more time to achieve parametric optimization. Proper use of hierarchical‑based methods is fundamental for developing accurate land use/cover classification, mainly from historical remotely sensed data.
Resumo:
Background. The time passed since the infection of a human immunodeficiency virus (HIV)-infected individual (the age of infection) is an important but often only poorly known quantity. We assessed whether the fraction of ambiguous nucleotides obtained from bulk sequencing as done for genotypic resistance testing can serve as a proxy of this parameter. Methods. We correlated the age of infection and the fraction of ambiguous nucleotides in partial pol sequences of HIV-1 sampled before initiation of antiretroviral therapy (ART). Three groups of Swiss HIV Cohort Study participants were analyzed, for whom the age of infection was estimated on the basis of Bayesian back calculation (n = 3,307), seroconversion (n = 366), or diagnoses of primary HIV infection (n = 130). In addition, we studied 124 patients for whom longitudinal genotypic resistance testing was performed while they were still ART-naive. Results. We found that the fraction of ambiguous nucleotides increased with the age of infection with a rate of .2% per year within the first 8 years but thereafter with a decreasing rate. We show that this pattern is consistent with population-genetic models for realistic parameters. Finally, we show that, in this highly representative population, a fraction of ambiguous nucleotides of >.5% provides strong evidence against a recent infection event < 1 year prior to sampling (negative predictive value, 98.7%). Conclusions. These findings show that the fraction of ambiguous nucleotides is a useful marker for the age of infection.
Resumo:
The ability to obtain gene expression profiles from human disease specimens provides an opportunity to identify relevant gene pathways, but is limited by the absence of data sets spanning a broad range of conditions. Here, we analyzed publicly available microarray data from 16 diverse skin conditions in order to gain insight into disease pathogenesis. Unsupervised hierarchical clustering separated samples by disease as well as common cellular and molecular pathways. Disease-specific signatures were leveraged to build a multi-disease classifier, which predicted the diagnosis of publicly and prospectively collected expression profiles with 93% accuracy. In one sample, the molecular classifier differed from the initial clinical diagnosis and correctly predicted the eventual diagnosis as the clinical presentation evolved. Finally, integration of IFN-regulated gene programs with the skin database revealed a significant inverse correlation between IFN-β and IFN-γ programs across all conditions. Our study provides an integrative approach to the study of gene signatures from multiple skin conditions, elucidating mechanisms of disease pathogenesis. In addition, these studies provide a framework for developing tools for personalized medicine toward the precise prediction, prevention, and treatment of disease on an individual level.
Resumo:
The objective of this work was to evaluate the use of multispectral remote sensing for site-specific nitrogen fertilizer management. Satellite imagery from the advanced spaceborne thermal emission and reflection radiometer (Aster) was acquired in a 23 ha corn-planted area in Iran. For the collection of field samples, a total of 53 pixels were selected by systematic randomized sampling. The total nitrogen content in corn leaf tissues in these pixels was evaluated. To predict corn canopy nitrogen content, different vegetation indices, such as normalized difference vegetation index (NDVI), soil-adjusted vegetation index (Savi), optimized soil-adjusted vegetation index (Osavi), modified chlorophyll absorption ratio index 2 (MCARI2), and modified triangle vegetation index 2 (MTVI2), were investigated. The supervised classification technique using the spectral angle mapper classifier (SAM) was performed to generate a nitrogen fertilization map. The MTVI2 presented the highest correlation (R²=0.87) and is a good predictor of corn canopy nitrogen content in the V13 stage, at 60 days after cultivating. Aster imagery can be used to predict nitrogen status in corn canopy. Classification results indicate three levels of required nitrogen per pixel: low (0-2.5 kg), medium (2.5-3 kg), and high (3-3.3 kg).
Resumo:
This paper presents a Bayesian approach to the design of transmit prefiltering matrices in closed-loop schemes robust to channel estimation errors. The algorithms are derived for a multiple-input multiple-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. Two different optimizationcriteria are analyzed: the minimization of the mean square error and the minimization of the bit error rate. In both cases, the transmitter design is based on the singular value decomposition (SVD) of the conditional mean of the channel response, given the channel estimate. The performance of the proposed algorithms is analyzed,and their relationship with existing algorithms is indicated. As withother previously proposed solutions, the minimum bit error rate algorithmconverges to the open-loop transmission scheme for very poor CSI estimates.
Resumo:
The research of condition monitoring of electric motors has been wide for several decades. The research and development at universities and in industry has provided means for the predictive condition monitoring. Many different devices and systems are developed and are widely used in industry, transportation and in civil engineering. In addition, many methods are developed and reported in scientific arenas in order to improve existing methods for the automatic analysis of faults. The methods, however, are not widely used as a part of condition monitoring systems. The main reasons are, firstly, that many methods are presented in scientific papers but their performance in different conditions is not evaluated, secondly, the methods include parameters that are so case specific that the implementation of a systemusing such methods would be far from straightforward. In this thesis, some of these methods are evaluated theoretically and tested with simulations and with a drive in a laboratory. A new automatic analysis method for the bearing fault detection is introduced. In the first part of this work the generation of the bearing fault originating signal is explained and its influence into the stator current is concerned with qualitative and quantitative estimation. The verification of the feasibility of the stator current measurement as a bearing fault indicatoris experimentally tested with the running 15 kW induction motor. The second part of this work concentrates on the bearing fault analysis using the vibration measurement signal. The performance of the micromachined silicon accelerometer chip in conjunction with the envelope spectrum analysis of the cyclic bearing faultis experimentally tested. Furthermore, different methods for the creation of feature extractors for the bearing fault classification are researched and an automatic fault classifier using multivariate statistical discrimination and fuzzy logic is introduced. It is often important that the on-line condition monitoring system is integrated with the industrial communications infrastructure. Two types of a sensor solutions are tested in the thesis: the first one is a sensor withcalculation capacity for example for the production of the envelope spectra; the other one can collect the measurement data in memory and another device can read the data via field bus. The data communications requirements highly depend onthe type of the sensor solution selected. If the data is already analysed in the sensor the data communications are needed only for the results but in the other case, all measurement data need to be transferred. The complexity of the classification method can be great if the data is analysed at the management level computer, but if the analysis is made in sensor itself, the analyses must be simple due to the restricted calculation and memory capacity.
Resumo:
Due to the large number of characteristics, there is a need to extract the most relevant characteristicsfrom the input data, so that the amount of information lost in this way is minimal, and the classification realized with the projected data set is relevant with respect to the original data. In order to achieve this feature extraction, different statistical techniques, as well as the principal components analysis (PCA) may be used. This thesis describes an extension of principal components analysis (PCA) allowing the extraction ofa finite number of relevant features from high-dimensional fuzzy data and noisy data. PCA finds linear combinations of the original measurement variables that describe the significant variation in the data. The comparisonof the two proposed methods was produced by using postoperative patient data. Experiment results demonstrate the ability of using the proposed two methods in complex data. Fuzzy PCA was used in the classificationproblem. The classification was applied by using the similarity classifier algorithm where total similarity measures weights are optimized with differential evolution algorithm. This thesis presents the comparison of the classification results based on the obtained data from the fuzzy PCA.
Resumo:
Phage display is a powerful method of isolating of antibody fragments from highly diverse naive human antibody repertoires. However, the affinity of the selected antibodies is usually low and current methods of affinity maturation are complex and time-consuming. In this paper, we describe an easy way to increase the functional affinity (avidity) of single chain variable fragments (scFvs) by tetramerization on streptavidin, following their site-specific biotinylation by the enzyme BirA. Expression vectors have been constructed that enable addition of the 15 amino acid biotin acceptor domain (BAD) on selected scFvs. Different domains were cloned at the C-terminus of scFv in the following order: a semi-rigid hinge region (of 16 residues), the BAD, and a histidine tail. Two such recombinant scFvs directed against the carcinoembryonic antigen (CEA) were previously selected from human non-immune and murine immune phage display libraries. The scFvs were first synthesized in Escherichia coli carrying the plasmid encoding the BirA enzyme, and then purified from the cytoplasmic extracts by Ni-NTA affinity chromatography. Purified biotinylated scFvs were tetramerized on the streptavidin molecule to create a streptabody (StAb). The avidity of various forms of anti-CEA StAbs, tested on purified CEA by competitive assays and surface plasmon resonance showed an increase of more than one log, as compared with the scFv monomer counterparts. Furthermore, the percentage of direct binding of 125I-labeled StAb or monomeric scFv on CEA-Sepharose beads and on CEA-expressing cells showed a dramatic increase for the tetramerized scFv (>80%), as compared with the monomeric scFv (<20%). Interestingly, the percentage binding of 125I-labeled anti-CEA StAbs to CEA-expressing colon carcinoma cells was definitely higher (>80%) than that obtained with a reference high affinity murine anti-CEA mAb (30%). Another advantage of using scFvs in a StAb format was demonstrated by Western blot analysis, where tetramerized anti-CEA scFv could detect a small quantity of CEA at a concentration 100-fold lower than the monomeric scFv.
Resumo:
Resumo:O objetivo deste trabalho foi selecionar, sob a perspectiva bayesiana, genótipos de feijão-caupi (Vigna unguiculata) que reúnam alta adaptabilidade e estabilidade fenotípicas, no Estado do Mato Grosso do Sul. Foram utilizados dados de quatro experimentos, conduzidos em delineamento de blocos ao acaso, em que a produtividade de grãos de 20 genótipos de feijão-caupi semiprostrado foi avaliada. Para representar as distribuições a priori pouco informativas, utilizaram-se distribuições de probabilidade com grande variância; e, para representar distribuições a priori informativas, adotou-se o conceito de metanálise, com uso de informações de trabalhos anteriores. A comparação entre as distribuições a priori foi realizada por meio do fator de Bayes. A abordagem bayesiana proporciona maior acurácia na seleção de genótipos de feijão-caupi semiprostrado, com elevadas adaptabilidade e estabilidade fenotípicas avaliadas por meio da metodologia de Eberhart & Russell. Com base nas prioris informativas, os genótipos MNC99-507G-4, TE97-309G-24, MNC99-542F-7 e BR 17-Gurguéia são classificados como de alta adaptabilidade a ambientes favoráveis. Já os genótipos TE96-290-12G, MNC99-510F-16, MNC99-508G-1, MNC99-541F-21, MNC99-542F-5 e MNC99-547F-2 apresentam alta adaptabilidade a ambientes desfavoráveis.