954 resultados para Presence-only data
Resumo:
This article introduces a new neural network architecture, called ARTMAP, that autonomously learns to classify arbitrarily many, arbitrarily ordered vectors into recognition categories based on predictive success. This supervised learning system is built up from a pair of Adaptive Resonance Theory modules (ARTa and ARTb) that are capable of self-organizing stable recognition categories in response to arbitrary sequences of input patterns. During training trials, the ARTa module receives a stream {a^(p)} of input patterns, and ARTb receives a stream {b^(p)} of input patterns, where b^(p) is the correct prediction given a^(p). These ART modules are linked by an associative learning network and an internal controller that ensures autonomous system operation in real time. During test trials, the remaining patterns a^(p) are presented without b^(p), and their predictions at ARTb are compared with b^(p). Tested on a benchmark machine learning database in both on-line and off-line simulations, the ARTMAP system learns orders of magnitude more quickly, efficiently, and accurately than alternative algorithms, and achieves 100% accuracy after training on less than half the input patterns in the database. It achieves these properties by using an internal controller that conjointly maximizes predictive generalization and minimizes predictive error by linking predictive success to category size on a trial-by-trial basis, using only local operations. This computation increases the vigilance parameter ρa of ARTa by the minimal amount needed to correct a predictive error at ARTb· Parameter ρa calibrates the minimum confidence that ARTa must have in a category, or hypothesis, activated by an input a^(p) in order for ARTa to accept that category, rather than search for a better one through an automatically controlled process of hypothesis testing. Parameter ρa is compared with the degree of match between a^(p) and the top-down learned expectation, or prototype, that is read-out subsequent to activation of an ARTa category. Search occurs if the degree of match is less than ρa. ARTMAP is hereby a type of self-organizing expert system that calibrates the selectivity of its hypotheses based upon predictive success. As a result, rare but important events can be quickly and sharply distinguished even if they are similar to frequent events with different consequences. Between input trials ρa relaxes to a baseline vigilance pa When ρa is large, the system runs in a conservative mode, wherein predictions are made only if the system is confident of the outcome. Very few false-alarm errors then occur at any stage of learning, yet the system reaches asymptote with no loss of speed. Because ARTMAP learning is self stabilizing, it can continue learning one or more databases, without degrading its corpus of memories, until its full memory capacity is utilized.
Resumo:
Initial studies have demonstrated that intra- renal infusion of Ang (1-7) caused a diuresis and natriuresis that was proportional to the degree of activation of the Renin Angiotensin Aldosterone System (RAAS). This raised the question as why the magnitude of this diuresis and natriuresis was compromised in rats receiving a high sodium diet (suppressed RAAS) and enhanced in low sodium fed rats (activated RAAS)? Could the answer lie with changes in intra-renal AT1 or Mas receptor expression? Interestingly, the observed Ang (1-7) induced increases in sodium and water excretion in rats receiving either a low or normal sodium diet were and blocked in the presence of the AT 1 receptor antagonist (Losartan) in the presence of the, 'Mas' receptor antagonist (A-779). These data suggest that both AT1 and 'Mas' receptors need to be functional in order to fully mediate the renal responses to intra-renal Ang (1-7) infusion. Importantly, further experimentation also revealed that there is a proportional relationship between AT 1 receptor expression in the rat renal cortex and the magnitude of the excretory actions of intra renal Ang (1-7) infusion, which is only partially dependent on the level of 'Mas' receptor expression. These observations suggest that although Ang (1-7) induced increases in sodium and water excretion are mediated by the Mas receptor, the magnitude of these excretory responses appear to be dependent upon the level of AT 1 receptor expression and more specifically Ang II/ AT 1 receptor signalling. Thus in rats receiving a low sodium diet, Ang (1-7) acts via the Mas receptor to inhibit Ang II/ AT 1 receptor signalling. In rats receiving a high sodium diet the down regulated AT 1 receptor expression implies a reduction in Ang II/ AT 1 receptor signalling which renders the counter-regulatory effects of intra-renal Ang (1-7) infusion redundant.
Resumo:
BACKGROUND: Interleukin-10 (IL-10) is currently being extensively studied in clinical trials for the treatment of Crohn's disease (CD). Only marginal effects have, however, been reported, and the dose-response curve was bell-shaped contrasting with the reported data from in vitro experiments. AIM: To use another in vitro model to analyze the effect of rhIL-10 and rhIL-4 on the spontaneous mucosal TNF-alpha secretion in patients with CD, and to characterize the phenotype of the cells targeted by rhIL-10. METHODS: Non-inflamed colon biopsies from CD patients were cultured for 16 hours in presence of different concentrations of rhIL-10 or rhIL-4. The numbers of TNF-alpha-secreting cells among isolated lamina propria mononuclear cells (LPMNC) were estimated by Elispot. RESULTS: Both rhIL-10 and rhIL-4 down-regulate TNF-alpha secretion by LPMNC from CD patients, with a more pronounced effect with rhIL-10. These effects were closely linked to the cytokine concentrations used, with a bell-shaped dose-response curve. Residual TNF-alpha secretion, in the presence of optimal rhIL-10 concentration was mainly attributable to CD3+ T cells. In contrast, at higher rhIL-10 concentrations, CD3- cells contributed significantly to the TNF-alpha secretion. CONCLUSIONS: The in vitro model we used, demonstrates that IL-4, but mostly IL-10, efficiently suppresses TNF-alpha secretion in LPMNC from CD patients, with a dose-response curve similar to results obtained in vivo. Resistance at high rhIL-10 concentrations was associated with a change in the phenotype of TNF-alpha-secreting cells.
Resumo:
The potential value of baseline health-related quality-of-life (HRQOL) and clinical factors in predicting prognosis was examined using data from an international randomised phase III trial which compared doxorubicin and paclitaxel with doxorubicin and cylophosphamide as first line chemotherapy in 275 women with metastatic breast cancer. The European Organisation for Research and Treatment of Cancer (EORTC) QLQ-C30 and the related breast module (QLQ-BR23) were used to assess baseline HRQOL data. The Cox proportional-hazards regression model was used for both univariate and multivariate analyses of survival. In the univariate analyses, performance status (P<0.001) and number of sites involved (P=0.001) were the most important clinical prognostic factors. The HRQOL variables at baseline most strongly associated with longer survival were better appetite, physical and role functioning, as well as less fatigue (P<0.001). The final multivariate model retained performance status (P<0.001) and appetite loss (P=0.005) as the variables best predicting survival. Substantial loss of appetite was the only independent HRQOL factor predicting poor survival and was strongly correlated (/r/>0.5) with fatigue, role and physical functioning. In addition to known clinical factors, appetite loss appears to be a significant prognostic factor for survival in women with metastatic breast cancer. However, the mechanism underlying this association remains to be precisely defined in future studies.
Resumo:
Novel bifunctional ruthenium(n) complexes, [Ru(TAP)2(POQ-Nmet)]2+ and [Ru(BPY)2(POQ-Nmet)]2+(la, 2a), containing a metallic and an organic moiety, have been prepared as photoprobes and photoreagents of DNA(TAP = 1,4,5,8-tetraazaphenanthrene, POQ-Nmet = 5-[6-(7-chloroquinolin-4-yl)-3-thia-6-azaheptanamido]-l,10phenanthroline). The ES mass spectrometry and 'H NMR data in organic solvents indicate that the quinoline moiety exists in both the protonated and non-protonated form. Moreover, the comparison of the NMR data with those of the corresponding monofunctional complexes(without quinoline) evidences that [Ru(TAP).2(POQ-Nmet)]2+ and [Ru(BPY)J(POQ-Nmet)]2+ are unfolded when the quinoline unit is protonated whereas deprotonation permits folding of the molecule. In the folded state the spatial proximity of the electron donor(the organic moiety) and electron acceptor(the metallic moiety) in [Ru(TAP)2(POQ-Nmet)]2+ favours intramolecular photo-induced electron transfer, which has been shown in a previous study to be responsible for the very low luminescence of la in non-protonating solutions. The restoration of the luminescence by protonation of the quinoline moiety as observed previously is in agreement with the unfolding of the molecule demonstrated in this work. The existence of such folding-unfolding processes related to protonation is crucial for studies of la with DNA. © The Royal Society of Chemistry 2000.
Resumo:
Objective - To evaluate the effect of in vitro culture on zona pellucida resistance in mouse oocytes and embryos. Method-Zona pellucida resistance was assessed by comparing duration of zona lysis in the presence of alpha- chymotrypsin. The effects of artificial or physiological conditions of development were evaluated by comparing embryos in vitro with those left to reach the same stage of development in vivo. Results - The time required for zona lysis of oocytes increased after 2, 9.4, and 48 hours in vitro (P < .001). The same observation holds true for oocytes left in vivo during 24 hours. Fertilization both in vivo and in vitro induced a major increase in zona resistance. At the two-cell stage, in vitro culture did not harden the zona pellucida. At the morula stage and beyond, enzymatic lysis was slightly longer in vitro as compared to that of similar stages recovered from the genital tract. Conclusions - Our data indicate that in vitro culture conditions do not modify zona hardening in oocytes and only slightly increased zona resistance from the morula stage on.
Resumo:
This paper describes a methodology for detecting anomalies from sequentially observed and potentially noisy data. The proposed approach consists of two main elements: 1) filtering, or assigning a belief or likelihood to each successive measurement based upon our ability to predict it from previous noisy observations and 2) hedging, or flagging potential anomalies by comparing the current belief against a time-varying and data-adaptive threshold. The threshold is adjusted based on the available feedback from an end user. Our algorithms, which combine universal prediction with recent work on online convex programming, do not require computing posterior distributions given all current observations and involve simple primal-dual parameter updates. At the heart of the proposed approach lie exponential-family models which can be used in a wide variety of contexts and applications, and which yield methods that achieve sublinear per-round regret against both static and slowly varying product distributions with marginals drawn from the same exponential family. Moreover, the regret against static distributions coincides with the minimax value of the corresponding online strongly convex game. We also prove bounds on the number of mistakes made during the hedging step relative to the best offline choice of the threshold with access to all estimated beliefs and feedback signals. We validate the theory on synthetic data drawn from a time-varying distribution over binary vectors of high dimensionality, as well as on the Enron email dataset. © 1963-2012 IEEE.
Resumo:
Therapeutic anticancer vaccines are designed to boost patients' immune responses to tumors. One approach is to use a viral vector to deliver antigen to in situ DCs, which then activate tumor-specific T cell and antibody responses. However, vector-specific neutralizing antibodies and suppressive cell populations such as Tregs remain great challenges to the efficacy of this approach. We report here that an alphavirus vector, packaged in virus-like replicon particles (VRP) and capable of efficiently infecting DCs, could be repeatedly administered to patients with metastatic cancer expressing the tumor antigen carcinoembryonic antigen (CEA) and that it overcame high titers of neutralizing antibodies and elevated Treg levels to induce clinically relevant CEA-specific T cell and antibody responses. The CEA-specific antibodies mediated antibody-dependent cellular cytotoxicity against tumor cells from human colorectal cancer metastases. In addition, patients with CEA-specific T cell responses exhibited longer overall survival. These data suggest that VRP-based vectors can overcome the presence of neutralizing antibodies to break tolerance to self antigen and may be clinically useful for immunotherapy in the setting of tumor-induced immunosuppression.
Resumo:
BACKGROUND: Biological processes occur on a vast range of time scales, and many of them occur concurrently. As a result, system-wide measurements of gene expression have the potential to capture many of these processes simultaneously. The challenge however, is to separate these processes and time scales in the data. In many cases the number of processes and their time scales is unknown. This issue is particularly relevant to developmental biologists, who are interested in processes such as growth, segmentation and differentiation, which can all take place simultaneously, but on different time scales. RESULTS: We introduce a flexible and statistically rigorous method for detecting different time scales in time-series gene expression data, by identifying expression patterns that are temporally shifted between replicate datasets. We apply our approach to a Saccharomyces cerevisiae cell-cycle dataset and an Arabidopsis thaliana root developmental dataset. In both datasets our method successfully detects processes operating on several different time scales. Furthermore we show that many of these time scales can be associated with particular biological functions. CONCLUSIONS: The spatiotemporal modules identified by our method suggest the presence of multiple biological processes, acting at distinct time scales in both the Arabidopsis root and yeast. Using similar large-scale expression datasets, the identification of biological processes acting at multiple time scales in many organisms is now possible.
Resumo:
The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation.
Resumo:
BACKGROUND: Historically, only partial assessments of data quality have been performed in clinical trials, for which the most common method of measuring database error rates has been to compare the case report form (CRF) to database entries and count discrepancies. Importantly, errors arising from medical record abstraction and transcription are rarely evaluated as part of such quality assessments. Electronic Data Capture (EDC) technology has had a further impact, as paper CRFs typically leveraged for quality measurement are not used in EDC processes. METHODS AND PRINCIPAL FINDINGS: The National Institute on Drug Abuse Treatment Clinical Trials Network has developed, implemented, and evaluated methodology for holistically assessing data quality on EDC trials. We characterize the average source-to-database error rate (14.3 errors per 10,000 fields) for the first year of use of the new evaluation method. This error rate was significantly lower than the average of published error rates for source-to-database audits, and was similar to CRF-to-database error rates reported in the published literature. We attribute this largely to an absence of medical record abstraction on the trials we examined, and to an outpatient setting characterized by less acute patient conditions. CONCLUSIONS: Historically, medical record abstraction is the most significant source of error by an order of magnitude, and should be measured and managed during the course of clinical trials. Source-to-database error rates are highly dependent on the amount of structured data collection in the clinical setting and on the complexity of the medical record, dependencies that should be considered when developing data quality benchmarks.
Resumo:
BACKGROUND: Genetic association studies are conducted to discover genetic loci that contribute to an inherited trait, identify the variants behind these associations and ascertain their functional role in determining the phenotype. To date, functional annotations of the genetic variants have rarely played more than an indirect role in assessing evidence for association. Here, we demonstrate how these data can be systematically integrated into an association study's analysis plan. RESULTS: We developed a Bayesian statistical model for the prior probability of phenotype-genotype association that incorporates data from past association studies and publicly available functional annotation data regarding the susceptibility variants under study. The model takes the form of a binary regression of association status on a set of annotation variables whose coefficients were estimated through an analysis of associated SNPs in the GWAS Catalog (GC). The functional predictors examined included measures that have been demonstrated to correlate with the association status of SNPs in the GC and some whose utility in this regard is speculative: summaries of the UCSC Human Genome Browser ENCODE super-track data, dbSNP function class, sequence conservation summaries, proximity to genomic variants in the Database of Genomic Variants and known regulatory elements in the Open Regulatory Annotation database, PolyPhen-2 probabilities and RegulomeDB categories. Because we expected that only a fraction of the annotations would contribute to predicting association, we employed a penalized likelihood method to reduce the impact of non-informative predictors and evaluated the model's ability to predict GC SNPs not used to construct the model. We show that the functional data alone are predictive of a SNP's presence in the GC. Further, using data from a genome-wide study of ovarian cancer, we demonstrate that their use as prior data when testing for association is practical at the genome-wide scale and improves power to detect associations. CONCLUSIONS: We show how diverse functional annotations can be efficiently combined to create 'functional signatures' that predict the a priori odds of a variant's association to a trait and how these signatures can be integrated into a standard genome-wide-scale association analysis, resulting in improved power to detect truly associated variants.
Resumo:
Association studies of quantitative traits have often relied on methods in which a normal distribution of the trait is assumed. However, quantitative phenotypes from complex human diseases are often censored, highly skewed, or contaminated with outlying values. We recently developed a rank-based association method that takes into account censoring and makes no distributional assumptions about the trait. In this study, we applied our new method to age-at-onset data on ALDX1 and ALDX2. Both traits are highly skewed (skewness > 1.9) and often censored. We performed a whole genome association study of age at onset of the ALDX1 trait using Illumina single-nucleotide polymorphisms. Only slightly more than 5% of markers were significant. However, we identified two regions on chromosomes 14 and 15, which each have at least four significant markers clustering together. These two regions may harbor genes that regulate age at onset of ALDX1 and ALDX2. Future fine mapping of these two regions with densely spaced markers is warranted.
Resumo:
Transcriptional regulation has been studied intensively in recent decades. One important aspect of this regulation is the interaction between regulatory proteins, such as transcription factors (TF) and nucleosomes, and the genome. Different high-throughput techniques have been invented to map these interactions genome-wide, including ChIP-based methods (ChIP-chip, ChIP-seq, etc.), nuclease digestion methods (DNase-seq, MNase-seq, etc.), and others. However, a single experimental technique often only provides partial and noisy information about the whole picture of protein-DNA interactions. Therefore, the overarching goal of this dissertation is to provide computational developments for jointly modeling different experimental datasets to achieve a holistic inference on the protein-DNA interaction landscape.
We first present a computational framework that can incorporate the protein binding information in MNase-seq data into a thermodynamic model of protein-DNA interaction. We use a correlation-based objective function to model the MNase-seq data and a Markov chain Monte Carlo method to maximize the function. Our results show that the inferred protein-DNA interaction landscape is concordant with the MNase-seq data and provides a mechanistic explanation for the experimentally collected MNase-seq fragments. Our framework is flexible and can easily incorporate other data sources. To demonstrate this flexibility, we use prior distributions to integrate experimentally measured protein concentrations.
We also study the ability of DNase-seq data to position nucleosomes. Traditionally, DNase-seq has only been widely used to identify DNase hypersensitive sites, which tend to be open chromatin regulatory regions devoid of nucleosomes. We reveal for the first time that DNase-seq datasets also contain substantial information about nucleosome translational positioning, and that existing DNase-seq data can be used to infer nucleosome positions with high accuracy. We develop a Bayes-factor-based nucleosome scoring method to position nucleosomes using DNase-seq data. Our approach utilizes several effective strategies to extract nucleosome positioning signals from the noisy DNase-seq data, including jointly modeling data points across the nucleosome body and explicitly modeling the quadratic and oscillatory DNase I digestion pattern on nucleosomes. We show that our DNase-seq-based nucleosome map is highly consistent with previous high-resolution maps. We also show that the oscillatory DNase I digestion pattern is useful in revealing the nucleosome rotational context around TF binding sites.
Finally, we present a state-space model (SSM) for jointly modeling different kinds of genomic data to provide an accurate view of the protein-DNA interaction landscape. We also provide an efficient expectation-maximization algorithm to learn model parameters from data. We first show in simulation studies that the SSM can effectively recover underlying true protein binding configurations. We then apply the SSM to model real genomic data (both DNase-seq and MNase-seq data). Through incrementally increasing the types of genomic data in the SSM, we show that different data types can contribute complementary information for the inference of protein binding landscape and that the most accurate inference comes from modeling all available datasets.
This dissertation provides a foundation for future research by taking a step toward the genome-wide inference of protein-DNA interaction landscape through data integration.
Resumo:
CONCLUSION Radiation dose reduction, while saving image quality could be easily implemented with this approach. Furthermore, the availability of a dosimetric data archive provides immediate feedbacks, related to the implemented optimization strategies. Background JCI Standards and European Legislation (EURATOM 59/2013) require the implementation of patient radiation protection programs in diagnostic radiology. Aim of this study is to demonstrate the possibility to reduce patients radiation exposure without decreasing image quality, through a multidisciplinary team (MT), which analyzes dosimetric data of diagnostic examinations. Evaluation Data from CT examinations performed with two different scanners (Siemens DefinitionTM and GE LightSpeed UltraTM) between November and December 2013 are considered. CT scanners are configured to automatically send images to DoseWatch© software, which is able to store output parameters (e.g. kVp, mAs, pitch ) and exposure data (e.g. CTDIvol, DLP, SSDE). Data are analyzed and discussed by a MT composed by Medical Physicists and Radiologists, to identify protocols which show critical dosimetric values, then suggest possible improvement actions to be implemented. Furthermore, the large amount of data available allows to monitor diagnostic protocols currently in use and to identify different statistic populations for each of them. Discussion We identified critical values of average CTDIvol for head and facial bones examinations (respectively 61.8 mGy, 151 scans; 61.6 mGy, 72 scans), performed with the GE LightSpeed CTTM. Statistic analysis allowed us to identify the presence of two different populations for head scan, one of which was only 10% of the total number of scans and corresponded to lower exposure values. The MT adopted this protocol as standard. Moreover, the constant output parameters monitoring allowed us to identify unusual values in facial bones exams, due to changes during maintenance service, which the team promptly suggested to correct. This resulted in a substantial dose saving in CTDIvol average values of approximately 15% and 50% for head and facial bones exams, respectively. Diagnostic image quality was deemed suitable for clinical use by radiologists.