6 resultados para Classification time
em Biblioteca Digital da Produção Intelectual da Universidade de São Paulo (BDPI/USP)
Resumo:
For many learning tasks the duration of the data collection can be greater than the time scale for changes of the underlying data distribution. The question we ask is how to include the information that data are aging. Ad hoc methods to achieve this include the use of validity windows that prevent the learning machine from making inferences based on old data. This introduces the problem of how to define the size of validity windows. In this brief, a new adaptive Bayesian inspired algorithm is presented for learning drifting concepts. It uses the analogy of validity windows in an adaptive Bayesian way to incorporate changes in the data distribution over time. We apply a theoretical approach based on information geometry to the classification problem and measure its performance in simulations. The uncertainty about the appropriate size of the memory windows is dealt with in a Bayesian manner by integrating over the distribution of the adaptive window size. Thus, the posterior distribution of the weights may develop algebraic tails. The learning algorithm results from tracking the mean and variance of the posterior distribution of the weights. It was found that the algebraic tails of this posterior distribution give the learning algorithm the ability to cope with an evolving environment by permitting the escape from local traps.
Resumo:
This work proposes and discusses an approach for inducing Bayesian classifiers aimed at balancing the tradeoff between the precise probability estimates produced by time consuming unrestricted Bayesian networks and the computational efficiency of Naive Bayes (NB) classifiers. The proposed approach is based on the fundamental principles of the Heuristic Search Bayesian network learning. The Markov Blanket concept, as well as a proposed ""approximate Markov Blanket"" are used to reduce the number of nodes that form the Bayesian network to be induced from data. Consequently, the usually high computational cost of the heuristic search learning algorithms can be lessened, while Bayesian network structures better than NB can be achieved. The resulting algorithms, called DMBC (Dynamic Markov Blanket Classifier) and A-DMBC (Approximate DMBC), are empirically assessed in twelve domains that illustrate scenarios of particular interest. The obtained results are compared with NB and Tree Augmented Network (TAN) classifiers, and confinn that both proposed algorithms can provide good classification accuracies and better probability estimates than NB and TAN, while being more computationally efficient than the widely used K2 Algorithm.
Resumo:
In this paper we present a novel approach for multispectral image contextual classification by combining iterative combinatorial optimization algorithms. The pixel-wise decision rule is defined using a Bayesian approach to combine two MRF models: a Gaussian Markov Random Field (GMRF) for the observations (likelihood) and a Potts model for the a priori knowledge, to regularize the solution in the presence of noisy data. Hence, the classification problem is stated according to a Maximum a Posteriori (MAP) framework. In order to approximate the MAP solution we apply several combinatorial optimization methods using multiple simultaneous initializations, making the solution less sensitive to the initial conditions and reducing both computational cost and time in comparison to Simulated Annealing, often unfeasible in many real image processing applications. Markov Random Field model parameters are estimated by Maximum Pseudo-Likelihood (MPL) approach, avoiding manual adjustments in the choice of the regularization parameters. Asymptotic evaluations assess the accuracy of the proposed parameter estimation procedure. To test and evaluate the proposed classification method, we adopt metrics for quantitative performance assessment (Cohen`s Kappa coefficient), allowing a robust and accurate statistical analysis. The obtained results clearly show that combining sub-optimal contextual algorithms significantly improves the classification performance, indicating the effectiveness of the proposed methodology. (C) 2010 Elsevier B.V. All rights reserved.
Resumo:
In this paper, a novel statistical test is introduced to compare two locally stationary time series. The proposed approach is a Wald test considering time-varying autoregressive modeling and function projections in adequate spaces. The covariance structure of the innovations may be also time- varying. In order to obtain function estimators for the time- varying autoregressive parameters, we consider function expansions in splines and wavelet bases. Simulation studies provide evidence that the proposed test has a good performance. We also assess its usefulness when applied to a financial time series.
Resumo:
BACKGROUND: A major problem in Chagas disease donor screening is the high frequency of samples with inconclusive results. The objective of this study was to describe patterns of serologic results among donors to the three Brazilian REDS-II blood centers and correlate with epidemiologic characteristics. STUDY DESIGN AND METHODS: The centers screened donor samples with one Trypanosoma cruzi lysate enzyme immunoassay (EIA). EIA-reactive samples were tested with a second lysate EIA, a recombinant-antigen based EIA, and an immunfluorescence assay. Based on the serologic results, samples were classified as confirmed positive (CP), probable positive (PP), possible other parasitic infection (POPI), and false positive (FP). RESULTS: In 2007 to 2008, a total of 877 of 615,433 donations were discarded due to Chagas assay reactivity. The prevalences (95% confidence intervals [CIs]) among first-time donors for CP, PP, POPI, and FP patterns were 114 (99-129), 26 (19-34), 10 (5-14), and 96 (82-110) per 100,000 donations, respectively. CP and PP had similar patterns of prevalence when analyzed by age, sex, education, and location, suggesting that PP cases represent true T. cruzi infections; in contrast the demographics of donors with POPI were distinct and likely unrelated to Chagas disease. No CP cases were detected among 218,514 repeat donors followed for a total of 718,187 person-years. CONCLUSION: We have proposed a classification algorithm that may have practical importance for donor counseling and epidemiologic analyses of T. cruzi-seroreactive donors. The absence of incident T. cruzi infections is reassuring with respect to risk of window phase infections within Brazil and travel-related infections in nonendemic countries such as the United States.
Resumo:
Coconut water is a natural isotonic, nutritive, and low-caloric drink. Preservation process is necessary to increase its shelf life outside the fruit and to improve commercialization. However, the influence of the conservation processes, antioxidant addition, maturation time, and soil where coconut is cultivated on the chemical composition of coconut water has had few arguments and studies. For these reasons, an evaluation of coconut waters (unprocessed and processed) was carried out using Ca, Cu, Fe, K, Mg, Mn, Na, Zn, chloride, sulfate, phosphate, malate, and ascorbate concentrations and chemometric tools. The quantitative determinations were performed by electrothermal atomic absorption spectrometry, inductively coupled plasma optical emission spectrometry, and capillary electrophoresis. The results showed that Ca, K, and Zn concentrations did not present significant alterations between the samples. The ranges of Cu, Fe, Mg, Mn, PO (4) (3-) , and SO (4) (2-) concentrations were as follows: Cu (3.1-120 A mu g L(-1)), Fe (60-330 A mu g L(-1)), Mg (48-123 mg L(-1)), Mn (0.4-4.0 mg L(-1)), PO (4) (3-) (55-212 mg L(-1)), and SO (4) (2-) (19-136 mg L(-1)). The principal component analysis (PCA) and hierarchical cluster analysis (HCA) were applied to differentiate unprocessed and processed samples. Multivariated analysis (PCA and HCA) were compared through one-way analysis of variance with Tukey-Kramer multiple comparisons test, and p values less than 0.05 were considered to be significant.