16 resultados para "Naïve Bayes Classifier"
em University of Queensland eSpace - Australia
Resumo:
The Tree Augmented Naïve Bayes (TAN) classifier relaxes the sweeping independence assumptions of the Naïve Bayes approach by taking account of conditional probabilities. It does this in a limited sense, by incorporating the conditional probability of each attribute given the class and (at most) one other attribute. The method of boosting has previously proven very effective in improving the performance of Naïve Bayes classifiers and in this paper, we investigate its effectiveness on application to the TAN classifier.
Resumo:
The indefinite determiner yi 'one'+ classifier' is the most approximate to an indefinite article, like the English a, in Chinese. It serves all the functions characteristic of representative stages of grammaticalization from a numeral to a generalized indefinite determiner as elaborated in the literature. It is established in this paper that the Chinese indefinite determiner has developed a special use with definite expressions, serving as a backgrounding device marking entities as of low thematic importance and unlikely to receive subsequent mentions in ensuing discourse. 'yi+ classifier' in the special use with definite expressions displays striking similarities in terms of semantic bleaching and phonological reduction with the same determiner at the advanced stage of grammaticalization characterized by uses with generics, nonspecifics and nonreferentials. An explanation is offered in terms of an implicational relation between nonreferentiality and low thematic importance which characterize the two uses of the indefinite determiner. While providing another piece of evidence in support of the claim that semantically nonreferentials and entities of low thematic importance tend to be encoded in terms of same linguistic devices in language, findings in this paper have shown how an indefinite determiner can undergo a higher degree of grammaticalization than has been reported in the literature-it expands its scope to mark not only indefinite but also definite expressions as semantically nonreferential and/or thematically unimportant. (C) 2003 Elsevier B.V. All rights reserved.
Resumo:
Merkel cell carcinoma (MCC) is a rare aggressive skin tumor which shares histopathological and genetic features with small-cell lung carcinoma (SCLC), both are of neuroendocrine origin. Comparable to SCLC, MCC cell lines are classified into two different biochemical subgroups designated as 'Classic' and 'Variant'. With the aim to identify typical gene-expression signatures associated with these phenotypically different MCC cell lines subgroups and to search for differentially expressed genes between MCC and SCLC, we used cDNA arrays to pro. le 10 MCC cell lines and four SCLC cell lines. Using significance analysis of microarrays, we defined a set of 76 differentially expressed genes that allowed unequivocal identification of Classic and Variant MCC subgroups. We assume that the differential expression levels of some of these genes reflect, analogous to SCLC, the different biological and clinical properties of Classic and Variant MCC phenotypes. Therefore, they may serve as useful prognostic markers and potential targets for the development of new therapeutic interventions specific for each subgroup. Moreover, our analysis identified 17 powerful classifier genes capable of discriminating MCC from SCLC. Real-time quantitative RT-PCR analysis of these genes on 26 additional MCC and SCLC samples confirmed their diagnostic classification potential, opening opportunities for new investigations into these aggressive cancers.
Resumo:
Support vector machines (SVMs) have recently emerged as a powerful technique for solving problems in pattern classification and regression. Best performance is obtained from the SVM its parameters have their values optimally set. In practice, good parameter settings are usually obtained by a lengthy process of trial and error. This paper describes the use of genetic algorithm to evolve these parameter settings for an application in mobile robotics.
Resumo:
This paper examines the article system in interlanguage grammar focusing on Japanese learners of English, whose native language lacks articles. It will be demonstrated that for the acquisition of the English article system, count/mass distinctions and definiteness are the crucial factors. Although Japanese does not employ the article system to encode these aspects, it will be argued that they are nevertheless syntactically encoded through its classifier system. Hence, the problem for these learners must be to map these features onto the appropriate surface forms as the Missing Surface Inflection Hypothesis predicts (Prévost & White 2000). This suggestion will further be supported empirically by a fill-in-the article task. It will be concluded that these Japanese learners understand the English article system fairly well, possibly due to their native language, yet have problems with realizing the relevant features (i.e. count/mass distinctions and definiteness) in the target language.
Resumo:
Data mining is the process to identify valid, implicit, previously unknown, potentially useful and understandable information from large databases. It is an important step in the process of knowledge discovery in databases, (Olaru & Wehenkel, 1999). In a data mining process, input data can be structured, seme-structured, or unstructured. Data can be in text, categorical or numerical values. One of the important characteristics of data mining is its ability to deal data with large volume, distributed, time variant, noisy, and high dimensionality. A large number of data mining algorithms have been developed for different applications. For example, association rules mining can be useful for market basket problems, clustering algorithms can be used to discover trends in unsupervised learning problems, classification algorithms can be applied in decision-making problems, and sequential and time series mining algorithms can be used in predicting events, fault detection, and other supervised learning problems (Vapnik, 1999). Classification is among the most important tasks in the data mining, particularly for data mining applications into engineering fields. Together with regression, classification is mainly for predictive modelling. So far, there have been a number of classification algorithms in practice. According to (Sebastiani, 2002), the main classification algorithms can be categorized as: decision tree and rule based approach such as C4.5 (Quinlan, 1996); probability methods such as Bayesian classifier (Lewis, 1998); on-line methods such as Winnow (Littlestone, 1988) and CVFDT (Hulten 2001), neural networks methods (Rumelhart, Hinton & Wiliams, 1986); example-based methods such as k-nearest neighbors (Duda & Hart, 1973), and SVM (Cortes & Vapnik, 1995). Other important techniques for classification tasks include Associative Classification (Liu et al, 1998) and Ensemble Classification (Tumer, 1996).
Resumo:
There are many techniques for electricity market price forecasting. However, most of them are designed for expected price analysis rather than price spike forecasting. An effective method of predicting the occurrence of spikes has not yet been observed in the literature so far. In this paper, a data mining based approach is presented to give a reliable forecast of the occurrence of price spikes. Combined with the spike value prediction techniques developed by the same authors, the proposed approach aims at providing a comprehensive tool for price spike forecasting. In this paper, feature selection techniques are firstly described to identify the attributes relevant to the occurrence of spikes. A simple introduction to the classification techniques is given for completeness. Two algorithms: support vector machine and probability classifier are chosen to be the spike occurrence predictors and are discussed in details. Realistic market data are used to test the proposed model with promising results.
Resumo:
A magnesium-aluminium alloy of eutectic composition was solidified under two different cooling conditions, producing a low and a high growth rate of the eutectic solid-liquid interface. The high growth rate specimen contained smaller eutectic grains and cells, with a smaller interphase spacing compared with the low growth rate specimen. The high growth rate specimen also contained some primary Mg17Al12 dendrites, suggesting that the coupled zone is skewed towards the Mg phase with increased undercooling, A lamellar eutectic morphology was observed in the low growth rate specimen, while the morphology was fibrous in the high growth rate specimen.
Resumo:
A new model of halo formation in directional solidification is presented. The model describes halo formation in terms of competitive growth between the halo phase and coupled eutectic in liquid with a nominal composition that follows the primary phase liquidus extension with decreasing temperature. The model distinguishes between the effects of constitutional, capillarity and (where applicable) kinetic undercooling and avoids a number of theoretical inconsistencies associated with previous models. The critical growth rate for halo formation in directionally solidified hypereutectic Al-Si alloys is calculated using the model in conjunction with models of primary phase and coupled eutectic growth from the literature. The calculated result agrees reasonably well with the experimental result of Yilmaz and Elliott (Met. Sci. 18 (1984) 362), given the use of a relatively simple isolated dendrite tip model to calculate the growth undercooling of the halo tip. (C) 2002 Acta Materialia Inc. Published by Elsevier Science Ltd. All rights reserved.
Resumo:
Psychostimulants produce a broad range of effects. Adverse effects can exist on a spectrum of severity from minor symptoms to life threatening toxicity. Although regular use or use of high doses increases risk of adverse events, many adverse events requiring emergency intervention may occur even in the naïve user.
Resumo:
Modelling and simulation studies were carried out at 26 cement clinker grinding circuits including tube mills, air separators and high pressure grinding rolls in 8 plants. The results reported earlier have shown that tube mills can be modelled as several mills in series, and the internal partition in tube mills can be modelled as a screen which must retain coarse particles in the first compartment but not impede the flow of drying air. In this work the modelling has been extended to show that the Tromp curve which describes separator (classifier) performance can be modelled in terms of d(50)(corr), by-pass, the fish hook, and the sharpness of the curve. Also the high pressure grinding rolls model developed at the Julius Kruttschnitt Mineral Research Centre gives satisfactory predictions using a breakage function derived from impact and compressed bed tests. Simulation studies of a full plant incorporating a tube mill, HPGR and separators showed that the models could successfully predict the performance of the another mill working under different conditions. The simulation capability can therefore be used for process optimization and design. (C) 2001 Elsevier Science Ltd. All rights reserved.
Resumo:
In this study we present a novel automated strategy for predicting infarct evolution, based on MR diffusion and perfusion images acquired in the acute stage of stroke. The validity of this methodology was tested on novel patient data including data acquired from an independent stroke clinic. Regions-of-interest (ROIs) defining the initial diffusion lesion and tissue with abnormal hemodynamic function as defined by the mean transit time (MTT) abnormality were automatically extracted from DWI/PI maps. Quantitative measures of cerebral blood flow (CBF) and volume (CBV) along with ratio measures defined relative to the contralateral hemisphere (r(a)CBF and r(a)CBV) were calculated for the MTT ROIs. A parametric normal classifier algorithm incorporating these measures was used to predict infarct growth. The mean r(a)CBF and r(a)CBV values for eventually infarcted MTT tissue were 0.70 +/-0.19 and 1.20 +/-0.36. For recovered tissue the mean values were 0.99 +/-0.25 and 1.87 +/-0.71, respectively. There was a significant difference between these two regions for both measures (P