48 resultados para Naive Bayes Classifier
Resumo:
A joint distribution of two discrete random variables with finite support can be displayed as a two way table of probabilities adding to one. Assume that this table hasn rows and m columns and all probabilities are non-null. This kind of table can beseen as an element in the simplex of n · m parts. In this context, the marginals areidentified as compositional amalgams, conditionals (rows or columns) as subcompositions. Also, simplicial perturbation appears as Bayes theorem. However, the Euclideanelements of the Aitchison geometry of the simplex can also be translated into the tableof probabilities: subspaces, orthogonal projections, distances.Two important questions are addressed: a) given a table of probabilities, which isthe nearest independent table to the initial one? b) which is the largest orthogonalprojection of a row onto a column? or, equivalently, which is the information in arow explained by a column, thus explaining the interaction? To answer these questionsthree orthogonal decompositions are presented: (1) by columns and a row-wise geometric marginal, (2) by rows and a columnwise geometric marginal, (3) by independenttwo-way tables and fully dependent tables representing row-column interaction. Animportant result is that the nearest independent table is the product of the two (rowand column)-wise geometric marginal tables. A corollary is that, in an independenttable, the geometric marginals conform with the traditional (arithmetic) marginals.These decompositions can be compared with standard log-linear models.Key words: balance, compositional data, simplex, Aitchison geometry, composition,orthonormal basis, arithmetic and geometric marginals, amalgam, dependence measure,contingency table
Technologies de procédé et de contrôle pour réduire la teneur en sel du jambon sec et des saucissons
Resumo:
Dans certains pays européens, les produits carnés élaborés peuvent représenter près de 20% de la consommation journalière de sodium. De ce fait, les industries de la viande tentent de réduire la teneur en sel dans les produits carnés pour répondre, d’une part aux attentes des consommateurs et d’autre part aux demandes des autorités sanitaires. Le système Quick‐Dry‐Slice process (QDS®), couplé avec l’utilisation de sels substituant le chlorure de sodium (NaCl), a permis de fabriquer, avec succès, des saucisses fermentées à basse teneur en sel en réduisant le cycle de fabrication et sans ajout de NaCl supplémentaire. Les technologies de mesure en ligne non destructives, comme les rayons X et l’induction électromagnétique, permettent de classifier les jambons frais suivant leur teneur en gras, un paramètre crucial pour adapter la durée de l’étape de salaison. La technologie des rayons X peut aussi être utilisée pour estimer la quantité de sel incorporée pendant la salaison. L’information relative aux teneurs en sel et en gras est importante pour optimiser le processus d’élaboration du jambon sec en réduisant la variabilité de la teneur en sel entre les lots et dans un même lot, mais aussi pour réduire la teneur en sel du produit final. D’autres technologies comme la spectroscopie en proche infrarouge (NIRS) ou spectroscopie microondes sont aussi utiles pour contrôler le processus d’élaboration et pour caractériser et classifier les produits carnés élaborés, selon leur teneur en sel. La plupart de ces technologies peuvent être facilement appliquées en ligne dans l’industrie afin de contrôler le processus de fabrication et d’obtenir ainsi des produits carnés présentant les caractéristiques recherchées.
Resumo:
We are going to implement the "GA-SEFS" by Tsymbal and analyse experimentally its performance depending on the classifier algorithms used in the fitness function (NB, MNge, SMO). We are also going to study the effect of adding to the fitness function a measure to control complexity of the base classifiers.
Resumo:
Wireless “MIMO” systems, employing multiple transmit and receive antennas, promise a significant increase of channel capacity, while orthogonal frequency-division multiplexing (OFDM) is attracting a good deal of attention due to its robustness to multipath fading. Thus, the combination of both techniques is an attractive proposition for radio transmission. The goal of this paper is the description and analysis of a new and novel pilot-aided estimator of multipath block-fading channels. Typical models leading to estimation algorithms assume the number of multipath components and delays to be constant (and often known), while their amplitudes are allowed to vary with time. Our estimator is focused instead on the more realistic assumption that the number of channel taps is also unknown and varies with time following a known probabilistic model. The estimation problem arising from these assumptions is solved using Random-Set Theory (RST), whereby one regards the multipath-channel response as a single set-valued random entity.Within this framework, Bayesian recursive equations determine the evolution with time of the channel estimator. Due to the lack of a closed form for the solution of Bayesian equations, a (Rao–Blackwellized) particle filter (RBPF) implementation ofthe channel estimator is advocated. Since the resulting estimator exhibits a complexity which grows exponentially with the number of multipath components, a simplified version is also introduced. Simulation results describing the performance of our channel estimator demonstrate its effectiveness.
Resumo:
This paper presents several algorithms for joint estimation of the target number and state in a time-varying scenario. Building on the results presented in [1], which considers estimation of the target number only, we assume that not only the target number, but also their state evolution must be estimated. In this context, we extend to this new scenario the Rao-Blackwellization procedure of [1] to compute Bayes recursions, thus defining reduced-complexity solutions for the multi-target set estimator. A performance assessmentis finally given both in terms of Circular Position Error Probability - aimed at evaluating the accuracy of the estimated track - and in terms of Cardinality Error Probability, aimed at evaluating the reliability of the target number estimates.
Resumo:
Utilizing the well-known Ultimatum Game, this note presents the following phenomenon. If we start with simple stimulus-response agents, learning through naive reinforcement, and then grant them some introspective capabilities, we get outcomes that are not closer but farther away from the fully introspective game-theoretic approach. The cause of this is the following: there is an asymmetry in the information that agents can deduce from their experience, and this leads to a bias in their learning process.
Resumo:
We present simple procedures for the prediction of a real valued sequence. The algorithms are based on a combinationof several simple predictors. We show that if the sequence is a realization of a bounded stationary and ergodic random process then the average of squared errors converges, almost surely, to that of the optimum, given by the Bayes predictor. We offer an analog result for the prediction of stationary gaussian processes.
Resumo:
The spectacular failure of top-rated structured finance products has broughtrenewed attention to the conflicts of interest of Credit Rating Agencies (CRAs). We modelboth the CRA conflict of understating credit risk to attract more business, and the issuerconflict of purchasing only the most favorable ratings (issuer shopping), and examine theeffectiveness of a number of proposed regulatory solutions of CRAs. We find that CRAs aremore prone to inflate ratings when there is a larger fraction of naive investors in the marketwho take ratings at face value, or when CRA expected reputation costs are lower. To theextent that in booms the fraction of naive investors is higher, and the reputation risk forCRAs of getting caught understating credit risk is lower, our model predicts that CRAs aremore likely to understate credit risk in booms than in recessions. We also show that, due toissuer shopping, competition among CRAs in a duopoly is less efficient (conditional on thesame equilibrium CRA rating policy) than having a monopoly CRA, in terms of both totalex-ante surplus and investor surplus. Allowing tranching decreases total surplus further.We argue that regulatory intervention requiring upfront payments for rating services (beforeCRAs propose a rating to the issuer) combined with mandatory disclosure of any ratingproduced by CRAs can substantially mitigate the con.icts of interest of both CRAs andissuers.
Resumo:
Utilizing the well-known Ultimatum Game, this note presents the following phenomenon. If we start with simple stimulus-response agents,learning through naive reinforcement, and then grant them some introspective capabilities, we get outcomes that are not closer but farther away from the fully introspective game-theoretic approach. The cause of this is the following: there is an asymmetry in the information that agents can deduce from their experience, and this leads to a bias in their learning process.
Resumo:
We present a simple randomized procedure for the prediction of a binary sequence. The algorithm uses ideas from recent developments of the theory of the prediction of individual sequences. We show that if thesequence is a realization of a stationary and ergodic random process then the average number of mistakes converges, almost surely, to that of the optimum, given by the Bayes predictor.
Resumo:
The objective of this paper is to compare the performance of twopredictive radiological models, logistic regression (LR) and neural network (NN), with five different resampling methods. One hundred and sixty-seven patients with proven calvarial lesions as the only known disease were enrolled. Clinical and CT data were used for LR and NN models. Both models were developed with cross validation, leave-one-out and three different bootstrap algorithms. The final results of each model were compared with error rate and the area under receiver operating characteristic curves (Az). The neural network obtained statistically higher Az than LR with cross validation. The remaining resampling validation methods did not reveal statistically significant differences between LR and NN rules. The neural network classifier performs better than the one based on logistic regression. This advantage is well detected by three-fold cross-validation, but remains unnoticed when leave-one-out or bootstrap algorithms are used.
Resumo:
This paper proposes a common and tractable framework for analyzingdifferent definitions of fixed and random effects in a contant-slopevariable-intercept model. It is shown that, regardless of whethereffects (i) are treated as parameters or as an error term, (ii) areestimated in different stages of a hierarchical model, or whether (iii)correlation between effects and regressors is allowed, when the sameinformation on effects is introduced into all estimation methods, theresulting slope estimator is also the same across methods. If differentmethods produce different results, it is ultimately because differentinformation is being used for each methods.
Resumo:
This paper studies the equilibrating process of several implementationmechanisms using naive adaptive dynamics. We show that the dynamics convergeand are stable, for the canonical mechanism of implementation in Nash equilibrium.In this way we cast some doubt on the criticism of ``complexity'' commonlyused against this mechanism. For mechanisms that use more refined equilibrium concepts,the dynamics converge but are not stable. Some papers in the literatureon implementation with refined equilibrium concepts have claimed that themechanisms they propose are ``simple'' and implement ``everything'' (incontrast with the canonical mechanism). The fact that some of these ``simple''mechanisms have unstable equilibria suggests that these statements shouldbe interpreted with some caution.
Resumo:
Objetivos: avaluar tasas de respuestas y seguridad, a las 24 semanas del inicio de tratamiento, en pacientes monoinfectados por VHC y coinfectados VIH/VHC con Telaprevir. Métodos: estudio descriptivo transversal de los pacientes mono y coinfectados (tanto “naive”, recaedores, no respondedores y respondedores parciales), tratados con Telaprevir en una Unidad de Enfermedades Infecciosas. Se recogieron los datos demográficos de cada paciente, datos analíticos, inmunológicos y virológicos así como la determinación de polimorfismo IL B28. Se recogen datos basales y a las 4, 8, 12 y 24 semanas. Resultados: un total de 43 pacientes analizados que iniciaron tratamiento con Telaprevir. Completan tratamiento hasta 12 semanas los 43, y hasta la semana 24 con Peginterferon y Rivabirina un total de 35 pacientes. Un 48% eran pacientes monoinfectados y un 51% coinfectados VIH-VHC. Un 80% eran hombres y un 20% mujeres, con una edad media de 52 años +/- 8´79. A las 12 semanas, un 76% de los pacientes monoinfectados y un 86% de los coinfectados tenían indetectable VHC, y a las 24 semanas un 86% de los monoinfectados y un 90% de los coinfectados, mantenían respuesta viral en tratamiento, sin ser estas diferencias estadísticamente significativas. De la misma forma no se encontraron diferencias estadísticamente significativas en las proporciones de efectos secundarios. Conclusiones: la efectividad y la seguridad del tratamiento con triple terapia que incluye Teleprevir en la infección crónica de VHC, son similares en pacientes monoinfectados y coinfectados.
Resumo:
Several features that can be extracted from digital images of the sky and that can be useful for cloud-type classification of such images are presented. Some features are statistical measurements of image texture, some are based on the Fourier transform of the image and, finally, others are computed from the image where cloudy pixels are distinguished from clear-sky pixels. The use of the most suitable features in an automatic classification algorithm is also shown and discussed. Both the features and the classifier are developed over images taken by two different camera devices, namely, a total sky imager (TSI) and a whole sky imager (WSC), which are placed in two different areas of the world (Toowoomba, Australia; and Girona, Spain, respectively). The performance of the classifier is assessed by comparing its image classification with an a priori classification carried out by visual inspection of more than 200 images from each camera. The index of agreement is 76% when five different sky conditions are considered: clear, low cumuliform clouds, stratiform clouds (overcast), cirriform clouds, and mottled clouds (altocumulus, cirrocumulus). Discussion on the future directions of this research is also presented, regarding both the use of other features and the use of other classification techniques