947 resultados para naive bayes classifier


Relevância:

10.00% 10.00%

Publicador:

Resumo:

We consider the problem of conducting inference on nonparametric high-frequency estimators without knowing their asymptotic variances. We prove that a multivariate subsampling method achieves this goal under general conditions that were not previously available in the literature. We suggest a procedure for a data-driven choice of the bandwidth parameters. Our simulation study indicates that the subsampling method is much more robust than the plug-in method based on the asymptotic expression for the variance. Importantly, the subsampling method reliably estimates the variability of the Two Scale estimator even when its parameters are chosen to minimize the finite sample Mean Squared Error; in contrast, the plugin estimator substantially underestimates the sampling uncertainty. By construction, the subsampling method delivers estimates of the variance-covariance matrices that are always positive semi-definite. We use the subsampling method to study the dynamics of financial betas of six stocks on the NYSE. We document significant variation in betas within year 2006, and find that tick data captures more variation in betas than the data sampled at moderate frequencies such as every five or twenty minutes. To capture this variation we estimate a simple dynamic model for betas. The variance estimation is also important for the correction of the errors-in-variables bias in such models. We find that the bias corrections are substantial, and that betas are more persistent than the naive estimators would lead one to believe.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

L’environnement façonne la physiologie, la morphologie et le comportement des organismes par l’entremise de processus écologiques et évolutifs complexes et multidimensionnels. Le succès reproducteur des animaux est déterminé par la valeur adaptative d’un phénotype dans un environnement en modification constante selon une échelle temporelle d’une à plusieurs générations. De plus, les phénotypes sont façonnés par l’environnement, ce qui entraine des modifications adaptatives des stratégies de reproduction tout en imposant des contraintes. Dans cette thèse, considérant des punaises et leurs parasitoïdes comme organismes modèles, j’ai investigué comment plusieurs types de plasticité peuvent interagir pour influencer la valeur adaptative, et comment la plasticité des stratégies de reproduction répond à plusieurs composantes des changements environnementaux (qualité de l’hôte, radiation ultraviolette, température, invasion biologique). Premièrement, j’ai comparé la réponse comportementale et de traits d’histoire de vie à la variation de taille corporelle chez le parasitoïde Telenomus podisi Ashmead (Hymenoptera : Platygastridae), démontrant que les normes de réaction des comportements étaient plus souvent positives que celles des traits d’histoires de vie. Ensuite, j’ai démontré que la punaise prédatrice Podisus maculiventris Say (Hemiptera : Pentatomidae) peut contrôler la couleur de ses œufs, et que la pigmentation des œufs protège les embryons du rayonnement ultraviolet; une composante d’une stratégie complexe de ponte qui a évoluée en réponse à une multitude de facteurs environnementaux. Puis, j’ai testé comment le stress thermique affectait la dynamique de la mémoire du parasitoïde Trissolcus basalis (Wollaston) (Hymenoptera : Platygastridae) lors de l’apprentissage de la fiabilité des traces chimiques laissées par son hôte. Ces expériences ont révélé que des températures hautes et basses prévenaient l’oubli, affectant ainsi l’allocation du temps passé par les parasitoïdes dans des agrégats d’hôtes contenant des traces chimiques. J’ai aussi développé un cadre théorique général pour classifier les effets de la température sur l’ensemble des aspects comportementaux des ectothermes, distinguant les contraintes des adaptations. Finalement, j’ai testé l’habileté d’un parasitoïde indigène (T. podisi) à exploiter les œufs d’un nouveau ravageur invasif en agriculture, Halyomorpha halys Stål (Hemiptera : Pentatomidae). Les résultats ont montré que T. podisi attaque les œufs de H. halys, mais qu’il ne peut s’y développer, indiquant que le ravageur invasif s’avère un « piège évolutif » pour ce parasitoïde. Cela pourrait indirectement bénéficier aux espèces indigènes de punaises en agissant comme un puits écologique de ressources (œufs) et de temps pour le parasitoïde. Ces résultats ont des implications importantes sur la réponse des insectes, incluant ceux impliqués dans les programmes de lutte biologique, face aux changements environnementaux.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Adolescent idiopathic scoliosis (AIS) is a deformity of the spine manifested by asymmetry and deformities of the external surface of the trunk. Classification of scoliosis deformities according to curve type is used to plan management of scoliosis patients. Currently, scoliosis curve type is determined based on X-ray exam. However, cumulative exposure to X-rays radiation significantly increases the risk for certain cancer. In this paper, we propose a robust system that can classify the scoliosis curve type from non invasive acquisition of 3D trunk surface of the patients. The 3D image of the trunk is divided into patches and local geometric descriptors characterizing the surface of the back are computed from each patch and forming the features. We perform the reduction of the dimensionality by using Principal Component Analysis and 53 components were retained. In this work a multi-class classifier is built with Least-squares support vector machine (LS-SVM) which is a kernel classifier. For this study, a new kernel was designed in order to achieve a robust classifier in comparison with polynomial and Gaussian kernel. The proposed system was validated using data of 103 patients with different scoliosis curve types diagnosed and classified by an orthopedic surgeon from the X-ray images. The average rate of successful classification was 93.3% with a better rate of prediction for the major thoracic and lumbar/thoracolumbar types.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Objective To determine scoliosis curve types using non invasive surface acquisition, without prior knowledge from X-ray data. Methods Classification of scoliosis deformities according to curve type is used in the clinical management of scoliotic patients. In this work, we propose a robust system that can determine the scoliosis curve type from non invasive acquisition of the 3D back surface of the patients. The 3D image of the surface of the trunk is divided into patches and local geometric descriptors characterizing the back surface are computed from each patch and constitute the features. We reduce the dimensionality by using principal component analysis and retain 53 components using an overlap criterion combined with the total variance in the observed variables. In this work, a multi-class classifier is built with least-squares support vector machines (LS-SVM). The original LS-SVM formulation was modified by weighting the positive and negative samples differently and a new kernel was designed in order to achieve a robust classifier. The proposed system is validated using data from 165 patients with different scoliosis curve types. The results of our non invasive classification were compared with those obtained by an expert using X-ray images. Results The average rate of successful classification was computed using a leave-one-out cross-validation procedure. The overall accuracy of the system was 95%. As for the correct classification rates per class, we obtained 96%, 84% and 97% for the thoracic, double major and lumbar/thoracolumbar curve types, respectively. Conclusion This study shows that it is possible to find a relationship between the internal deformity and the back surface deformity in scoliosis with machine learning methods. The proposed system uses non invasive surface acquisition, which is safe for the patient as it involves no radiation. Also, the design of a specific kernel improved classification performance.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper, a new methodology for the prediction of scoliosis curve types from non invasive acquisitions of the back surface of the trunk is proposed. One hundred and fifty-nine scoliosis patients had their back surface acquired in 3D using an optical digitizer. Each surface is then characterized by 45 local measurements of the back surface rotation. Using a semi-supervised algorithm, the classifier is trained with only 32 labeled and 58 unlabeled data. Tested on 69 new samples, the classifier succeeded in classifying correctly 87.0% of the data. After reducing the number of labeled training samples to 12, the behavior of the resulting classifier tends to be similar to the reference case where the classifier is trained only with the maximum number of available labeled data. Moreover, the addition of unlabeled data guided the classifier towards more generalizable boundaries between the classes. Those results provide a proof of feasibility for using a semi-supervised learning algorithm to train a classifier for the prediction of a scoliosis curve type, when only a few training data are labeled. This constitutes a promising clinical finding since it will allow the diagnosis and the follow-up of scoliotic deformities without exposing the patient to X-ray radiations.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Scoliosis treatment strategy is generally chosen according to the severity and type of the spinal curve. Currently, the curve type is determined from X-rays whose acquisition can be harmful for the patient. We propose in this paper a system that can predict the scoliosis curve type based on the analysis of the surface of the trunk. The latter is acquired and reconstructed in 3D using a non invasive multi-head digitizing system. The deformity is described by the back surface rotation, measured on several cross-sections of the trunk. A classifier composed of three support vector machines was trained and tested using the data of 97 patients with scoliosis. A prediction rate of 72.2% was obtained, showing that the use of the trunk surface for a high-level scoliosis classification is feasible and promising.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Biometrics deals with the physiological and behavioral characteristics of an individual to establish identity. Fingerprint based authentication is the most advanced biometric authentication technology. The minutiae based fingerprint identification method offer reasonable identification rate. The feature minutiae map consists of about 70-100 minutia points and matching accuracy is dropping down while the size of database is growing up. Hence it is inevitable to make the size of the fingerprint feature code to be as smaller as possible so that identification may be much easier. In this research, a novel global singularity based fingerprint representation is proposed. Fingerprint baseline, which is the line between distal and intermediate phalangeal joint line in the fingerprint, is taken as the reference line. A polygon is formed with the singularities and the fingerprint baseline. The feature vectors are the polygonal angle, sides, area, type and the ridge counts in between the singularities. 100% recognition rate is achieved in this method. The method is compared with the conventional minutiae based recognition method in terms of computation time, receiver operator characteristics (ROC) and the feature vector length. Speech is a behavioural biometric modality and can be used for identification of a speaker. In this work, MFCC of text dependant speeches are computed and clustered using k-means algorithm. A backpropagation based Artificial Neural Network is trained to identify the clustered speech code. The performance of the neural network classifier is compared with the VQ based Euclidean minimum classifier. Biometric systems that use a single modality are usually affected by problems like noisy sensor data, non-universality and/or lack of distinctiveness of the biometric trait, unacceptable error rates, and spoof attacks. Multifinger feature level fusion based fingerprint recognition is developed and the performances are measured in terms of the ROC curve. Score level fusion of fingerprint and speech based recognition system is done and 100% accuracy is achieved for a considerable range of matching threshold

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Learning Disability (LD) is a classification including several disorders in which a child has difficulty in learning in a typical manner, usually caused by an unknown factor or factors. LD affects about 15% of children enrolled in schools. The prediction of learning disability is a complicated task since the identification of LD from diverse features or signs is a complicated problem. There is no cure for learning disabilities and they are life-long. The problems of children with specific learning disabilities have been a cause of concern to parents and teachers for some time. The aim of this paper is to develop a new algorithm for imputing missing values and to determine the significance of the missing value imputation method and dimensionality reduction method in the performance of fuzzy and neuro fuzzy classifiers with specific emphasis on prediction of learning disabilities in school age children. In the basic assessment method for prediction of LD, checklists are generally used and the data cases thus collected fully depends on the mood of children and may have also contain redundant as well as missing values. Therefore, in this study, we are proposing a new algorithm, viz. the correlation based new algorithm for imputing the missing values and Principal Component Analysis (PCA) for reducing the irrelevant attributes. After the study, it is found that, the preprocessing methods applied by us improves the quality of data and thereby increases the accuracy of the classifiers. The system is implemented in Math works Software Mat Lab 7.10. The results obtained from this study have illustrated that the developed missing value imputation method is very good contribution in prediction system and is capable of improving the performance of a classifier.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Learning Disability (LD) is a neurological condition that affects a child’s brain and impairs his ability to carry out one or many specific tasks. LD affects about 15 % of children enrolled in schools. The prediction of LD is a vital and intricate job. The aim of this paper is to design an effective and powerful tool, using the two intelligent methods viz., Artificial Neural Network and Adaptive Neuro-Fuzzy Inference System, for measuring the percentage of LD that affected in school-age children. In this study, we are proposing some soft computing methods in data preprocessing for improving the accuracy of the tool as well as the classifier. The data preprocessing is performed through Principal Component Analysis for attribute reduction and closest fit algorithm is used for imputing missing values. The main idea in developing the LD prediction tool is not only to predict the LD present in children but also to measure its percentage along with its class like low or minor or major. The system is implemented in Mathworks Software MatLab 7.10. The results obtained from this study have illustrated that the designed prediction system or tool is capable of measuring the LD effectively

Relevância:

10.00% 10.00%

Publicador:

Resumo:

This paper presents a Robust Content Based Video Retrieval (CBVR) system. This system retrieves similar videos based on a local feature descriptor called SURF (Speeded Up Robust Feature). The higher dimensionality of SURF like feature descriptors causes huge storage consumption during indexing of video information. To achieve a dimensionality reduction on the SURF feature descriptor, this system employs a stochastic dimensionality reduction method and thus provides a model data for the videos. On retrieval, the model data of the test clip is classified to its similar videos using a minimum distance classifier. The performance of this system is evaluated using two different minimum distance classifiers during the retrieval stage. The experimental analyses performed on the system shows that the system has a retrieval performance of 78%. This system also analyses the performance efficiency of the low dimensional SURF descriptor.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

In this paper an attempt has been made to determine the number of Premature Ventricular Contraction (PVC) cycles accurately from a given Electrocardiogram (ECG) using a wavelet constructed from multiple Gaussian functions. It is difficult to assess the ECGs of patients who are continuously monitored over a long period of time. Hence the proposed method of classification will be helpful to doctors to determine the severity of PVC in a patient. Principal Component Analysis (PCA) and a simple classifier have been used in addition to the specially developed wavelet transform. The proposed wavelet has been designed using multiple Gaussian functions which when summed up looks similar to that of a normal ECG. The number of Gaussians used depends on the number of peaks present in a normal ECG. The developed wavelet satisfied all the properties of a traditional continuous wavelet. The new wavelet was optimized using genetic algorithm (GA). ECG records from Massachusetts Institute of Technology-Beth Israel Hospital (MIT-BIH) database have been used for validation. Out of the 8694 ECG cycles used for evaluation, the classification algorithm responded with an accuracy of 97.77%. In order to compare the performance of the new wavelet, classification was also performed using the standard wavelets like morlet, meyer, bior3.9, db5, db3, sym3 and haar. The new wavelet outperforms the rest

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Es werde das lineare Regressionsmodell y = X b + e mit den ueblichen Bedingungen betrachtet. Weiter werde angenommen, dass der Parametervektor aus einem Ellipsoid stammt. Ein optimaler Schaetzer fuer den Parametervektor ist durch den Minimax-Schaetzer gegeben. Nach der entscheidungstheoretischen Formulierung des Minimax-Schaetzproblems werden mit dem Bayesschen Ansatz, Spektralen Methoden und der Darstellung von Hoffmann und Laeuter Wege zur Bestimmung des Minimax- Schaetzers dargestellt und in Beziehung gebracht. Eine Betrachtung von Modellen mit drei Einflussgroeßen und gemeinsamen Eigenvektor fuehrt zu einer Strukturierung des Problems nach der Vielfachheit des maximalen Eigenwerts. Die Bestimmung des Minimax-Schaetzers in einem noch nicht geloesten Fall kann auf die Bestimmung einer Nullstelle einer nichtlinearen reellwertigen Funktion gefuehrt werden. Es wird ein Beispiel gefunden, in dem die Nullstelle nicht durch Radikale angegeben werden kann. Durch das Intervallschachtelungs-Prinzip oder Newton-Verfahren ist die numerische Bestimmung der Nullstelle moeglich. Durch Entwicklung einer Fixpunktgleichung aus der Darstellung von Hoffmann und Laeuter war es in einer Simulation moeglich die angestrebten Loesungen zu finden.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Der in dieser Arbeit wesentliche Fokus ist die Realisierung eines anwendungsbezogenen Konzeptes zur Förderung stochastischer Kompetenzen im Mathematikunterricht, die sich auf Entscheiden und Urteilen unter Unsicherheit beziehen. Von zentraler Bedeutung ist hierbei die alltagsrelevante Kompetenz, mit Problemen um bedingte Wahrscheinlichkeiten und Anwendungen des Satzes von Bayes umgehen zu können, die i.w.S. mit „Bayesianischem Denken“ bezeichnet wird. Die historische und theoretische Grundlage der Arbeit sind kognitionspsychologische Erkenntnisse zum menschlichen Urteilen unter Unsicherheit: Intuitive Formen probabilistischen Denkens basieren auf Häufigkeitsanschauungen (z.B. Piaget & Inhelder, 1975; Gigerenzer, 1991). Meine didaktischen Analysen ergaben aber, dass der Umgang mit Unsicherheit im üblichen Stochastikunterricht nach einer häufigkeitsbasierten Einführung des Wahrscheinlichkeitsbegriffes (der ja bekanntlich vielfältige Interpretationsmöglichkeiten aufweist) nur noch auf Basis der numerischen Formate für Wahrscheinlichkeiten (z.B. Prozentwerte, Dezimalbrüche) und entsprechenden Regeln gelehrt wird. Damit werden m.E. grundlegende Intuitionen von Schülern leider nur unzureichend beachtet. Das in dieser Arbeit detailliert entwickelte „Didaktische Konzept der natürlichen Häufigkeiten“ schlägt somit die konsequente Modellierung probabilistischer Probleme mit Häufigkeitsrepräsentationen vor. Auf Grundlage empirischer Laborbefunde und didaktischer Analysen wurde im Rahmen der Arbeit eine Unterrichtsreihe „Authentisches Bewerten und Urteilen unter Unsicherheit“ für die Sekundarstufe I entwickelt (Wassner, Biehler, Schweynoch & Martignon, 2004 auch als Band 5 der KaDiSto-Reihe veröffentlicht). Zum einen erfolgte eine Umsetzung des „Didaktischen Konzeptes der natürlichen Häufigkeiten“, zum anderen wurde ein Zugang mit hohem Realitätsbezug verwirklicht, in dem so genannte „allgemeinere Bildungsaspekte“ wie Lebensvorbereitung, eigenständige Problemlösefähigkeit, kritischer Vernunftgebrauch, Sinnstiftung, motivationale Faktoren etc. wesentliche Beachtung fanden. Die Reihe wurde auch im Rahmen dieser Arbeit in der Sekundarstufe I (fünf 9. Klassen, Gymnasium) implementiert und daraufhin der Unterrichtsgang detailliert bewertet und analysiert. Diese Arbeit stellt die Dissertation des Verfassers dar, die an der Universität Kassel von Rolf Biehler betreut wurde. Sie ist identisch mit der Erstveröffentlichung 2004 im Franzbecker Verlag, Hildesheim, der der elektronischen Veröffentlichung im Rahmen von KaDiSto zugestimmt hat.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Die vorliegende Unterrichtsreihe basiert auf zwei grundlegenden Vorstellungen zum Lernen und Lehren von Wahrscheinlichkeitsrechnung für Anfänger in der Sekundarstufe I. Zum einen ist die grundsätzliche Überzeugung der Autoren, dass ein sinnvoller und gewinnbringender Unterricht in Stochastik über den aufwendigeren Weg möglichst authentischer und konkreter Anwendungen im täglichen Leben gehen sollte. Demzufolge reicht eine Einkleidung stochastischer Probleme in realistisch wirkende Kontexte nicht, sondern es sollte eine intensive Erarbeitung authentischer Problemstellungen, z.B. mit Hilfe von realen Medientexten, erfolgen. Die Schüler sollen vor allem lernen, reale Probleme mathematisch zu modellieren und gefundene mathematische Ergebnisse für die reale Situation zu interpretieren und kritisch zu diskutieren. Eine weitere Besonderheit gegenüber traditionellen Zugängen zur Wahrscheinlichkeitsrechnung basiert auf kognitionspsychologischen Ergebnissen zur menschlichen Informationsverarbeitung. Durch eine Serie von Studien wurde gezeigt, dass Menschen – und natürlich auch Schüler – große Probleme haben, mit Wahrscheinlichkeiten (also auf 1 normierte Maße) umzugehen. Als viel einfacher und verständnisfördernder stellte sich die kognitive Verarbeitung von Häufigkeiten (bzw. Verhältnissen von natürlichen Zahlen) heraus. In dieser Reihe wird deshalb auf eine traditionelle formale Einführung der Bayesschen Regel verzichtet und es werden spezielle, auf Häufigkeiten basierende Hilfsmittel zur Lösungsfindung verwendet. Die erwähnten Studien belegen den Vorteil dieser Häufigkeitsdarstellungen gegenüber traditionellen Methoden im Hinblick auf den sofortigen und insbesondere den längerfristigen Lernerfolg (vgl. umfassend zu diesem Thema C. Wassner (2004). Förderung Bayesianischen Denkens, Hildesheim: Franzbecker, http://nbn-resolving.org/urn:nbn:de:hebis:34-2006092214705). Die vorliegende Schrift wurde zuerst im Jahre 2004 als Anhang zur o.g. Schrift bei Franzbecker Hildesheim veröffentlicht. Der Verlag hat einer elektronischen Veröffentlichung in der KaDiSto-Reihe zugestimmt.

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Data mining means to summarize information from large amounts of raw data. It is one of the key technologies in many areas of economy, science, administration and the internet. In this report we introduce an approach for utilizing evolutionary algorithms to breed fuzzy classifier systems. This approach was exercised as part of a structured procedure by the students Achler, Göb and Voigtmann as contribution to the 2006 Data-Mining-Cup contest, yielding encouragingly positive results.